linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics
@ 2022-02-10  4:13 Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read Colin Foster
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Colin Foster @ 2022-02-10  4:13 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Jakub Kicinski, David S. Miller, UNGLinuxDriver,
	Alexandre Belloni, Claudiu Manoil, Vladimir Oltean

Ocelot loops over memory regions to gather stats on different ports.
These regions are mostly continuous, and are ordered. This patch set
uses that information to break the stats reads into regions that can get
read in bulk.

The motiviation is for general cleanup, but also for SPI. Performing two
back-to-back reads on a SPI bus require toggling the CS line, holding,
re-toggling the CS line, sending 3 address bytes, sending N padding
bytes, then actually performing the read. Bulk reads could reduce almost
all of that overhead, but require that the reads are performed via
regmap_bulk_read.

Verified with eth0 hooked up to the CPU port:
# ethtool -S eth0 | grep -v ": 0"
NIC statistics:
     Good Rx Frames: 8352
     Rx Octets: 10972241
     Good Tx Frames: 1674
     Tx Octets: 146253
     Rx + Tx 65-127 Octet Frames: 2565
     Rx + Tx 128-255 Octet Frames: 93
     Rx + Tx 256-511 Octet Frames: 158
     Rx + Tx 512-1023 Octet Frames: 271
     Rx + Tx 1024-Up Octet Frames: 6939
     Net Octets: 11118494
     Rx DMA chan 0: head_enqueue: 1
     Rx DMA chan 0: tail_enqueue: 8479
     Rx DMA chan 0: busy_dequeue: 7614
     Rx DMA chan 0: good_dequeue: 8352
     Tx DMA chan 0: head_enqueue: 1335
     Tx DMA chan 0: tail_enqueue: 339
     Tx DMA chan 0: misqueued: 339
     Tx DMA chan 0: empty_dequeue: 1335
     Tx DMA chan 0: good_dequeue: 1674
     p00_rx_octets: 146253
     p00_rx_unicast: 1674
     p00_rx_frames_65_to_127_octets: 1666
     p00_rx_frames_128_to_255_octets: 7
     p00_rx_frames_over_1526_octets: 1
     p00_tx_octets: 10972241
     p00_tx_unicast: 8352
     p00_tx_frames_65_to_127_octets: 899
     p00_tx_frames_128_255_octets: 86
     p00_tx_frames_256_511_octets: 158
     p00_tx_frames_512_1023_octets: 271
     p00_tx_frames_1024_1526_octets: 222
     p00_tx_frames_over_1526_octets: 6716
     p00_tx_green_prio_0: 8352


And with swp2 connected to swp3 with STP enabled:
# ethtool -S swp2 | grep -v ": 0"
NIC statistics:
     tx_packets: 397
     tx_bytes: 20634
     rx_packets: 1
     rx_bytes: 46
     rx_octets: 64
     rx_multicast: 1
     rx_frames_below_65_octets: 1
     rx_classified_drops: 1
     tx_octets: 46586
     tx_multicast: 404
     tx_broadcast: 303
     tx_frames_below_65_octets: 397
     tx_frames_65_to_127_octets: 306
     tx_frames_128_255_octets: 4
     tx_green_prio_0: 311
     tx_green_prio_7: 396
# ethtool -S swp3 | grep -v ": 0"
NIC statistics:
     tx_packets: 1
     tx_bytes: 52
     rx_packets: 711
     rx_bytes: 34050
     rx_octets: 46848
     rx_multicast: 406
     rx_broadcast: 305
     rx_frames_below_65_octets: 399
     rx_frames_65_to_127_octets: 308
     rx_frames_128_to_255_octets: 4
     rx_classified_drops: 398
     rx_green_prio_0: 313
     tx_octets: 64
     tx_multicast: 1
     tx_frames_below_65_octets: 1
     tx_green_prio_7: 1


v1 > v2: reword commit messages
v2 > v3: correctly mark this for net-next when sending
v3 > v4: calloc array instead of zalloc per review
v4 > v5:
    Apply CR suggestions for whitespace
    Fix calloc / zalloc mixup
    Properly destroy workqueues
    Add third commit to split long macros
v5 > v6:
    Fix functionality - v5 was improperly tested
    Add bugfix for ethtool mutex lock
    Remove unnecessary ethtool stats reads


Colin Foster (5):
  net: mscc: ocelot: fix mutex lock error during ethtool stats read
  net: mscc: ocelot: remove unnecessary stat reading from ethtool
  net: ocelot: align macros for consistency
  net: mscc: ocelot: add ability to perform bulk reads
  net: mscc: ocelot: use bulk reads for stats

 drivers/net/ethernet/mscc/ocelot.c    | 97 +++++++++++++++++++++------
 drivers/net/ethernet/mscc/ocelot_io.c | 13 ++++
 include/soc/mscc/ocelot.h             | 57 +++++++++++-----
 3 files changed, 132 insertions(+), 35 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read
  2022-02-10  4:13 [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics Colin Foster
@ 2022-02-10  4:13 ` Colin Foster
  2022-02-10  9:50   ` Vladimir Oltean
  2022-02-10  4:13 ` [PATCH v6 net-next 2/5] net: mscc: ocelot: remove unnecessary stat reading from ethtool Colin Foster
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Colin Foster @ 2022-02-10  4:13 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Jakub Kicinski, David S. Miller, UNGLinuxDriver,
	Alexandre Belloni, Claudiu Manoil, Vladimir Oltean,
	Vladimir Oltean

An ongoing workqueue populates the stats buffer. At the same time, a user
might query the statistics. While writing to the buffer is mutex-locked,
reading from the buffer wasn't. This could lead to buggy reads by ethtool.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Fixes: a556c76adc052 ("net: mscc: Add initial Ocelot switch support")
Reported-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/ethernet/mscc/ocelot.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 455293aa6343..6933dff1dd37 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -1737,12 +1737,11 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
 }
 EXPORT_SYMBOL(ocelot_get_strings);
 
+/* Caller must hold &ocelot->stats_lock */
 static void ocelot_update_stats(struct ocelot *ocelot)
 {
 	int i, j;
 
-	mutex_lock(&ocelot->stats_lock);
-
 	for (i = 0; i < ocelot->num_phys_ports; i++) {
 		/* Configure the port to read the stats from */
 		ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(i), SYS_STAT_CFG);
@@ -1761,8 +1760,6 @@ static void ocelot_update_stats(struct ocelot *ocelot)
 					      ~(u64)U32_MAX) + val;
 		}
 	}
-
-	mutex_unlock(&ocelot->stats_lock);
 }
 
 static void ocelot_check_stats_work(struct work_struct *work)
@@ -1771,7 +1768,9 @@ static void ocelot_check_stats_work(struct work_struct *work)
 	struct ocelot *ocelot = container_of(del_work, struct ocelot,
 					     stats_work);
 
+	mutex_lock(&ocelot->stats_lock);
 	ocelot_update_stats(ocelot);
+	mutex_unlock(&ocelot->stats_lock);
 
 	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
 			   OCELOT_STATS_CHECK_DELAY);
@@ -1781,12 +1780,16 @@ void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
 {
 	int i;
 
+	mutex_lock(&ocelot->stats_lock);
+
 	/* check and update now */
 	ocelot_update_stats(ocelot);
 
 	/* Copy all counters */
 	for (i = 0; i < ocelot->num_stats; i++)
 		*data++ = ocelot->stats[port * ocelot->num_stats + i];
+
+	mutex_unlock(&ocelot->stats_lock);
 }
 EXPORT_SYMBOL(ocelot_get_ethtool_stats);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 net-next 2/5] net: mscc: ocelot: remove unnecessary stat reading from ethtool
  2022-02-10  4:13 [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read Colin Foster
@ 2022-02-10  4:13 ` Colin Foster
  2022-02-10 10:34   ` Vladimir Oltean
  2022-02-10  4:13 ` [PATCH v6 net-next 3/5] net: ocelot: align macros for consistency Colin Foster
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Colin Foster @ 2022-02-10  4:13 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Jakub Kicinski, David S. Miller, UNGLinuxDriver,
	Alexandre Belloni, Claudiu Manoil, Vladimir Oltean

The ocelot_update_stats function only needs to read from one port, yet it
was updating the stats for all ports. Update to only read the stats that
are necessary.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
---
 drivers/net/ethernet/mscc/ocelot.c | 33 +++++++++++++++---------------
 1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 6933dff1dd37..ab36732e7d3f 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -1738,27 +1738,24 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
 EXPORT_SYMBOL(ocelot_get_strings);
 
 /* Caller must hold &ocelot->stats_lock */
-static void ocelot_update_stats(struct ocelot *ocelot)
+static void ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
 {
-	int i, j;
+	int j;
 
-	for (i = 0; i < ocelot->num_phys_ports; i++) {
-		/* Configure the port to read the stats from */
-		ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(i), SYS_STAT_CFG);
+	/* Configure the port to read the stats from */
+	ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port), SYS_STAT_CFG);
 
-		for (j = 0; j < ocelot->num_stats; j++) {
-			u32 val;
-			unsigned int idx = i * ocelot->num_stats + j;
+	for (j = 0; j < ocelot->num_stats; j++) {
+		u32 val;
+		unsigned int idx = port * ocelot->num_stats + j;
 
-			val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
-					      ocelot->stats_layout[j].offset);
+		val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
+				      ocelot->stats_layout[j].offset);
 
-			if (val < (ocelot->stats[idx] & U32_MAX))
-				ocelot->stats[idx] += (u64)1 << 32;
+		if (val < (ocelot->stats[idx] & U32_MAX))
+			ocelot->stats[idx] += (u64)1 << 32;
 
-			ocelot->stats[idx] = (ocelot->stats[idx] &
-					      ~(u64)U32_MAX) + val;
-		}
+		ocelot->stats[idx] = (ocelot->stats[idx] & ~(u64)U32_MAX) + val;
 	}
 }
 
@@ -1767,9 +1764,11 @@ static void ocelot_check_stats_work(struct work_struct *work)
 	struct delayed_work *del_work = to_delayed_work(work);
 	struct ocelot *ocelot = container_of(del_work, struct ocelot,
 					     stats_work);
+	int i;
 
 	mutex_lock(&ocelot->stats_lock);
-	ocelot_update_stats(ocelot);
+	for (i = 0; i < ocelot->num_phys_ports; i++)
+		ocelot_update_stats_for_port(ocelot, i);
 	mutex_unlock(&ocelot->stats_lock);
 
 	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
@@ -1783,7 +1782,7 @@ void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
 	mutex_lock(&ocelot->stats_lock);
 
 	/* check and update now */
-	ocelot_update_stats(ocelot);
+	ocelot_update_stats_for_port(ocelot, port);
 
 	/* Copy all counters */
 	for (i = 0; i < ocelot->num_stats; i++)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 net-next 3/5] net: ocelot: align macros for consistency
  2022-02-10  4:13 [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 2/5] net: mscc: ocelot: remove unnecessary stat reading from ethtool Colin Foster
@ 2022-02-10  4:13 ` Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 4/5] net: mscc: ocelot: add ability to perform bulk reads Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats Colin Foster
  4 siblings, 0 replies; 13+ messages in thread
From: Colin Foster @ 2022-02-10  4:13 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Jakub Kicinski, David S. Miller, UNGLinuxDriver,
	Alexandre Belloni, Claudiu Manoil, Vladimir Oltean

In the ocelot.h file, several read / write macros were split across
multiple lines, while others weren't. Split all macros that exceed the 80
character column width and match the style of the rest of the file.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/soc/mscc/ocelot.h | 44 ++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 62cd61d4142e..14a6f4de8e1f 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -744,25 +744,39 @@ struct ocelot_policer {
 	u32 burst; /* bytes */
 };
 
-#define ocelot_read_ix(ocelot, reg, gi, ri) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
-#define ocelot_read_gix(ocelot, reg, gi) __ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi))
-#define ocelot_read_rix(ocelot, reg, ri) __ocelot_read_ix(ocelot, reg, reg##_RSZ * (ri))
-#define ocelot_read(ocelot, reg) __ocelot_read_ix(ocelot, reg, 0)
-
-#define ocelot_write_ix(ocelot, val, reg, gi, ri) __ocelot_write_ix(ocelot, val, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
-#define ocelot_write_gix(ocelot, val, reg, gi) __ocelot_write_ix(ocelot, val, reg, reg##_GSZ * (gi))
-#define ocelot_write_rix(ocelot, val, reg, ri) __ocelot_write_ix(ocelot, val, reg, reg##_RSZ * (ri))
+#define ocelot_read_ix(ocelot, reg, gi, ri) \
+	__ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
+#define ocelot_read_gix(ocelot, reg, gi) \
+	__ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi))
+#define ocelot_read_rix(ocelot, reg, ri) \
+	__ocelot_read_ix(ocelot, reg, reg##_RSZ * (ri))
+#define ocelot_read(ocelot, reg) \
+	__ocelot_read_ix(ocelot, reg, 0)
+
+#define ocelot_write_ix(ocelot, val, reg, gi, ri) \
+	__ocelot_write_ix(ocelot, val, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
+#define ocelot_write_gix(ocelot, val, reg, gi) \
+	__ocelot_write_ix(ocelot, val, reg, reg##_GSZ * (gi))
+#define ocelot_write_rix(ocelot, val, reg, ri) \
+	__ocelot_write_ix(ocelot, val, reg, reg##_RSZ * (ri))
 #define ocelot_write(ocelot, val, reg) __ocelot_write_ix(ocelot, val, reg, 0)
 
-#define ocelot_rmw_ix(ocelot, val, m, reg, gi, ri) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
-#define ocelot_rmw_gix(ocelot, val, m, reg, gi) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi))
-#define ocelot_rmw_rix(ocelot, val, m, reg, ri) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_RSZ * (ri))
+#define ocelot_rmw_ix(ocelot, val, m, reg, gi, ri) \
+	__ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
+#define ocelot_rmw_gix(ocelot, val, m, reg, gi) \
+	__ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi))
+#define ocelot_rmw_rix(ocelot, val, m, reg, ri) \
+	__ocelot_rmw_ix(ocelot, val, m, reg, reg##_RSZ * (ri))
 #define ocelot_rmw(ocelot, val, m, reg) __ocelot_rmw_ix(ocelot, val, m, reg, 0)
 
-#define ocelot_field_write(ocelot, reg, val) regmap_field_write((ocelot)->regfields[(reg)], (val))
-#define ocelot_field_read(ocelot, reg, val) regmap_field_read((ocelot)->regfields[(reg)], (val))
-#define ocelot_fields_write(ocelot, id, reg, val) regmap_fields_write((ocelot)->regfields[(reg)], (id), (val))
-#define ocelot_fields_read(ocelot, id, reg, val) regmap_fields_read((ocelot)->regfields[(reg)], (id), (val))
+#define ocelot_field_write(ocelot, reg, val) \
+	regmap_field_write((ocelot)->regfields[(reg)], (val))
+#define ocelot_field_read(ocelot, reg, val) \
+	regmap_field_read((ocelot)->regfields[(reg)], (val))
+#define ocelot_fields_write(ocelot, id, reg, val) \
+	regmap_fields_write((ocelot)->regfields[(reg)], (id), (val))
+#define ocelot_fields_read(ocelot, id, reg, val) \
+	regmap_fields_read((ocelot)->regfields[(reg)], (id), (val))
 
 #define ocelot_target_read_ix(ocelot, target, reg, gi, ri) \
 	__ocelot_target_read_ix(ocelot, target, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 net-next 4/5] net: mscc: ocelot: add ability to perform bulk reads
  2022-02-10  4:13 [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics Colin Foster
                   ` (2 preceding siblings ...)
  2022-02-10  4:13 ` [PATCH v6 net-next 3/5] net: ocelot: align macros for consistency Colin Foster
@ 2022-02-10  4:13 ` Colin Foster
  2022-02-10  4:13 ` [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats Colin Foster
  4 siblings, 0 replies; 13+ messages in thread
From: Colin Foster @ 2022-02-10  4:13 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Jakub Kicinski, David S. Miller, UNGLinuxDriver,
	Alexandre Belloni, Claudiu Manoil, Vladimir Oltean

Regmap supports bulk register reads. Ocelot does not. This patch adds
support for Ocelot to invoke bulk regmap reads. That will allow any driver
that performs consecutive reads over memory regions to optimize that
access.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/ethernet/mscc/ocelot_io.c | 13 +++++++++++++
 include/soc/mscc/ocelot.h             |  5 +++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/mscc/ocelot_io.c b/drivers/net/ethernet/mscc/ocelot_io.c
index 7390fa3980ec..2067382d0ee1 100644
--- a/drivers/net/ethernet/mscc/ocelot_io.c
+++ b/drivers/net/ethernet/mscc/ocelot_io.c
@@ -10,6 +10,19 @@
 
 #include "ocelot.h"
 
+int __ocelot_bulk_read_ix(struct ocelot *ocelot, u32 reg, u32 offset, void *buf,
+			  int count)
+{
+	u16 target = reg >> TARGET_OFFSET;
+
+	WARN_ON(!target);
+
+	return regmap_bulk_read(ocelot->targets[target],
+				ocelot->map[target][reg & REG_MASK] + offset,
+				buf, count);
+}
+EXPORT_SYMBOL_GPL(__ocelot_bulk_read_ix);
+
 u32 __ocelot_read_ix(struct ocelot *ocelot, u32 reg, u32 offset)
 {
 	u16 target = reg >> TARGET_OFFSET;
diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 14a6f4de8e1f..312b72558659 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -744,6 +744,9 @@ struct ocelot_policer {
 	u32 burst; /* bytes */
 };
 
+#define ocelot_bulk_read_rix(ocelot, reg, ri, buf, count) \
+	__ocelot_bulk_read_ix(ocelot, reg, reg##_RSZ * (ri), buf, count)
+
 #define ocelot_read_ix(ocelot, reg, gi, ri) \
 	__ocelot_read_ix(ocelot, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
 #define ocelot_read_gix(ocelot, reg, gi) \
@@ -800,6 +803,8 @@ struct ocelot_policer {
 u32 ocelot_port_readl(struct ocelot_port *port, u32 reg);
 void ocelot_port_writel(struct ocelot_port *port, u32 val, u32 reg);
 void ocelot_port_rmwl(struct ocelot_port *port, u32 val, u32 mask, u32 reg);
+int __ocelot_bulk_read_ix(struct ocelot *ocelot, u32 reg, u32 offset, void *buf,
+			  int count);
 u32 __ocelot_read_ix(struct ocelot *ocelot, u32 reg, u32 offset);
 void __ocelot_write_ix(struct ocelot *ocelot, u32 val, u32 reg, u32 offset);
 void __ocelot_rmw_ix(struct ocelot *ocelot, u32 val, u32 mask, u32 reg,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats
  2022-02-10  4:13 [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics Colin Foster
                   ` (3 preceding siblings ...)
  2022-02-10  4:13 ` [PATCH v6 net-next 4/5] net: mscc: ocelot: add ability to perform bulk reads Colin Foster
@ 2022-02-10  4:13 ` Colin Foster
  2022-02-10 10:36   ` Vladimir Oltean
  4 siblings, 1 reply; 13+ messages in thread
From: Colin Foster @ 2022-02-10  4:13 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Jakub Kicinski, David S. Miller, UNGLinuxDriver,
	Alexandre Belloni, Claudiu Manoil, Vladimir Oltean

Create and utilize bulk regmap reads instead of single access for gathering
stats. The background reading of statistics happens frequently, and over
a few contiguous memory regions.

High speed PCIe buses and MMIO access will probably see negligible
performance increase. Lower speed buses like SPI and I2C could see
significant performance increase, since the bus configuration and register
access times account for a large percentage of data transfer time.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
---
 drivers/net/ethernet/mscc/ocelot.c | 79 +++++++++++++++++++++++++-----
 include/soc/mscc/ocelot.h          |  8 +++
 2 files changed, 75 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index ab36732e7d3f..fdbd31149dfc 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -1738,25 +1738,36 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
 EXPORT_SYMBOL(ocelot_get_strings);
 
 /* Caller must hold &ocelot->stats_lock */
-static void ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
+static int ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
 {
-	int j;
+	unsigned int idx = port * ocelot->num_stats;
+	struct ocelot_stats_region *region;
+	int err, j;
 
 	/* Configure the port to read the stats from */
 	ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port), SYS_STAT_CFG);
 
-	for (j = 0; j < ocelot->num_stats; j++) {
-		u32 val;
-		unsigned int idx = port * ocelot->num_stats + j;
+	list_for_each_entry(region, &ocelot->stats_regions, node) {
+		err = ocelot_bulk_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
+					   region->offset, region->buf,
+					   region->count);
+		if (err)
+			return err;
 
-		val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
-				      ocelot->stats_layout[j].offset);
+		for (j = 0; j < region->count; j++) {
+			u64 *stat = &ocelot->stats[idx + j];
+			u64 val = region->buf[j];
 
-		if (val < (ocelot->stats[idx] & U32_MAX))
-			ocelot->stats[idx] += (u64)1 << 32;
+			if (val < (*stat & U32_MAX))
+				*stat += (u64)1 << 32;
+
+			*stat = (*stat & ~(u64)U32_MAX) + val;
+		}
 
-		ocelot->stats[idx] = (ocelot->stats[idx] & ~(u64)U32_MAX) + val;
+		idx += region->count;
 	}
+
+	return err;
 }
 
 static void ocelot_check_stats_work(struct work_struct *work)
@@ -1777,12 +1788,14 @@ static void ocelot_check_stats_work(struct work_struct *work)
 
 void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
 {
-	int i;
+	int i, err;
 
 	mutex_lock(&ocelot->stats_lock);
 
 	/* check and update now */
-	ocelot_update_stats_for_port(ocelot, port);
+	err = ocelot_update_stats_for_port(ocelot, port);
+	if (err)
+		dev_err(ocelot->dev, "Error %d updating ethtool stats\n", err);
 
 	/* Copy all counters */
 	for (i = 0; i < ocelot->num_stats; i++)
@@ -1801,6 +1814,41 @@ int ocelot_get_sset_count(struct ocelot *ocelot, int port, int sset)
 }
 EXPORT_SYMBOL(ocelot_get_sset_count);
 
+static int ocelot_prepare_stats_regions(struct ocelot *ocelot)
+{
+	struct ocelot_stats_region *region = NULL;
+	unsigned int last;
+	int i;
+
+	INIT_LIST_HEAD(&ocelot->stats_regions);
+
+	for (i = 0; i < ocelot->num_stats; i++) {
+		if (region && ocelot->stats_layout[i].offset == last + 1) {
+			region->count++;
+		} else {
+			region = devm_kzalloc(ocelot->dev, sizeof(*region),
+					      GFP_KERNEL);
+			if (!region)
+				return -ENOMEM;
+
+			region->offset = ocelot->stats_layout[i].offset;
+			region->count = 1;
+			list_add_tail(&region->node, &ocelot->stats_regions);
+		}
+
+		last = ocelot->stats_layout[i].offset;
+	}
+
+	list_for_each_entry(region, &ocelot->stats_regions, node) {
+		region->buf = devm_kcalloc(ocelot->dev, region->count,
+					   sizeof(*region->buf), GFP_KERNEL);
+		if (!region->buf)
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
 int ocelot_get_ts_info(struct ocelot *ocelot, int port,
 		       struct ethtool_ts_info *info)
 {
@@ -2801,6 +2849,13 @@ int ocelot_init(struct ocelot *ocelot)
 				 ANA_CPUQ_8021_CFG_CPUQ_BPDU_VAL(6),
 				 ANA_CPUQ_8021_CFG, i);
 
+	ret = ocelot_prepare_stats_regions(ocelot);
+	if (ret) {
+		destroy_workqueue(ocelot->stats_queue);
+		destroy_workqueue(ocelot->owq);
+		return ret;
+	}
+
 	INIT_DELAYED_WORK(&ocelot->stats_work, ocelot_check_stats_work);
 	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
 			   OCELOT_STATS_CHECK_DELAY);
diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 312b72558659..d3291a5f7e88 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -542,6 +542,13 @@ struct ocelot_stat_layout {
 	char name[ETH_GSTRING_LEN];
 };
 
+struct ocelot_stats_region {
+	struct list_head node;
+	u32 offset;
+	int count;
+	u32 *buf;
+};
+
 enum ocelot_tag_prefix {
 	OCELOT_TAG_PREFIX_DISABLED	= 0,
 	OCELOT_TAG_PREFIX_NONE,
@@ -673,6 +680,7 @@ struct ocelot {
 	struct regmap_field		*regfields[REGFIELD_MAX];
 	const u32 *const		*map;
 	const struct ocelot_stat_layout	*stats_layout;
+	struct list_head		stats_regions;
 	unsigned int			num_stats;
 
 	u32				pool_size[OCELOT_SB_NUM][OCELOT_SB_POOL_NUM];
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read
  2022-02-10  4:13 ` [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read Colin Foster
@ 2022-02-10  9:50   ` Vladimir Oltean
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-02-10  9:50 UTC (permalink / raw)
  To: Colin Foster
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil,
	Vladimir Oltean

On Wed, Feb 09, 2022 at 08:13:41PM -0800, Colin Foster wrote:
> An ongoing workqueue populates the stats buffer. At the same time, a user
> might query the statistics. While writing to the buffer is mutex-locked,
> reading from the buffer wasn't. This could lead to buggy reads by ethtool.
> 
> Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
> Fixes: a556c76adc052 ("net: mscc: Add initial Ocelot switch support")
> Reported-by: Vladimir Oltean <olteanv@gmail.com>

I reported this using vladimir.oltean@nxp.com btw.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>

If you hurry and resend this against the "net" tree, you might catch
this week's pull request, since the last one was on Feb 3->4:
https://patchwork.kernel.org/project/netdevbpf/patch/20220204000428.2889873-1-kuba@kernel.org/
Then "net" will be merged into "net-next" probably the next day or so,
and you can resend patches 2-5 towards "net-next".

> ---
>  drivers/net/ethernet/mscc/ocelot.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> index 455293aa6343..6933dff1dd37 100644
> --- a/drivers/net/ethernet/mscc/ocelot.c
> +++ b/drivers/net/ethernet/mscc/ocelot.c
> @@ -1737,12 +1737,11 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
>  }
>  EXPORT_SYMBOL(ocelot_get_strings);
>  
> +/* Caller must hold &ocelot->stats_lock */
>  static void ocelot_update_stats(struct ocelot *ocelot)
>  {
>  	int i, j;
>  
> -	mutex_lock(&ocelot->stats_lock);
> -
>  	for (i = 0; i < ocelot->num_phys_ports; i++) {
>  		/* Configure the port to read the stats from */
>  		ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(i), SYS_STAT_CFG);
> @@ -1761,8 +1760,6 @@ static void ocelot_update_stats(struct ocelot *ocelot)
>  					      ~(u64)U32_MAX) + val;
>  		}
>  	}
> -
> -	mutex_unlock(&ocelot->stats_lock);
>  }
>  
>  static void ocelot_check_stats_work(struct work_struct *work)
> @@ -1771,7 +1768,9 @@ static void ocelot_check_stats_work(struct work_struct *work)
>  	struct ocelot *ocelot = container_of(del_work, struct ocelot,
>  					     stats_work);
>  
> +	mutex_lock(&ocelot->stats_lock);
>  	ocelot_update_stats(ocelot);
> +	mutex_unlock(&ocelot->stats_lock);
>  
>  	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
>  			   OCELOT_STATS_CHECK_DELAY);
> @@ -1781,12 +1780,16 @@ void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
>  {
>  	int i;
>  
> +	mutex_lock(&ocelot->stats_lock);
> +
>  	/* check and update now */
>  	ocelot_update_stats(ocelot);
>  
>  	/* Copy all counters */
>  	for (i = 0; i < ocelot->num_stats; i++)
>  		*data++ = ocelot->stats[port * ocelot->num_stats + i];
> +
> +	mutex_unlock(&ocelot->stats_lock);
>  }
>  EXPORT_SYMBOL(ocelot_get_ethtool_stats);
>  
> -- 
> 2.25.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 2/5] net: mscc: ocelot: remove unnecessary stat reading from ethtool
  2022-02-10  4:13 ` [PATCH v6 net-next 2/5] net: mscc: ocelot: remove unnecessary stat reading from ethtool Colin Foster
@ 2022-02-10 10:34   ` Vladimir Oltean
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-02-10 10:34 UTC (permalink / raw)
  To: Colin Foster
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil

On Wed, Feb 09, 2022 at 08:13:42PM -0800, Colin Foster wrote:
> The ocelot_update_stats function only needs to read from one port, yet it
> was updating the stats for all ports. Update to only read the stats that
> are necessary.
> 
> Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
> ---

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>

>  drivers/net/ethernet/mscc/ocelot.c | 33 +++++++++++++++---------------
>  1 file changed, 16 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> index 6933dff1dd37..ab36732e7d3f 100644
> --- a/drivers/net/ethernet/mscc/ocelot.c
> +++ b/drivers/net/ethernet/mscc/ocelot.c
> @@ -1738,27 +1738,24 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
>  EXPORT_SYMBOL(ocelot_get_strings);
>  
>  /* Caller must hold &ocelot->stats_lock */
> -static void ocelot_update_stats(struct ocelot *ocelot)
> +static void ocelot_update_stats_for_port(struct ocelot *ocelot, int port)

If you need to resend, I think a name more consistent with the rest of
the driver would be "ocelot_port_update_stats".

>  {
> -	int i, j;
> +	int j;
>  
> -	for (i = 0; i < ocelot->num_phys_ports; i++) {
> -		/* Configure the port to read the stats from */
> -		ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(i), SYS_STAT_CFG);
> +	/* Configure the port to read the stats from */
> +	ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port), SYS_STAT_CFG);
>  
> -		for (j = 0; j < ocelot->num_stats; j++) {
> -			u32 val;
> -			unsigned int idx = i * ocelot->num_stats + j;
> +	for (j = 0; j < ocelot->num_stats; j++) {
> +		u32 val;
> +		unsigned int idx = port * ocelot->num_stats + j;
>  
> -			val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> -					      ocelot->stats_layout[j].offset);
> +		val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> +				      ocelot->stats_layout[j].offset);
>  
> -			if (val < (ocelot->stats[idx] & U32_MAX))
> -				ocelot->stats[idx] += (u64)1 << 32;
> +		if (val < (ocelot->stats[idx] & U32_MAX))
> +			ocelot->stats[idx] += (u64)1 << 32;
>  
> -			ocelot->stats[idx] = (ocelot->stats[idx] &
> -					      ~(u64)U32_MAX) + val;
> -		}
> +		ocelot->stats[idx] = (ocelot->stats[idx] & ~(u64)U32_MAX) + val;
>  	}
>  }
>  
> @@ -1767,9 +1764,11 @@ static void ocelot_check_stats_work(struct work_struct *work)
>  	struct delayed_work *del_work = to_delayed_work(work);
>  	struct ocelot *ocelot = container_of(del_work, struct ocelot,
>  					     stats_work);
> +	int i;
>  
>  	mutex_lock(&ocelot->stats_lock);
> -	ocelot_update_stats(ocelot);
> +	for (i = 0; i < ocelot->num_phys_ports; i++)
> +		ocelot_update_stats_for_port(ocelot, i);
>  	mutex_unlock(&ocelot->stats_lock);
>  
>  	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
> @@ -1783,7 +1782,7 @@ void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
>  	mutex_lock(&ocelot->stats_lock);
>  
>  	/* check and update now */
> -	ocelot_update_stats(ocelot);
> +	ocelot_update_stats_for_port(ocelot, port);
>  
>  	/* Copy all counters */
>  	for (i = 0; i < ocelot->num_stats; i++)
> -- 
> 2.25.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats
  2022-02-10  4:13 ` [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats Colin Foster
@ 2022-02-10 10:36   ` Vladimir Oltean
  2022-02-10 14:37     ` Colin Foster
  2022-02-10 15:21     ` Colin Foster
  0 siblings, 2 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-02-10 10:36 UTC (permalink / raw)
  To: Colin Foster
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil

On Wed, Feb 09, 2022 at 08:13:45PM -0800, Colin Foster wrote:
> Create and utilize bulk regmap reads instead of single access for gathering
> stats. The background reading of statistics happens frequently, and over
> a few contiguous memory regions.
> 
> High speed PCIe buses and MMIO access will probably see negligible
> performance increase. Lower speed buses like SPI and I2C could see
> significant performance increase, since the bus configuration and register
> access times account for a large percentage of data transfer time.
> 
> Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
> ---
>  drivers/net/ethernet/mscc/ocelot.c | 79 +++++++++++++++++++++++++-----
>  include/soc/mscc/ocelot.h          |  8 +++
>  2 files changed, 75 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> index ab36732e7d3f..fdbd31149dfc 100644
> --- a/drivers/net/ethernet/mscc/ocelot.c
> +++ b/drivers/net/ethernet/mscc/ocelot.c
> @@ -1738,25 +1738,36 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
>  EXPORT_SYMBOL(ocelot_get_strings);
>  
>  /* Caller must hold &ocelot->stats_lock */
> -static void ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
> +static int ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
>  {
> -	int j;
> +	unsigned int idx = port * ocelot->num_stats;
> +	struct ocelot_stats_region *region;
> +	int err, j;
>  
>  	/* Configure the port to read the stats from */
>  	ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port), SYS_STAT_CFG);
>  
> -	for (j = 0; j < ocelot->num_stats; j++) {
> -		u32 val;
> -		unsigned int idx = port * ocelot->num_stats + j;
> +	list_for_each_entry(region, &ocelot->stats_regions, node) {
> +		err = ocelot_bulk_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> +					   region->offset, region->buf,
> +					   region->count);
> +		if (err)
> +			return err;
>  
> -		val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> -				      ocelot->stats_layout[j].offset);
> +		for (j = 0; j < region->count; j++) {
> +			u64 *stat = &ocelot->stats[idx + j];
> +			u64 val = region->buf[j];
>  
> -		if (val < (ocelot->stats[idx] & U32_MAX))
> -			ocelot->stats[idx] += (u64)1 << 32;
> +			if (val < (*stat & U32_MAX))
> +				*stat += (u64)1 << 32;
> +
> +			*stat = (*stat & ~(u64)U32_MAX) + val;
> +		}
>  
> -		ocelot->stats[idx] = (ocelot->stats[idx] & ~(u64)U32_MAX) + val;
> +		idx += region->count;
>  	}
> +
> +	return err;
>  }
>  
>  static void ocelot_check_stats_work(struct work_struct *work)
> @@ -1777,12 +1788,14 @@ static void ocelot_check_stats_work(struct work_struct *work)
>  
>  void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
>  {
> -	int i;
> +	int i, err;
>  
>  	mutex_lock(&ocelot->stats_lock);
>  
>  	/* check and update now */
> -	ocelot_update_stats_for_port(ocelot, port);
> +	err = ocelot_update_stats_for_port(ocelot, port);

ocelot_check_stats_work() should also check for errors.

> +	if (err)
> +		dev_err(ocelot->dev, "Error %d updating ethtool stats\n", err);
>  
>  	/* Copy all counters */
>  	for (i = 0; i < ocelot->num_stats; i++)
> @@ -1801,6 +1814,41 @@ int ocelot_get_sset_count(struct ocelot *ocelot, int port, int sset)
>  }
>  EXPORT_SYMBOL(ocelot_get_sset_count);
>  
> +static int ocelot_prepare_stats_regions(struct ocelot *ocelot)
> +{
> +	struct ocelot_stats_region *region = NULL;
> +	unsigned int last;
> +	int i;
> +
> +	INIT_LIST_HEAD(&ocelot->stats_regions);
> +
> +	for (i = 0; i < ocelot->num_stats; i++) {
> +		if (region && ocelot->stats_layout[i].offset == last + 1) {
> +			region->count++;
> +		} else {
> +			region = devm_kzalloc(ocelot->dev, sizeof(*region),
> +					      GFP_KERNEL);
> +			if (!region)
> +				return -ENOMEM;
> +
> +			region->offset = ocelot->stats_layout[i].offset;
> +			region->count = 1;
> +			list_add_tail(&region->node, &ocelot->stats_regions);
> +		}
> +
> +		last = ocelot->stats_layout[i].offset;
> +	}
> +
> +	list_for_each_entry(region, &ocelot->stats_regions, node) {
> +		region->buf = devm_kcalloc(ocelot->dev, region->count,
> +					   sizeof(*region->buf), GFP_KERNEL);
> +		if (!region->buf)
> +			return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
>  int ocelot_get_ts_info(struct ocelot *ocelot, int port,
>  		       struct ethtool_ts_info *info)
>  {
> @@ -2801,6 +2849,13 @@ int ocelot_init(struct ocelot *ocelot)
>  				 ANA_CPUQ_8021_CFG_CPUQ_BPDU_VAL(6),
>  				 ANA_CPUQ_8021_CFG, i);
>  
> +	ret = ocelot_prepare_stats_regions(ocelot);
> +	if (ret) {
> +		destroy_workqueue(ocelot->stats_queue);
> +		destroy_workqueue(ocelot->owq);
> +		return ret;
> +	}
> +
>  	INIT_DELAYED_WORK(&ocelot->stats_work, ocelot_check_stats_work);
>  	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
>  			   OCELOT_STATS_CHECK_DELAY);
> diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
> index 312b72558659..d3291a5f7e88 100644
> --- a/include/soc/mscc/ocelot.h
> +++ b/include/soc/mscc/ocelot.h
> @@ -542,6 +542,13 @@ struct ocelot_stat_layout {
>  	char name[ETH_GSTRING_LEN];
>  };
>  
> +struct ocelot_stats_region {
> +	struct list_head node;
> +	u32 offset;
> +	int count;
> +	u32 *buf;
> +};
> +
>  enum ocelot_tag_prefix {
>  	OCELOT_TAG_PREFIX_DISABLED	= 0,
>  	OCELOT_TAG_PREFIX_NONE,
> @@ -673,6 +680,7 @@ struct ocelot {
>  	struct regmap_field		*regfields[REGFIELD_MAX];
>  	const u32 *const		*map;
>  	const struct ocelot_stat_layout	*stats_layout;
> +	struct list_head		stats_regions;
>  	unsigned int			num_stats;
>  
>  	u32				pool_size[OCELOT_SB_NUM][OCELOT_SB_POOL_NUM];
> -- 
> 2.25.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats
  2022-02-10 10:36   ` Vladimir Oltean
@ 2022-02-10 14:37     ` Colin Foster
  2022-02-10 15:21     ` Colin Foster
  1 sibling, 0 replies; 13+ messages in thread
From: Colin Foster @ 2022-02-10 14:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil

On Thu, Feb 10, 2022 at 10:36:37AM +0000, Vladimir Oltean wrote:
> On Wed, Feb 09, 2022 at 08:13:45PM -0800, Colin Foster wrote:
> > Create and utilize bulk regmap reads instead of single access for gathering
> > stats. The background reading of statistics happens frequently, and over
> > a few contiguous memory regions.
> > 
> > High speed PCIe buses and MMIO access will probably see negligible
> > performance increase. Lower speed buses like SPI and I2C could see
> > significant performance increase, since the bus configuration and register
> > access times account for a large percentage of data transfer time.
> > 
> > Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
> > ---
> >  drivers/net/ethernet/mscc/ocelot.c | 79 +++++++++++++++++++++++++-----
> >  include/soc/mscc/ocelot.h          |  8 +++
> >  2 files changed, 75 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> > index ab36732e7d3f..fdbd31149dfc 100644
> > --- a/drivers/net/ethernet/mscc/ocelot.c
> > +++ b/drivers/net/ethernet/mscc/ocelot.c
> > @@ -1738,25 +1738,36 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
> >  EXPORT_SYMBOL(ocelot_get_strings);
> >  
> >  /* Caller must hold &ocelot->stats_lock */
> > -static void ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
> > +static int ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
> >  {
> > -	int j;
> > +	unsigned int idx = port * ocelot->num_stats;
> > +	struct ocelot_stats_region *region;
> > +	int err, j;
> >  
> >  	/* Configure the port to read the stats from */
> >  	ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port), SYS_STAT_CFG);
> >  
> > -	for (j = 0; j < ocelot->num_stats; j++) {
> > -		u32 val;
> > -		unsigned int idx = port * ocelot->num_stats + j;
> > +	list_for_each_entry(region, &ocelot->stats_regions, node) {
> > +		err = ocelot_bulk_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> > +					   region->offset, region->buf,
> > +					   region->count);
> > +		if (err)
> > +			return err;
> >  
> > -		val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> > -				      ocelot->stats_layout[j].offset);
> > +		for (j = 0; j < region->count; j++) {
> > +			u64 *stat = &ocelot->stats[idx + j];
> > +			u64 val = region->buf[j];
> >  
> > -		if (val < (ocelot->stats[idx] & U32_MAX))
> > -			ocelot->stats[idx] += (u64)1 << 32;
> > +			if (val < (*stat & U32_MAX))
> > +				*stat += (u64)1 << 32;
> > +
> > +			*stat = (*stat & ~(u64)U32_MAX) + val;
> > +		}
> >  
> > -		ocelot->stats[idx] = (ocelot->stats[idx] & ~(u64)U32_MAX) + val;
> > +		idx += region->count;
> >  	}
> > +
> > +	return err;
> >  }
> >  
> >  static void ocelot_check_stats_work(struct work_struct *work)
> > @@ -1777,12 +1788,14 @@ static void ocelot_check_stats_work(struct work_struct *work)
> >  
> >  void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
> >  {
> > -	int i;
> > +	int i, err;
> >  
> >  	mutex_lock(&ocelot->stats_lock);
> >  
> >  	/* check and update now */
> > -	ocelot_update_stats_for_port(ocelot, port);
> > +	err = ocelot_update_stats_for_port(ocelot, port);
> 
> ocelot_check_stats_work() should also check for errors.

Worthy enough for a resend. I'll send the fix patch to net with your
correct email and send this update once that gets merged through.

Thanks!

> 
> > +	if (err)
> > +		dev_err(ocelot->dev, "Error %d updating ethtool stats\n", err);
> >  
> >  	/* Copy all counters */
> >  	for (i = 0; i < ocelot->num_stats; i++)
> > @@ -1801,6 +1814,41 @@ int ocelot_get_sset_count(struct ocelot *ocelot, int port, int sset)
> >  }
> >  EXPORT_SYMBOL(ocelot_get_sset_count);
> >  
> > +static int ocelot_prepare_stats_regions(struct ocelot *ocelot)
> > +{
> > +	struct ocelot_stats_region *region = NULL;
> > +	unsigned int last;
> > +	int i;
> > +
> > +	INIT_LIST_HEAD(&ocelot->stats_regions);
> > +
> > +	for (i = 0; i < ocelot->num_stats; i++) {
> > +		if (region && ocelot->stats_layout[i].offset == last + 1) {
> > +			region->count++;
> > +		} else {
> > +			region = devm_kzalloc(ocelot->dev, sizeof(*region),
> > +					      GFP_KERNEL);
> > +			if (!region)
> > +				return -ENOMEM;
> > +
> > +			region->offset = ocelot->stats_layout[i].offset;
> > +			region->count = 1;
> > +			list_add_tail(&region->node, &ocelot->stats_regions);
> > +		}
> > +
> > +		last = ocelot->stats_layout[i].offset;
> > +	}
> > +
> > +	list_for_each_entry(region, &ocelot->stats_regions, node) {
> > +		region->buf = devm_kcalloc(ocelot->dev, region->count,
> > +					   sizeof(*region->buf), GFP_KERNEL);
> > +		if (!region->buf)
> > +			return -ENOMEM;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  int ocelot_get_ts_info(struct ocelot *ocelot, int port,
> >  		       struct ethtool_ts_info *info)
> >  {
> > @@ -2801,6 +2849,13 @@ int ocelot_init(struct ocelot *ocelot)
> >  				 ANA_CPUQ_8021_CFG_CPUQ_BPDU_VAL(6),
> >  				 ANA_CPUQ_8021_CFG, i);
> >  
> > +	ret = ocelot_prepare_stats_regions(ocelot);
> > +	if (ret) {
> > +		destroy_workqueue(ocelot->stats_queue);
> > +		destroy_workqueue(ocelot->owq);
> > +		return ret;
> > +	}
> > +
> >  	INIT_DELAYED_WORK(&ocelot->stats_work, ocelot_check_stats_work);
> >  	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
> >  			   OCELOT_STATS_CHECK_DELAY);
> > diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
> > index 312b72558659..d3291a5f7e88 100644
> > --- a/include/soc/mscc/ocelot.h
> > +++ b/include/soc/mscc/ocelot.h
> > @@ -542,6 +542,13 @@ struct ocelot_stat_layout {
> >  	char name[ETH_GSTRING_LEN];
> >  };
> >  
> > +struct ocelot_stats_region {
> > +	struct list_head node;
> > +	u32 offset;
> > +	int count;
> > +	u32 *buf;
> > +};
> > +
> >  enum ocelot_tag_prefix {
> >  	OCELOT_TAG_PREFIX_DISABLED	= 0,
> >  	OCELOT_TAG_PREFIX_NONE,
> > @@ -673,6 +680,7 @@ struct ocelot {
> >  	struct regmap_field		*regfields[REGFIELD_MAX];
> >  	const u32 *const		*map;
> >  	const struct ocelot_stat_layout	*stats_layout;
> > +	struct list_head		stats_regions;
> >  	unsigned int			num_stats;
> >  
> >  	u32				pool_size[OCELOT_SB_NUM][OCELOT_SB_POOL_NUM];
> > -- 
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats
  2022-02-10 10:36   ` Vladimir Oltean
  2022-02-10 14:37     ` Colin Foster
@ 2022-02-10 15:21     ` Colin Foster
  2022-02-10 15:27       ` Vladimir Oltean
  1 sibling, 1 reply; 13+ messages in thread
From: Colin Foster @ 2022-02-10 15:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil

On Thu, Feb 10, 2022 at 10:36:37AM +0000, Vladimir Oltean wrote:
> On Wed, Feb 09, 2022 at 08:13:45PM -0800, Colin Foster wrote:
> > Create and utilize bulk regmap reads instead of single access for gathering
> > stats. The background reading of statistics happens frequently, and over
> > a few contiguous memory regions.
> > 
> > High speed PCIe buses and MMIO access will probably see negligible
> > performance increase. Lower speed buses like SPI and I2C could see
> > significant performance increase, since the bus configuration and register
> > access times account for a large percentage of data transfer time.
> > 
> > Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
> > ---
> >  drivers/net/ethernet/mscc/ocelot.c | 79 +++++++++++++++++++++++++-----
> >  include/soc/mscc/ocelot.h          |  8 +++
> >  2 files changed, 75 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
> > index ab36732e7d3f..fdbd31149dfc 100644
> > --- a/drivers/net/ethernet/mscc/ocelot.c
> > +++ b/drivers/net/ethernet/mscc/ocelot.c
> > @@ -1738,25 +1738,36 @@ void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data)
> >  EXPORT_SYMBOL(ocelot_get_strings);
> >  
> >  /* Caller must hold &ocelot->stats_lock */
> > -static void ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
> > +static int ocelot_update_stats_for_port(struct ocelot *ocelot, int port)
> >  {
> > -	int j;
> > +	unsigned int idx = port * ocelot->num_stats;
> > +	struct ocelot_stats_region *region;
> > +	int err, j;
> >  
> >  	/* Configure the port to read the stats from */
> >  	ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port), SYS_STAT_CFG);
> >  
> > -	for (j = 0; j < ocelot->num_stats; j++) {
> > -		u32 val;
> > -		unsigned int idx = port * ocelot->num_stats + j;
> > +	list_for_each_entry(region, &ocelot->stats_regions, node) {
> > +		err = ocelot_bulk_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> > +					   region->offset, region->buf,
> > +					   region->count);
> > +		if (err)
> > +			return err;
> >  
> > -		val = ocelot_read_rix(ocelot, SYS_COUNT_RX_OCTETS,
> > -				      ocelot->stats_layout[j].offset);
> > +		for (j = 0; j < region->count; j++) {
> > +			u64 *stat = &ocelot->stats[idx + j];
> > +			u64 val = region->buf[j];
> >  
> > -		if (val < (ocelot->stats[idx] & U32_MAX))
> > -			ocelot->stats[idx] += (u64)1 << 32;
> > +			if (val < (*stat & U32_MAX))
> > +				*stat += (u64)1 << 32;
> > +
> > +			*stat = (*stat & ~(u64)U32_MAX) + val;
> > +		}
> >  
> > -		ocelot->stats[idx] = (ocelot->stats[idx] & ~(u64)U32_MAX) + val;
> > +		idx += region->count;
> >  	}
> > +
> > +	return err;
> >  }
> >  
> >  static void ocelot_check_stats_work(struct work_struct *work)
> > @@ -1777,12 +1788,14 @@ static void ocelot_check_stats_work(struct work_struct *work)
> >  
> >  void ocelot_get_ethtool_stats(struct ocelot *ocelot, int port, u64 *data)
> >  {
> > -	int i;
> > +	int i, err;
> >  
> >  	mutex_lock(&ocelot->stats_lock);
> >  
> >  	/* check and update now */
> > -	ocelot_update_stats_for_port(ocelot, port);
> > +	err = ocelot_update_stats_for_port(ocelot, port);
> 
> ocelot_check_stats_work() should also check for errors.

Another change I'm catching: I assume calling dev_err while holding a
mutex is frowned upon, so I'm moving this err check after the counter
copy / mutex unlock. I'll submit that fix after patch 1 gets merged into
net / mainline.

> 
> > +	if (err)
> > +		dev_err(ocelot->dev, "Error %d updating ethtool stats\n", err);
> >  
> >  	/* Copy all counters */
> >  	for (i = 0; i < ocelot->num_stats; i++)
> > @@ -1801,6 +1814,41 @@ int ocelot_get_sset_count(struct ocelot *ocelot, int port, int sset)
> >  }
> >  EXPORT_SYMBOL(ocelot_get_sset_count);
> >  
> > +static int ocelot_prepare_stats_regions(struct ocelot *ocelot)
> > +{
> > +	struct ocelot_stats_region *region = NULL;
> > +	unsigned int last;
> > +	int i;
> > +
> > +	INIT_LIST_HEAD(&ocelot->stats_regions);
> > +
> > +	for (i = 0; i < ocelot->num_stats; i++) {
> > +		if (region && ocelot->stats_layout[i].offset == last + 1) {
> > +			region->count++;
> > +		} else {
> > +			region = devm_kzalloc(ocelot->dev, sizeof(*region),
> > +					      GFP_KERNEL);
> > +			if (!region)
> > +				return -ENOMEM;
> > +
> > +			region->offset = ocelot->stats_layout[i].offset;
> > +			region->count = 1;
> > +			list_add_tail(&region->node, &ocelot->stats_regions);
> > +		}
> > +
> > +		last = ocelot->stats_layout[i].offset;
> > +	}
> > +
> > +	list_for_each_entry(region, &ocelot->stats_regions, node) {
> > +		region->buf = devm_kcalloc(ocelot->dev, region->count,
> > +					   sizeof(*region->buf), GFP_KERNEL);
> > +		if (!region->buf)
> > +			return -ENOMEM;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  int ocelot_get_ts_info(struct ocelot *ocelot, int port,
> >  		       struct ethtool_ts_info *info)
> >  {
> > @@ -2801,6 +2849,13 @@ int ocelot_init(struct ocelot *ocelot)
> >  				 ANA_CPUQ_8021_CFG_CPUQ_BPDU_VAL(6),
> >  				 ANA_CPUQ_8021_CFG, i);
> >  
> > +	ret = ocelot_prepare_stats_regions(ocelot);
> > +	if (ret) {
> > +		destroy_workqueue(ocelot->stats_queue);
> > +		destroy_workqueue(ocelot->owq);
> > +		return ret;
> > +	}
> > +
> >  	INIT_DELAYED_WORK(&ocelot->stats_work, ocelot_check_stats_work);
> >  	queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
> >  			   OCELOT_STATS_CHECK_DELAY);
> > diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
> > index 312b72558659..d3291a5f7e88 100644
> > --- a/include/soc/mscc/ocelot.h
> > +++ b/include/soc/mscc/ocelot.h
> > @@ -542,6 +542,13 @@ struct ocelot_stat_layout {
> >  	char name[ETH_GSTRING_LEN];
> >  };
> >  
> > +struct ocelot_stats_region {
> > +	struct list_head node;
> > +	u32 offset;
> > +	int count;
> > +	u32 *buf;
> > +};
> > +
> >  enum ocelot_tag_prefix {
> >  	OCELOT_TAG_PREFIX_DISABLED	= 0,
> >  	OCELOT_TAG_PREFIX_NONE,
> > @@ -673,6 +680,7 @@ struct ocelot {
> >  	struct regmap_field		*regfields[REGFIELD_MAX];
> >  	const u32 *const		*map;
> >  	const struct ocelot_stat_layout	*stats_layout;
> > +	struct list_head		stats_regions;
> >  	unsigned int			num_stats;
> >  
> >  	u32				pool_size[OCELOT_SB_NUM][OCELOT_SB_POOL_NUM];
> > -- 
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats
  2022-02-10 15:21     ` Colin Foster
@ 2022-02-10 15:27       ` Vladimir Oltean
  2022-02-10 15:35         ` Colin Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Oltean @ 2022-02-10 15:27 UTC (permalink / raw)
  To: Colin Foster
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil

On Thu, Feb 10, 2022 at 07:21:03AM -0800, Colin Foster wrote:
> > ocelot_check_stats_work() should also check for errors.
> 
> Another change I'm catching: I assume calling dev_err while holding a
> mutex is frowned upon, so I'm moving this err check after the counter
> copy / mutex unlock. I'll submit that fix after patch 1 gets merged into
> net / mainline.

Where did you read that? Doing work that doesn't need to be under a lock
while that work is held isn't preferable, of course, but what is special
about dev_err?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats
  2022-02-10 15:27       ` Vladimir Oltean
@ 2022-02-10 15:35         ` Colin Foster
  0 siblings, 0 replies; 13+ messages in thread
From: Colin Foster @ 2022-02-10 15:35 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: linux-kernel, netdev, Jakub Kicinski, David S. Miller,
	UNGLinuxDriver, Alexandre Belloni, Claudiu Manoil

On Thu, Feb 10, 2022 at 03:27:13PM +0000, Vladimir Oltean wrote:
> On Thu, Feb 10, 2022 at 07:21:03AM -0800, Colin Foster wrote:
> > > ocelot_check_stats_work() should also check for errors.
> > 
> > Another change I'm catching: I assume calling dev_err while holding a
> > mutex is frowned upon, so I'm moving this err check after the counter
> > copy / mutex unlock. I'll submit that fix after patch 1 gets merged into
> > net / mainline.
> 
> Where did you read that? Doing work that doesn't need to be under a lock
> while that work is held isn't preferable, of course, but what is special
> about dev_err?

I didn't read it anywhere, just a hunch about string formatting
overhead. Either way - not necessary to do inside of stats_lock, so I
moved it out. Next round will hopefully be early next week.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-02-10 15:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10  4:13 [PATCH v6 net-next 0/5] use bulk reads for ocelot statistics Colin Foster
2022-02-10  4:13 ` [PATCH v6 net-next 1/5] net: mscc: ocelot: fix mutex lock error during ethtool stats read Colin Foster
2022-02-10  9:50   ` Vladimir Oltean
2022-02-10  4:13 ` [PATCH v6 net-next 2/5] net: mscc: ocelot: remove unnecessary stat reading from ethtool Colin Foster
2022-02-10 10:34   ` Vladimir Oltean
2022-02-10  4:13 ` [PATCH v6 net-next 3/5] net: ocelot: align macros for consistency Colin Foster
2022-02-10  4:13 ` [PATCH v6 net-next 4/5] net: mscc: ocelot: add ability to perform bulk reads Colin Foster
2022-02-10  4:13 ` [PATCH v6 net-next 5/5] net: mscc: ocelot: use bulk reads for stats Colin Foster
2022-02-10 10:36   ` Vladimir Oltean
2022-02-10 14:37     ` Colin Foster
2022-02-10 15:21     ` Colin Foster
2022-02-10 15:27       ` Vladimir Oltean
2022-02-10 15:35         ` Colin Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).