All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port
@ 2022-06-26 12:05 Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio Vladimir Oltean
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-26 12:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, Vincenzo Frascino

Richie Pearn reports that if we install a tc-taprio schedule on a Felix
switch port, and that schedule has at least one gate that never opens
(for example TC0 below):

tc qdisc add dev swp1 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S fe 1000000 flags 0x2

then packets classified to the permanently closed traffic class will not
be dequeued by the egress port. They will just remain in the queue
system, to consume resources. Frame aging does not trigger either,
because in order for that to happen, the packets need to be eligible for
egress scheduling in the first place, which they aren't. If that port is
allowed to consume the entire shared buffer of the switch (as we
configure things by default using devlink-sb), then eventually, by
sending enough packets, the entire switch will hang.

If we think enough about the problem, we realize that this is only a
special case of a more general issue, and can also be reproduced with
gates that aren't permanently closed, but are not large enough to send
an entire frame. In that sense, a permanently closed gate is simply a
case where all frames are oversized.

The ENETC has logic to reject transmitted packets that would overrun the
time window - see commit 285e8dedb4bd ("net: enetc: count the tc-taprio
window drops").

The Felix switch has no such thing on a per-packet basis, but it has a
register replicated per {egress port, TC} which essentially limits the
max MTU. A packet which exceeds the per-port-TC MTU is immediately
discarded and therefore will not hang the port anymore (albeit, sadly,
this only bumps a generic drop hardware counter and we cannot really
infer the reason such as to offer a dedicated counter for these events).

This patch set calculates the max MTU per {port, TC} when the tc-taprio
config, or link speed, or port-global MTU values change. This solves the
larger "gate too small for packet" problem, but also the original issue
with the gate permanently closed that was reported by Richie.

Q: Bug fix patch sent to net-next?
A: Yeah, after Xiaoliang started sending bug fixes to net-next himself
   (see https://patchwork.kernel.org/project/netdevbpf/patch/20220617032423.13852-1-xiaoliang.yang_1@nxp.com/)
   there is absolutely no gain in targeting "net" here - I am modifying
   the same areas of code, that have already diverged from 5.18 and
   earlier. So this is why I am also taking the opportunity to introduce
   cleanup patches 1-3, to leave things as clean as possible after the
   rework. I'd be interested if there is a better approach to this.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>

Vladimir Oltean (4):
  time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  net: dsa: felix: keep reference on entire tc-taprio config
  net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw
  net: dsa: felix: drop oversized frames with tc-taprio instead of
    hanging the port

 drivers/net/dsa/ocelot/felix.c         |   9 +
 drivers/net/dsa/ocelot/felix.h         |   1 +
 drivers/net/dsa/ocelot/felix_vsc9959.c | 241 ++++++++++++++++++++++---
 include/soc/mscc/ocelot.h              |   5 +-
 include/vdso/time64.h                  |   1 +
 net/sched/sch_taprio.c                 |   4 +-
 6 files changed, 234 insertions(+), 27 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  2022-06-26 12:05 [PATCH net-next 0/4] Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port Vladimir Oltean
@ 2022-06-26 12:05 ` Vladimir Oltean
  2022-06-27  7:52   ` Vincenzo Frascino
  2022-06-26 12:05 ` [PATCH net-next 2/4] net: dsa: felix: keep reference on entire tc-taprio config Vladimir Oltean
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-26 12:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, Vincenzo Frascino

Time-sensitive networking code needs to work with PTP times expressed in
nanoseconds, and with packet transmission times expressed in
picoseconds, since those would be fractional at higher than gigabit
speed when expressed in nanoseconds.

Convert the existing uses in tc-taprio to a PSEC_PER_NSEC macro.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/vdso/time64.h  | 1 +
 net/sched/sch_taprio.c | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/vdso/time64.h b/include/vdso/time64.h
index b40cfa2aa33c..f1c2d02474ae 100644
--- a/include/vdso/time64.h
+++ b/include/vdso/time64.h
@@ -6,6 +6,7 @@
 #define MSEC_PER_SEC	1000L
 #define USEC_PER_MSEC	1000L
 #define NSEC_PER_USEC	1000L
+#define PSEC_PER_NSEC	1000L
 #define NSEC_PER_MSEC	1000000L
 #define USEC_PER_SEC	1000000L
 #define NSEC_PER_SEC	1000000000L
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index b9c71a304d39..55a9c5ad9954 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -176,7 +176,7 @@ static ktime_t get_interval_end_time(struct sched_gate_list *sched,
 
 static int length_to_duration(struct taprio_sched *q, int len)
 {
-	return div_u64(len * atomic64_read(&q->picos_per_byte), 1000);
+	return div_u64(len * atomic64_read(&q->picos_per_byte), PSEC_PER_NSEC);
 }
 
 /* Returns the entry corresponding to next available interval. If
@@ -551,7 +551,7 @@ static struct sk_buff *taprio_peek(struct Qdisc *sch)
 static void taprio_set_budget(struct taprio_sched *q, struct sched_entry *entry)
 {
 	atomic_set(&entry->budget,
-		   div64_u64((u64)entry->interval * 1000,
+		   div64_u64((u64)entry->interval * PSEC_PER_NSEC,
 			     atomic64_read(&q->picos_per_byte)));
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 2/4] net: dsa: felix: keep reference on entire tc-taprio config
  2022-06-26 12:05 [PATCH net-next 0/4] Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio Vladimir Oltean
@ 2022-06-26 12:05 ` Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 3/4] net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port Vladimir Oltean
  3 siblings, 0 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-26 12:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel

In a future change we will need to remember the entire tc-taprio config
on all ports rather than just the base time, so use the
taprio_offload_get() helper function to replace ocelot_port->base_time
with ocelot_port->taprio.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/ocelot/felix_vsc9959.c | 23 +++++++++++++----------
 include/soc/mscc/ocelot.h              |  5 ++---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index dd9085ae0922..44bbbba4d528 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1210,6 +1210,9 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 			       QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
 			       QSYS_TAG_CONFIG, port);
 
+		taprio_offload_free(ocelot_port->taprio);
+		ocelot_port->taprio = NULL;
+
 		mutex_unlock(&ocelot->tas_lock);
 		return 0;
 	}
@@ -1258,8 +1261,6 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 		       QSYS_TAG_CONFIG_SCH_TRAFFIC_QUEUES_M,
 		       QSYS_TAG_CONFIG, port);
 
-	ocelot_port->base_time = taprio->base_time;
-
 	vsc9959_new_base_time(ocelot, taprio->base_time,
 			      taprio->cycle_time, &base_ts);
 	ocelot_write(ocelot, base_ts.tv_nsec, QSYS_PARAM_CFG_REG_1);
@@ -1282,6 +1283,10 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 	ret = readx_poll_timeout(vsc9959_tas_read_cfg_status, ocelot, val,
 				 !(val & QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE),
 				 10, 100000);
+	if (ret)
+		goto err;
+
+	ocelot_port->taprio = taprio_offload_get(taprio);
 
 err:
 	mutex_unlock(&ocelot->tas_lock);
@@ -1291,17 +1296,18 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 
 static void vsc9959_tas_clock_adjust(struct ocelot *ocelot)
 {
+	struct tc_taprio_qopt_offload *taprio;
 	struct ocelot_port *ocelot_port;
 	struct timespec64 base_ts;
-	u64 cycletime;
 	int port;
 	u32 val;
 
 	mutex_lock(&ocelot->tas_lock);
 
 	for (port = 0; port < ocelot->num_phys_ports; port++) {
-		val = ocelot_read_rix(ocelot, QSYS_TAG_CONFIG, port);
-		if (!(val & QSYS_TAG_CONFIG_ENABLE))
+		ocelot_port = ocelot->ports[port];
+		taprio = ocelot_port->taprio;
+		if (!taprio)
 			continue;
 
 		ocelot_rmw(ocelot,
@@ -1315,11 +1321,8 @@ static void vsc9959_tas_clock_adjust(struct ocelot *ocelot)
 			       QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
 			       QSYS_TAG_CONFIG, port);
 
-		cycletime = ocelot_read(ocelot, QSYS_PARAM_CFG_REG_4);
-		ocelot_port = ocelot->ports[port];
-
-		vsc9959_new_base_time(ocelot, ocelot_port->base_time,
-				      cycletime, &base_ts);
+		vsc9959_new_base_time(ocelot, taprio->base_time,
+				      taprio->cycle_time, &base_ts);
 
 		ocelot_write(ocelot, base_ts.tv_nsec, QSYS_PARAM_CFG_REG_1);
 		ocelot_write(ocelot, lower_32_bits(base_ts.tv_sec),
diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 3737570116c3..ac151ecc7f19 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -670,6 +670,8 @@ struct ocelot_port {
 	/* VLAN that untagged frames are classified to, on ingress */
 	const struct ocelot_bridge_vlan	*pvid_vlan;
 
+	struct tc_taprio_qopt_offload	*taprio;
+
 	phy_interface_t			phy_mode;
 
 	unsigned int			ptp_skbs_in_flight;
@@ -692,9 +694,6 @@ struct ocelot_port {
 	int				bridge_num;
 
 	int				speed;
-
-	/* Store the AdminBaseTime of EST fetched from userspace. */
-	s64				base_time;
 };
 
 struct ocelot {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 3/4] net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw
  2022-06-26 12:05 [PATCH net-next 0/4] Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 2/4] net: dsa: felix: keep reference on entire tc-taprio config Vladimir Oltean
@ 2022-06-26 12:05 ` Vladimir Oltean
  2022-06-26 12:05 ` [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port Vladimir Oltean
  3 siblings, 0 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-26 12:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel

In vsc9959_tas_clock_adjust(), the INIT_GATE_STATE field is not changed,
only the ENABLE field. Similarly for the disabling of the time-aware
shaper in vsc9959_qos_port_tas_set().

To reflect this, keep the QSYS_TAG_CONFIG_INIT_GATE_STATE_M mask out of
the read-modify-write procedure to make it clearer what is the intention
of the code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/ocelot/felix_vsc9959.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index 44bbbba4d528..7573254274b3 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1204,10 +1204,7 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 	mutex_lock(&ocelot->tas_lock);
 
 	if (!taprio->enable) {
-		ocelot_rmw_rix(ocelot,
-			       QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF),
-			       QSYS_TAG_CONFIG_ENABLE |
-			       QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
+		ocelot_rmw_rix(ocelot, 0, QSYS_TAG_CONFIG_ENABLE,
 			       QSYS_TAG_CONFIG, port);
 
 		taprio_offload_free(ocelot_port->taprio);
@@ -1315,10 +1312,8 @@ static void vsc9959_tas_clock_adjust(struct ocelot *ocelot)
 			   QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM_M,
 			   QSYS_TAS_PARAM_CFG_CTRL);
 
-		ocelot_rmw_rix(ocelot,
-			       QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF),
-			       QSYS_TAG_CONFIG_ENABLE |
-			       QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
+		/* Disable time-aware shaper */
+		ocelot_rmw_rix(ocelot, 0, QSYS_TAG_CONFIG_ENABLE,
 			       QSYS_TAG_CONFIG, port);
 
 		vsc9959_new_base_time(ocelot, taprio->base_time,
@@ -1337,11 +1332,9 @@ static void vsc9959_tas_clock_adjust(struct ocelot *ocelot)
 			   QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE,
 			   QSYS_TAS_PARAM_CFG_CTRL);
 
-		ocelot_rmw_rix(ocelot,
-			       QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) |
+		/* Re-enable time-aware shaper */
+		ocelot_rmw_rix(ocelot, QSYS_TAG_CONFIG_ENABLE,
 			       QSYS_TAG_CONFIG_ENABLE,
-			       QSYS_TAG_CONFIG_ENABLE |
-			       QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
 			       QSYS_TAG_CONFIG, port);
 	}
 	mutex_unlock(&ocelot->tas_lock);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port
  2022-06-26 12:05 [PATCH net-next 0/4] Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port Vladimir Oltean
                   ` (2 preceding siblings ...)
  2022-06-26 12:05 ` [PATCH net-next 3/4] net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw Vladimir Oltean
@ 2022-06-26 12:05 ` Vladimir Oltean
  2022-06-27 18:46   ` Jakub Kicinski
  3 siblings, 1 reply; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-26 12:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel

Currently, sending a packet into a time gate too small for it (or always
closed) causes the queue system to hold the frame forever. Even worse,
this frame isn't subject to aging either, because for that to happen, it
needs to be scheduled for transmission in the first place. But the frame
will consume buffer memory and frame references while it is forever held
in the queue system.

Before commit a4ae997adcbd ("net: mscc: ocelot: initialize watermarks to
sane defaults"), this behavior was somewhat subtle, as the switch had a
more intricately tuned default watermark configuration out of reset,
which did not allow any single port and tc to consume the entire switch
buffer space. Nonetheless, the held frames are still there, and they
reduce the total backplane capacity of the switch.

However, after the aforementioned commit, the behavior can be very
clearly seen, since we deliberately allow each {port, tc} to consume the
entire shared buffer of the switch minus the reservations (and we
disable all reservations by default). That is to say, we allow a
permanently closed tc-taprio gate to hang the entire switch.

A careful inspection of the documentation shows that the QSYS:Q_MAX_SDU
per-port-tc registers serve 2 purposes: one is for guard band calculation
(when zero, this falls back to QSYS:PORT_MAX_SDU), and the other is to
enable oversized frame dropping (when non-zero).

Currently the QSYS:Q_MAX_SDU registers are all zero, so oversized frame
dropping is disabled. The goal of the change is to enable it seamlessly.
For that, we need to hook into the MTU change, tc-taprio change, and
port link speed change procedures, since we depend on these variables.

When a frame is dropped due to a queue system oversize condition, the
counter that increments is the generic "drop_tail" in ethtool -S, even
if this isn't a tail drop as the rest (i.e. the controlling watermarks
don't count these frames, as can be seen in "devlink sb occupancy show
pci/0000:00:00.5 sb 0"). So the hardware counter is unspecific
regarding the drop reason.

The issue exists in various forms since the tc-taprio offload was introduced.

Fixes: de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
Q: checkpatch says "quoted string split across lines", don't you see?
A: I do, but I believe it won't practically affect anybody, since the
   usual reason for not splitting up strings is for grep-ability. But
   no reasonable person would grep for a string containing a number,
   realizing that it's part of a printf-formatted string and the search
   wouldn't find any result. There are plenty of contiguous words that
   can be grepped already, and the "%d" specifier constitutes a good
   place IMO to snap an already long string into multiple lines.

 drivers/net/dsa/ocelot/felix.c         |   9 ++
 drivers/net/dsa/ocelot/felix.h         |   1 +
 drivers/net/dsa/ocelot/felix_vsc9959.c | 201 +++++++++++++++++++++++++
 3 files changed, 211 insertions(+)

diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
index 3e07dc39007a..859196898a7d 100644
--- a/drivers/net/dsa/ocelot/felix.c
+++ b/drivers/net/dsa/ocelot/felix.c
@@ -1553,9 +1553,18 @@ static void felix_txtstamp(struct dsa_switch *ds, int port,
 static int felix_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
 {
 	struct ocelot *ocelot = ds->priv;
+	struct ocelot_port *ocelot_port = ocelot->ports[port];
+	struct felix *felix = ocelot_to_felix(ocelot);
 
 	ocelot_port_set_maxlen(ocelot, port, new_mtu);
 
+	mutex_lock(&ocelot->tas_lock);
+
+	if (ocelot_port->taprio && felix->info->tas_guard_bands_update)
+		felix->info->tas_guard_bands_update(ocelot, port);
+
+	mutex_unlock(&ocelot->tas_lock);
+
 	return 0;
 }
 
diff --git a/drivers/net/dsa/ocelot/felix.h b/drivers/net/dsa/ocelot/felix.h
index 9e07eb7ee28d..deb8dde1fc19 100644
--- a/drivers/net/dsa/ocelot/felix.h
+++ b/drivers/net/dsa/ocelot/felix.h
@@ -53,6 +53,7 @@ struct felix_info {
 				    struct phylink_link_state *state);
 	int	(*port_setup_tc)(struct dsa_switch *ds, int port,
 				 enum tc_setup_type type, void *type_data);
+	void	(*tas_guard_bands_update)(struct ocelot *ocelot, int port);
 	void	(*port_sched_speed_set)(struct ocelot *ocelot, int port,
 					u32 speed);
 	struct regmap *(*init_regmap)(struct ocelot *ocelot,
diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index 7573254274b3..00578110b8da 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1127,9 +1127,199 @@ static void vsc9959_mdio_bus_free(struct ocelot *ocelot)
 	mdiobus_free(felix->imdio);
 }
 
+/* Extract shortest continuous gate open intervals in ns for each traffic class
+ * of a cyclic tc-taprio schedule. If a gate is always open, the duration is
+ * considered U64_MAX. If the gate is always closed, it is considered 0.
+ */
+static void vsc9959_tas_min_gate_lengths(struct tc_taprio_qopt_offload *taprio,
+					 u64 min_gate_len[OCELOT_NUM_TC])
+{
+	struct tc_taprio_sched_entry *entry;
+	u64 gate_len[OCELOT_NUM_TC];
+	int tc, i, n;
+
+	/* Initialize arrays */
+	for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+		min_gate_len[tc] = U64_MAX;
+		gate_len[tc] = 0;
+	}
+
+	/* If we don't have taprio, consider all gates as permanently open */
+	if (!taprio)
+		return;
+
+	n = taprio->num_entries;
+
+	/* Walk through the gate list twice to determine the length
+	 * of consecutively open gates for a traffic class, including
+	 * open gates that wrap around. We are just interested in the
+	 * minimum window size, and this doesn't change what the
+	 * minimum is (if the gate never closes, min_gate_len will
+	 * remain U64_MAX).
+	 */
+	for (i = 0; i < 2 * n; i++) {
+		entry = &taprio->entries[i % n];
+
+		for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+			if (entry->gate_mask & BIT(tc)) {
+				gate_len[tc] += entry->interval;
+			} else {
+				/* Gate closes now, record a potential new
+				 * minimum and reinitialize length
+				 */
+				if (min_gate_len[tc] > gate_len[tc])
+					min_gate_len[tc] = gate_len[tc];
+				gate_len[tc] = 0;
+			}
+		}
+	}
+}
+
+/* Update QSYS_PORT_MAX_SDU to make sure the static guard bands added by the
+ * switch (see the ALWAYS_GUARD_BAND_SCH_Q comment) are correct at all MTU
+ * values (the default value is 1518). Also, for traffic class windows smaller
+ * than one MTU sized frame, update QSYS_QMAXSDU_CFG to enable oversized frame
+ * dropping, such that these won't hang the port, as they will never be sent.
+ */
+static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
+{
+	struct ocelot_port *ocelot_port = ocelot->ports[port];
+	u64 min_gate_len[OCELOT_NUM_TC];
+	int speed, picos_per_byte;
+	u64 needed_bit_time_ps;
+	u32 val, maxlen;
+	u8 tas_speed;
+	int tc;
+
+	lockdep_assert_held(&ocelot->tas_lock);
+
+	val = ocelot_read_rix(ocelot, QSYS_TAG_CONFIG, port);
+	tas_speed = QSYS_TAG_CONFIG_LINK_SPEED_X(val);
+
+	switch (tas_speed) {
+	case OCELOT_SPEED_10:
+		speed = SPEED_10;
+		break;
+	case OCELOT_SPEED_100:
+		speed = SPEED_100;
+		break;
+	case OCELOT_SPEED_1000:
+		speed = SPEED_1000;
+		break;
+	case OCELOT_SPEED_2500:
+		speed = SPEED_2500;
+		break;
+	default:
+		return;
+	}
+
+	picos_per_byte = (USEC_PER_SEC * 8) / speed;
+
+	val = ocelot_port_readl(ocelot_port, DEV_MAC_MAXLEN_CFG);
+	/* MAXLEN_CFG accounts automatically for VLAN. We need to include it
+	 * manually in the bit time calculation, plus the preamble and SFD.
+	 */
+	maxlen = val + 2 * VLAN_HLEN;
+	/* Consider the standard Ethernet overhead of 8 octets preamble+SFD,
+	 * 4 octets FCS, 12 octets IFG.
+	 */
+	needed_bit_time_ps = (maxlen + 24) * picos_per_byte;
+
+	dev_dbg(ocelot->dev,
+		"port %d: max frame size %d needs %llu ps at speed %d\n",
+		port, maxlen, needed_bit_time_ps, speed);
+
+	vsc9959_tas_min_gate_lengths(ocelot_port->taprio, min_gate_len);
+
+	for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+		u32 max_sdu;
+
+		if (min_gate_len[tc] == U64_MAX /* Gate always open */ ||
+		    min_gate_len[tc] * PSEC_PER_NSEC > needed_bit_time_ps) {
+			/* Setting QMAXSDU_CFG to 0 disables oversized frame
+			 * dropping.
+			 */
+			max_sdu = 0;
+			dev_dbg(ocelot->dev,
+				"port %d tc %d min gate len %llu"
+				", sending all frames\n",
+				port, tc, min_gate_len[tc]);
+		} else {
+			/* If traffic class doesn't support a full MTU sized
+			 * frame, make sure to enable oversize frame dropping
+			 * for frames larger than the smallest that would fit.
+			 */
+			max_sdu = div_u64(min_gate_len[tc] * PSEC_PER_NSEC,
+					  picos_per_byte);
+			/* A TC gate may be completely closed, which is a
+			 * special case where all packets are oversized.
+			 * Any limit smaller than 64 octets accomplishes this
+			 */
+			if (!max_sdu)
+				max_sdu = 1;
+			/* Take L1 overhead into account, but just don't allow
+			 * max_sdu to go negative or to 0. Here we use 20
+			 * because QSYS_MAXSDU_CFG_* already counts the 4 FCS
+			 * octets as part of packet size.
+			 */
+			if (max_sdu > 20)
+				max_sdu -= 20;
+			dev_info(ocelot->dev,
+				 "port %d tc %d min gate length %llu"
+				 " ns not enough for max frame size %d at %d"
+				 " Mbps, dropping frames over %d"
+				 " octets including FCS\n",
+				 port, tc, min_gate_len[tc], maxlen, speed,
+				 max_sdu);
+		}
+
+		/* ocelot_write_rix is a macro that concatenates
+		 * QSYS_MAXSDU_CFG_* with _RSZ, so we need to spell out
+		 * the writes to each traffic class
+		 */
+		switch (tc) {
+		case 0:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_0,
+					 port);
+			break;
+		case 1:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_1,
+					 port);
+			break;
+		case 2:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_2,
+					 port);
+			break;
+		case 3:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_3,
+					 port);
+			break;
+		case 4:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_4,
+					 port);
+			break;
+		case 5:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_5,
+					 port);
+			break;
+		case 6:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_6,
+					 port);
+			break;
+		case 7:
+			ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_7,
+					 port);
+			break;
+		}
+	}
+
+	ocelot_write_rix(ocelot, maxlen, QSYS_PORT_MAX_SDU, port);
+}
+
 static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
 				    u32 speed)
 {
+	struct ocelot_port *ocelot_port = ocelot->ports[port];
 	u8 tas_speed;
 
 	switch (speed) {
@@ -1154,6 +1344,13 @@ static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
 		       QSYS_TAG_CONFIG_LINK_SPEED(tas_speed),
 		       QSYS_TAG_CONFIG_LINK_SPEED_M,
 		       QSYS_TAG_CONFIG, port);
+
+	mutex_lock(&ocelot->tas_lock);
+
+	if (ocelot_port->taprio)
+		vsc9959_tas_guard_bands_update(ocelot, port);
+
+	mutex_unlock(&ocelot->tas_lock);
 }
 
 static void vsc9959_new_base_time(struct ocelot *ocelot, ktime_t base_time,
@@ -1210,6 +1407,8 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 		taprio_offload_free(ocelot_port->taprio);
 		ocelot_port->taprio = NULL;
 
+		vsc9959_tas_guard_bands_update(ocelot, port);
+
 		mutex_unlock(&ocelot->tas_lock);
 		return 0;
 	}
@@ -1284,6 +1483,7 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
 		goto err;
 
 	ocelot_port->taprio = taprio_offload_get(taprio);
+	vsc9959_tas_guard_bands_update(ocelot, port);
 
 err:
 	mutex_unlock(&ocelot->tas_lock);
@@ -2303,6 +2503,7 @@ static const struct felix_info felix_info_vsc9959 = {
 	.port_modes		= vsc9959_port_modes,
 	.port_setup_tc		= vsc9959_port_setup_tc,
 	.port_sched_speed_set	= vsc9959_sched_speed_set,
+	.tas_guard_bands_update	= vsc9959_tas_guard_bands_update,
 	.init_regmap		= ocelot_regmap_init,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  2022-06-26 12:05 ` [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio Vladimir Oltean
@ 2022-06-27  7:52   ` Vincenzo Frascino
  2022-06-27  8:51     ` Vladimir Oltean
  0 siblings, 1 reply; 13+ messages in thread
From: Vincenzo Frascino @ 2022-06-27  7:52 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner

Hi Vladimir,

On 6/26/22 13:05, Vladimir Oltean wrote:
> Time-sensitive networking code needs to work with PTP times expressed in
> nanoseconds, and with packet transmission times expressed in
> picoseconds, since those would be fractional at higher than gigabit
> speed when expressed in nanoseconds.
> 
> Convert the existing uses in tc-taprio to a PSEC_PER_NSEC macro.
> 
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  include/vdso/time64.h  | 1 +
>  net/sched/sch_taprio.c | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/vdso/time64.h b/include/vdso/time64.h
> index b40cfa2aa33c..f1c2d02474ae 100644
> --- a/include/vdso/time64.h
> +++ b/include/vdso/time64.h
> @@ -6,6 +6,7 @@
>  #define MSEC_PER_SEC	1000L
>  #define USEC_PER_MSEC	1000L
>  #define NSEC_PER_USEC	1000L
> +#define PSEC_PER_NSEC	1000L

Are you planning to use this definition in the vdso library code? If not, you
should define PSEC_PER_NSEC in "include/linux/time64.h". The vdso namespace
should contain only the definitions shared by the implementations of the kernel
and of the vdso library.

...
-- 
Regards,
Vincenzo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  2022-06-27  7:52   ` Vincenzo Frascino
@ 2022-06-27  8:51     ` Vladimir Oltean
  2022-06-27  9:25       ` Vincenzo Frascino
  0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-27  8:51 UTC (permalink / raw)
  To: Vincenzo Frascino
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner

Hi Vincenzo,

On Mon, Jun 27, 2022 at 08:52:51AM +0100, Vincenzo Frascino wrote:
> Hi Vladimir,
> 
> On 6/26/22 13:05, Vladimir Oltean wrote:
> > Time-sensitive networking code needs to work with PTP times expressed in
> > nanoseconds, and with packet transmission times expressed in
> > picoseconds, since those would be fractional at higher than gigabit
> > speed when expressed in nanoseconds.
> > 
> > Convert the existing uses in tc-taprio to a PSEC_PER_NSEC macro.
> > 
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > ---
> >  include/vdso/time64.h  | 1 +
> >  net/sched/sch_taprio.c | 4 ++--
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/vdso/time64.h b/include/vdso/time64.h
> > index b40cfa2aa33c..f1c2d02474ae 100644
> > --- a/include/vdso/time64.h
> > +++ b/include/vdso/time64.h
> > @@ -6,6 +6,7 @@
> >  #define MSEC_PER_SEC	1000L
> >  #define USEC_PER_MSEC	1000L
> >  #define NSEC_PER_USEC	1000L
> > +#define PSEC_PER_NSEC	1000L
> 
> Are you planning to use this definition in the vdso library code? If not, you
> should define PSEC_PER_NSEC in "include/linux/time64.h". The vdso namespace
> should contain only the definitions shared by the implementations of the kernel
> and of the vdso library.

I am not. I thought it would be ok to preserve the locality of
definitions by placing this near the others of its kind, since a macro
doesn't affect the compiled vDSO code in any way. But if you prefer, I
can create a new mini-section in linux/time64.h. Any preference on where
exactly to place that definition within the file?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  2022-06-27  8:51     ` Vladimir Oltean
@ 2022-06-27  9:25       ` Vincenzo Frascino
  2022-06-27 10:06         ` Vladimir Oltean
  0 siblings, 1 reply; 13+ messages in thread
From: Vincenzo Frascino @ 2022-06-27  9:25 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner

Hi Vladimir,

On 6/27/22 09:51, Vladimir Oltean wrote:
> Hi Vincenzo,
> 
> On Mon, Jun 27, 2022 at 08:52:51AM +0100, Vincenzo Frascino wrote:
>> Hi Vladimir,
>>
>> On 6/26/22 13:05, Vladimir Oltean wrote:
>>> Time-sensitive networking code needs to work with PTP times expressed in
>>> nanoseconds, and with packet transmission times expressed in
>>> picoseconds, since those would be fractional at higher than gigabit
>>> speed when expressed in nanoseconds.
>>>
>>> Convert the existing uses in tc-taprio to a PSEC_PER_NSEC macro.
>>>
>>> Cc: Andy Lutomirski <luto@kernel.org>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
>>> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
>>> ---
>>>  include/vdso/time64.h  | 1 +
>>>  net/sched/sch_taprio.c | 4 ++--
>>>  2 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/vdso/time64.h b/include/vdso/time64.h
>>> index b40cfa2aa33c..f1c2d02474ae 100644
>>> --- a/include/vdso/time64.h
>>> +++ b/include/vdso/time64.h
>>> @@ -6,6 +6,7 @@
>>>  #define MSEC_PER_SEC	1000L
>>>  #define USEC_PER_MSEC	1000L
>>>  #define NSEC_PER_USEC	1000L
>>> +#define PSEC_PER_NSEC	1000L
>>
>> Are you planning to use this definition in the vdso library code? If not, you
>> should define PSEC_PER_NSEC in "include/linux/time64.h". The vdso namespace
>> should contain only the definitions shared by the implementations of the kernel
>> and of the vdso library.
> 
> I am not. I thought it would be ok to preserve the locality of
> definitions by placing this near the others of its kind, since a macro
> doesn't affect the compiled vDSO code in any way. But if you prefer, I
> can create a new mini-section in linux/time64.h. Any preference on where
> exactly to place that definition within the file?

I do not have a strong opinion on where to put it. But I think that if you put a
section above TIME64_MAX should work.

-- 
Regards,
Vincenzo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  2022-06-27  9:25       ` Vincenzo Frascino
@ 2022-06-27 10:06         ` Vladimir Oltean
  2022-06-27 18:40           ` Jakub Kicinski
  0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-27 10:06 UTC (permalink / raw)
  To: Vincenzo Frascino
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner

On Mon, Jun 27, 2022 at 10:25:42AM +0100, Vincenzo Frascino wrote:
> Hi Vladimir,
> 
> On 6/27/22 09:51, Vladimir Oltean wrote:
> > Hi Vincenzo,
> > 
> > On Mon, Jun 27, 2022 at 08:52:51AM +0100, Vincenzo Frascino wrote:
> >> Hi Vladimir,
> >>
> >> On 6/26/22 13:05, Vladimir Oltean wrote:
> >>> Time-sensitive networking code needs to work with PTP times expressed in
> >>> nanoseconds, and with packet transmission times expressed in
> >>> picoseconds, since those would be fractional at higher than gigabit
> >>> speed when expressed in nanoseconds.
> >>>
> >>> Convert the existing uses in tc-taprio to a PSEC_PER_NSEC macro.
> >>>
> >>> Cc: Andy Lutomirski <luto@kernel.org>
> >>> Cc: Thomas Gleixner <tglx@linutronix.de>
> >>> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
> >>> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> >>> ---
> >>>  include/vdso/time64.h  | 1 +
> >>>  net/sched/sch_taprio.c | 4 ++--
> >>>  2 files changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/include/vdso/time64.h b/include/vdso/time64.h
> >>> index b40cfa2aa33c..f1c2d02474ae 100644
> >>> --- a/include/vdso/time64.h
> >>> +++ b/include/vdso/time64.h
> >>> @@ -6,6 +6,7 @@
> >>>  #define MSEC_PER_SEC	1000L
> >>>  #define USEC_PER_MSEC	1000L
> >>>  #define NSEC_PER_USEC	1000L
> >>> +#define PSEC_PER_NSEC	1000L
> >>
> >> Are you planning to use this definition in the vdso library code? If not, you
> >> should define PSEC_PER_NSEC in "include/linux/time64.h". The vdso namespace
> >> should contain only the definitions shared by the implementations of the kernel
> >> and of the vdso library.
> > 
> > I am not. I thought it would be ok to preserve the locality of
> > definitions by placing this near the others of its kind, since a macro
> > doesn't affect the compiled vDSO code in any way. But if you prefer, I
> > can create a new mini-section in linux/time64.h. Any preference on where
> > exactly to place that definition within the file?
> 
> I do not have a strong opinion on where to put it. But I think that if you put a
> section above TIME64_MAX should work.

@networking people: do you mind if in v2 I move this patch to the end,
hardcode 1000 in the current DSA patch 4/4, and then replace it afterwards
with PSEC_PER_NSEC, together with tc-taprio? I'd like to leave the code
in a clean state, but remember this is also a patch that fixes a
functional issue, even if on net-next, so one dependency less can't
hurt, for those who'll want to backport.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio
  2022-06-27 10:06         ` Vladimir Oltean
@ 2022-06-27 18:40           ` Jakub Kicinski
  0 siblings, 0 replies; 13+ messages in thread
From: Jakub Kicinski @ 2022-06-27 18:40 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Vincenzo Frascino, netdev, David S. Miller, Eric Dumazet,
	Paolo Abeni, Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel, Andy Lutomirski,
	Thomas Gleixner

On Mon, 27 Jun 2022 10:06:15 +0000 Vladimir Oltean wrote:
> > I do not have a strong opinion on where to put it. But I think that if you put a
> > section above TIME64_MAX should work.  
> 
> @networking people: do you mind if in v2 I move this patch to the end,
> hardcode 1000 in the current DSA patch 4/4, and then replace it afterwards
> with PSEC_PER_NSEC, together with tc-taprio? I'd like to leave the code
> in a clean state, but remember this is also a patch that fixes a
> functional issue, even if on net-next, so one dependency less can't
> hurt, for those who'll want to backport.

Makes sense.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port
  2022-06-26 12:05 ` [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port Vladimir Oltean
@ 2022-06-27 18:46   ` Jakub Kicinski
  2022-06-27 19:40     ` Vladimir Oltean
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2022-06-27 18:46 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel

On Sun, 26 Jun 2022 15:05:05 +0300 Vladimir Oltean wrote:
> When a frame is dropped due to a queue system oversize condition, the
> counter that increments is the generic "drop_tail" in ethtool -S, even
> if this isn't a tail drop as the rest (i.e. the controlling watermarks
> don't count these frames, as can be seen in "devlink sb occupancy show
> pci/0000:00:00.5 sb 0"). So the hardware counter is unspecific
> regarding the drop reason.

I had mixed feelings about the stats, as I usually do, but I don't
recall if I sent that email. Are you at least counting the drop_tail
into one of the standard tx error / drop stats so admins will notice?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port
  2022-06-27 18:46   ` Jakub Kicinski
@ 2022-06-27 19:40     ` Vladimir Oltean
  2022-06-27 19:46       ` Vladimir Oltean
  0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-27 19:40 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel

On Mon, Jun 27, 2022 at 11:46:44AM -0700, Jakub Kicinski wrote:
> On Sun, 26 Jun 2022 15:05:05 +0300 Vladimir Oltean wrote:
> > When a frame is dropped due to a queue system oversize condition, the
> > counter that increments is the generic "drop_tail" in ethtool -S, even
> > if this isn't a tail drop as the rest (i.e. the controlling watermarks
> > don't count these frames, as can be seen in "devlink sb occupancy show
> > pci/0000:00:00.5 sb 0"). So the hardware counter is unspecific
> > regarding the drop reason.
> 
> I had mixed feelings about the stats, as I usually do, but I don't
> recall if I sent that email. Are you at least counting the drop_tail
> into one of the standard tx error / drop stats so admins will notice?

Sorry for being unclear, fact is, I had even forgot this small detail
between testing and writing the commit message...

The switch, being a switch, will not 'drop' the packet per se, it will
still try to forward it towards the ports it can. It will only drop it
and record it as such if it couldn't forward it towards any port (i.e.
if we're in a 2-port bridge and there isn't any learned FDB entry, it
will still flood it towards the CPU).
Because of that, the "drop_tail" counter is incremented on the _ingress_
port. Good luck counting that into a tx_errors counter :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port
  2022-06-27 19:40     ` Vladimir Oltean
@ 2022-06-27 19:46       ` Vladimir Oltean
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Oltean @ 2022-06-27 19:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Xiaoliang Yang, Claudiu Manoil, Alexandre Belloni,
	UNGLinuxDriver, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Michael Walle, Vinicius Costa Gomes, Maxim Kochetkov,
	Colin Foster, Richie Pearn, linux-kernel

On Mon, Jun 27, 2022 at 10:40:30PM +0300, Vladimir Oltean wrote:
> I had even forgot this small detail

forgotten, blah

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-06-27 19:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-26 12:05 [PATCH net-next 0/4] Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port Vladimir Oltean
2022-06-26 12:05 ` [PATCH net-next 1/4] time64.h: define PSEC_PER_NSEC and use it in tc-taprio Vladimir Oltean
2022-06-27  7:52   ` Vincenzo Frascino
2022-06-27  8:51     ` Vladimir Oltean
2022-06-27  9:25       ` Vincenzo Frascino
2022-06-27 10:06         ` Vladimir Oltean
2022-06-27 18:40           ` Jakub Kicinski
2022-06-26 12:05 ` [PATCH net-next 2/4] net: dsa: felix: keep reference on entire tc-taprio config Vladimir Oltean
2022-06-26 12:05 ` [PATCH net-next 3/4] net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw Vladimir Oltean
2022-06-26 12:05 ` [PATCH net-next 4/4] net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port Vladimir Oltean
2022-06-27 18:46   ` Jakub Kicinski
2022-06-27 19:40     ` Vladimir Oltean
2022-06-27 19:46       ` Vladimir Oltean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.