Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
@ 2019-09-10 15:41 Robert Beckett
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
                   ` (8 more replies)
  0 siblings, 9 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller

This patch-set adds support for some features of the Marvell switch
chips that can be used to handle packet storms.

The rationale for this was a setup that requires the ability to receive
traffic from one port, while a packet storm is occuring on another port
(via an external switch with a deliberate loop). This is needed to
ensure vital data delivery from a specific port, while mitigating any
loops or DoS that a user may introduce on another port (can't guarantee
sensible users).

[patch 1/7] configures auto negotiation for CPU ports connected with
phys to enable pause frame propogation.

[patch 2/7] allows setting of port's default output queue priority for
any ingressing packets on that port.

[patch 3/7] dt-bindings for patch 2.

[patch 4/7] allows setting of a port's queue scheduling so that it can
prioritise egress of traffic routed from high priority ports.

[patch 5/7] dt-bindings for patch 4.

[patch 6/7] allows ports to rate limit their egress. This can be used to
stop the host CPU from becoming swamped by packet delivery and exhasting
descriptors.

[patch 7/7] dt-bindings for patch 6.


Robert Beckett (7):
  net/dsa: configure autoneg for CPU port
  net: dsa: mv88e6xxx: add ability to set default queue priorities per
    port
  dt-bindings: mv88e6xxx: add ability to set default queue priorities
    per port
  net: dsa: mv88e6xxx: add ability to set queue scheduling
  dt-bindings: mv88e6xxx: add ability to set queue scheduling
  net: dsa: mv88e6xxx: add egress rate limiting
  dt-bindings: mv88e6xxx: add egress rate limiting

 .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
 drivers/net/dsa/mv88e6xxx/chip.c              | 122 ++++++++++++---
 drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
 drivers/net/dsa/mv88e6xxx/port.c              | 140 +++++++++++++++++-
 drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
 include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
 net/dsa/port.c                                |  10 ++
 7 files changed, 327 insertions(+), 34 deletions(-)
 create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h

-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
@ 2019-09-10 15:41 ` Robert Beckett
  2019-09-10 16:14   ` Vivien Didelot
                     ` (4 more replies)
  2019-09-10 15:41 ` [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port Robert Beckett
                   ` (7 subsequent siblings)
  8 siblings, 5 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller

Configure autoneg for phy connected CPU ports.
This allows us to use autoneg between the CPU port's phy and the link
partner's phy.
This enables us to negoatiate pause frame transmission to prioritise
packet delivery over throughput.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 net/dsa/port.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index f071acf2842b..1b6832eac2c5 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -538,10 +538,20 @@ static int dsa_port_setup_phy_of(struct dsa_port *dp, bool enable)
 		return PTR_ERR(phydev);
 
 	if (enable) {
+		phydev->supported = PHY_GBIT_FEATURES | SUPPORTED_MII |
+				    SUPPORTED_AUI | SUPPORTED_FIBRE |
+				    SUPPORTED_BNC | SUPPORTED_Pause |
+				    SUPPORTED_Asym_Pause;
+		phydev->advertising = phydev->supported;
+
 		err = genphy_config_init(phydev);
 		if (err < 0)
 			goto err_put_dev;
 
+		err = genphy_config_aneg(phydev);
+		if (err < 0)
+			goto err_put_dev;
+
 		err = genphy_resume(phydev);
 		if (err < 0)
 			goto err_put_dev;
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
@ 2019-09-10 15:41 ` Robert Beckett
  2019-09-10 16:43   ` Vivien Didelot
  2019-09-10 15:41 ` [PATCH 3/7] dt-bindings: " Robert Beckett
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller

Add code to set DefQPri for any port that specifies "defqpri" in their
device tree node.
This allows setting the default output queue priority for all packets
entering the switch via the port that uses this, which is useful for
prioritizing traffic based on port.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 25 +++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/chip.h |  1 +
 drivers/net/dsa/mv88e6xxx/port.c | 19 +++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/port.h |  4 ++++
 4 files changed, 49 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index d0a97eb73a37..5005a35493e3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2086,6 +2086,23 @@ static int mv88e6xxx_setup_upstream_port(struct mv88e6xxx_chip *chip, int port)
 	return 0;
 }
 
+static int mv88e6xxx_set_port_defqpri(struct mv88e6xxx_chip *chip, int port)
+{
+	struct dsa_switch *ds = chip->ds;
+	struct device_node *dn = ds->ports[port].dn;
+	int err;
+	u32 pri;
+
+	if (!dn || !chip->info->ops->port_set_defqpri)
+		return 0;
+
+	err = of_property_read_u32(dn, "defqpri", &pri);
+	if (err < 0)
+		return 0;
+
+	return chip->info->ops->port_set_defqpri(chip, port, (u16)pri);
+}
+
 static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 {
 	struct dsa_switch *ds = chip->ds;
@@ -2176,6 +2193,10 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 			return err;
 	}
 
+	err = mv88e6xxx_set_port_defqpri(chip, port);
+	if (err)
+		return err;
+
 	/* Port Association Vector: when learning source addresses
 	 * of packets, add the address to the address database using
 	 * a port bitmap that has only the bit for this port set and
@@ -3107,6 +3128,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
+	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3190,6 +3212,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
+	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3407,6 +3430,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
+	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3750,6 +3774,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
+	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 4646e46d47f2..2d2c24f5a79d 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -383,6 +383,7 @@ struct mv88e6xxx_ops {
 				   u16 etype);
 	int (*port_set_jumbo_size)(struct mv88e6xxx_chip *chip, int port,
 				   size_t size);
+	int (*port_set_defqpri)(struct mv88e6xxx_chip *chip, int port, u16 pri);
 
 	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
 	int (*port_pause_limit)(struct mv88e6xxx_chip *chip, int port, u8 in,
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 04309ef0a1cc..3a45fcd5cd9c 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -1147,6 +1147,25 @@ int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
 	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL2, reg);
 }
 
+int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri)
+{
+	u16 reg;
+	int err;
+
+	if (pri > 3)
+		return -EINVAL;
+
+	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL2, &reg);
+	if (err)
+		return err;
+
+	reg &= ~MV88E6XXX_PORT_CTL2_DEFQPRI_MASK;
+	reg |= pri << MV88E6XXX_PORT_CTL2_DEFQPRI_SHIFT;
+	reg |= MV88E6XXX_PORT_CTL2_USE_DEFQPRI;
+
+	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL2, reg);
+}
+
 /* Offset 0x09: Port Rate Control */
 
 int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index 8d5a6cd6fb19..03884bbaa762 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -197,6 +197,9 @@
 #define MV88E6XXX_PORT_CTL2_DEFAULT_FORWARD		0x0040
 #define MV88E6XXX_PORT_CTL2_EGRESS_MONITOR		0x0020
 #define MV88E6XXX_PORT_CTL2_INGRESS_MONITOR		0x0010
+#define MV88E6XXX_PORT_CTL2_USE_DEFQPRI		0x0008
+#define MV88E6XXX_PORT_CTL2_DEFQPRI_MASK		0x0006
+#define MV88E6XXX_PORT_CTL2_DEFQPRI_SHIFT		1
 #define MV88E6095_PORT_CTL2_CPU_PORT_MASK		0x000f
 
 /* Offset 0x09: Egress Rate Control */
@@ -326,6 +329,7 @@ int mv88e6xxx_port_set_message_port(struct mv88e6xxx_chip *chip, int port,
 				    bool message_port);
 int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
 				  size_t size);
+int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri);
 int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
 int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
 int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/7] dt-bindings: mv88e6xxx: add ability to set default queue priorities per port
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
  2019-09-10 15:41 ` [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port Robert Beckett
@ 2019-09-10 15:41 ` " Robert Beckett
  2019-09-10 16:42   ` Florian Fainelli
  2019-09-10 15:41 ` [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling Robert Beckett
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rob Herring, Mark Rutland, devicetree

Document a new setting for Marvell switch chips to set the default queue
priorities per port.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 Documentation/devicetree/bindings/net/dsa/marvell.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt b/Documentation/devicetree/bindings/net/dsa/marvell.txt
index 6f9538974bb9..e097c3c52eac 100644
--- a/Documentation/devicetree/bindings/net/dsa/marvell.txt
+++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt
@@ -47,6 +47,10 @@ Optional properties:
 			  bus. The node must contains a compatible string of
 			  "marvell,mv88e6xxx-mdio-external"
 
+Optional properties for ports:
+- defqpri=<n>		: Enforced default queue priority for the given port.
+			  Valid range is 0..3
+
 Example:
 
 	mdio {
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
                   ` (2 preceding siblings ...)
  2019-09-10 15:41 ` [PATCH 3/7] dt-bindings: " Robert Beckett
@ 2019-09-10 15:41 ` Robert Beckett
  2019-09-10 17:18   ` Vivien Didelot
  2019-09-10 15:41 ` [PATCH 5/7] dt-bindings: " Robert Beckett
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller

Add code to set Schedule for any port that specifies "schedule" in their
device tree node.
This allows port prioritization in conjunction with port default queue
priorities or packet priorities.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 25 +++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/chip.h |  1 +
 drivers/net/dsa/mv88e6xxx/port.c | 21 +++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/port.h |  6 ++++++
 4 files changed, 53 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 5005a35493e3..2bc22c59200c 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2103,6 +2103,23 @@ static int mv88e6xxx_set_port_defqpri(struct mv88e6xxx_chip *chip, int port)
 	return chip->info->ops->port_set_defqpri(chip, port, (u16)pri);
 }
 
+static int mv88e6xxx_set_port_sched(struct mv88e6xxx_chip *chip, int port)
+{
+	struct dsa_switch *ds = chip->ds;
+	struct device_node *dn = ds->ports[port].dn;
+	int err;
+	u32 sched;
+
+	if (!dn || !chip->info->ops->port_set_sched)
+		return 0;
+
+	err = of_property_read_u32(dn, "schedule", &sched);
+	if (err < 0)
+		return 0;
+
+	return chip->info->ops->port_set_sched(chip, port, (u16)sched);
+}
+
 static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 {
 	struct dsa_switch *ds = chip->ds;
@@ -2218,6 +2235,10 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 	if (err)
 		return err;
 
+	err = mv88e6xxx_set_port_sched(chip, port);
+	if (err)
+		return err;
+
 	if (chip->info->ops->port_pause_limit) {
 		err = chip->info->ops->port_pause_limit(chip, port, 0, 0);
 		if (err)
@@ -3130,6 +3151,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3214,6 +3236,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3432,6 +3455,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3776,6 +3800,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
 	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 2d2c24f5a79d..ff3e35eceee0 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -386,6 +386,7 @@ struct mv88e6xxx_ops {
 	int (*port_set_defqpri)(struct mv88e6xxx_chip *chip, int port, u16 pri);
 
 	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
+	int (*port_set_sched)(struct mv88e6xxx_chip *chip, int port, u16 sched);
 	int (*port_pause_limit)(struct mv88e6xxx_chip *chip, int port, u8 in,
 				u8 out);
 	int (*port_disable_learn_limit)(struct mv88e6xxx_chip *chip, int port);
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 3a45fcd5cd9c..236732fc598d 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -1180,6 +1180,27 @@ int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
 				    0x0001);
 }
 
+/* Offset 0x0A: Egress Rate Control 2 */
+int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched)
+{
+	u16 reg;
+	int err;
+
+	if (sched > MV88E6XXX_PORT_SCHED_STRICT_ALL)
+		return -EINVAL;
+
+	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
+				  &reg);
+	if (err)
+		return err;
+
+	reg &= ~MV88E6XXX_PORT_SCHED_MASK;
+	reg |= sched << MV88E6XXX_PORT_SCHED_SHIFT;
+
+	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
+				    reg);
+}
+
 /* Offset 0x0C: Port ATU Control */
 
 int mv88e6xxx_port_disable_learn_limit(struct mv88e6xxx_chip *chip, int port)
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index 03884bbaa762..710d6eccafae 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -11,6 +11,7 @@
 #ifndef _MV88E6XXX_PORT_H
 #define _MV88E6XXX_PORT_H
 
+#include <dt-bindings/net/dsa-mv88e6xxx.h>
 #include "chip.h"
 
 /* Offset 0x00: Port Status Register */
@@ -207,6 +208,10 @@
 
 /* Offset 0x0A: Egress Rate Control 2 */
 #define MV88E6XXX_PORT_EGRESS_RATE_CTL2		0x0a
+#define MV88E6XXX_PORT_SCHED_SHIFT		12
+#define MV88E6XXX_PORT_SCHED_MASK \
+	(0x3 << MV88E6XXX_PORT_SCHED_SHIFT)
+/* see MV88E6XXX_PORT_SCHED_* in include/dt-bindings/net/dsa-mv88e6xxx.h */
 
 /* Offset 0x0B: Port Association Vector */
 #define MV88E6XXX_PORT_ASSOC_VECTOR			0x0b
@@ -332,6 +337,7 @@ int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
 int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri);
 int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
 int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
+int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched);
 int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
 			       u8 out);
 int mv88e6390_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 5/7] dt-bindings: mv88e6xxx: add ability to set queue scheduling
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
                   ` (3 preceding siblings ...)
  2019-09-10 15:41 ` [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling Robert Beckett
@ 2019-09-10 15:41 ` " Robert Beckett
  2019-09-10 15:41 ` [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting Robert Beckett
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rob Herring, Mark Rutland, devicetree

Document port queue scheduling settings.
Add definitions for specific valid values.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 .../devicetree/bindings/net/dsa/marvell.txt     | 12 ++++++++++++
 include/dt-bindings/net/dsa-mv88e6xxx.h         | 17 +++++++++++++++++
 2 files changed, 29 insertions(+)
 create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h

diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt b/Documentation/devicetree/bindings/net/dsa/marvell.txt
index e097c3c52eac..7de90929c3c9 100644
--- a/Documentation/devicetree/bindings/net/dsa/marvell.txt
+++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt
@@ -50,6 +50,18 @@ Optional properties:
 Optional properties for ports:
 - defqpri=<n>		: Enforced default queue priority for the given port.
 			  Valid range is 0..3
+- schedule=<n>		: Set ports scheduling mode. Valid values are:
+			  MV88E6XXX_PORT_SCHED_ROUND_ROBIN - All output queues
+			  use a weighter round robin scheme.
+			  MV88E6XXX_PORT_SCHED_STRICT_3 - Output queue 3 uses
+			  a strict scheme, where any packets in queue 3 will be
+			  egressed first, followed by weighted round robin for
+			  the other ports.
+			  MV88E6XXX_PORT_SCHED_STRICT_3_2 - Output queue's 2
+			  and 3 use strict, other use weighted round robin.
+			  MV88E6XXX_PORT_SCHED_STRICT_ALL - All queues use
+			  strict priority, where queues drain in descending
+			  queue number order.
 
 Example:
 
diff --git a/include/dt-bindings/net/dsa-mv88e6xxx.h b/include/dt-bindings/net/dsa-mv88e6xxx.h
new file mode 100644
index 000000000000..3f62003841ce
--- /dev/null
+++ b/include/dt-bindings/net/dsa-mv88e6xxx.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Device Tree constants for Marvell 88E6xxx Switch Port Registers
+ *
+ * Copyright (c) 2019, Collabora Ltd.
+ * Copyright (c) 2019, General Electric Company
+ */
+
+#ifndef _DT_BINDINGS_MV88E6XXX_H
+#define _DT_BINDINGS_MV88E6XXX_H
+
+#define MV88E6XXX_PORT_SCHED_ROUND_ROBIN	0
+#define MV88E6XXX_PORT_SCHED_STRICT_3		1
+#define MV88E6XXX_PORT_SCHED_STRICT_3_2		2
+#define MV88E6XXX_PORT_SCHED_STRICT_ALL		3
+
+#endif /* _DT_BINDINGS_MV88E6XXX_H */
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
                   ` (4 preceding siblings ...)
  2019-09-10 15:41 ` [PATCH 5/7] dt-bindings: " Robert Beckett
@ 2019-09-10 15:41 ` Robert Beckett
  2019-09-10 17:13   ` Vivien Didelot
  2019-09-11 12:26   ` kbuild test robot
  2019-09-10 15:41 ` [PATCH 7/7] dt-bindings: " Robert Beckett
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller

Add code for specifying egress rate limiting per port.
The rate can be specified as ethernet frames or bits per second.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c |  72 ++++++++++++++-------
 drivers/net/dsa/mv88e6xxx/chip.h |   3 +-
 drivers/net/dsa/mv88e6xxx/port.c | 106 ++++++++++++++++++++++++++++---
 drivers/net/dsa/mv88e6xxx/port.h |  14 +++-
 4 files changed, 158 insertions(+), 37 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 2bc22c59200c..8c116496ab2f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2120,6 +2120,32 @@ static int mv88e6xxx_set_port_sched(struct mv88e6xxx_chip *chip, int port)
 	return chip->info->ops->port_set_sched(chip, port, (u16)sched);
 }
 
+static int mv88e6xxx_set_port_egress_rate_limiting(struct mv88e6xxx_chip *chip,
+						   int port)
+{
+	struct dsa_switch *ds = chip->ds;
+	struct device_node *dn = ds->ports[port].dn;
+	int err;
+	u32 mode, count;
+
+	if (!dn || !chip->info->ops->port_egress_rate_limiting)
+		return 0;
+
+	err = of_property_read_u32(dn, "egress-limit-mode", &mode);
+	if (err < 0)
+		goto disable;
+
+	err = of_property_read_u32(dn, "egress-limit-count", &count);
+	if (err < 0)
+		goto disable;
+
+	return chip->info->ops->port_egress_rate_limiting(chip, port, count,
+							  mode);
+
+disable:
+	return chip->info->ops->port_egress_rate_limiting(chip, port, 0, 0);
+}
+
 static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 {
 	struct dsa_switch *ds = chip->ds;
@@ -2263,11 +2289,9 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 			return err;
 	}
 
-	if (chip->info->ops->port_egress_rate_limiting) {
-		err = chip->info->ops->port_egress_rate_limiting(chip, port);
-		if (err)
-			return err;
-	}
+	err = mv88e6xxx_set_port_egress_rate_limiting(chip, port);
+	if (err)
+		return err;
 
 	err = mv88e6xxx_setup_message_port(chip, port);
 	if (err)
@@ -2809,7 +2833,7 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -2879,7 +2903,7 @@ static const struct mv88e6xxx_ops mv88e6097_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6095_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -2951,7 +2975,7 @@ static const struct mv88e6xxx_ops mv88e6131_ops = {
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_upstream_port = mv88e6095_port_set_upstream_port,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_set_pause = mv88e6185_port_set_pause,
 	.port_link_state = mv88e6352_port_link_state,
@@ -2994,7 +3018,7 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3034,7 +3058,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3108,7 +3132,7 @@ static const struct mv88e6xxx_ops mv88e6171_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3150,7 +3174,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3193,7 +3217,7 @@ static const struct mv88e6xxx_ops mv88e6175_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3235,7 +3259,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3275,7 +3299,7 @@ static const struct mv88e6xxx_ops mv88e6185_ops = {
 	.port_set_speed = mv88e6185_port_set_speed,
 	.port_set_frame_mode = mv88e6085_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6185_port_set_egress_floods,
-	.port_egress_rate_limiting = mv88e6095_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_set_upstream_port = mv88e6095_port_set_upstream_port,
 	.port_set_pause = mv88e6185_port_set_pause,
 	.port_link_state = mv88e6185_port_link_state,
@@ -3454,7 +3478,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3587,7 +3611,7 @@ static const struct mv88e6xxx_ops mv88e6320_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3630,7 +3654,7 @@ static const struct mv88e6xxx_ops mv88e6321_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3673,7 +3697,7 @@ static const struct mv88e6xxx_ops mv88e6341_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3716,7 +3740,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3755,7 +3779,7 @@ static const struct mv88e6xxx_ops mv88e6351_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3799,7 +3823,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
 	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_set_sched = mv88e6xxx_port_set_sched,
 	.port_pause_limit = mv88e6097_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
@@ -3851,7 +3875,7 @@ static const struct mv88e6xxx_ops mv88e6390_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6390_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
@@ -3900,7 +3924,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
 	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
-	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
 	.port_pause_limit = mv88e6390_port_pause_limit,
 	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index ff3e35eceee0..75fbd5df4aae 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -385,7 +385,8 @@ struct mv88e6xxx_ops {
 				   size_t size);
 	int (*port_set_defqpri)(struct mv88e6xxx_chip *chip, int port, u16 pri);
 
-	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
+	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port,
+					 u32 count, u32 mode);
 	int (*port_set_sched)(struct mv88e6xxx_chip *chip, int port, u16 sched);
 	int (*port_pause_limit)(struct mv88e6xxx_chip *chip, int port, u8 in,
 				u8 out);
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 236732fc598d..41418cfaca56 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -1166,21 +1166,107 @@ int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri)
 	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL2, reg);
 }
 
-/* Offset 0x09: Port Rate Control */
+/* Offset 0x09: Port Rate Control
+ * Offset 0x0A: Egress Rate Control 2
+ */
 
-int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
+#define Kb			1000
+#define Mb			(1000 * Kb)
+#define Gb			(1000ull * Mb)
+#define EGRESS_FRAME_RATE_MIN	7632
+#define EGRESS_FRAME_RATE_MAX	31250000
+#define EGRESS_BPS_RATE_MIN	(64 * Kb)
+#define EGRESS_BPS_RATE_MAX	(1 * Gb)
+#define EGRESS_RATE_PERIOD	32
+int mv88e6xxx_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port,
+					u32 count, u32 mode)
 {
-	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
-				    0x0000);
-}
+	u16 reg1, reg2;
+	int err;
 
-int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
-{
-	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
-				    0x0001);
+	/* quick exit for disabling */
+	if (count == 0) {
+		err = mv88e6xxx_port_read(chip, port,
+					  MV88E6XXX_PORT_EGRESS_RATE_CTL2,
+					  &reg2);
+		if (err)
+			return err;
+		reg2 &= ~MV88E6XXX_PORT_EGRESS_RATE_MASK;
+		err =  mv88e6xxx_port_write(chip, port,
+					    MV88E6XXX_PORT_EGRESS_RATE_CTL2,
+					    reg2);
+		return err;
+	}
+
+	if (mode > MV88E6XXX_PORT_EGRESS_COUNT_MODE_L3)
+		return -EINVAL;
+
+	if (mode == MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES &&
+	    (count < EGRESS_FRAME_RATE_MIN || count > EGRESS_FRAME_RATE_MAX))
+		return -EINVAL;
+
+	if (mode != MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES &&
+	    (count < EGRESS_BPS_RATE_MIN || count > EGRESS_BPS_RATE_MAX))
+		return -EINVAL;
+
+	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
+				  &reg1);
+	if (err)
+		return err;
+
+	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
+				  &reg2);
+	if (err)
+		return err;
+
+	reg1 &= ~MV88E6XXX_PORT_EGRESS_DEC_MASK;
+	reg2 &= ~MV88E6XXX_PORT_EGRESS_COUNT_MODE_MASK;
+
+	if (mode == MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES) {
+		u32 val;
+
+		/* recommended to use dec of 1 for frame based */
+		reg1 |= 1 << MV88E6XXX_PORT_EGRESS_DEC_SHIFT;
+
+		reg2 |= mode << MV88E6XXX_PORT_EGRESS_COUNT_MODE_SHIFT;
+		reg2 &= ~MV88E6XXX_PORT_EGRESS_RATE_MASK;
+
+		val = NSEC_PER_SEC / (EGRESS_RATE_PERIOD * count);
+		if (NSEC_PER_SEC % (EGRESS_RATE_PERIOD * count))
+			val++;
+		reg2 |= (u16)(val << MV88E6XXX_PORT_EGRESS_RATE_SHIFT);
+	} else {
+		u16 egress_dec, egress_rate;
+		u64 dec_bytes, ns_bits;
+
+		if (count < (1 * Mb))
+			egress_dec = (u16)roundup(count, (64 * Kb));
+		else if (count < (100 * Mb))
+			egress_dec = (u16)roundup(count, (1 * Mb));
+		else
+			egress_dec = (u16)roundup(count, (10 * Mb));
+
+		reg1 |= egress_dec;
+
+		dec_bytes = 8ull * NSEC_PER_SEC * egress_dec;
+		ns_bits = 32ull * count;
+		egress_rate = (u16)div64_u64(dec_bytes, ns_bits);
+		reg2 |= egress_rate;
+	}
+
+	err =  mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
+				    reg1);
+	if (err)
+		return err;
+
+	err =  mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
+				    reg2);
+	if (err)
+		return err;
+
+	return 0;
 }
 
-/* Offset 0x0A: Egress Rate Control 2 */
 int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched)
 {
 	u16 reg;
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index 710d6eccafae..724f839c570a 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -205,13 +205,23 @@
 
 /* Offset 0x09: Egress Rate Control */
 #define MV88E6XXX_PORT_EGRESS_RATE_CTL1		0x09
+#define MV88E6XXX_PORT_EGRESS_DEC_SHIFT		0
+#define MV88E6XXX_PORT_EGRESS_DEC_MASK		0x7f
 
 /* Offset 0x0A: Egress Rate Control 2 */
 #define MV88E6XXX_PORT_EGRESS_RATE_CTL2		0x0a
+#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_SHIFT	14
+#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_MASK	\
+	(0x3 << MV88E6XXX_PORT_EGRESS_COUNT_MODE_SHIFT)
+/* see MV88E6XXX_PORT_EGRESS_COUNT_* in
+ * include/dt-bindings/net/dsa-mv88e6xxx.h
+ */
 #define MV88E6XXX_PORT_SCHED_SHIFT		12
 #define MV88E6XXX_PORT_SCHED_MASK \
 	(0x3 << MV88E6XXX_PORT_SCHED_SHIFT)
 /* see MV88E6XXX_PORT_SCHED_* in include/dt-bindings/net/dsa-mv88e6xxx.h */
+#define MV88E6XXX_PORT_EGRESS_RATE_SHIFT	0
+#define MV88E6XXX_PORT_EGRESS_RATE_MASK		0xfff
 
 /* Offset 0x0B: Port Association Vector */
 #define MV88E6XXX_PORT_ASSOC_VECTOR			0x0b
@@ -335,8 +345,8 @@ int mv88e6xxx_port_set_message_port(struct mv88e6xxx_chip *chip, int port,
 int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
 				  size_t size);
 int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri);
-int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
-int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
+int mv88e6xxx_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port,
+					u32 count, u32 mode);
 int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched);
 int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
 			       u8 out);
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 7/7] dt-bindings: mv88e6xxx: add egress rate limiting
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
                   ` (5 preceding siblings ...)
  2019-09-10 15:41 ` [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting Robert Beckett
@ 2019-09-10 15:41 ` " Robert Beckett
  2019-09-10 16:49 ` [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Florian Fainelli
  2019-09-10 17:19 ` Vivien Didelot
  8 siblings, 0 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-10 15:41 UTC (permalink / raw)
  To: netdev
  Cc: Robert Beckett, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rob Herring, Mark Rutland, devicetree

Document port egress rate limiting settings.
Add defines for specifying egress rate limiting mode.

Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
 .../devicetree/bindings/net/dsa/marvell.txt   | 22 +++++++++++++++++++
 include/dt-bindings/net/dsa-mv88e6xxx.h       |  5 +++++
 2 files changed, 27 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt b/Documentation/devicetree/bindings/net/dsa/marvell.txt
index 7de90929c3c9..d33c1958f420 100644
--- a/Documentation/devicetree/bindings/net/dsa/marvell.txt
+++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt
@@ -62,6 +62,28 @@ Optional properties for ports:
 			  MV88E6XXX_PORT_SCHED_STRICT_ALL - All queues use
 			  strict priority, where queues drain in descending
 			  queue number order.
+- egress-limit-mode=<n>	: Set port egress rate limiting mode. Valid values are:
+			  MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES - Count layer
+			  2 frames (assumed to be 64kb).
+			  MV88E6XXX_PORT_EGRESS_COUNT_MODE_L1 - Count all layer
+			  1 bits
+			  MV88E6XXX_PORT_EGRESS_COUNT_MODE_L2 - Count all layer
+			  2 bits
+			  MV88E6XXX_PORT_EGRESS_COUNT_MODE_L3 - Count all layer
+			  3 bits
+			  Must also specify egress-limit-count.
+- egress-limit-count=<n>: Set port egress rate limiting count. If
+			  egress-limit-mode is FRAMES, this specifies the
+			  maximum number of ethernet frames to allow to egress
+			  from this port per second, otherwise it is number of
+			  bits as counted based on the mode allowed to egress
+			  from this port per second.
+			  The HW has limitations which the driver adheres to:
+			  between 64 Kbps to 1 Mbps in 16 Kbps increments
+			  between 1 Mbps to 100 Mbps in 1Mbps increments
+			  between 100 Mbps to 1 Gbps in 10 Mbps increments.
+			  Other values will be rounded down the previous
+			  increment.
 
 Example:
 
diff --git a/include/dt-bindings/net/dsa-mv88e6xxx.h b/include/dt-bindings/net/dsa-mv88e6xxx.h
index 3f62003841ce..33ecd94f5e22 100644
--- a/include/dt-bindings/net/dsa-mv88e6xxx.h
+++ b/include/dt-bindings/net/dsa-mv88e6xxx.h
@@ -9,6 +9,11 @@
 #ifndef _DT_BINDINGS_MV88E6XXX_H
 #define _DT_BINDINGS_MV88E6XXX_H
 
+#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES	0
+#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_L1	1
+#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_L2	2
+#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_L3	3
+
 #define MV88E6XXX_PORT_SCHED_ROUND_ROBIN	0
 #define MV88E6XXX_PORT_SCHED_STRICT_3		1
 #define MV88E6XXX_PORT_SCHED_STRICT_3_2		2
-- 
2.18.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
@ 2019-09-10 16:14   ` Vivien Didelot
  2019-09-10 16:56   ` Florian Fainelli
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 42+ messages in thread
From: Vivien Didelot @ 2019-09-10 16:14 UTC (permalink / raw)
  To: Robert Beckett
  Cc: netdev, Robert Beckett, Andrew Lunn, Florian Fainelli, David S. Miller

Hi Robert,

On Tue, 10 Sep 2019 16:41:47 +0100, Robert Beckett <bob.beckett@collabora.com> wrote:
> Configure autoneg for phy connected CPU ports.
> This allows us to use autoneg between the CPU port's phy and the link
> partner's phy.
> This enables us to negoatiate pause frame transmission to prioritise
> packet delivery over throughput.
> 
> Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> ---
>  net/dsa/port.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index f071acf2842b..1b6832eac2c5 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -538,10 +538,20 @@ static int dsa_port_setup_phy_of(struct dsa_port *dp, bool enable)
>  		return PTR_ERR(phydev);
>  
>  	if (enable) {
> +		phydev->supported = PHY_GBIT_FEATURES | SUPPORTED_MII |
> +				    SUPPORTED_AUI | SUPPORTED_FIBRE |
> +				    SUPPORTED_BNC | SUPPORTED_Pause |
> +				    SUPPORTED_Asym_Pause;
> +		phydev->advertising = phydev->supported;
> +

This seems a bit intruisive to me. I'll get back to you.

>  		err = genphy_config_init(phydev);
>  		if (err < 0)
>  			goto err_put_dev;
>  
> +		err = genphy_config_aneg(phydev);
> +		if (err < 0)
> +			goto err_put_dev;
> +
>  		err = genphy_resume(phydev);
>  		if (err < 0)
>  			goto err_put_dev;

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/7] dt-bindings: mv88e6xxx: add ability to set default queue priorities per port
  2019-09-10 15:41 ` [PATCH 3/7] dt-bindings: " Robert Beckett
@ 2019-09-10 16:42   ` Florian Fainelli
  2019-09-10 16:49     ` Vivien Didelot
  0 siblings, 1 reply; 42+ messages in thread
From: Florian Fainelli @ 2019-09-10 16:42 UTC (permalink / raw)
  To: Robert Beckett, netdev
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Rob Herring,
	Mark Rutland, devicetree, Jiri Pirko, Ido Schimmel

On 9/10/19 8:41 AM, Robert Beckett wrote:
> Document a new setting for Marvell switch chips to set the default queue
> priorities per port.
> 
> Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> ---
>  Documentation/devicetree/bindings/net/dsa/marvell.txt | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt b/Documentation/devicetree/bindings/net/dsa/marvell.txt
> index 6f9538974bb9..e097c3c52eac 100644
> --- a/Documentation/devicetree/bindings/net/dsa/marvell.txt
> +++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt
> @@ -47,6 +47,10 @@ Optional properties:
>  			  bus. The node must contains a compatible string of
>  			  "marvell,mv88e6xxx-mdio-external"
>  
> +Optional properties for ports:
> +- defqpri=<n>		: Enforced default queue priority for the given port.
> +			  Valid range is 0..3

This is a vendor specific driver/property,
marvell,default-queue-priority (which be cheapskate on words) would be
more readable. But still, I have some more fundamental issues with the
general approach, see my response in the cover letter.
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port
  2019-09-10 15:41 ` [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port Robert Beckett
@ 2019-09-10 16:43   ` Vivien Didelot
  0 siblings, 0 replies; 42+ messages in thread
From: Vivien Didelot @ 2019-09-10 16:43 UTC (permalink / raw)
  To: Robert Beckett
  Cc: netdev, Robert Beckett, Andrew Lunn, Florian Fainelli, David S. Miller

Hi Robert,

On Tue, 10 Sep 2019 16:41:48 +0100, Robert Beckett <bob.beckett@collabora.com> wrote:
> +static int mv88e6xxx_set_port_defqpri(struct mv88e6xxx_chip *chip, int port)
> +{
> +	struct dsa_switch *ds = chip->ds;
> +	struct device_node *dn = ds->ports[port].dn;
> +	int err;
> +	u32 pri;
> +
> +	if (!dn || !chip->info->ops->port_set_defqpri)
> +		return 0;
> +
> +	err = of_property_read_u32(dn, "defqpri", &pri);
> +	if (err < 0)
> +		return 0;
> +
> +	return chip->info->ops->port_set_defqpri(chip, port, (u16)pri);
> +}
> +
>  static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
>  {
>  	struct dsa_switch *ds = chip->ds;
> @@ -2176,6 +2193,10 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
>  			return err;
>  	}
>  
> +	err = mv88e6xxx_set_port_defqpri(chip, port);
> +	if (err)
> +		return err;
> +
>  	/* Port Association Vector: when learning source addresses
>  	 * of packets, add the address to the address database using
>  	 * a port bitmap that has only the bit for this port set and
> @@ -3107,6 +3128,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> +	.port_set_defqpri = mv88e6xxx_port_set_defqpri,

Please use a reference model, like mv88e6352_port_set_defqpri to avoid
confusion with a generic wrapper or implementation.

>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3190,6 +3212,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> +	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3407,6 +3430,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> +	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3750,6 +3774,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> +	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
> index 4646e46d47f2..2d2c24f5a79d 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -383,6 +383,7 @@ struct mv88e6xxx_ops {
>  				   u16 etype);
>  	int (*port_set_jumbo_size)(struct mv88e6xxx_chip *chip, int port,
>  				   size_t size);
> +	int (*port_set_defqpri)(struct mv88e6xxx_chip *chip, int port, u16 pri);

The default queue priority seems to be an integer in the [0:3] range, not
a register mask or value per-se. So an unsigned int seems more appropriate.

>  
>  	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
>  	int (*port_pause_limit)(struct mv88e6xxx_chip *chip, int port, u8 in,
> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
> index 04309ef0a1cc..3a45fcd5cd9c 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -1147,6 +1147,25 @@ int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
>  	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL2, reg);
>  }
>  
> +int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri)
> +{
> +	u16 reg;
> +	int err;
> +
> +	if (pri > 3)
> +		return -EINVAL;
> +
> +	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL2, &reg);
> +	if (err)
> +		return err;
> +
> +	reg &= ~MV88E6XXX_PORT_CTL2_DEFQPRI_MASK;
> +	reg |= pri << MV88E6XXX_PORT_CTL2_DEFQPRI_SHIFT;

                      __bf_shf(MV88E6XXX_PORT_CTL2_DEFQPRI_MASK)

> +	reg |= MV88E6XXX_PORT_CTL2_USE_DEFQPRI;
> +
> +	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL2, reg);
> +}
> +
>  /* Offset 0x09: Port Rate Control */
>  
>  int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
> index 8d5a6cd6fb19..03884bbaa762 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -197,6 +197,9 @@
>  #define MV88E6XXX_PORT_CTL2_DEFAULT_FORWARD		0x0040
>  #define MV88E6XXX_PORT_CTL2_EGRESS_MONITOR		0x0020
>  #define MV88E6XXX_PORT_CTL2_INGRESS_MONITOR		0x0010
> +#define MV88E6XXX_PORT_CTL2_USE_DEFQPRI		0x0008
> +#define MV88E6XXX_PORT_CTL2_DEFQPRI_MASK		0x0006
> +#define MV88E6XXX_PORT_CTL2_DEFQPRI_SHIFT		1

No shift macro needed, MV88E6XXX_PORT_CTL2_DEFQPRI_MASK is enough.

>  #define MV88E6095_PORT_CTL2_CPU_PORT_MASK		0x000f
>  
>  /* Offset 0x09: Egress Rate Control */
> @@ -326,6 +329,7 @@ int mv88e6xxx_port_set_message_port(struct mv88e6xxx_chip *chip, int port,
>  				    bool message_port);
>  int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
>  				  size_t size);
> +int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri);
>  int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
>  int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
>  int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,

Thanks,

	Vivien

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
                   ` (6 preceding siblings ...)
  2019-09-10 15:41 ` [PATCH 7/7] dt-bindings: " Robert Beckett
@ 2019-09-10 16:49 ` Florian Fainelli
  2019-09-11  9:43   ` Robert Beckett
  2019-09-11 11:21   ` Ido Schimmel
  2019-09-10 17:19 ` Vivien Didelot
  8 siblings, 2 replies; 42+ messages in thread
From: Florian Fainelli @ 2019-09-10 16:49 UTC (permalink / raw)
  To: Robert Beckett, netdev
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Ido Schimmel, Jiri Pirko

+Ido, Jiri,

On 9/10/19 8:41 AM, Robert Beckett wrote:
> This patch-set adds support for some features of the Marvell switch
> chips that can be used to handle packet storms.
> 
> The rationale for this was a setup that requires the ability to receive
> traffic from one port, while a packet storm is occuring on another port
> (via an external switch with a deliberate loop). This is needed to
> ensure vital data delivery from a specific port, while mitigating any
> loops or DoS that a user may introduce on another port (can't guarantee
> sensible users).

The use case is reasonable, but the implementation is not really. You
are using Device Tree which is meant to describe hardware as a policy
holder for setting up queue priorities and likewise for queue scheduling.

The tool that should be used for that purpose is tc and possibly an
appropriately offloaded queue scheduler in order to map the desired
scheduling class to what the hardware supports.

Jiri, Ido, how do you guys support this with mlxsw?

> 
> [patch 1/7] configures auto negotiation for CPU ports connected with
> phys to enable pause frame propogation.
> 
> [patch 2/7] allows setting of port's default output queue priority for
> any ingressing packets on that port.
> 
> [patch 3/7] dt-bindings for patch 2.
> 
> [patch 4/7] allows setting of a port's queue scheduling so that it can
> prioritise egress of traffic routed from high priority ports.
> 
> [patch 5/7] dt-bindings for patch 4.
> 
> [patch 6/7] allows ports to rate limit their egress. This can be used to
> stop the host CPU from becoming swamped by packet delivery and exhasting
> descriptors.
> 
> [patch 7/7] dt-bindings for patch 6.
> 
> 
> Robert Beckett (7):
>   net/dsa: configure autoneg for CPU port
>   net: dsa: mv88e6xxx: add ability to set default queue priorities per
>     port
>   dt-bindings: mv88e6xxx: add ability to set default queue priorities
>     per port
>   net: dsa: mv88e6xxx: add ability to set queue scheduling
>   dt-bindings: mv88e6xxx: add ability to set queue scheduling
>   net: dsa: mv88e6xxx: add egress rate limiting
>   dt-bindings: mv88e6xxx: add egress rate limiting
> 
>  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
>  drivers/net/dsa/mv88e6xxx/chip.c              | 122 ++++++++++++---
>  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
>  drivers/net/dsa/mv88e6xxx/port.c              | 140 +++++++++++++++++-
>  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
>  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
>  net/dsa/port.c                                |  10 ++
>  7 files changed, 327 insertions(+), 34 deletions(-)
>  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/7] dt-bindings: mv88e6xxx: add ability to set default queue priorities per port
  2019-09-10 16:42   ` Florian Fainelli
@ 2019-09-10 16:49     ` Vivien Didelot
  2019-09-10 20:46       ` Vladimir Oltean
  0 siblings, 1 reply; 42+ messages in thread
From: Vivien Didelot @ 2019-09-10 16:49 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Robert Beckett, netdev, Andrew Lunn, David S. Miller,
	Rob Herring, Mark Rutland, devicetree, Jiri Pirko, Ido Schimmel

Hi Robert,

On Tue, 10 Sep 2019 09:42:24 -0700, Florian Fainelli <f.fainelli@gmail.com> wrote:
> This is a vendor specific driver/property,
> marvell,default-queue-priority (which be cheapskate on words) would be
> more readable. But still, I have some more fundamental issues with the
> general approach, see my response in the cover letter.

As Florian said, the DT is unlikely to welcome vendor specific nodes for
configuration which may be generic through standard network userspace tools.


Thanks,

	Vivien

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
  2019-09-10 16:14   ` Vivien Didelot
@ 2019-09-10 16:56   ` Florian Fainelli
  2019-09-10 18:26   ` Andrew Lunn
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 42+ messages in thread
From: Florian Fainelli @ 2019-09-10 16:56 UTC (permalink / raw)
  To: Robert Beckett, netdev; +Cc: Andrew Lunn, Vivien Didelot, David S. Miller

On 9/10/19 8:41 AM, Robert Beckett wrote:
> Configure autoneg for phy connected CPU ports.
> This allows us to use autoneg between the CPU port's phy and the link
> partner's phy.
> This enables us to negoatiate pause frame transmission to prioritise
> packet delivery over throughput.

s/autoneg/auto-negotiation/
s/phy/PHY/
s/negoatiate/negotiate/
s/prioritise/prioritize/ (maybe the latter is just my US english
dictionary tripping up)

Also the subject should be net: dsa: Configure auto-negotiation for CPU
port to match previous submissions done to that file.

Fixing up that code path sounds reasonable, but are you not hitting the
PHYLINK code path instead?

> 
> Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> ---
>  net/dsa/port.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index f071acf2842b..1b6832eac2c5 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -538,10 +538,20 @@ static int dsa_port_setup_phy_of(struct dsa_port *dp, bool enable)
>  		return PTR_ERR(phydev);
>  
>  	if (enable) {
> +		phydev->supported = PHY_GBIT_FEATURES | SUPPORTED_MII |
> +				    SUPPORTED_AUI | SUPPORTED_FIBRE |
> +				    SUPPORTED_BNC | SUPPORTED_Pause |
> +				    SUPPORTED_Asym_Pause;
> +		phydev->advertising = phydev->supported;
> +
>  		err = genphy_config_init(phydev);
>  		if (err < 0)
>  			goto err_put_dev;
>  
> +		err = genphy_config_aneg(phydev);
> +		if (err < 0)
> +			goto err_put_dev;
> +
>  		err = genphy_resume(phydev);
>  		if (err < 0)
>  			goto err_put_dev;
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting
  2019-09-10 15:41 ` [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting Robert Beckett
@ 2019-09-10 17:13   ` Vivien Didelot
  2019-09-11 12:26   ` kbuild test robot
  1 sibling, 0 replies; 42+ messages in thread
From: Vivien Didelot @ 2019-09-10 17:13 UTC (permalink / raw)
  To: Robert Beckett
  Cc: netdev, Robert Beckett, Andrew Lunn, Florian Fainelli, David S. Miller

Hi Robert,

On Tue, 10 Sep 2019 16:41:52 +0100, Robert Beckett <bob.beckett@collabora.com> wrote:
> Add code for specifying egress rate limiting per port.
> The rate can be specified as ethernet frames or bits per second.
> 
> Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c |  72 ++++++++++++++-------
>  drivers/net/dsa/mv88e6xxx/chip.h |   3 +-
>  drivers/net/dsa/mv88e6xxx/port.c | 106 ++++++++++++++++++++++++++++---
>  drivers/net/dsa/mv88e6xxx/port.h |  14 +++-
>  4 files changed, 158 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 2bc22c59200c..8c116496ab2f 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -2120,6 +2120,32 @@ static int mv88e6xxx_set_port_sched(struct mv88e6xxx_chip *chip, int port)
>  	return chip->info->ops->port_set_sched(chip, port, (u16)sched);
>  }
>  
> +static int mv88e6xxx_set_port_egress_rate_limiting(struct mv88e6xxx_chip *chip,
> +						   int port)
> +{
> +	struct dsa_switch *ds = chip->ds;
> +	struct device_node *dn = ds->ports[port].dn;
> +	int err;
> +	u32 mode, count;
> +
> +	if (!dn || !chip->info->ops->port_egress_rate_limiting)
> +		return 0;
> +
> +	err = of_property_read_u32(dn, "egress-limit-mode", &mode);
> +	if (err < 0)
> +		goto disable;
> +
> +	err = of_property_read_u32(dn, "egress-limit-count", &count);
> +	if (err < 0)
> +		goto disable;
> +
> +	return chip->info->ops->port_egress_rate_limiting(chip, port, count,
> +							  mode);
> +
> +disable:
> +	return chip->info->ops->port_egress_rate_limiting(chip, port, 0, 0);
> +}
> +
>  static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
>  {
>  	struct dsa_switch *ds = chip->ds;
> @@ -2263,11 +2289,9 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
>  			return err;
>  	}
>  
> -	if (chip->info->ops->port_egress_rate_limiting) {
> -		err = chip->info->ops->port_egress_rate_limiting(chip, port);
> -		if (err)
> -			return err;
> -	}
> +	err = mv88e6xxx_set_port_egress_rate_limiting(chip, port);
> +	if (err)
> +		return err;
>  
>  	err = mv88e6xxx_setup_message_port(chip, port);
>  	if (err)
> @@ -2809,7 +2833,7 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
>  	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -2879,7 +2903,7 @@ static const struct mv88e6xxx_ops mv88e6097_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6095_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -2951,7 +2975,7 @@ static const struct mv88e6xxx_ops mv88e6131_ops = {
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_upstream_port = mv88e6095_port_set_upstream_port,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_set_pause = mv88e6185_port_set_pause,
>  	.port_link_state = mv88e6352_port_link_state,
> @@ -2994,7 +3018,7 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3034,7 +3058,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3108,7 +3132,7 @@ static const struct mv88e6xxx_ops mv88e6171_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3150,7 +3174,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3193,7 +3217,7 @@ static const struct mv88e6xxx_ops mv88e6175_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3235,7 +3259,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3275,7 +3299,7 @@ static const struct mv88e6xxx_ops mv88e6185_ops = {
>  	.port_set_speed = mv88e6185_port_set_speed,
>  	.port_set_frame_mode = mv88e6085_port_set_frame_mode,
>  	.port_set_egress_floods = mv88e6185_port_set_egress_floods,
> -	.port_egress_rate_limiting = mv88e6095_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_set_upstream_port = mv88e6095_port_set_upstream_port,
>  	.port_set_pause = mv88e6185_port_set_pause,
>  	.port_link_state = mv88e6185_port_link_state,
> @@ -3454,7 +3478,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3587,7 +3611,7 @@ static const struct mv88e6xxx_ops mv88e6320_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3630,7 +3654,7 @@ static const struct mv88e6xxx_ops mv88e6321_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3673,7 +3697,7 @@ static const struct mv88e6xxx_ops mv88e6341_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3716,7 +3740,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3755,7 +3779,7 @@ static const struct mv88e6xxx_ops mv88e6351_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3799,7 +3823,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
> @@ -3851,7 +3875,7 @@ static const struct mv88e6xxx_ops mv88e6390_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6390_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3900,7 +3924,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
>  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
>  	.port_set_ether_type = mv88e6351_port_set_ether_type,
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
> -	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_egress_rate_limiting = mv88e6xxx_port_egress_rate_limiting,
>  	.port_pause_limit = mv88e6390_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
> index ff3e35eceee0..75fbd5df4aae 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -385,7 +385,8 @@ struct mv88e6xxx_ops {
>  				   size_t size);
>  	int (*port_set_defqpri)(struct mv88e6xxx_chip *chip, int port, u16 pri);
>  
> -	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
> +	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port,
> +					 u32 count, u32 mode);
>  	int (*port_set_sched)(struct mv88e6xxx_chip *chip, int port, u16 sched);
>  	int (*port_pause_limit)(struct mv88e6xxx_chip *chip, int port, u8 in,
>  				u8 out);
> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
> index 236732fc598d..41418cfaca56 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -1166,21 +1166,107 @@ int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri)
>  	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL2, reg);
>  }
>  
> -/* Offset 0x09: Port Rate Control */
> +/* Offset 0x09: Port Rate Control
> + * Offset 0x0A: Egress Rate Control 2
> + */
>  
> -int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
> +#define Kb			1000
> +#define Mb			(1000 * Kb)
> +#define Gb			(1000ull * Mb)
> +#define EGRESS_FRAME_RATE_MIN	7632
> +#define EGRESS_FRAME_RATE_MAX	31250000
> +#define EGRESS_BPS_RATE_MIN	(64 * Kb)
> +#define EGRESS_BPS_RATE_MAX	(1 * Gb)
> +#define EGRESS_RATE_PERIOD	32
> +int mv88e6xxx_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port,
> +					u32 count, u32 mode)
>  {
> -	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
> -				    0x0000);
> -}
> +	u16 reg1, reg2;
> +	int err;
>  
> -int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
> -{
> -	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
> -				    0x0001);
> +	/* quick exit for disabling */
> +	if (count == 0) {
> +		err = mv88e6xxx_port_read(chip, port,
> +					  MV88E6XXX_PORT_EGRESS_RATE_CTL2,
> +					  &reg2);
> +		if (err)
> +			return err;
> +		reg2 &= ~MV88E6XXX_PORT_EGRESS_RATE_MASK;
> +		err =  mv88e6xxx_port_write(chip, port,
> +					    MV88E6XXX_PORT_EGRESS_RATE_CTL2,
> +					    reg2);
> +		return err;
> +	}
> +
> +	if (mode > MV88E6XXX_PORT_EGRESS_COUNT_MODE_L3)
> +		return -EINVAL;
> +
> +	if (mode == MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES &&
> +	    (count < EGRESS_FRAME_RATE_MIN || count > EGRESS_FRAME_RATE_MAX))
> +		return -EINVAL;
> +
> +	if (mode != MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES &&
> +	    (count < EGRESS_BPS_RATE_MIN || count > EGRESS_BPS_RATE_MAX))
> +		return -EINVAL;
> +
> +	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
> +				  &reg1);
> +	if (err)
> +		return err;
> +
> +	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
> +				  &reg2);
> +	if (err)
> +		return err;
> +
> +	reg1 &= ~MV88E6XXX_PORT_EGRESS_DEC_MASK;
> +	reg2 &= ~MV88E6XXX_PORT_EGRESS_COUNT_MODE_MASK;
> +
> +	if (mode == MV88E6XXX_PORT_EGRESS_COUNT_MODE_FRAMES) {
> +		u32 val;
> +
> +		/* recommended to use dec of 1 for frame based */
> +		reg1 |= 1 << MV88E6XXX_PORT_EGRESS_DEC_SHIFT;
> +
> +		reg2 |= mode << MV88E6XXX_PORT_EGRESS_COUNT_MODE_SHIFT;
> +		reg2 &= ~MV88E6XXX_PORT_EGRESS_RATE_MASK;
> +
> +		val = NSEC_PER_SEC / (EGRESS_RATE_PERIOD * count);
> +		if (NSEC_PER_SEC % (EGRESS_RATE_PERIOD * count))
> +			val++;
> +		reg2 |= (u16)(val << MV88E6XXX_PORT_EGRESS_RATE_SHIFT);
> +	} else {
> +		u16 egress_dec, egress_rate;
> +		u64 dec_bytes, ns_bits;
> +
> +		if (count < (1 * Mb))
> +			egress_dec = (u16)roundup(count, (64 * Kb));
> +		else if (count < (100 * Mb))
> +			egress_dec = (u16)roundup(count, (1 * Mb));
> +		else
> +			egress_dec = (u16)roundup(count, (10 * Mb));
> +
> +		reg1 |= egress_dec;
> +
> +		dec_bytes = 8ull * NSEC_PER_SEC * egress_dec;
> +		ns_bits = 32ull * count;
> +		egress_rate = (u16)div64_u64(dec_bytes, ns_bits);
> +		reg2 |= egress_rate;
> +	}
> +
> +	err =  mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL1,
> +				    reg1);
> +	if (err)
> +		return err;
> +
> +	err =  mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
> +				    reg2);
> +	if (err)
> +		return err;
> +
> +	return 0;
>  }
>  
> -/* Offset 0x0A: Egress Rate Control 2 */
>  int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched)
>  {
>  	u16 reg;
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
> index 710d6eccafae..724f839c570a 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -205,13 +205,23 @@
>  
>  /* Offset 0x09: Egress Rate Control */
>  #define MV88E6XXX_PORT_EGRESS_RATE_CTL1		0x09
> +#define MV88E6XXX_PORT_EGRESS_DEC_SHIFT		0
> +#define MV88E6XXX_PORT_EGRESS_DEC_MASK		0x7f
>  
>  /* Offset 0x0A: Egress Rate Control 2 */
>  #define MV88E6XXX_PORT_EGRESS_RATE_CTL2		0x0a
> +#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_SHIFT	14
> +#define MV88E6XXX_PORT_EGRESS_COUNT_MODE_MASK	\
> +	(0x3 << MV88E6XXX_PORT_EGRESS_COUNT_MODE_SHIFT)

No shift macros please, only 0x1234 masks and their values named as in the
documentation. This way we see clearly how the 16-bit registers are organized.

> +/* see MV88E6XXX_PORT_EGRESS_COUNT_* in
> + * include/dt-bindings/net/dsa-mv88e6xxx.h
> + */
>  #define MV88E6XXX_PORT_SCHED_SHIFT		12
>  #define MV88E6XXX_PORT_SCHED_MASK \
>  	(0x3 << MV88E6XXX_PORT_SCHED_SHIFT)
>  /* see MV88E6XXX_PORT_SCHED_* in include/dt-bindings/net/dsa-mv88e6xxx.h */
> +#define MV88E6XXX_PORT_EGRESS_RATE_SHIFT	0
> +#define MV88E6XXX_PORT_EGRESS_RATE_MASK		0xfff
>  
>  /* Offset 0x0B: Port Association Vector */
>  #define MV88E6XXX_PORT_ASSOC_VECTOR			0x0b
> @@ -335,8 +345,8 @@ int mv88e6xxx_port_set_message_port(struct mv88e6xxx_chip *chip, int port,
>  int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
>  				  size_t size);
>  int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri);
> -int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
> -int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
> +int mv88e6xxx_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port,
> +					u32 count, u32 mode);
>  int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched);
>  int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
>  			       u8 out);

This patch does not look good. Implementations in port.c must be simple
functions ordered per Port register, implementing read write operations for
them, and eventually checking unsupported values. No logic in them. You may
abstract some values with an enum defined in chip.h if needed. (some models
don't use the same values for various definitions for example.)


Thanks,

	Vivien

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling
  2019-09-10 15:41 ` [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling Robert Beckett
@ 2019-09-10 17:18   ` Vivien Didelot
  0 siblings, 0 replies; 42+ messages in thread
From: Vivien Didelot @ 2019-09-10 17:18 UTC (permalink / raw)
  To: Robert Beckett
  Cc: netdev, Robert Beckett, Andrew Lunn, Florian Fainelli, David S. Miller

Hi Robert,

On Tue, 10 Sep 2019 16:41:50 +0100, Robert Beckett <bob.beckett@collabora.com> wrote:
> Add code to set Schedule for any port that specifies "schedule" in their
> device tree node.
> This allows port prioritization in conjunction with port default queue
> priorities or packet priorities.
> 
> Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c | 25 +++++++++++++++++++++++++
>  drivers/net/dsa/mv88e6xxx/chip.h |  1 +
>  drivers/net/dsa/mv88e6xxx/port.c | 21 +++++++++++++++++++++
>  drivers/net/dsa/mv88e6xxx/port.h |  6 ++++++
>  4 files changed, 53 insertions(+)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 5005a35493e3..2bc22c59200c 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -2103,6 +2103,23 @@ static int mv88e6xxx_set_port_defqpri(struct mv88e6xxx_chip *chip, int port)
>  	return chip->info->ops->port_set_defqpri(chip, port, (u16)pri);
>  }
>  
> +static int mv88e6xxx_set_port_sched(struct mv88e6xxx_chip *chip, int port)
> +{
> +	struct dsa_switch *ds = chip->ds;
> +	struct device_node *dn = ds->ports[port].dn;
> +	int err;
> +	u32 sched;
> +
> +	if (!dn || !chip->info->ops->port_set_sched)
> +		return 0;
> +
> +	err = of_property_read_u32(dn, "schedule", &sched);
> +	if (err < 0)
> +		return 0;
> +
> +	return chip->info->ops->port_set_sched(chip, port, (u16)sched);
> +}
> +
>  static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
>  {
>  	struct dsa_switch *ds = chip->ds;
> @@ -2218,6 +2235,10 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
>  	if (err)
>  		return err;
>  
> +	err = mv88e6xxx_set_port_sched(chip, port);
> +	if (err)
> +		return err;
> +
>  	if (chip->info->ops->port_pause_limit) {
>  		err = chip->info->ops->port_pause_limit(chip, port, 0, 0);
>  		if (err)
> @@ -3130,6 +3151,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3214,6 +3236,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3432,6 +3455,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> @@ -3776,6 +3800,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
>  	.port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
>  	.port_set_defqpri = mv88e6xxx_port_set_defqpri,
>  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
> +	.port_set_sched = mv88e6xxx_port_set_sched,
>  	.port_pause_limit = mv88e6097_port_pause_limit,
>  	.port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
>  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
> index 2d2c24f5a79d..ff3e35eceee0 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -386,6 +386,7 @@ struct mv88e6xxx_ops {
>  	int (*port_set_defqpri)(struct mv88e6xxx_chip *chip, int port, u16 pri);
>  
>  	int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
> +	int (*port_set_sched)(struct mv88e6xxx_chip *chip, int port, u16 sched);
>  	int (*port_pause_limit)(struct mv88e6xxx_chip *chip, int port, u8 in,
>  				u8 out);
>  	int (*port_disable_learn_limit)(struct mv88e6xxx_chip *chip, int port);
> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
> index 3a45fcd5cd9c..236732fc598d 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -1180,6 +1180,27 @@ int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port)
>  				    0x0001);
>  }
>  
> +/* Offset 0x0A: Egress Rate Control 2 */
> +int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched)
> +{
> +	u16 reg;
> +	int err;
> +
> +	if (sched > MV88E6XXX_PORT_SCHED_STRICT_ALL)
> +		return -EINVAL;
> +
> +	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
> +				  &reg);
> +	if (err)
> +		return err;
> +
> +	reg &= ~MV88E6XXX_PORT_SCHED_MASK;
> +	reg |= sched << MV88E6XXX_PORT_SCHED_SHIFT;
> +
> +	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_EGRESS_RATE_CTL2,
> +				    reg);
> +}
> +
>  /* Offset 0x0C: Port ATU Control */
>  
>  int mv88e6xxx_port_disable_learn_limit(struct mv88e6xxx_chip *chip, int port)
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
> index 03884bbaa762..710d6eccafae 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -11,6 +11,7 @@
>  #ifndef _MV88E6XXX_PORT_H
>  #define _MV88E6XXX_PORT_H
>  
> +#include <dt-bindings/net/dsa-mv88e6xxx.h>
>  #include "chip.h"
>  
>  /* Offset 0x00: Port Status Register */
> @@ -207,6 +208,10 @@
>  
>  /* Offset 0x0A: Egress Rate Control 2 */
>  #define MV88E6XXX_PORT_EGRESS_RATE_CTL2		0x0a
> +#define MV88E6XXX_PORT_SCHED_SHIFT		12
> +#define MV88E6XXX_PORT_SCHED_MASK \
> +	(0x3 << MV88E6XXX_PORT_SCHED_SHIFT)
> +/* see MV88E6XXX_PORT_SCHED_* in include/dt-bindings/net/dsa-mv88e6xxx.h */
>  
>  /* Offset 0x0B: Port Association Vector */
>  #define MV88E6XXX_PORT_ASSOC_VECTOR			0x0b
> @@ -332,6 +337,7 @@ int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port,
>  int mv88e6xxx_port_set_defqpri(struct mv88e6xxx_chip *chip, int port, u16 pri);
>  int mv88e6095_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
>  int mv88e6097_port_egress_rate_limiting(struct mv88e6xxx_chip *chip, int port);
> +int mv88e6xxx_port_set_sched(struct mv88e6xxx_chip *chip, int port, u16 sched);
>  int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
>  			       u8 out);
>  int mv88e6390_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,

Same comments applied as for the other patches adding implementations in port.c.


Thanks,

	Vivien

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
                   ` (7 preceding siblings ...)
  2019-09-10 16:49 ` [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Florian Fainelli
@ 2019-09-10 17:19 ` Vivien Didelot
  2019-09-11  9:46   ` Robert Beckett
  8 siblings, 1 reply; 42+ messages in thread
From: Vivien Didelot @ 2019-09-10 17:19 UTC (permalink / raw)
  To: Robert Beckett
  Cc: netdev, Robert Beckett, Andrew Lunn, Florian Fainelli, David S. Miller

Hi Robert,

On Tue, 10 Sep 2019 16:41:46 +0100, Robert Beckett <bob.beckett@collabora.com> wrote:
> This patch-set adds support for some features of the Marvell switch
> chips that can be used to handle packet storms.
> 
> The rationale for this was a setup that requires the ability to receive
> traffic from one port, while a packet storm is occuring on another port
> (via an external switch with a deliberate loop). This is needed to
> ensure vital data delivery from a specific port, while mitigating any
> loops or DoS that a user may introduce on another port (can't guarantee
> sensible users).
> 
> [patch 1/7] configures auto negotiation for CPU ports connected with
> phys to enable pause frame propogation.
> 
> [patch 2/7] allows setting of port's default output queue priority for
> any ingressing packets on that port.
> 
> [patch 3/7] dt-bindings for patch 2.
> 
> [patch 4/7] allows setting of a port's queue scheduling so that it can
> prioritise egress of traffic routed from high priority ports.
> 
> [patch 5/7] dt-bindings for patch 4.
> 
> [patch 6/7] allows ports to rate limit their egress. This can be used to
> stop the host CPU from becoming swamped by packet delivery and exhasting
> descriptors.
> 
> [patch 7/7] dt-bindings for patch 6.
> 
> 
> Robert Beckett (7):
>   net/dsa: configure autoneg for CPU port
>   net: dsa: mv88e6xxx: add ability to set default queue priorities per
>     port
>   dt-bindings: mv88e6xxx: add ability to set default queue priorities
>     per port
>   net: dsa: mv88e6xxx: add ability to set queue scheduling
>   dt-bindings: mv88e6xxx: add ability to set queue scheduling
>   net: dsa: mv88e6xxx: add egress rate limiting
>   dt-bindings: mv88e6xxx: add egress rate limiting
> 
>  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
>  drivers/net/dsa/mv88e6xxx/chip.c              | 122 ++++++++++++---
>  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
>  drivers/net/dsa/mv88e6xxx/port.c              | 140 +++++++++++++++++-
>  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
>  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
>  net/dsa/port.c                                |  10 ++
>  7 files changed, 327 insertions(+), 34 deletions(-)
>  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h

Feature series targeting netdev must be prefixed "PATCH net-next". As
this approach was a PoC, sending it as "RFC net-next" would be even more
appropriate.


Thank you,

	Vivien

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
  2019-09-10 16:14   ` Vivien Didelot
  2019-09-10 16:56   ` Florian Fainelli
@ 2019-09-10 18:26   ` Andrew Lunn
  2019-09-10 18:29     ` Florian Fainelli
  2019-09-11 11:43   ` kbuild test robot
  2019-09-14  7:16   ` kbuild test robot
  4 siblings, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2019-09-10 18:26 UTC (permalink / raw)
  To: Robert Beckett; +Cc: netdev, Vivien Didelot, Florian Fainelli, David S. Miller

On Tue, Sep 10, 2019 at 04:41:47PM +0100, Robert Beckett wrote:
> This enables us to negoatiate pause frame transmission to prioritise
> packet delivery over throughput.

I don't think we can unconditionally enable this. It is a big
behaviour change, and it is likely to break running systems. It has
affects on QoS, packet prioritisation, etc.

I think there needs to be a configuration knob. But unfortunately, i
don't know of a good place to put this knob. The switch CPU port is
not visible in any way.

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 18:26   ` Andrew Lunn
@ 2019-09-10 18:29     ` Florian Fainelli
  2019-09-11  9:16       ` Robert Beckett
  0 siblings, 1 reply; 42+ messages in thread
From: Florian Fainelli @ 2019-09-10 18:29 UTC (permalink / raw)
  To: Andrew Lunn, Robert Beckett; +Cc: netdev, Vivien Didelot, David S. Miller

On 9/10/19 11:26 AM, Andrew Lunn wrote:
> On Tue, Sep 10, 2019 at 04:41:47PM +0100, Robert Beckett wrote:
>> This enables us to negoatiate pause frame transmission to prioritise
>> packet delivery over throughput.
> 
> I don't think we can unconditionally enable this. It is a big
> behaviour change, and it is likely to break running systems. It has
> affects on QoS, packet prioritisation, etc.
> 
> I think there needs to be a configuration knob. But unfortunately, i
> don't know of a good place to put this knob. The switch CPU port is
> not visible in any way.

Broadcast storm suppression is to be solved at ingress, not on the CPU
port, once this lands on the CPU port, it's game over already.
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 3/7] dt-bindings: mv88e6xxx: add ability to set default queue priorities per port
  2019-09-10 16:49     ` Vivien Didelot
@ 2019-09-10 20:46       ` Vladimir Oltean
  0 siblings, 0 replies; 42+ messages in thread
From: Vladimir Oltean @ 2019-09-10 20:46 UTC (permalink / raw)
  To: Vivien Didelot
  Cc: Florian Fainelli, Robert Beckett, netdev, Andrew Lunn,
	David S. Miller, Rob Herring, Mark Rutland, devicetree,
	Jiri Pirko, Ido Schimmel

Hi guys,

On 10/09/2019, Vivien Didelot <vivien.didelot@gmail.com> wrote:
> Hi Robert,
>
> On Tue, 10 Sep 2019 09:42:24 -0700, Florian Fainelli <f.fainelli@gmail.com>
> wrote:
>> This is a vendor specific driver/property,
>> marvell,default-queue-priority (which be cheapskate on words) would be
>> more readable. But still, I have some more fundamental issues with the
>> general approach, see my response in the cover letter.
>
> As Florian said, the DT is unlikely to welcome vendor specific nodes for
> configuration which may be generic through standard network userspace
> tools.
>
>
> Thanks,
>
> 	Vivien
>

While I do agree that the DT bindings are a big no-no for QoS
settings, the topic is interesting.
What is the user space knob for configuring port-default priority (say
RX queue)?
Something like this maybe? (a very forced "matchall" with rxnfc)
ethtool --config-nfc eth0 flow-type ether src 00:00:00:00:00:00 m
00:00:00:00:00:00 action 5

Regards,
-Vladimir

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 18:29     ` Florian Fainelli
@ 2019-09-11  9:16       ` Robert Beckett
  2019-09-11  9:54         ` Robert Beckett
  2019-09-11 22:52         ` Andrew Lunn
  0 siblings, 2 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-11  9:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev, Vivien Didelot, David S. Miller

On Tue, 2019-09-10 at 11:29 -0700, Florian Fainelli wrote:
> On 9/10/19 11:26 AM, Andrew Lunn wrote:
> > On Tue, Sep 10, 2019 at 04:41:47PM +0100, Robert Beckett wrote:
> > > This enables us to negoatiate pause frame transmission to
> > > prioritise
> > > packet delivery over throughput.
> > 
> > I don't think we can unconditionally enable this. It is a big
> > behaviour change, and it is likely to break running systems. It has
> > affects on QoS, packet prioritisation, etc.
> > 
> > I think there needs to be a configuration knob. But unfortunately,
> > i
> > don't know of a good place to put this knob. The switch CPU port is
> > not visible in any way.
> 
> Broadcast storm suppression is to be solved at ingress, not on the
> CPU
> port, once this lands on the CPU port, it's game over already.

It is not just for broadcast storm protection. The original issue that
made me look in to all of this turned out to be rx descritor ring
buffer exhaustion due to the CPU not being able to keep up with packet
reception.

Although the simple repro case for it is a broadcast storm, this could
happen with many legitimate small packets, and the correct way to
handle it seems to be pause frames, though I am not traditionally a
network programmer, so my knowledge may be incorrect. Please advise if
you know of a better way to handle that.

Fundamentally, with a phy to phy CPU connection, the CPU MAC may well
wish to enable pause frames for various reasons, so we should strive to
handle that I think.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-10 16:49 ` [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Florian Fainelli
@ 2019-09-11  9:43   ` Robert Beckett
  2019-09-11 11:21   ` Ido Schimmel
  1 sibling, 0 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-11  9:43 UTC (permalink / raw)
  To: Florian Fainelli, netdev
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Ido Schimmel,
	Jiri Pirko, bob.beckett

On Tue, 2019-09-10 at 09:49 -0700, Florian Fainelli wrote:
> +Ido, Jiri,
> 
> On 9/10/19 8:41 AM, Robert Beckett wrote:
> > This patch-set adds support for some features of the Marvell switch
> > chips that can be used to handle packet storms.
> > 
> > The rationale for this was a setup that requires the ability to
> > receive
> > traffic from one port, while a packet storm is occuring on another
> > port
> > (via an external switch with a deliberate loop). This is needed to
> > ensure vital data delivery from a specific port, while mitigating
> > any
> > loops or DoS that a user may introduce on another port (can't
> > guarantee
> > sensible users).
> 
> The use case is reasonable, but the implementation is not really. You
> are using Device Tree which is meant to describe hardware as a policy
> holder for setting up queue priorities and likewise for queue
> scheduling.
> 
> The tool that should be used for that purpose is tc and possibly an
> appropriately offloaded queue scheduler in order to map the desired
> scheduling class to what the hardware supports.

Thanks for the review and tip about tc. Im currently not familiar with
that tool. Ill investigate it as an alternative approach.

> 
> Jiri, Ido, how do you guys support this with mlxsw?
> 
> > 
> > [patch 1/7] configures auto negotiation for CPU ports connected
> > with
> > phys to enable pause frame propogation.
> > 
> > [patch 2/7] allows setting of port's default output queue priority
> > for
> > any ingressing packets on that port.
> > 
> > [patch 3/7] dt-bindings for patch 2.
> > 
> > [patch 4/7] allows setting of a port's queue scheduling so that it
> > can
> > prioritise egress of traffic routed from high priority ports.
> > 
> > [patch 5/7] dt-bindings for patch 4.
> > 
> > [patch 6/7] allows ports to rate limit their egress. This can be
> > used to
> > stop the host CPU from becoming swamped by packet delivery and
> > exhasting
> > descriptors.
> > 
> > [patch 7/7] dt-bindings for patch 6.
> > 
> > 
> > Robert Beckett (7):
> >   net/dsa: configure autoneg for CPU port
> >   net: dsa: mv88e6xxx: add ability to set default queue priorities
> > per
> >     port
> >   dt-bindings: mv88e6xxx: add ability to set default queue
> > priorities
> >     per port
> >   net: dsa: mv88e6xxx: add ability to set queue scheduling
> >   dt-bindings: mv88e6xxx: add ability to set queue scheduling
> >   net: dsa: mv88e6xxx: add egress rate limiting
> >   dt-bindings: mv88e6xxx: add egress rate limiting
> > 
> >  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
> >  drivers/net/dsa/mv88e6xxx/chip.c              | 122 ++++++++++++
> > ---
> >  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
> >  drivers/net/dsa/mv88e6xxx/port.c              | 140
> > +++++++++++++++++-
> >  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
> >  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
> >  net/dsa/port.c                                |  10 ++
> >  7 files changed, 327 insertions(+), 34 deletions(-)
> >  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h
> > 
> 
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-10 17:19 ` Vivien Didelot
@ 2019-09-11  9:46   ` Robert Beckett
  2019-09-11 15:31     ` Vivien Didelot
  2019-09-11 23:01     ` Andrew Lunn
  0 siblings, 2 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-11  9:46 UTC (permalink / raw)
  To: Vivien Didelot
  Cc: netdev, Andrew Lunn, Florian Fainelli, David S. Miller, bob.beckett

On Tue, 2019-09-10 at 13:19 -0400, Vivien Didelot wrote:
> Hi Robert,
> 
> On Tue, 10 Sep 2019 16:41:46 +0100, Robert Beckett <
> bob.beckett@collabora.com> wrote:
> > This patch-set adds support for some features of the Marvell switch
> > chips that can be used to handle packet storms.
> > 
> > The rationale for this was a setup that requires the ability to
> > receive
> > traffic from one port, while a packet storm is occuring on another
> > port
> > (via an external switch with a deliberate loop). This is needed to
> > ensure vital data delivery from a specific port, while mitigating
> > any
> > loops or DoS that a user may introduce on another port (can't
> > guarantee
> > sensible users).
> > 
> > [patch 1/7] configures auto negotiation for CPU ports connected
> > with
> > phys to enable pause frame propogation.
> > 
> > [patch 2/7] allows setting of port's default output queue priority
> > for
> > any ingressing packets on that port.
> > 
> > [patch 3/7] dt-bindings for patch 2.
> > 
> > [patch 4/7] allows setting of a port's queue scheduling so that it
> > can
> > prioritise egress of traffic routed from high priority ports.
> > 
> > [patch 5/7] dt-bindings for patch 4.
> > 
> > [patch 6/7] allows ports to rate limit their egress. This can be
> > used to
> > stop the host CPU from becoming swamped by packet delivery and
> > exhasting
> > descriptors.
> > 
> > [patch 7/7] dt-bindings for patch 6.
> > 
> > 
> > Robert Beckett (7):
> >   net/dsa: configure autoneg for CPU port
> >   net: dsa: mv88e6xxx: add ability to set default queue priorities
> > per
> >     port
> >   dt-bindings: mv88e6xxx: add ability to set default queue
> > priorities
> >     per port
> >   net: dsa: mv88e6xxx: add ability to set queue scheduling
> >   dt-bindings: mv88e6xxx: add ability to set queue scheduling
> >   net: dsa: mv88e6xxx: add egress rate limiting
> >   dt-bindings: mv88e6xxx: add egress rate limiting
> > 
> >  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
> >  drivers/net/dsa/mv88e6xxx/chip.c              | 122 ++++++++++++
> > ---
> >  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
> >  drivers/net/dsa/mv88e6xxx/port.c              | 140
> > +++++++++++++++++-
> >  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
> >  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
> >  net/dsa/port.c                                |  10 ++
> >  7 files changed, 327 insertions(+), 34 deletions(-)
> >  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h
> 
> Feature series targeting netdev must be prefixed "PATCH net-next". As

Thanks for the info. Out of curiosity, where should I have gleaned this
info from? This is my first contribution to netdev, so I wasnt familiar
with the etiquette.

> this approach was a PoC, sending it as "RFC net-next" would be even
> more
> appropriate.
> 
> 
> Thank you,
> 
> 	Vivien


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-11  9:16       ` Robert Beckett
@ 2019-09-11  9:54         ` Robert Beckett
  2019-09-11 22:52         ` Andrew Lunn
  1 sibling, 0 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-11  9:54 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev, Vivien Didelot, David S. Miller

On Wed, 2019-09-11 at 10:16 +0100, Robert Beckett wrote:
> On Tue, 2019-09-10 at 11:29 -0700, Florian Fainelli wrote:
> > On 9/10/19 11:26 AM, Andrew Lunn wrote:
> > > On Tue, Sep 10, 2019 at 04:41:47PM +0100, Robert Beckett wrote:
> > > > This enables us to negoatiate pause frame transmission to
> > > > prioritise
> > > > packet delivery over throughput.
> > > 
> > > I don't think we can unconditionally enable this. It is a big
> > > behaviour change, and it is likely to break running systems. It
> > > has
> > > affects on QoS, packet prioritisation, etc.
> > > 
> > > I think there needs to be a configuration knob. But
> > > unfortunately,
> > > i
> > > don't know of a good place to put this knob. The switch CPU port
> > > is
> > > not visible in any way.
> > 
> > Broadcast storm suppression is to be solved at ingress, not on the
> > CPU
> > port, once this lands on the CPU port, it's game over already.
> 
> It is not just for broadcast storm protection. The original issue
> that
> made me look in to all of this turned out to be rx descritor ring
> buffer exhaustion due to the CPU not being able to keep up with
> packet
> reception.
> 
> Although the simple repro case for it is a broadcast storm, this
> could
> happen with many legitimate small packets, and the correct way to
> handle it seems to be pause frames, though I am not traditionally a
> network programmer, so my knowledge may be incorrect. Please advise
> if
> you know of a better way to handle that.
> 
> Fundamentally, with a phy to phy CPU connection, the CPU MAC may well
> wish to enable pause frames for various reasons, so we should strive
> to
> handle that I think.
> 

As an aside, do any of you have experience of trying to enable PIRL on
the Marvell switches? The first thing I tried was configuring it for
packet number based (rather than byte count based) input rate limiting,
but it never seemed to have any effect even at extreme values that
should in theory have greatly limited the number of packets allowed to
ingress.

After investigating the root cause and finding it was due to the CPU's
inability to process the received packets quickly enough, pause frames
and port prioritization seemed like the correct fix anyway.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-10 16:49 ` [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Florian Fainelli
  2019-09-11  9:43   ` Robert Beckett
@ 2019-09-11 11:21   ` Ido Schimmel
  2019-09-11 11:49     ` Robert Beckett
  1 sibling, 1 reply; 42+ messages in thread
From: Ido Schimmel @ 2019-09-11 11:21 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Robert Beckett, netdev, Andrew Lunn, Vivien Didelot,
	David S. Miller, Jiri Pirko

On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli wrote:
> +Ido, Jiri,
> 
> On 9/10/19 8:41 AM, Robert Beckett wrote:
> > This patch-set adds support for some features of the Marvell switch
> > chips that can be used to handle packet storms.
> > 
> > The rationale for this was a setup that requires the ability to receive
> > traffic from one port, while a packet storm is occuring on another port
> > (via an external switch with a deliberate loop). This is needed to
> > ensure vital data delivery from a specific port, while mitigating any
> > loops or DoS that a user may introduce on another port (can't guarantee
> > sensible users).
> 
> The use case is reasonable, but the implementation is not really. You
> are using Device Tree which is meant to describe hardware as a policy
> holder for setting up queue priorities and likewise for queue scheduling.
> 
> The tool that should be used for that purpose is tc and possibly an
> appropriately offloaded queue scheduler in order to map the desired
> scheduling class to what the hardware supports.
> 
> Jiri, Ido, how do you guys support this with mlxsw?

Hi Florian,

Are you referring to policing traffic towards the CPU using a policer on
the egress of the CPU port? At least that's what I understand from the
description of patch 6 below.

If so, mlxsw sets policers for different traffic types during its
initialization sequence. These policers are not exposed to the user nor
configurable. While the default settings are good for most users, we do
want to allow users to change these and expose current settings.

I agree that tc seems like the right choice, but the question is where
are we going to install the filters?

> 
> > 
> > [patch 1/7] configures auto negotiation for CPU ports connected with
> > phys to enable pause frame propogation.
> > 
> > [patch 2/7] allows setting of port's default output queue priority for
> > any ingressing packets on that port.
> > 
> > [patch 3/7] dt-bindings for patch 2.
> > 
> > [patch 4/7] allows setting of a port's queue scheduling so that it can
> > prioritise egress of traffic routed from high priority ports.
> > 
> > [patch 5/7] dt-bindings for patch 4.
> > 
> > [patch 6/7] allows ports to rate limit their egress. This can be used to
> > stop the host CPU from becoming swamped by packet delivery and exhasting
> > descriptors.
> > 
> > [patch 7/7] dt-bindings for patch 6.
> > 
> > 
> > Robert Beckett (7):
> >   net/dsa: configure autoneg for CPU port
> >   net: dsa: mv88e6xxx: add ability to set default queue priorities per
> >     port
> >   dt-bindings: mv88e6xxx: add ability to set default queue priorities
> >     per port
> >   net: dsa: mv88e6xxx: add ability to set queue scheduling
> >   dt-bindings: mv88e6xxx: add ability to set queue scheduling
> >   net: dsa: mv88e6xxx: add egress rate limiting
> >   dt-bindings: mv88e6xxx: add egress rate limiting
> > 
> >  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
> >  drivers/net/dsa/mv88e6xxx/chip.c              | 122 ++++++++++++---
> >  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
> >  drivers/net/dsa/mv88e6xxx/port.c              | 140 +++++++++++++++++-
> >  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
> >  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
> >  net/dsa/port.c                                |  10 ++
> >  7 files changed, 327 insertions(+), 34 deletions(-)
> >  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h
> > 
> 
> 
> -- 
> Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
                     ` (2 preceding siblings ...)
  2019-09-10 18:26   ` Andrew Lunn
@ 2019-09-11 11:43   ` kbuild test robot
  2019-09-14  7:16   ` kbuild test robot
  4 siblings, 0 replies; 42+ messages in thread
From: kbuild test robot @ 2019-09-11 11:43 UTC (permalink / raw)
  To: Robert Beckett
  Cc: kbuild-all, netdev, Robert Beckett, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 2922 bytes --]

Hi Robert,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Robert-Beckett/net-dsa-mv88e6xxx-features-to-handle-network-storms/20190911-142233
config: mips-allmodconfig (attached as .config)
compiler: mips-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=mips 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   net//dsa/port.c: In function 'dsa_port_setup_phy_of':
>> net//dsa/port.c:541:41: error: invalid operands to binary | (have 'long unsigned int *' and 'long unsigned int')
      phydev->supported = PHY_GBIT_FEATURES | SUPPORTED_MII |
                                            ^
>> net//dsa/port.c:545:23: error: assignment to expression with array type
      phydev->advertising = phydev->supported;
                          ^

vim +541 net//dsa/port.c

   525	
   526	static int dsa_port_setup_phy_of(struct dsa_port *dp, bool enable)
   527	{
   528		struct dsa_switch *ds = dp->ds;
   529		struct phy_device *phydev;
   530		int port = dp->index;
   531		int err = 0;
   532	
   533		phydev = dsa_port_get_phy_device(dp);
   534		if (!phydev)
   535			return 0;
   536	
   537		if (IS_ERR(phydev))
   538			return PTR_ERR(phydev);
   539	
   540		if (enable) {
 > 541			phydev->supported = PHY_GBIT_FEATURES | SUPPORTED_MII |
   542					    SUPPORTED_AUI | SUPPORTED_FIBRE |
   543					    SUPPORTED_BNC | SUPPORTED_Pause |
   544					    SUPPORTED_Asym_Pause;
 > 545			phydev->advertising = phydev->supported;
   546	
   547			err = genphy_config_init(phydev);
   548			if (err < 0)
   549				goto err_put_dev;
   550	
   551			err = genphy_config_aneg(phydev);
   552			if (err < 0)
   553				goto err_put_dev;
   554	
   555			err = genphy_resume(phydev);
   556			if (err < 0)
   557				goto err_put_dev;
   558	
   559			err = genphy_read_status(phydev);
   560			if (err < 0)
   561				goto err_put_dev;
   562		} else {
   563			err = genphy_suspend(phydev);
   564			if (err < 0)
   565				goto err_put_dev;
   566		}
   567	
   568		if (ds->ops->adjust_link)
   569			ds->ops->adjust_link(ds, port, phydev);
   570	
   571		dev_dbg(ds->dev, "enabled port's phy: %s", phydev_name(phydev));
   572	
   573	err_put_dev:
   574		put_device(&phydev->mdio.dev);
   575		return err;
   576	}
   577	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 61484 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-11 11:21   ` Ido Schimmel
@ 2019-09-11 11:49     ` Robert Beckett
  2019-09-11 22:58       ` Andrew Lunn
  2019-09-12  9:03       ` Ido Schimmel
  0 siblings, 2 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-11 11:49 UTC (permalink / raw)
  To: Ido Schimmel, Florian Fainelli
  Cc: netdev, Andrew Lunn, Vivien Didelot, David S. Miller, Jiri Pirko,
	bob.beckett

On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
> On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli wrote:
> > +Ido, Jiri,
> > 
> > On 9/10/19 8:41 AM, Robert Beckett wrote:
> > > This patch-set adds support for some features of the Marvell
> > > switch
> > > chips that can be used to handle packet storms.
> > > 
> > > The rationale for this was a setup that requires the ability to
> > > receive
> > > traffic from one port, while a packet storm is occuring on
> > > another port
> > > (via an external switch with a deliberate loop). This is needed
> > > to
> > > ensure vital data delivery from a specific port, while mitigating
> > > any
> > > loops or DoS that a user may introduce on another port (can't
> > > guarantee
> > > sensible users).
> > 
> > The use case is reasonable, but the implementation is not really.
> > You
> > are using Device Tree which is meant to describe hardware as a
> > policy
> > holder for setting up queue priorities and likewise for queue
> > scheduling.
> > 
> > The tool that should be used for that purpose is tc and possibly an
> > appropriately offloaded queue scheduler in order to map the desired
> > scheduling class to what the hardware supports.
> > 
> > Jiri, Ido, how do you guys support this with mlxsw?
> 
> Hi Florian,
> 
> Are you referring to policing traffic towards the CPU using a policer
> on
> the egress of the CPU port? At least that's what I understand from
> the
> description of patch 6 below.
> 
> If so, mlxsw sets policers for different traffic types during its
> initialization sequence. These policers are not exposed to the user
> nor
> configurable. While the default settings are good for most users, we
> do
> want to allow users to change these and expose current settings.
> 
> I agree that tc seems like the right choice, but the question is
> where
> are we going to install the filters?
> 

Before I go too far down the rabbit hole of tc traffic shaping, maybe
it would be good to explain in more detail the problem I am trying to
solve.

We have a setup as follows:

Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port 1
(P1) is critical priority, no dropped packets allowed, all others can
be best effort.

CPU port of swtich chip is connected via phy to phy of intel i210 (igb
driver).

i210 is connected via pcie switch to imx6.

When too many small packets attempt to be delivered to CPU port (e.g.
during broadcast flood) we saw dropped packets.

The packets were being received by i210 in to rx descriptor buffer
fine, but the CPU could not keep up with the load. We saw
rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.


With this in mind, I am wondering whether any amount of tc traffic
shaping would help? Would tc shaping require that the packet reception
manages to keep up before it can enact its policies? Does the
infrastructure have accelerator offload hooks to be able to apply it
via HW? I dont see how it would be able to inspect the packets to apply
filtering if they were dropped due to rx descriptor exhaustion. (please
bear with me with the basic questions, I am not familiar with this part
of the stack).

Assuming that tc is still the way to go, after a brief look in to the
man pages and the documentation at largc.org, it seems like it would
need to use the ingress qdisc, with some sort of system to segregate
and priortise based on ingress port. Is this possible?



> > 
> > > 
> > > [patch 1/7] configures auto negotiation for CPU ports connected
> > > with
> > > phys to enable pause frame propogation.
> > > 
> > > [patch 2/7] allows setting of port's default output queue
> > > priority for
> > > any ingressing packets on that port.
> > > 
> > > [patch 3/7] dt-bindings for patch 2.
> > > 
> > > [patch 4/7] allows setting of a port's queue scheduling so that
> > > it can
> > > prioritise egress of traffic routed from high priority ports.
> > > 
> > > [patch 5/7] dt-bindings for patch 4.
> > > 
> > > [patch 6/7] allows ports to rate limit their egress. This can be
> > > used to
> > > stop the host CPU from becoming swamped by packet delivery and
> > > exhasting
> > > descriptors.
> > > 
> > > [patch 7/7] dt-bindings for patch 6.
> > > 
> > > 
> > > Robert Beckett (7):
> > >   net/dsa: configure autoneg for CPU port
> > >   net: dsa: mv88e6xxx: add ability to set default queue
> > > priorities per
> > >     port
> > >   dt-bindings: mv88e6xxx: add ability to set default queue
> > > priorities
> > >     per port
> > >   net: dsa: mv88e6xxx: add ability to set queue scheduling
> > >   dt-bindings: mv88e6xxx: add ability to set queue scheduling
> > >   net: dsa: mv88e6xxx: add egress rate limiting
> > >   dt-bindings: mv88e6xxx: add egress rate limiting
> > > 
> > >  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
> > >  drivers/net/dsa/mv88e6xxx/chip.c              | 122
> > > ++++++++++++---
> > >  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
> > >  drivers/net/dsa/mv88e6xxx/port.c              | 140
> > > +++++++++++++++++-
> > >  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
> > >  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
> > >  net/dsa/port.c                                |  10 ++
> > >  7 files changed, 327 insertions(+), 34 deletions(-)
> > >  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h
> > > 
> > 
> > 
> > -- 
> > Florian


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting
  2019-09-10 15:41 ` [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting Robert Beckett
  2019-09-10 17:13   ` Vivien Didelot
@ 2019-09-11 12:26   ` kbuild test robot
  1 sibling, 0 replies; 42+ messages in thread
From: kbuild test robot @ 2019-09-11 12:26 UTC (permalink / raw)
  To: Robert Beckett
  Cc: kbuild-all, netdev, Robert Beckett, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 5217 bytes --]

Hi Robert,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Robert-Beckett/net-dsa-mv88e6xxx-features-to-handle-network-storms/20190911-142233
config: x86_64-fedora-25 (attached as .config)
compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/net//dsa/mv88e6xxx/chip.c:3529:31: error: 'mv88e6097_port_egress_rate_limiting' undeclared here (not in a function); did you mean 'mv88e6xxx_port_egress_rate_limiting'?
     .port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                  mv88e6xxx_port_egress_rate_limiting

vim +3529 drivers/net//dsa/mv88e6xxx/chip.c

b3469dd8adade1 Vivien Didelot   2016-09-29  3510  
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3511  static const struct mv88e6xxx_ops mv88e6250_ops = {
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3512  	/* MV88E6XXX_FAMILY_6250 */
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3513  	.ieee_pri_map = mv88e6250_g1_ieee_pri_map,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3514  	.ip_pri_map = mv88e6085_g1_ip_pri_map,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3515  	.irl_init_all = mv88e6352_g2_irl_init_all,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3516  	.get_eeprom = mv88e6xxx_g2_get_eeprom16,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3517  	.set_eeprom = mv88e6xxx_g2_set_eeprom16,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3518  	.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3519  	.phy_read = mv88e6xxx_g2_smi_phy_read,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3520  	.phy_write = mv88e6xxx_g2_smi_phy_write,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3521  	.port_set_link = mv88e6xxx_port_set_link,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3522  	.port_set_duplex = mv88e6xxx_port_set_duplex,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3523  	.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3524  	.port_set_speed = mv88e6250_port_set_speed,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3525  	.port_tag_remap = mv88e6095_port_tag_remap,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3526  	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3527  	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3528  	.port_set_ether_type = mv88e6351_port_set_ether_type,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04 @3529  	.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3530  	.port_pause_limit = mv88e6097_port_pause_limit,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3531  	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3532  	.port_link_state = mv88e6250_port_link_state,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3533  	.stats_snapshot = mv88e6320_g1_stats_snapshot,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3534  	.stats_set_histogram = mv88e6095_g1_stats_set_histogram,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3535  	.stats_get_sset_count = mv88e6250_stats_get_sset_count,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3536  	.stats_get_strings = mv88e6250_stats_get_strings,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3537  	.stats_get_stats = mv88e6250_stats_get_stats,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3538  	.set_cpu_port = mv88e6095_g1_set_cpu_port,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3539  	.set_egress_port = mv88e6095_g1_set_egress_port,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3540  	.watchdog_ops = &mv88e6250_watchdog_ops,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3541  	.mgmt_rsvd2cpu = mv88e6352_g2_mgmt_rsvd2cpu,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3542  	.pot_clear = mv88e6xxx_g2_pot_clear,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3543  	.reset = mv88e6250_g1_reset,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3544  	.vtu_getnext = mv88e6250_g1_vtu_getnext,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3545  	.vtu_loadpurge = mv88e6250_g1_vtu_loadpurge,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3546  	.phylink_validate = mv88e6065_phylink_validate,
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3547  };
1f71836f5d96e4 Rasmus Villemoes 2019-06-04  3548  

:::::: The code at line 3529 was first introduced by commit
:::::: 1f71836f5d96e4c87fad16db86d324bee47e1d30 net: dsa: mv88e6xxx: add support for mv88e6250

:::::: TO: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50818 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-11  9:46   ` Robert Beckett
@ 2019-09-11 15:31     ` Vivien Didelot
  2019-09-11 23:01     ` Andrew Lunn
  1 sibling, 0 replies; 42+ messages in thread
From: Vivien Didelot @ 2019-09-11 15:31 UTC (permalink / raw)
  To: Robert Beckett
  Cc: netdev, Andrew Lunn, Florian Fainelli, David S. Miller, bob.beckett

Hi Robert,

On Wed, 11 Sep 2019 10:46:05 +0100, Robert Beckett <bob.beckett@collabora.com> wrote:
> > Feature series targeting netdev must be prefixed "PATCH net-next". As
> 
> Thanks for the info. Out of curiosity, where should I have gleaned this
> info from? This is my first contribution to netdev, so I wasnt familiar
> with the etiquette.
> 
> > this approach was a PoC, sending it as "RFC net-next" would be even
> > more
> > appropriate.

Netdev being a huge subsystem has specific rules for subject prefix or merge
window, which are described in Documentation/networking/netdev-FAQ.rst


Thank you,

	Vivien

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-11  9:16       ` Robert Beckett
  2019-09-11  9:54         ` Robert Beckett
@ 2019-09-11 22:52         ` Andrew Lunn
  2019-09-12 10:14           ` Robert Beckett
  1 sibling, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2019-09-11 22:52 UTC (permalink / raw)
  To: Robert Beckett; +Cc: Florian Fainelli, netdev, Vivien Didelot, David S. Miller

> It is not just for broadcast storm protection. The original issue that
> made me look in to all of this turned out to be rx descritor ring
> buffer exhaustion due to the CPU not being able to keep up with packet
> reception.

Pause frames does not really solve this problem. The switch will at
some point fill its buffers, and start throwing packets away. Or it
needs to send pause packets it its peers. And then your whole switch
throughput goes down. Packets will always get thrown away, so you need
QoS in your network to give the network hints about which frames is
should throw away first.

..

> Fundamentally, with a phy to phy CPU connection, the CPU MAC may well
> wish to enable pause frames for various reasons, so we should strive to
> handle that I think.

It actually has nothing to do with PHY to PHY connections. You can use
pause frames with direct MAC to MAC connections. PHY auto-negotiation
is one way to indicate both ends support it, but there are also other
ways. e.g.

ethtool -A|--pause devname [autoneg on|off] [rx on|off] [tx on|off]

on the SoC you could do

ethtool --pause eth0 autoneg off rx on tx on

to force the SoC to send and process pause frames. Ideally i would
prefer a solution like this, since it is not a change of behaviour for
everybody else.

   Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-11 11:49     ` Robert Beckett
@ 2019-09-11 22:58       ` Andrew Lunn
  2019-09-12  9:05         ` Ido Schimmel
  2019-09-12  9:03       ` Ido Schimmel
  1 sibling, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2019-09-11 22:58 UTC (permalink / raw)
  To: Robert Beckett
  Cc: Ido Schimmel, Florian Fainelli, netdev, Vivien Didelot,
	David S. Miller, Jiri Pirko

> We have a setup as follows:
> 
> Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port 1
> (P1) is critical priority, no dropped packets allowed, all others can
> be best effort.
> 
> CPU port of swtich chip is connected via phy to phy of intel i210 (igb
> driver).
> 
> i210 is connected via pcie switch to imx6.
> 
> When too many small packets attempt to be delivered to CPU port (e.g.
> during broadcast flood) we saw dropped packets.
> 
> The packets were being received by i210 in to rx descriptor buffer
> fine, but the CPU could not keep up with the load. We saw
> rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
> 
> 
> With this in mind, I am wondering whether any amount of tc traffic
> shaping would help?

Hi Robert

The model in linux is that you start with a software TC filter, and
then offload it to the hardware. So the user configures TC just as
normal, and then that is used to program the hardware to do the same
thing as what would happen in software. This is exactly the same as we
do with bridging. You create a software bridge and add interfaces to
the bridge. This then gets offloaded to the hardware and it does the
bridging for you.

So think about how your can model the Marvell switch capabilities
using TC, and implement offload support for it.

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-11  9:46   ` Robert Beckett
  2019-09-11 15:31     ` Vivien Didelot
@ 2019-09-11 23:01     ` Andrew Lunn
  1 sibling, 0 replies; 42+ messages in thread
From: Andrew Lunn @ 2019-09-11 23:01 UTC (permalink / raw)
  To: Robert Beckett; +Cc: Vivien Didelot, netdev, Florian Fainelli, David S. Miller

> > Feature series targeting netdev must be prefixed "PATCH net-next". As
> 
> Thanks for the info. Out of curiosity, where should I have gleaned this
> info from? This is my first contribution to netdev, so I wasnt familiar
> with the etiquette.

It is also a good idea to 'lurk' in a mailing list for a while,
reading emails flying around, getting to know how things work. This
subject of "PATCH net-next" comes up maybe once a week. The idea off
offloads gets discussed once every couple of weeks etc.

	 Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-11 11:49     ` Robert Beckett
  2019-09-11 22:58       ` Andrew Lunn
@ 2019-09-12  9:03       ` Ido Schimmel
  2019-09-12  9:21         ` Andrew Lunn
  2019-09-12 16:25         ` Florian Fainelli
  1 sibling, 2 replies; 42+ messages in thread
From: Ido Schimmel @ 2019-09-12  9:03 UTC (permalink / raw)
  To: Robert Beckett
  Cc: Florian Fainelli, netdev, Andrew Lunn, Vivien Didelot,
	David S. Miller, Jiri Pirko

On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
> On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
> > On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli wrote:
> > > +Ido, Jiri,
> > > 
> > > On 9/10/19 8:41 AM, Robert Beckett wrote:
> > > > This patch-set adds support for some features of the Marvell
> > > > switch
> > > > chips that can be used to handle packet storms.
> > > > 
> > > > The rationale for this was a setup that requires the ability to
> > > > receive
> > > > traffic from one port, while a packet storm is occuring on
> > > > another port
> > > > (via an external switch with a deliberate loop). This is needed
> > > > to
> > > > ensure vital data delivery from a specific port, while mitigating
> > > > any
> > > > loops or DoS that a user may introduce on another port (can't
> > > > guarantee
> > > > sensible users).
> > > 
> > > The use case is reasonable, but the implementation is not really.
> > > You
> > > are using Device Tree which is meant to describe hardware as a
> > > policy
> > > holder for setting up queue priorities and likewise for queue
> > > scheduling.
> > > 
> > > The tool that should be used for that purpose is tc and possibly an
> > > appropriately offloaded queue scheduler in order to map the desired
> > > scheduling class to what the hardware supports.
> > > 
> > > Jiri, Ido, how do you guys support this with mlxsw?
> > 
> > Hi Florian,
> > 
> > Are you referring to policing traffic towards the CPU using a policer
> > on
> > the egress of the CPU port? At least that's what I understand from
> > the
> > description of patch 6 below.
> > 
> > If so, mlxsw sets policers for different traffic types during its
> > initialization sequence. These policers are not exposed to the user
> > nor
> > configurable. While the default settings are good for most users, we
> > do
> > want to allow users to change these and expose current settings.
> > 
> > I agree that tc seems like the right choice, but the question is
> > where
> > are we going to install the filters?
> > 
> 
> Before I go too far down the rabbit hole of tc traffic shaping, maybe
> it would be good to explain in more detail the problem I am trying to
> solve.
> 
> We have a setup as follows:
> 
> Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port 1
> (P1) is critical priority, no dropped packets allowed, all others can
> be best effort.
> 
> CPU port of swtich chip is connected via phy to phy of intel i210 (igb
> driver).
> 
> i210 is connected via pcie switch to imx6.
> 
> When too many small packets attempt to be delivered to CPU port (e.g.
> during broadcast flood) we saw dropped packets.
> 
> The packets were being received by i210 in to rx descriptor buffer
> fine, but the CPU could not keep up with the load. We saw
> rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
> 
> 
> With this in mind, I am wondering whether any amount of tc traffic
> shaping would help? Would tc shaping require that the packet reception
> manages to keep up before it can enact its policies? Does the
> infrastructure have accelerator offload hooks to be able to apply it
> via HW? I dont see how it would be able to inspect the packets to apply
> filtering if they were dropped due to rx descriptor exhaustion. (please
> bear with me with the basic questions, I am not familiar with this part
> of the stack).
> 
> Assuming that tc is still the way to go, after a brief look in to the
> man pages and the documentation at largc.org, it seems like it would
> need to use the ingress qdisc, with some sort of system to segregate
> and priortise based on ingress port. Is this possible?

Hi Robert,

As I see it, you have two problems here:

1. Classification: Based on ingress port in your case

2. Scheduling: How to schedule between the different transmission queues

Where the port from which the packets should egress is the CPU port,
before they cross the PCI towards the imx6.

Both of these issues can be solved by tc. The main problem is that today
we do not have a netdev to represent the CPU port and therefore can't
use existing infra like tc. I believe we need to create one. Besides
scheduling, we can also use it to permit/deny certain traffic from
reaching the CPU and perform policing.

Drivers can run the received packets via taps using
dev_queue_xmit_nit(), so that users will see all the traffic directed at
the host when running tcpdump on this netdev.

> 
> 
> 
> > > 
> > > > 
> > > > [patch 1/7] configures auto negotiation for CPU ports connected
> > > > with
> > > > phys to enable pause frame propogation.
> > > > 
> > > > [patch 2/7] allows setting of port's default output queue
> > > > priority for
> > > > any ingressing packets on that port.
> > > > 
> > > > [patch 3/7] dt-bindings for patch 2.
> > > > 
> > > > [patch 4/7] allows setting of a port's queue scheduling so that
> > > > it can
> > > > prioritise egress of traffic routed from high priority ports.
> > > > 
> > > > [patch 5/7] dt-bindings for patch 4.
> > > > 
> > > > [patch 6/7] allows ports to rate limit their egress. This can be
> > > > used to
> > > > stop the host CPU from becoming swamped by packet delivery and
> > > > exhasting
> > > > descriptors.
> > > > 
> > > > [patch 7/7] dt-bindings for patch 6.
> > > > 
> > > > 
> > > > Robert Beckett (7):
> > > >   net/dsa: configure autoneg for CPU port
> > > >   net: dsa: mv88e6xxx: add ability to set default queue
> > > > priorities per
> > > >     port
> > > >   dt-bindings: mv88e6xxx: add ability to set default queue
> > > > priorities
> > > >     per port
> > > >   net: dsa: mv88e6xxx: add ability to set queue scheduling
> > > >   dt-bindings: mv88e6xxx: add ability to set queue scheduling
> > > >   net: dsa: mv88e6xxx: add egress rate limiting
> > > >   dt-bindings: mv88e6xxx: add egress rate limiting
> > > > 
> > > >  .../devicetree/bindings/net/dsa/marvell.txt   |  38 +++++
> > > >  drivers/net/dsa/mv88e6xxx/chip.c              | 122
> > > > ++++++++++++---
> > > >  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +-
> > > >  drivers/net/dsa/mv88e6xxx/port.c              | 140
> > > > +++++++++++++++++-
> > > >  drivers/net/dsa/mv88e6xxx/port.h              |  24 ++-
> > > >  include/dt-bindings/net/dsa-mv88e6xxx.h       |  22 +++
> > > >  net/dsa/port.c                                |  10 ++
> > > >  7 files changed, 327 insertions(+), 34 deletions(-)
> > > >  create mode 100644 include/dt-bindings/net/dsa-mv88e6xxx.h
> > > > 
> > > 
> > > 
> > > -- 
> > > Florian
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-11 22:58       ` Andrew Lunn
@ 2019-09-12  9:05         ` Ido Schimmel
  0 siblings, 0 replies; 42+ messages in thread
From: Ido Schimmel @ 2019-09-12  9:05 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Robert Beckett, Florian Fainelli, netdev, Vivien Didelot,
	David S. Miller, Jiri Pirko

On Thu, Sep 12, 2019 at 12:58:41AM +0200, Andrew Lunn wrote:
> So think about how your can model the Marvell switch capabilities
> using TC, and implement offload support for it.

+1 :)

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-12  9:03       ` Ido Schimmel
@ 2019-09-12  9:21         ` Andrew Lunn
  2019-09-12 16:25         ` Florian Fainelli
  1 sibling, 0 replies; 42+ messages in thread
From: Andrew Lunn @ 2019-09-12  9:21 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Robert Beckett, Florian Fainelli, netdev, Vivien Didelot,
	David S. Miller, Jiri Pirko

> 2. Scheduling: How to schedule between the different transmission queues
> 
> Where the port from which the packets should egress is the CPU port,
> before they cross the PCI towards the imx6.

Hi Ido

This is DSA, so the switch is connected via Ethernet to the IMX6, not
PCI. Minor detail, but that really is the core of what makes DSA DSA.

     Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-11 22:52         ` Andrew Lunn
@ 2019-09-12 10:14           ` Robert Beckett
  2019-09-12 10:43             ` Andrew Lunn
  0 siblings, 1 reply; 42+ messages in thread
From: Robert Beckett @ 2019-09-12 10:14 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, netdev, Vivien Didelot, David S. Miller, bob.beckett

On Thu, 2019-09-12 at 00:52 +0200, Andrew Lunn wrote:
> > It is not just for broadcast storm protection. The original issue
> > that
> > made me look in to all of this turned out to be rx descritor ring
> > buffer exhaustion due to the CPU not being able to keep up with
> > packet
> > reception.
> 
> Pause frames does not really solve this problem. The switch will at
> some point fill its buffers, and start throwing packets away. Or it
> needs to send pause packets it its peers. And then your whole switch
> throughput goes down. Packets will always get thrown away, so you
> need
> QoS in your network to give the network hints about which frames is
> should throw away first.
> 

Indeed. This is the understanding I was working with.
This patch series enables pause frames, output queue prriority and
strict scheduling to egress the high priority queues first.
This means that when the switch starts dropping frames, it drops from
the lowest priority as the highest ones are delivered at line speed
without issue.

> ..
> 
> > Fundamentally, with a phy to phy CPU connection, the CPU MAC may
> > well
> > wish to enable pause frames for various reasons, so we should
> > strive to
> > handle that I think.
> 
> It actually has nothing to do with PHY to PHY connections. You can
> use
> pause frames with direct MAC to MAC connections. PHY auto-negotiation
> is one way to indicate both ends support it, but there are also other
> ways. e.g.
> 
> ethtool -A|--pause devname [autoneg on|off] [rx on|off] [tx on|off]
> 
> on the SoC you could do
> 
> ethtool --pause eth0 autoneg off rx on tx on
> 
> to force the SoC to send and process pause frames. Ideally i would
> prefer a solution like this, since it is not a change of behaviour
> for
> everybody else.

Good point, well made.
The reason for using autoneg in this series was due to having no netdev
to run ethtool against for the CPU port.
If we go down the route of creating a netdev for the CPU port, then we
could indeed force pause frames at both ends.

However, given that the phy on the marvell switch is capable of autoneg
, is it not reasonable to setup the advertisement and let autoneg take
care of it if using phy to phy connection?

> 
>    Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-12 10:14           ` Robert Beckett
@ 2019-09-12 10:43             ` Andrew Lunn
  0 siblings, 0 replies; 42+ messages in thread
From: Andrew Lunn @ 2019-09-12 10:43 UTC (permalink / raw)
  To: Robert Beckett
  Cc: Florian Fainelli, netdev, Vivien Didelot, David S. Miller, bob.beckett

> > It actually has nothing to do with PHY to PHY connections. You can
> > use
> > pause frames with direct MAC to MAC connections. PHY auto-negotiation
> > is one way to indicate both ends support it, but there are also other
> > ways. e.g.
> > 
> > ethtool -A|--pause devname [autoneg on|off] [rx on|off] [tx on|off]
> > 
> > on the SoC you could do
> > 
> > ethtool --pause eth0 autoneg off rx on tx on
> > 
> > to force the SoC to send and process pause frames. Ideally i would
> > prefer a solution like this, since it is not a change of behaviour
> > for
> > everybody else.
> 
> Good point, well made.
> The reason for using autoneg in this series was due to having no netdev
> to run ethtool against for the CPU port.

Do you need one? It is the IMX which is the bottle neck. It is the one
which needs to send pause frames. You have a netdev for that. Have you
checked if the switch will react on pause frames without your
change. Play with the command i give above on the master interface. It
looks like the FEC driver fully supports synchronous pause
configuration.

> However, given that the phy on the marvell switch is capable of
> autoneg , is it not reasonable to setup the advertisement and let
> autoneg take care of it if using phy to phy connection?

Most designs don't use back to back PHYs for the CPU port. They save
the cost and connect MACs back to back using RGMII, or maybe SERDES.
If we are going for a method which can configure pause between the CPU
and the switch, it needs to be generic and work for both setups.

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-12  9:03       ` Ido Schimmel
  2019-09-12  9:21         ` Andrew Lunn
@ 2019-09-12 16:25         ` Florian Fainelli
  2019-09-12 16:46           ` Robert Beckett
  1 sibling, 1 reply; 42+ messages in thread
From: Florian Fainelli @ 2019-09-12 16:25 UTC (permalink / raw)
  To: Ido Schimmel, Robert Beckett
  Cc: netdev, Andrew Lunn, Vivien Didelot, David S. Miller, Jiri Pirko

On 9/12/19 2:03 AM, Ido Schimmel wrote:
> On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
>> On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
>>> On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli wrote:
>>>> +Ido, Jiri,
>>>>
>>>> On 9/10/19 8:41 AM, Robert Beckett wrote:
>>>>> This patch-set adds support for some features of the Marvell
>>>>> switch
>>>>> chips that can be used to handle packet storms.
>>>>>
>>>>> The rationale for this was a setup that requires the ability to
>>>>> receive
>>>>> traffic from one port, while a packet storm is occuring on
>>>>> another port
>>>>> (via an external switch with a deliberate loop). This is needed
>>>>> to
>>>>> ensure vital data delivery from a specific port, while mitigating
>>>>> any
>>>>> loops or DoS that a user may introduce on another port (can't
>>>>> guarantee
>>>>> sensible users).
>>>>
>>>> The use case is reasonable, but the implementation is not really.
>>>> You
>>>> are using Device Tree which is meant to describe hardware as a
>>>> policy
>>>> holder for setting up queue priorities and likewise for queue
>>>> scheduling.
>>>>
>>>> The tool that should be used for that purpose is tc and possibly an
>>>> appropriately offloaded queue scheduler in order to map the desired
>>>> scheduling class to what the hardware supports.
>>>>
>>>> Jiri, Ido, how do you guys support this with mlxsw?
>>>
>>> Hi Florian,
>>>
>>> Are you referring to policing traffic towards the CPU using a policer
>>> on
>>> the egress of the CPU port? At least that's what I understand from
>>> the
>>> description of patch 6 below.
>>>
>>> If so, mlxsw sets policers for different traffic types during its
>>> initialization sequence. These policers are not exposed to the user
>>> nor
>>> configurable. While the default settings are good for most users, we
>>> do
>>> want to allow users to change these and expose current settings.
>>>
>>> I agree that tc seems like the right choice, but the question is
>>> where
>>> are we going to install the filters?
>>>
>>
>> Before I go too far down the rabbit hole of tc traffic shaping, maybe
>> it would be good to explain in more detail the problem I am trying to
>> solve.
>>
>> We have a setup as follows:
>>
>> Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port 1
>> (P1) is critical priority, no dropped packets allowed, all others can
>> be best effort.
>>
>> CPU port of swtich chip is connected via phy to phy of intel i210 (igb
>> driver).
>>
>> i210 is connected via pcie switch to imx6.
>>
>> When too many small packets attempt to be delivered to CPU port (e.g.
>> during broadcast flood) we saw dropped packets.
>>
>> The packets were being received by i210 in to rx descriptor buffer
>> fine, but the CPU could not keep up with the load. We saw
>> rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
>>
>>
>> With this in mind, I am wondering whether any amount of tc traffic
>> shaping would help? Would tc shaping require that the packet reception
>> manages to keep up before it can enact its policies? Does the
>> infrastructure have accelerator offload hooks to be able to apply it
>> via HW? I dont see how it would be able to inspect the packets to apply
>> filtering if they were dropped due to rx descriptor exhaustion. (please
>> bear with me with the basic questions, I am not familiar with this part
>> of the stack).
>>
>> Assuming that tc is still the way to go, after a brief look in to the
>> man pages and the documentation at largc.org, it seems like it would
>> need to use the ingress qdisc, with some sort of system to segregate
>> and priortise based on ingress port. Is this possible?
> 
> Hi Robert,
> 
> As I see it, you have two problems here:
> 
> 1. Classification: Based on ingress port in your case
> 
> 2. Scheduling: How to schedule between the different transmission queues
> 
> Where the port from which the packets should egress is the CPU port,
> before they cross the PCI towards the imx6.
> 
> Both of these issues can be solved by tc. The main problem is that today
> we do not have a netdev to represent the CPU port and therefore can't
> use existing infra like tc. I believe we need to create one. Besides
> scheduling, we can also use it to permit/deny certain traffic from
> reaching the CPU and perform policing.

We do not necessarily have to create a CPU netdev, we can overlay netdev
operations onto the DSA master interface (fec in that case), and
whenever you configure the DSA master interface, we also call back into
the switch side for the CPU port. This is not necessarily the cleanest
way to do things, but that is how we support ethtool operations (and
some netdev operations incidentally), and it works
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-12 16:25         ` Florian Fainelli
@ 2019-09-12 16:46           ` Robert Beckett
  2019-09-12 17:41             ` Florian Fainelli
  0 siblings, 1 reply; 42+ messages in thread
From: Robert Beckett @ 2019-09-12 16:46 UTC (permalink / raw)
  To: Florian Fainelli, Ido Schimmel
  Cc: netdev, Andrew Lunn, Vivien Didelot, David S. Miller, Jiri Pirko,
	bob.beckett

On Thu, 2019-09-12 at 09:25 -0700, Florian Fainelli wrote:
> On 9/12/19 2:03 AM, Ido Schimmel wrote:
> > On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
> > > On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
> > > > On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli
> > > > wrote:
> > > > > +Ido, Jiri,
> > > > > 
> > > > > On 9/10/19 8:41 AM, Robert Beckett wrote:
> > > > > > This patch-set adds support for some features of the
> > > > > > Marvell
> > > > > > switch
> > > > > > chips that can be used to handle packet storms.
> > > > > > 
> > > > > > The rationale for this was a setup that requires the
> > > > > > ability to
> > > > > > receive
> > > > > > traffic from one port, while a packet storm is occuring on
> > > > > > another port
> > > > > > (via an external switch with a deliberate loop). This is
> > > > > > needed
> > > > > > to
> > > > > > ensure vital data delivery from a specific port, while
> > > > > > mitigating
> > > > > > any
> > > > > > loops or DoS that a user may introduce on another port
> > > > > > (can't
> > > > > > guarantee
> > > > > > sensible users).
> > > > > 
> > > > > The use case is reasonable, but the implementation is not
> > > > > really.
> > > > > You
> > > > > are using Device Tree which is meant to describe hardware as
> > > > > a
> > > > > policy
> > > > > holder for setting up queue priorities and likewise for queue
> > > > > scheduling.
> > > > > 
> > > > > The tool that should be used for that purpose is tc and
> > > > > possibly an
> > > > > appropriately offloaded queue scheduler in order to map the
> > > > > desired
> > > > > scheduling class to what the hardware supports.
> > > > > 
> > > > > Jiri, Ido, how do you guys support this with mlxsw?
> > > > 
> > > > Hi Florian,
> > > > 
> > > > Are you referring to policing traffic towards the CPU using a
> > > > policer
> > > > on
> > > > the egress of the CPU port? At least that's what I understand
> > > > from
> > > > the
> > > > description of patch 6 below.
> > > > 
> > > > If so, mlxsw sets policers for different traffic types during
> > > > its
> > > > initialization sequence. These policers are not exposed to the
> > > > user
> > > > nor
> > > > configurable. While the default settings are good for most
> > > > users, we
> > > > do
> > > > want to allow users to change these and expose current
> > > > settings.
> > > > 
> > > > I agree that tc seems like the right choice, but the question
> > > > is
> > > > where
> > > > are we going to install the filters?
> > > > 
> > > 
> > > Before I go too far down the rabbit hole of tc traffic shaping,
> > > maybe
> > > it would be good to explain in more detail the problem I am
> > > trying to
> > > solve.
> > > 
> > > We have a setup as follows:
> > > 
> > > Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port
> > > 1
> > > (P1) is critical priority, no dropped packets allowed, all others
> > > can
> > > be best effort.
> > > 
> > > CPU port of swtich chip is connected via phy to phy of intel i210
> > > (igb
> > > driver).
> > > 
> > > i210 is connected via pcie switch to imx6.
> > > 
> > > When too many small packets attempt to be delivered to CPU port
> > > (e.g.
> > > during broadcast flood) we saw dropped packets.
> > > 
> > > The packets were being received by i210 in to rx descriptor
> > > buffer
> > > fine, but the CPU could not keep up with the load. We saw
> > > rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
> > > 
> > > 
> > > With this in mind, I am wondering whether any amount of tc
> > > traffic
> > > shaping would help? Would tc shaping require that the packet
> > > reception
> > > manages to keep up before it can enact its policies? Does the
> > > infrastructure have accelerator offload hooks to be able to apply
> > > it
> > > via HW? I dont see how it would be able to inspect the packets to
> > > apply
> > > filtering if they were dropped due to rx descriptor exhaustion.
> > > (please
> > > bear with me with the basic questions, I am not familiar with
> > > this part
> > > of the stack).
> > > 
> > > Assuming that tc is still the way to go, after a brief look in to
> > > the
> > > man pages and the documentation at largc.org, it seems like it
> > > would
> > > need to use the ingress qdisc, with some sort of system to
> > > segregate
> > > and priortise based on ingress port. Is this possible?
> > 
> > Hi Robert,
> > 
> > As I see it, you have two problems here:
> > 
> > 1. Classification: Based on ingress port in your case
> > 
> > 2. Scheduling: How to schedule between the different transmission
> > queues
> > 
> > Where the port from which the packets should egress is the CPU
> > port,
> > before they cross the PCI towards the imx6.
> > 
> > Both of these issues can be solved by tc. The main problem is that
> > today
> > we do not have a netdev to represent the CPU port and therefore
> > can't
> > use existing infra like tc. I believe we need to create one.
> > Besides
> > scheduling, we can also use it to permit/deny certain traffic from
> > reaching the CPU and perform policing.
> 
> We do not necessarily have to create a CPU netdev, we can overlay
> netdev
> operations onto the DSA master interface (fec in that case), and
> whenever you configure the DSA master interface, we also call back
> into
> the switch side for the CPU port. This is not necessarily the
> cleanest
> way to do things, but that is how we support ethtool operations (and
> some netdev operations incidentally), and it works

After reading up on tc, I am not sure how this would work given the
semantics of the tool currently.

My initial thought was to model the switch's 4 output queues using an
mqprio qdisc for the CPU port, and then use either iptables's classify
module on the input ports to set which queue it egresses from on the
CPU port, or use vlan tagging with id 0 and priority set. (with the
many detail of how to implement them still left to discover).

However, it looks like the mqprio qdisc could only be used for egress,
so without a netdev representing the CPU port, I dont know how it could
be used.

Another thing I thought of using was just to use iptable's TOS module
to set the minimal delay bit and rely on default behaviours, but Ive
yet to find anything in the Marvell manual that indicates it could set
that bit on all frames entering a port.

Another option might be to use vlans with their priority bits being
used to steer to output queues, but I really dont want to introduce
more virtual interfaces in to the setup, and I cant see how to
configure an enforce default vlan tag with id 0 and priority bits set
via linux userland tools.


It does look like tc would be quite nice for configuring the egress
rate limiting assuming we a netdev to target with the rate controls of
the qdisc.


So far, this seems like I am trying to shoe horn this stuff in to tc.
It seems like tc is meant to configure how the ip stack  configures
flow within the stack, whereas in a switch chip, the packets go nowhere
near the CPUs kernel ip stack. I cant help thinking that it would be
good have a specific utility for configuring switches that operates on
the port level for manage flow within the chip, or maybe simple sysfs
attributes to set the ports priority.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-12 16:46           ` Robert Beckett
@ 2019-09-12 17:41             ` Florian Fainelli
  2019-09-13 12:47               ` Robert Beckett
  0 siblings, 1 reply; 42+ messages in thread
From: Florian Fainelli @ 2019-09-12 17:41 UTC (permalink / raw)
  To: Robert Beckett, Ido Schimmel
  Cc: netdev, Andrew Lunn, Vivien Didelot, David S. Miller, Jiri Pirko

On 9/12/19 9:46 AM, Robert Beckett wrote:
> On Thu, 2019-09-12 at 09:25 -0700, Florian Fainelli wrote:
>> On 9/12/19 2:03 AM, Ido Schimmel wrote:
>>> On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
>>>> On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
>>>>> On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli
>>>>> wrote:
>>>>>> +Ido, Jiri,
>>>>>>
>>>>>> On 9/10/19 8:41 AM, Robert Beckett wrote:
>>>>>>> This patch-set adds support for some features of the
>>>>>>> Marvell
>>>>>>> switch
>>>>>>> chips that can be used to handle packet storms.
>>>>>>>
>>>>>>> The rationale for this was a setup that requires the
>>>>>>> ability to
>>>>>>> receive
>>>>>>> traffic from one port, while a packet storm is occuring on
>>>>>>> another port
>>>>>>> (via an external switch with a deliberate loop). This is
>>>>>>> needed
>>>>>>> to
>>>>>>> ensure vital data delivery from a specific port, while
>>>>>>> mitigating
>>>>>>> any
>>>>>>> loops or DoS that a user may introduce on another port
>>>>>>> (can't
>>>>>>> guarantee
>>>>>>> sensible users).
>>>>>>
>>>>>> The use case is reasonable, but the implementation is not
>>>>>> really.
>>>>>> You
>>>>>> are using Device Tree which is meant to describe hardware as
>>>>>> a
>>>>>> policy
>>>>>> holder for setting up queue priorities and likewise for queue
>>>>>> scheduling.
>>>>>>
>>>>>> The tool that should be used for that purpose is tc and
>>>>>> possibly an
>>>>>> appropriately offloaded queue scheduler in order to map the
>>>>>> desired
>>>>>> scheduling class to what the hardware supports.
>>>>>>
>>>>>> Jiri, Ido, how do you guys support this with mlxsw?
>>>>>
>>>>> Hi Florian,
>>>>>
>>>>> Are you referring to policing traffic towards the CPU using a
>>>>> policer
>>>>> on
>>>>> the egress of the CPU port? At least that's what I understand
>>>>> from
>>>>> the
>>>>> description of patch 6 below.
>>>>>
>>>>> If so, mlxsw sets policers for different traffic types during
>>>>> its
>>>>> initialization sequence. These policers are not exposed to the
>>>>> user
>>>>> nor
>>>>> configurable. While the default settings are good for most
>>>>> users, we
>>>>> do
>>>>> want to allow users to change these and expose current
>>>>> settings.
>>>>>
>>>>> I agree that tc seems like the right choice, but the question
>>>>> is
>>>>> where
>>>>> are we going to install the filters?
>>>>>
>>>>
>>>> Before I go too far down the rabbit hole of tc traffic shaping,
>>>> maybe
>>>> it would be good to explain in more detail the problem I am
>>>> trying to
>>>> solve.
>>>>
>>>> We have a setup as follows:
>>>>
>>>> Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port
>>>> 1
>>>> (P1) is critical priority, no dropped packets allowed, all others
>>>> can
>>>> be best effort.
>>>>
>>>> CPU port of swtich chip is connected via phy to phy of intel i210
>>>> (igb
>>>> driver).
>>>>
>>>> i210 is connected via pcie switch to imx6.
>>>>
>>>> When too many small packets attempt to be delivered to CPU port
>>>> (e.g.
>>>> during broadcast flood) we saw dropped packets.
>>>>
>>>> The packets were being received by i210 in to rx descriptor
>>>> buffer
>>>> fine, but the CPU could not keep up with the load. We saw
>>>> rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
>>>>
>>>>
>>>> With this in mind, I am wondering whether any amount of tc
>>>> traffic
>>>> shaping would help? Would tc shaping require that the packet
>>>> reception
>>>> manages to keep up before it can enact its policies? Does the
>>>> infrastructure have accelerator offload hooks to be able to apply
>>>> it
>>>> via HW? I dont see how it would be able to inspect the packets to
>>>> apply
>>>> filtering if they were dropped due to rx descriptor exhaustion.
>>>> (please
>>>> bear with me with the basic questions, I am not familiar with
>>>> this part
>>>> of the stack).
>>>>
>>>> Assuming that tc is still the way to go, after a brief look in to
>>>> the
>>>> man pages and the documentation at largc.org, it seems like it
>>>> would
>>>> need to use the ingress qdisc, with some sort of system to
>>>> segregate
>>>> and priortise based on ingress port. Is this possible?
>>>
>>> Hi Robert,
>>>
>>> As I see it, you have two problems here:
>>>
>>> 1. Classification: Based on ingress port in your case
>>>
>>> 2. Scheduling: How to schedule between the different transmission
>>> queues
>>>
>>> Where the port from which the packets should egress is the CPU
>>> port,
>>> before they cross the PCI towards the imx6.
>>>
>>> Both of these issues can be solved by tc. The main problem is that
>>> today
>>> we do not have a netdev to represent the CPU port and therefore
>>> can't
>>> use existing infra like tc. I believe we need to create one.
>>> Besides
>>> scheduling, we can also use it to permit/deny certain traffic from
>>> reaching the CPU and perform policing.
>>
>> We do not necessarily have to create a CPU netdev, we can overlay
>> netdev
>> operations onto the DSA master interface (fec in that case), and
>> whenever you configure the DSA master interface, we also call back
>> into
>> the switch side for the CPU port. This is not necessarily the
>> cleanest
>> way to do things, but that is how we support ethtool operations (and
>> some netdev operations incidentally), and it works
> 
> After reading up on tc, I am not sure how this would work given the
> semantics of the tool currently.
> 
> My initial thought was to model the switch's 4 output queues using an
> mqprio qdisc for the CPU port, and then use either iptables's classify
> module on the input ports to set which queue it egresses from on the
> CPU port, or use vlan tagging with id 0 and priority set. (with the
> many detail of how to implement them still left to discover).
> 
> However, it looks like the mqprio qdisc could only be used for egress,
> so without a netdev representing the CPU port, I dont know how it could
> be used.

If you are looking at mapping your DSA master/CPU port egress queues to
actual switch egress queues, you can look at what bcm_sf2.c and
bcmsysport.c do and read the commit messages that introduced the mapping
functionality for background on why this was done. In a nutshell, the
hardware has the ability to back pressure the Ethernet MAC behind the
CPU port in order to automatically rate limit the egress out of the
switch. So for instance, if your CPU tries to send 1Gb/sec of traffic to
a port that is linked to a link partner at 100Mbits/sec, there is out of
band information between the switch and the Ethernet DMA of the CPU port
to pace the TX completion interrupt rate to match 100Mbits/sec.

This is going to be different for you here obviously because the
hardware has not been specifically designed for that, so you do need to
rely on more standard constructs, like actual egress QoS on both ends.

> 
> Another thing I thought of using was just to use iptable's TOS module
> to set the minimal delay bit and rely on default behaviours, but Ive
> yet to find anything in the Marvell manual that indicates it could set
> that bit on all frames entering a port.
> 
> Another option might be to use vlans with their priority bits being
> used to steer to output queues, but I really dont want to introduce
> more virtual interfaces in to the setup, and I cant see how to
> configure an enforce default vlan tag with id 0 and priority bits set
> via linux userland tools.
> 
> 
> It does look like tc would be quite nice for configuring the egress
> rate limiting assuming we a netdev to target with the rate controls of
> the qdisc.
> 
> 
> So far, this seems like I am trying to shoe horn this stuff in to tc.
> It seems like tc is meant to configure how the ip stack  configures
> flow within the stack, whereas in a switch chip, the packets go nowhere
> near the CPUs kernel ip stack. I cant help thinking that it would be
> good have a specific utility for configuring switches that operates on
> the port level for manage flow within the chip, or maybe simple sysfs
> attributes to set the ports priority.

I am not looking at tc the same way you are doing, tc is just the tool
to configure all QoS/ingress/egress related operations on a network
device. Whether that network device can offload some of the TC
operations or not is where things get interesting.

TC has ingress filtering support, which is what you could use for
offloading broadcast storms, I would imagine that the following should
be possible to be offloaded (this is not a working command but you get
the idea):

tc qdisc add dev sw0p0 handle ffff: ingress
tc filter add dev sw0p0 parent ffff: protocol ip prio 1 u32 match ether
src 0xfffffffffffff police rate 100k burst 10k skip_sw

something along those lines is how I would implement ingress rate
limiting leveraging what the switch could do. This might mean adding
support for offloading specific TC filters, Jiri and Ido can certainly
suggest a cleverer way of achieving that same functionality.
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
  2019-09-12 17:41             ` Florian Fainelli
@ 2019-09-13 12:47               ` Robert Beckett
  0 siblings, 0 replies; 42+ messages in thread
From: Robert Beckett @ 2019-09-13 12:47 UTC (permalink / raw)
  To: Florian Fainelli, Ido Schimmel
  Cc: netdev, Andrew Lunn, Vivien Didelot, David S. Miller, Jiri Pirko,
	bob.beckett

On Thu, 2019-09-12 at 10:41 -0700, Florian Fainelli wrote:
> On 9/12/19 9:46 AM, Robert Beckett wrote:
> > On Thu, 2019-09-12 at 09:25 -0700, Florian Fainelli wrote:
> > > On 9/12/19 2:03 AM, Ido Schimmel wrote:
> > > > On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
> > > > > On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
> > > > > > On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli
> > > > > > wrote:
> > > > > > > +Ido, Jiri,
> > > > > > > 
> > > > > > > On 9/10/19 8:41 AM, Robert Beckett wrote:
> > > > > > > > This patch-set adds support for some features of the
> > > > > > > > Marvell
> > > > > > > > switch
> > > > > > > > chips that can be used to handle packet storms.
> > > > > > > > 
> > > > > > > > The rationale for this was a setup that requires the
> > > > > > > > ability to
> > > > > > > > receive
> > > > > > > > traffic from one port, while a packet storm is occuring
> > > > > > > > on
> > > > > > > > another port
> > > > > > > > (via an external switch with a deliberate loop). This
> > > > > > > > is
> > > > > > > > needed
> > > > > > > > to
> > > > > > > > ensure vital data delivery from a specific port, while
> > > > > > > > mitigating
> > > > > > > > any
> > > > > > > > loops or DoS that a user may introduce on another port
> > > > > > > > (can't
> > > > > > > > guarantee
> > > > > > > > sensible users).
> > > > > > > 
> > > > > > > The use case is reasonable, but the implementation is not
> > > > > > > really.
> > > > > > > You
> > > > > > > are using Device Tree which is meant to describe hardware
> > > > > > > as
> > > > > > > a
> > > > > > > policy
> > > > > > > holder for setting up queue priorities and likewise for
> > > > > > > queue
> > > > > > > scheduling.
> > > > > > > 
> > > > > > > The tool that should be used for that purpose is tc and
> > > > > > > possibly an
> > > > > > > appropriately offloaded queue scheduler in order to map
> > > > > > > the
> > > > > > > desired
> > > > > > > scheduling class to what the hardware supports.
> > > > > > > 
> > > > > > > Jiri, Ido, how do you guys support this with mlxsw?
> > > > > > 
> > > > > > Hi Florian,
> > > > > > 
> > > > > > Are you referring to policing traffic towards the CPU using
> > > > > > a
> > > > > > policer
> > > > > > on
> > > > > > the egress of the CPU port? At least that's what I
> > > > > > understand
> > > > > > from
> > > > > > the
> > > > > > description of patch 6 below.
> > > > > > 
> > > > > > If so, mlxsw sets policers for different traffic types
> > > > > > during
> > > > > > its
> > > > > > initialization sequence. These policers are not exposed to
> > > > > > the
> > > > > > user
> > > > > > nor
> > > > > > configurable. While the default settings are good for most
> > > > > > users, we
> > > > > > do
> > > > > > want to allow users to change these and expose current
> > > > > > settings.
> > > > > > 
> > > > > > I agree that tc seems like the right choice, but the
> > > > > > question
> > > > > > is
> > > > > > where
> > > > > > are we going to install the filters?
> > > > > > 
> > > > > 
> > > > > Before I go too far down the rabbit hole of tc traffic
> > > > > shaping,
> > > > > maybe
> > > > > it would be good to explain in more detail the problem I am
> > > > > trying to
> > > > > solve.
> > > > > 
> > > > > We have a setup as follows:
> > > > > 
> > > > > Marvell 88E6240 switch chip, accepting traffic from 4 ports.
> > > > > Port
> > > > > 1
> > > > > (P1) is critical priority, no dropped packets allowed, all
> > > > > others
> > > > > can
> > > > > be best effort.
> > > > > 
> > > > > CPU port of swtich chip is connected via phy to phy of intel
> > > > > i210
> > > > > (igb
> > > > > driver).
> > > > > 
> > > > > i210 is connected via pcie switch to imx6.
> > > > > 
> > > > > When too many small packets attempt to be delivered to CPU
> > > > > port
> > > > > (e.g.
> > > > > during broadcast flood) we saw dropped packets.
> > > > > 
> > > > > The packets were being received by i210 in to rx descriptor
> > > > > buffer
> > > > > fine, but the CPU could not keep up with the load. We saw
> > > > > rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
> > > > > 
> > > > > 
> > > > > With this in mind, I am wondering whether any amount of tc
> > > > > traffic
> > > > > shaping would help? Would tc shaping require that the packet
> > > > > reception
> > > > > manages to keep up before it can enact its policies? Does the
> > > > > infrastructure have accelerator offload hooks to be able to
> > > > > apply
> > > > > it
> > > > > via HW? I dont see how it would be able to inspect the
> > > > > packets to
> > > > > apply
> > > > > filtering if they were dropped due to rx descriptor
> > > > > exhaustion.
> > > > > (please
> > > > > bear with me with the basic questions, I am not familiar with
> > > > > this part
> > > > > of the stack).
> > > > > 
> > > > > Assuming that tc is still the way to go, after a brief look
> > > > > in to
> > > > > the
> > > > > man pages and the documentation at largc.org, it seems like
> > > > > it
> > > > > would
> > > > > need to use the ingress qdisc, with some sort of system to
> > > > > segregate
> > > > > and priortise based on ingress port. Is this possible?
> > > > 
> > > > Hi Robert,
> > > > 
> > > > As I see it, you have two problems here:
> > > > 
> > > > 1. Classification: Based on ingress port in your case
> > > > 
> > > > 2. Scheduling: How to schedule between the different
> > > > transmission
> > > > queues
> > > > 
> > > > Where the port from which the packets should egress is the CPU
> > > > port,
> > > > before they cross the PCI towards the imx6.
> > > > 
> > > > Both of these issues can be solved by tc. The main problem is
> > > > that
> > > > today
> > > > we do not have a netdev to represent the CPU port and therefore
> > > > can't
> > > > use existing infra like tc. I believe we need to create one.
> > > > Besides
> > > > scheduling, we can also use it to permit/deny certain traffic
> > > > from
> > > > reaching the CPU and perform policing.
> > > 
> > > We do not necessarily have to create a CPU netdev, we can overlay
> > > netdev
> > > operations onto the DSA master interface (fec in that case), and
> > > whenever you configure the DSA master interface, we also call
> > > back
> > > into
> > > the switch side for the CPU port. This is not necessarily the
> > > cleanest
> > > way to do things, but that is how we support ethtool operations
> > > (and
> > > some netdev operations incidentally), and it works
> > 
> > After reading up on tc, I am not sure how this would work given the
> > semantics of the tool currently.
> > 
> > My initial thought was to model the switch's 4 output queues using
> > an
> > mqprio qdisc for the CPU port, and then use either iptables's
> > classify
> > module on the input ports to set which queue it egresses from on
> > the
> > CPU port, or use vlan tagging with id 0 and priority set. (with the
> > many detail of how to implement them still left to discover).
> > 
> > However, it looks like the mqprio qdisc could only be used for
> > egress,
> > so without a netdev representing the CPU port, I dont know how it
> > could
> > be used.
> 
> If you are looking at mapping your DSA master/CPU port egress queues
> to
> actual switch egress queues, you can look at what bcm_sf2.c and
> bcmsysport.c do and read the commit messages that introduced the
> mapping
> functionality for background on why this was done. In a nutshell, the
> hardware has the ability to back pressure the Ethernet MAC behind the
> CPU port in order to automatically rate limit the egress out of the
> switch. So for instance, if your CPU tries to send 1Gb/sec of traffic
> to
> a port that is linked to a link partner at 100Mbits/sec, there is out
> of
> band information between the switch and the Ethernet DMA of the CPU
> port
> to pace the TX completion interrupt rate to match 100Mbits/sec.
> 
> This is going to be different for you here obviously because the
> hardware has not been specifically designed for that, so you do need
> to
> rely on more standard constructs, like actual egress QoS on both
> ends.
> 
> > 
> > Another thing I thought of using was just to use iptable's TOS
> > module
> > to set the minimal delay bit and rely on default behaviours, but
> > Ive
> > yet to find anything in the Marvell manual that indicates it could
> > set
> > that bit on all frames entering a port.
> > 
> > Another option might be to use vlans with their priority bits being
> > used to steer to output queues, but I really dont want to introduce
> > more virtual interfaces in to the setup, and I cant see how to
> > configure an enforce default vlan tag with id 0 and priority bits
> > set
> > via linux userland tools.
> > 
> > 
> > It does look like tc would be quite nice for configuring the egress
> > rate limiting assuming we a netdev to target with the rate controls
> > of
> > the qdisc.
> > 
> > 
> > So far, this seems like I am trying to shoe horn this stuff in to
> > tc.
> > It seems like tc is meant to configure how the ip stack  configures
> > flow within the stack, whereas in a switch chip, the packets go
> > nowhere
> > near the CPUs kernel ip stack. I cant help thinking that it would
> > be
> > good have a specific utility for configuring switches that operates
> > on
> > the port level for manage flow within the chip, or maybe simple
> > sysfs
> > attributes to set the ports priority.
> 
> I am not looking at tc the same way you are doing, tc is just the
> tool
> to configure all QoS/ingress/egress related operations on a network
> device. Whether that network device can offload some of the TC
> operations or not is where things get interesting.
> 
> TC has ingress filtering support, which is what you could use for
> offloading broadcast storms, I would imagine that the following
> should
> be possible to be offloaded (this is not a working command but you
> get
> the idea):
> 
> tc qdisc add dev sw0p0 handle ffff: ingress
> tc filter add dev sw0p0 parent ffff: protocol ip prio 1 u32 match
> ether
> src 0xfffffffffffff police rate 100k burst 10k skip_sw
> 
> something along those lines is how I would implement ingress rate
> limiting leveraging what the switch could do. This might mean adding
> support for offloading specific TC filters, Jiri and Ido can
> certainly
> suggest a cleverer way of achieving that same functionality.

Thanks for your thoughts on this, its been very helpful in leanring the
stack and coming up with ideas for a better design.

I wrote up a set of high level options for discussions internally, and
would appreciate any feedback you had on them:

To get this upstreamed, I think we will need something like the
following high level design:

1. Handle egress rate limiting
	1.1. Add frames per second as a rate metric usable throughout
tc and associated kernel interfaces.
	1.2. Handle any changes required to make a command like this
work:

tc qdisc add dev enp4s0 handle ffff: ingress
tc filter add dev enp4s0 parent ffff: protocol ip prio 1 u32 match
ether src 0xfffffffffffffffff police rate 50kfrm burst 10kfrm

	This should mostly already work as a valid command, maybe some
changes for handling the new frame based rates.

	1.3. Add tc bindings in dsa driver that hook in to the parent
netdev's tc bindings (similar to how the ethtool bindings hook in to
the parent's ethtool bindings) to setup the HW egress rate limiting as
done in the existing patch.

2. Add ability to set output queue scheduling algorithm
Currently no netdev is created for the CPU port, so it can't be
targeted by tc or any of the other userland utilities. We need to be
able to set settings for that port. Currently I can think of 2 options:

	2.1. Add a netdev to represent the CPU port. This will likely
face objections from some people upstream, though it has already been
suggested as a way to handle this by others upstream.
This would likely require a lot of effort and learning to figure out
how to do this in a way that doesn't start to break key assumptions
with the rest of dsa (like the CPU port not having its own IP address).
If this were achieved, we could then do one of the following:
		2.1.1. Use mqprio disc to model the 4 output queues with a new
parameter to select the scheduling mode.
		2.1.2. Add an ethtool priv settings capability (similar to priv
flags) that configures the scheduling mode. This would be my preferred
method as it allows port prioritization irrespective of linux's qdisc
priorities, which seems to model the HW better.

	2.2. Add tc bindings similar to 1.3 above, which allow us to
define a new ingress qdisc parameter for scheduling mode, which the CPU
port code can see due to hooking in to the parent device's tc bindings.
This might be the simplest approach to implement, but feels a bit hinky
w.r.t the semantics of ingress qdisc as we are actually specifying the
scheduling of output queues for its link partner.

3. Add ability to set default queue priority of incoming packets on a
port.
This could be done as a new parameter for tc's ingress qdisc. I would
suggest that it should specify an 802.1p priority number (e.g. "hwprio
3" would specify all traffic ingressing should be considered critical)
as this neatly lines up with the 8 priority levels used in mqprio and
could be extended further to allow/disallow priority setting per frame
from 802.1q tags (e.g. "hwprio 3 notag" or something similar).
This might balloon in required effort, particularly if we have to
handle the none HW offload path, requiring us to figure out and
implement a priority tagging within the kernel's buffers. This would
likely be able to reuse a lot of the infrastructure in place for 802.1q
tagging that currently exists within the kernel, though Ive not looked
in to those code paths to estimate difficulty.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/7] net/dsa: configure autoneg for CPU port
  2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
                     ` (3 preceding siblings ...)
  2019-09-11 11:43   ` kbuild test robot
@ 2019-09-14  7:16   ` kbuild test robot
  4 siblings, 0 replies; 42+ messages in thread
From: kbuild test robot @ 2019-09-14  7:16 UTC (permalink / raw)
  To: Robert Beckett
  Cc: kbuild-all, netdev, Robert Beckett, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, David S. Miller

Hi Robert,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Robert-Beckett/net-dsa-mv88e6xxx-features-to-handle-network-storms/20190911-142233
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.1-rc1-7-g2b96cd8-dirty
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

   include/linux/sched.h:609:43: sparse: sparse: bad integer constant expression
   include/linux/sched.h:609:73: sparse: sparse: invalid named zero-width bitfield `value'
   include/linux/sched.h:610:43: sparse: sparse: bad integer constant expression
   include/linux/sched.h:610:67: sparse: sparse: invalid named zero-width bitfield `bucket_id'
>> net/dsa/port.c:541:55: sparse: sparse: incompatible types for operation (|)
>> net/dsa/port.c:541:55: sparse:    left side has type unsigned long *
>> net/dsa/port.c:541:55: sparse:    right side has type unsigned long

vim +541 net/dsa/port.c

   525	
   526	static int dsa_port_setup_phy_of(struct dsa_port *dp, bool enable)
   527	{
   528		struct dsa_switch *ds = dp->ds;
   529		struct phy_device *phydev;
   530		int port = dp->index;
   531		int err = 0;
   532	
   533		phydev = dsa_port_get_phy_device(dp);
   534		if (!phydev)
   535			return 0;
   536	
   537		if (IS_ERR(phydev))
   538			return PTR_ERR(phydev);
   539	
   540		if (enable) {
 > 541			phydev->supported = PHY_GBIT_FEATURES | SUPPORTED_MII |
   542					    SUPPORTED_AUI | SUPPORTED_FIBRE |
   543					    SUPPORTED_BNC | SUPPORTED_Pause |
   544					    SUPPORTED_Asym_Pause;
   545			phydev->advertising = phydev->supported;
   546	
   547			err = genphy_config_init(phydev);
   548			if (err < 0)
   549				goto err_put_dev;
   550	
   551			err = genphy_config_aneg(phydev);
   552			if (err < 0)
   553				goto err_put_dev;
   554	
   555			err = genphy_resume(phydev);
   556			if (err < 0)
   557				goto err_put_dev;
   558	
   559			err = genphy_read_status(phydev);
   560			if (err < 0)
   561				goto err_put_dev;
   562		} else {
   563			err = genphy_suspend(phydev);
   564			if (err < 0)
   565				goto err_put_dev;
   566		}
   567	
   568		if (ds->ops->adjust_link)
   569			ds->ops->adjust_link(ds, port, phydev);
   570	
   571		dev_dbg(ds->dev, "enabled port's phy: %s", phydev_name(phydev));
   572	
   573	err_put_dev:
   574		put_device(&phydev->mdio.dev);
   575		return err;
   576	}
   577	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, back to index

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
2019-09-10 16:14   ` Vivien Didelot
2019-09-10 16:56   ` Florian Fainelli
2019-09-10 18:26   ` Andrew Lunn
2019-09-10 18:29     ` Florian Fainelli
2019-09-11  9:16       ` Robert Beckett
2019-09-11  9:54         ` Robert Beckett
2019-09-11 22:52         ` Andrew Lunn
2019-09-12 10:14           ` Robert Beckett
2019-09-12 10:43             ` Andrew Lunn
2019-09-11 11:43   ` kbuild test robot
2019-09-14  7:16   ` kbuild test robot
2019-09-10 15:41 ` [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port Robert Beckett
2019-09-10 16:43   ` Vivien Didelot
2019-09-10 15:41 ` [PATCH 3/7] dt-bindings: " Robert Beckett
2019-09-10 16:42   ` Florian Fainelli
2019-09-10 16:49     ` Vivien Didelot
2019-09-10 20:46       ` Vladimir Oltean
2019-09-10 15:41 ` [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling Robert Beckett
2019-09-10 17:18   ` Vivien Didelot
2019-09-10 15:41 ` [PATCH 5/7] dt-bindings: " Robert Beckett
2019-09-10 15:41 ` [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting Robert Beckett
2019-09-10 17:13   ` Vivien Didelot
2019-09-11 12:26   ` kbuild test robot
2019-09-10 15:41 ` [PATCH 7/7] dt-bindings: " Robert Beckett
2019-09-10 16:49 ` [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Florian Fainelli
2019-09-11  9:43   ` Robert Beckett
2019-09-11 11:21   ` Ido Schimmel
2019-09-11 11:49     ` Robert Beckett
2019-09-11 22:58       ` Andrew Lunn
2019-09-12  9:05         ` Ido Schimmel
2019-09-12  9:03       ` Ido Schimmel
2019-09-12  9:21         ` Andrew Lunn
2019-09-12 16:25         ` Florian Fainelli
2019-09-12 16:46           ` Robert Beckett
2019-09-12 17:41             ` Florian Fainelli
2019-09-13 12:47               ` Robert Beckett
2019-09-10 17:19 ` Vivien Didelot
2019-09-11  9:46   ` Robert Beckett
2019-09-11 15:31     ` Vivien Didelot
2019-09-11 23:01     ` Andrew Lunn

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox