netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] Configure the MTU on DSA switches
@ 2019-11-23 19:48 Vladimir Oltean
  2019-11-23 19:48 ` [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports Vladimir Oltean
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 19:48 UTC (permalink / raw)
  To: andrew, f.fainelli, vivien.didelot, davem, jakub.kicinski
  Cc: netdev, Vladimir Oltean

This small series adds support for configuring the MTU on front-panel
switch ports, while seamlessly adapting the CPU port and the DSA master
to the largest value plus the tagger overhead.

Support added for the sja1105 and vsc73xx drivers at the moment. Will
add for felix as well if the series is well-received.

Vladimir Oltean (3):
  net: dsa: Configure the MTU for switch ports
  net: dsa: sja1105: Implement the port MTU callbacks
  net: dsa: vsc73xx: Make the MTU configurable

 drivers/net/dsa/sja1105/sja1105.h      |  1 +
 drivers/net/dsa/sja1105/sja1105_main.c | 48 +++++++++++++++++--
 drivers/net/dsa/vitesse-vsc73xx-core.c | 30 ++++++++----
 include/net/dsa.h                      |  2 +
 net/dsa/dsa_priv.h                     |  2 +
 net/dsa/master.c                       | 13 +++---
 net/dsa/slave.c                        | 64 +++++++++++++++++++++++++-
 7 files changed, 138 insertions(+), 22 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 19:48 [PATCH net-next 0/3] Configure the MTU on DSA switches Vladimir Oltean
@ 2019-11-23 19:48 ` Vladimir Oltean
  2019-11-23 20:27   ` Florian Fainelli
  2019-11-24 22:48   ` Andrew Lunn
  2019-11-23 19:48 ` [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks Vladimir Oltean
  2019-11-23 19:48 ` [PATCH net-next 3/3] net: dsa: vsc73xx: Make the MTU configurable Vladimir Oltean
  2 siblings, 2 replies; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 19:48 UTC (permalink / raw)
  To: andrew, f.fainelli, vivien.didelot, davem, jakub.kicinski
  Cc: netdev, Vladimir Oltean

It is useful be able to configure port policers on a switch to accept
frames of various sizes:

- Increase the MTU for better throughput from the default of 1500 if it
  is known that there is no 10/100 Mbps device in the network.
- Decrease the MTU to limit the latency of high-priority frames under
  congestion.

For DSA slave ports, this is mostly a pass-through callback, called
through the regular ndo ops and at probe time (to ensure consistency
across all supported switches).

The CPU port is called with an MTU equal to the largest configured MTU
of the slave ports. The assumption is that the user might want to
sustain a bidirectional conversation with a partner over any switch
port.

The DSA master is configured the same as the CPU port, plus the tagger
overhead. Since the MTU is by definition L2 payload (sans Ethernet
header), it is up to each individual driver to figure out if it needs to
do anything special for its frame tags on the CPU port (it shouldn't
except in special cases). So the MTU does not contain the tagger
overhead on the CPU port.
However the MTU of the DSA master, minus the tagger overhead, is used as
a proxy for the MTU of the CPU port, which does not have a net device.
This is to avoid uselessly calling the .change_mtu function on the CPU
port when nothing should change.

So it is safe to assume that the DSA master and the CPU port MTUs are
apart by exactly the tagger's overhead in bytes.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 include/net/dsa.h  |  2 ++
 net/dsa/dsa_priv.h |  2 ++
 net/dsa/master.c   | 13 +++++-----
 net/dsa/slave.c    | 64 +++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 73 insertions(+), 8 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 6767dc3f66c0..c25eec2b8cc0 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -382,6 +382,8 @@ struct dsa_switch_ops {
 	int	(*setup)(struct dsa_switch *ds);
 	void	(*teardown)(struct dsa_switch *ds);
 	u32	(*get_phy_flags)(struct dsa_switch *ds, int port);
+	int	(*change_mtu)(struct dsa_switch *ds, int port, int new_mtu);
+	int	(*get_max_mtu)(struct dsa_switch *ds, int port);
 
 	/*
 	 * Access to the switch's PHY registers.
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 53e7577896b6..1501f06643ca 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -98,6 +98,8 @@ int dsa_legacy_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
 /* master.c */
 int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp);
 void dsa_master_teardown(struct net_device *dev);
+int dsa_master_set_mtu(struct net_device *dev, struct dsa_port *cpu_dp,
+		       int mtu);
 
 static inline struct net_device *dsa_master_find_slave(struct net_device *dev,
 						       int device, int port)
diff --git a/net/dsa/master.c b/net/dsa/master.c
index 3255dfc97f86..7efcce6ef05d 100644
--- a/net/dsa/master.c
+++ b/net/dsa/master.c
@@ -284,18 +284,19 @@ static const struct attribute_group dsa_group = {
 	.attrs	= dsa_slave_attrs,
 };
 
-static void dsa_master_set_mtu(struct net_device *dev, struct dsa_port *cpu_dp)
+/* Needs to be called under rtnl_lock */
+int dsa_master_set_mtu(struct net_device *dev, struct dsa_port *cpu_dp,
+		       int mtu)
 {
-	unsigned int mtu = ETH_DATA_LEN + cpu_dp->tag_ops->overhead;
-	int err;
+	int err = -ERANGE;
 
-	rtnl_lock();
 	if (mtu <= dev->max_mtu) {
 		err = dev_set_mtu(dev, mtu);
 		if (err)
 			netdev_dbg(dev, "Unable to set MTU to include for DSA overheads\n");
 	}
-	rtnl_unlock();
+
+	return err;
 }
 
 static void dsa_master_reset_mtu(struct net_device *dev)
@@ -314,8 +315,6 @@ int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp)
 {
 	int ret;
 
-	dsa_master_set_mtu(dev,  cpu_dp);
-
 	/* If we use a tagging format that doesn't have an ethertype
 	 * field, make sure that all packets from this point on get
 	 * sent to the tag format's receive function.
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 78ffc87dc25e..acdeeee191d7 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -149,6 +149,60 @@ static void dsa_slave_change_rx_flags(struct net_device *dev, int change)
 	}
 }
 
+static int dsa_slave_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct net_device *master = dsa_slave_to_master(dev);
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_switch *ds = p->dp->ds;
+	struct dsa_port *cpu_dp;
+	int port = p->dp->index;
+	int max_mtu = 0;
+	int cpu_mtu;
+	int err, i;
+
+	if (!ds->ops->change_mtu)
+		return -EOPNOTSUPP;
+
+	err = ds->ops->change_mtu(ds, port, new_mtu);
+	if (err < 0)
+		return err;
+
+	dev->mtu = new_mtu;
+
+	for (i = 0; i < ds->num_ports; i++) {
+		if (!dsa_is_user_port(ds, i))
+			continue;
+
+		/* During probe, this function will be called for each slave
+		 * device, while not all of them have been allocated. That's
+		 * ok, it doesn't change what the maximum is, so ignore it.
+		 */
+		if (!dsa_to_port(ds, i)->slave)
+			continue;
+
+		if (max_mtu < dsa_to_port(ds, i)->slave->mtu)
+			max_mtu = dsa_to_port(ds, i)->slave->mtu;
+	}
+
+	cpu_dp = dsa_to_port(ds, port)->cpu_dp;
+
+	max_mtu += cpu_dp->tag_ops->overhead;
+	cpu_mtu = master->mtu;
+
+	if (max_mtu != cpu_mtu) {
+		err = ds->ops->change_mtu(ds, dsa_upstream_port(ds, port),
+					  max_mtu - cpu_dp->tag_ops->overhead);
+		if (err < 0)
+			return err;
+
+		err = dsa_master_set_mtu(master, cpu_dp, max_mtu);
+		if (err < 0)
+			return err;
+	}
+
+	return err;
+}
+
 static void dsa_slave_set_rx_mode(struct net_device *dev)
 {
 	struct net_device *master = dsa_slave_to_master(dev);
@@ -1248,6 +1302,7 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
 	.ndo_stop		= dsa_slave_close,
 	.ndo_start_xmit		= dsa_slave_xmit,
 	.ndo_change_rx_flags	= dsa_slave_change_rx_flags,
+	.ndo_change_mtu		= dsa_slave_change_mtu,
 	.ndo_set_rx_mode	= dsa_slave_set_rx_mode,
 	.ndo_set_mac_address	= dsa_slave_set_mac_address,
 	.ndo_fdb_add		= dsa_legacy_fdb_add,
@@ -1440,7 +1495,10 @@ int dsa_slave_create(struct dsa_port *port)
 	slave_dev->priv_flags |= IFF_NO_QUEUE;
 	slave_dev->netdev_ops = &dsa_slave_netdev_ops;
 	slave_dev->min_mtu = 0;
-	slave_dev->max_mtu = ETH_MAX_MTU;
+	if (ds->ops->get_max_mtu)
+		slave_dev->max_mtu = ds->ops->get_max_mtu(ds, port->index);
+	else
+		slave_dev->max_mtu = ETH_MAX_MTU;
 	SET_NETDEV_DEVTYPE(slave_dev, &dsa_type);
 
 	SET_NETDEV_DEV(slave_dev, port->ds->dev);
@@ -1460,6 +1518,10 @@ int dsa_slave_create(struct dsa_port *port)
 	p->xmit = cpu_dp->tag_ops->xmit;
 	port->slave = slave_dev;
 
+	rtnl_lock();
+	dsa_slave_change_mtu(slave_dev, ETH_DATA_LEN);
+	rtnl_unlock();
+
 	netif_carrier_off(slave_dev);
 
 	ret = dsa_slave_phy_setup(slave_dev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks
  2019-11-23 19:48 [PATCH net-next 0/3] Configure the MTU on DSA switches Vladimir Oltean
  2019-11-23 19:48 ` [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports Vladimir Oltean
@ 2019-11-23 19:48 ` Vladimir Oltean
  2019-11-23 20:30   ` Florian Fainelli
  2019-11-23 19:48 ` [PATCH net-next 3/3] net: dsa: vsc73xx: Make the MTU configurable Vladimir Oltean
  2 siblings, 1 reply; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 19:48 UTC (permalink / raw)
  To: andrew, f.fainelli, vivien.didelot, davem, jakub.kicinski
  Cc: netdev, Vladimir Oltean

On this switch, the frame length enforcements are performed by the
ingress policers. There are 2 types of those: regular L2 (also called
best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
defining L2 streams with certain QoS abilities). To avoid future
confusion, I prefer to call the reset reason "Best-effort policers",
even though the VL policers are not yet supported.

We also need to change the setup of the initial static config, such that
DSA calls to .change_mtu (which are expensive) become no-ops and don't
reset the switch 5 times.

A driver-level decision is to unconditionally allow single VLAN-tagged
traffic on all ports. The CPU port must accept an additional VLAN header
for the DSA tag, which is again a driver-level decision.

The policers actually count bytes not only from the SDU, but also from
the Ethernet header and FCS, so those need to be accounted for as well.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/dsa/sja1105/sja1105.h      |  1 +
 drivers/net/dsa/sja1105/sja1105_main.c | 48 +++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index d801fc204d19..3a5c8acb6e2a 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -122,6 +122,7 @@ enum sja1105_reset_reason {
 	SJA1105_RX_HWTSTAMPING,
 	SJA1105_AGEING_TIME,
 	SJA1105_SCHEDULING,
+	SJA1105_BEST_EFFORT_POLICING,
 };
 
 int sja1105_static_config_reload(struct sja1105_private *priv,
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index b60224c55244..3d55dd3c7e83 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -459,12 +459,12 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
 #define SJA1105_RATE_MBPS(speed) (((speed) * 64000) / 1000)
 
 static void sja1105_setup_policer(struct sja1105_l2_policing_entry *policing,
-				  int index)
+				  int index, int mtu)
 {
 	policing[index].sharindx = index;
 	policing[index].smax = 65535; /* Burst size in bytes */
 	policing[index].rate = SJA1105_RATE_MBPS(1000);
-	policing[index].maxlen = ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN;
+	policing[index].maxlen = mtu;
 	policing[index].partition = 0;
 }
 
@@ -496,12 +496,16 @@ static int sja1105_init_l2_policing(struct sja1105_private *priv)
 	 */
 	for (i = 0, k = 0; i < SJA1105_NUM_PORTS; i++) {
 		int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + i;
+		int mtu = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
+
+		if (dsa_is_cpu_port(priv->ds, i))
+			mtu += VLAN_HLEN;
 
 		for (j = 0; j < SJA1105_NUM_TC; j++, k++)
-			sja1105_setup_policer(policing, k);
+			sja1105_setup_policer(policing, k, mtu);
 
 		/* Set up this port's policer for broadcast traffic */
-		sja1105_setup_policer(policing, bcast);
+		sja1105_setup_policer(policing, bcast, mtu);
 	}
 	return 0;
 }
@@ -1346,6 +1350,7 @@ static const char * const sja1105_reset_reasons[] = {
 	[SJA1105_RX_HWTSTAMPING] = "RX timestamping",
 	[SJA1105_AGEING_TIME] = "Ageing time",
 	[SJA1105_SCHEDULING] = "Time-aware scheduling",
+	[SJA1105_BEST_EFFORT_POLICING] = "Best-effort policing",
 };
 
 /* For situations where we need to change a setting at runtime that is only
@@ -1886,6 +1891,39 @@ static int sja1105_set_ageing_time(struct dsa_switch *ds,
 	return sja1105_static_config_reload(priv, SJA1105_AGEING_TIME);
 }
 
+static int sja1105_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+{
+	int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
+	struct sja1105_l2_policing_entry *policing;
+	struct sja1105_private *priv = ds->priv;
+	int tc;
+
+	new_mtu += VLAN_ETH_HLEN + ETH_FCS_LEN;
+
+	if (dsa_is_cpu_port(ds, port))
+		new_mtu += VLAN_HLEN;
+
+	policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
+
+	/* We set all 9 port policers to the same value, so just checking the
+	 * broadcast one is fine.
+	 */
+	if (policing[bcast].maxlen == new_mtu)
+		return 0;
+
+	for (tc = 0; tc < SJA1105_NUM_TC; tc++)
+		policing[port * SJA1105_NUM_TC + tc].maxlen = new_mtu;
+
+	policing[bcast].maxlen = new_mtu;
+
+	return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
+}
+
+static int sja1105_get_max_mtu(struct dsa_switch *ds, int port)
+{
+	return 2043 - VLAN_ETH_HLEN - ETH_FCS_LEN;
+}
+
 static int sja1105_port_setup_tc(struct dsa_switch *ds, int port,
 				 enum tc_setup_type type,
 				 void *type_data)
@@ -1981,6 +2019,8 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
 	.setup			= sja1105_setup,
 	.teardown		= sja1105_teardown,
 	.set_ageing_time	= sja1105_set_ageing_time,
+	.change_mtu		= sja1105_change_mtu,
+	.get_max_mtu		= sja1105_get_max_mtu,
 	.phylink_validate	= sja1105_phylink_validate,
 	.phylink_mac_config	= sja1105_mac_config,
 	.phylink_mac_link_up	= sja1105_mac_link_up,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 3/3] net: dsa: vsc73xx: Make the MTU configurable
  2019-11-23 19:48 [PATCH net-next 0/3] Configure the MTU on DSA switches Vladimir Oltean
  2019-11-23 19:48 ` [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports Vladimir Oltean
  2019-11-23 19:48 ` [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks Vladimir Oltean
@ 2019-11-23 19:48 ` Vladimir Oltean
  2 siblings, 0 replies; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 19:48 UTC (permalink / raw)
  To: andrew, f.fainelli, vivien.didelot, davem, jakub.kicinski
  Cc: netdev, Vladimir Oltean

Instead of hardcoding the MTU to the maximum value allowed by the
hardware, obey the value known by the operating system.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/dsa/vitesse-vsc73xx-core.c | 30 +++++++++++++++++---------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/net/dsa/vitesse-vsc73xx-core.c b/drivers/net/dsa/vitesse-vsc73xx-core.c
index 42c1574d45f2..eed98566e2bf 100644
--- a/drivers/net/dsa/vitesse-vsc73xx-core.c
+++ b/drivers/net/dsa/vitesse-vsc73xx-core.c
@@ -663,16 +663,6 @@ static void vsc73xx_init_port(struct vsc73xx *vsc, int port)
 		      VSC73XX_MAC_CFG_TX_EN |
 		      VSC73XX_MAC_CFG_RX_EN);
 
-	/* Max length, we can do up to 9.6 KiB, so allow that.
-	 * According to application not "VSC7398 Jumbo Frames" setting
-	 * up the MTU to 9.6 KB does not affect the performance on standard
-	 * frames, so just enable it. It is clear from the application note
-	 * that "9.6 kilobytes" == 9600 bytes.
-	 */
-	vsc73xx_write(vsc, VSC73XX_BLOCK_MAC,
-		      port,
-		      VSC73XX_MAXLEN, 9600);
-
 	/* Flow control for the CPU port:
 	 * Use a zero delay pause frame when pause condition is left
 	 * Obey pause control frames
@@ -1029,6 +1019,24 @@ static void vsc73xx_get_ethtool_stats(struct dsa_switch *ds, int port,
 	}
 }
 
+static int vsc73xx_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+{
+	struct vsc73xx *vsc = ds->priv;
+
+	return vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port,
+			     VSC73XX_MAXLEN, new_mtu);
+}
+
+/* According to application not "VSC7398 Jumbo Frames" setting
+ * up the MTU to 9.6 KB does not affect the performance on standard
+ * frames. It is clear from the application note that
+ * "9.6 kilobytes" == 9600 bytes.
+ */
+static int vsc73xx_get_max_mtu(struct dsa_switch *ds, int port)
+{
+	return 9600;
+}
+
 static const struct dsa_switch_ops vsc73xx_ds_ops = {
 	.get_tag_protocol = vsc73xx_get_tag_protocol,
 	.setup = vsc73xx_setup,
@@ -1040,6 +1048,8 @@ static const struct dsa_switch_ops vsc73xx_ds_ops = {
 	.get_sset_count = vsc73xx_get_sset_count,
 	.port_enable = vsc73xx_port_enable,
 	.port_disable = vsc73xx_port_disable,
+	.change_mtu = vsc73xx_change_mtu,
+	.get_max_mtu = vsc73xx_get_max_mtu,
 };
 
 static int vsc73xx_gpio_get(struct gpio_chip *chip, unsigned int offset)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 19:48 ` [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports Vladimir Oltean
@ 2019-11-23 20:27   ` Florian Fainelli
  2019-11-23 20:46     ` Vladimir Oltean
  2019-11-23 20:51     ` Vladimir Oltean
  2019-11-24 22:48   ` Andrew Lunn
  1 sibling, 2 replies; 16+ messages in thread
From: Florian Fainelli @ 2019-11-23 20:27 UTC (permalink / raw)
  To: Vladimir Oltean, andrew, vivien.didelot, davem, jakub.kicinski; +Cc: netdev

Hi Vladimir,

On 11/23/2019 11:48 AM, Vladimir Oltean wrote:
> It is useful be able to configure port policers on a switch to accept
> frames of various sizes:
> 
> - Increase the MTU for better throughput from the default of 1500 if it
>   is known that there is no 10/100 Mbps device in the network.
> - Decrease the MTU to limit the latency of high-priority frames under
>   congestion.
> 
> For DSA slave ports, this is mostly a pass-through callback, called
> through the regular ndo ops and at probe time (to ensure consistency
> across all supported switches).
> 
> The CPU port is called with an MTU equal to the largest configured MTU
> of the slave ports. The assumption is that the user might want to
> sustain a bidirectional conversation with a partner over any switch
> port.
> 
> The DSA master is configured the same as the CPU port, plus the tagger
> overhead. Since the MTU is by definition L2 payload (sans Ethernet
> header), it is up to each individual driver to figure out if it needs to
> do anything special for its frame tags on the CPU port (it shouldn't
> except in special cases). So the MTU does not contain the tagger
> overhead on the CPU port.
> However the MTU of the DSA master, minus the tagger overhead, is used as
> a proxy for the MTU of the CPU port, which does not have a net device.
> This is to avoid uselessly calling the .change_mtu function on the CPU
> port when nothing should change.
> 
> So it is safe to assume that the DSA master and the CPU port MTUs are
> apart by exactly the tagger's overhead in bytes.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---

[snip]
> +static int dsa_slave_change_mtu(struct net_device *dev, int new_mtu)
> +{
> +	struct net_device *master = dsa_slave_to_master(dev);
> +	struct dsa_slave_priv *p = netdev_priv(dev);
> +	struct dsa_switch *ds = p->dp->ds;
> +	struct dsa_port *cpu_dp;
> +	int port = p->dp->index;
> +	int max_mtu = 0;
> +	int cpu_mtu;
> +	int err, i;
> +
> +	if (!ds->ops->change_mtu)
> +		return -EOPNOTSUPP;
> +
> +	err = ds->ops->change_mtu(ds, port, new_mtu);
> +	if (err < 0)
> +		return err;
> +
> +	dev->mtu = new_mtu;
> +
> +	for (i = 0; i < ds->num_ports; i++) {
> +		if (!dsa_is_user_port(ds, i))
> +			continue;
> +
> +		/* During probe, this function will be called for each slave
> +		 * device, while not all of them have been allocated. That's
> +		 * ok, it doesn't change what the maximum is, so ignore it.
> +		 */
> +		if (!dsa_to_port(ds, i)->slave)
> +			continue;
> +
> +		if (max_mtu < dsa_to_port(ds, i)->slave->mtu)
> +			max_mtu = dsa_to_port(ds, i)->slave->mtu;
> +	}
> +
> +	cpu_dp = dsa_to_port(ds, port)->cpu_dp;
> +
> +	max_mtu += cpu_dp->tag_ops->overhead;
> +	cpu_mtu = master->mtu;
> +
> +	if (max_mtu != cpu_mtu) {
> +		err = ds->ops->change_mtu(ds, dsa_upstream_port(ds, port),
> +					  max_mtu - cpu_dp->tag_ops->overhead);
> +		if (err < 0)
> +			return err;

Before changing and committing the slave_dev's MTU you should actually
perform these two operations first to make sure that you can honor the
user port MTU that is requested. Here, you would possibly leave an user
port configured for a MTU value that is unsupported by the upstream
port(s) and/or the CPU port and/or the DSA master device, which could
possibly break frame forwarding depending on what the switch is willing
to accept.

I had prepared a patch series with Murali doing nearly the same thing
and targeting Broadcom switches nearly a year ago but since I never got
feedback whether this worked properly for the use case he was after, I
did not submit it since I did not need it personally and found it to be
a nice can of worms.

Another thing that I had not gotten around testing was making sure that
when a slave_dev gets enslaved as a bridge port member, that bridge MTU
normalization would kick in and make sure that if you have say: port 0
configured with MTU 1500 and port 1 configured with MTU 9000, the bridge
would normalize to MTU 1500 as you would expect.

https://github.com/ffainelli/linux/commits/dsa-mtu

This should be a DSA switch fabric notifier IMHO because changing the
MTU on an user port implies changing the MTU on every DSA port in
between plus the CPU port. Your approach here works for the first
upstream port, but not for the ones in between, and there can be more,
as is common with the ZII devel Rev. B and C boards.
-- 
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks
  2019-11-23 19:48 ` [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks Vladimir Oltean
@ 2019-11-23 20:30   ` Florian Fainelli
  2019-11-23 20:48     ` Vladimir Oltean
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Fainelli @ 2019-11-23 20:30 UTC (permalink / raw)
  To: Vladimir Oltean, andrew, vivien.didelot, davem, jakub.kicinski; +Cc: netdev



On 11/23/2019 11:48 AM, Vladimir Oltean wrote:
> On this switch, the frame length enforcements are performed by the
> ingress policers. There are 2 types of those: regular L2 (also called
> best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
> defining L2 streams with certain QoS abilities). To avoid future
> confusion, I prefer to call the reset reason "Best-effort policers",
> even though the VL policers are not yet supported.
> 
> We also need to change the setup of the initial static config, such that
> DSA calls to .change_mtu (which are expensive) become no-ops and don't
> reset the switch 5 times.
> 
> A driver-level decision is to unconditionally allow single VLAN-tagged
> traffic on all ports. The CPU port must accept an additional VLAN header
> for the DSA tag, which is again a driver-level decision.
> 
> The policers actually count bytes not only from the SDU, but also from
> the Ethernet header and FCS, so those need to be accounted for as well.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  drivers/net/dsa/sja1105/sja1105.h      |  1 +
>  drivers/net/dsa/sja1105/sja1105_main.c | 48 +++++++++++++++++++++++---
>  2 files changed, 45 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
> index d801fc204d19..3a5c8acb6e2a 100644
> --- a/drivers/net/dsa/sja1105/sja1105.h
> +++ b/drivers/net/dsa/sja1105/sja1105.h
> @@ -122,6 +122,7 @@ enum sja1105_reset_reason {
>  	SJA1105_RX_HWTSTAMPING,
>  	SJA1105_AGEING_TIME,
>  	SJA1105_SCHEDULING,
> +	SJA1105_BEST_EFFORT_POLICING,
>  };
>  
>  int sja1105_static_config_reload(struct sja1105_private *priv,
> diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> index b60224c55244..3d55dd3c7e83 100644
> --- a/drivers/net/dsa/sja1105/sja1105_main.c
> +++ b/drivers/net/dsa/sja1105/sja1105_main.c
> @@ -459,12 +459,12 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
>  #define SJA1105_RATE_MBPS(speed) (((speed) * 64000) / 1000)
>  
>  static void sja1105_setup_policer(struct sja1105_l2_policing_entry *policing,
> -				  int index)
> +				  int index, int mtu)
>  {
>  	policing[index].sharindx = index;
>  	policing[index].smax = 65535; /* Burst size in bytes */
>  	policing[index].rate = SJA1105_RATE_MBPS(1000);
> -	policing[index].maxlen = ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN;
> +	policing[index].maxlen = mtu;
>  	policing[index].partition = 0;
>  }
>  
> @@ -496,12 +496,16 @@ static int sja1105_init_l2_policing(struct sja1105_private *priv)
>  	 */
>  	for (i = 0, k = 0; i < SJA1105_NUM_PORTS; i++) {
>  		int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + i;
> +		int mtu = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
> +
> +		if (dsa_is_cpu_port(priv->ds, i))
> +			mtu += VLAN_HLEN;

That really seems like a layering violation it so happens that you use
DSA_TAG_8021Q which is why you need VLAN_ETH_HLEN, but you should not
assume that from with your driver, even if this one is special on so
many counts. How about using use dsa_port(port)->tag_ops->overhead +
ETH_HLEN here?
>  
>  		for (j = 0; j < SJA1105_NUM_TC; j++, k++)
> -			sja1105_setup_policer(policing, k);
> +			sja1105_setup_policer(policing, k, mtu);
>  
>  		/* Set up this port's policer for broadcast traffic */
> -		sja1105_setup_policer(policing, bcast);
> +		sja1105_setup_policer(policing, bcast, mtu);
>  	}
>  	return 0;
>  }
> @@ -1346,6 +1350,7 @@ static const char * const sja1105_reset_reasons[] = {
>  	[SJA1105_RX_HWTSTAMPING] = "RX timestamping",
>  	[SJA1105_AGEING_TIME] = "Ageing time",
>  	[SJA1105_SCHEDULING] = "Time-aware scheduling",
> +	[SJA1105_BEST_EFFORT_POLICING] = "Best-effort policing",
>  };
>  
>  /* For situations where we need to change a setting at runtime that is only
> @@ -1886,6 +1891,39 @@ static int sja1105_set_ageing_time(struct dsa_switch *ds,
>  	return sja1105_static_config_reload(priv, SJA1105_AGEING_TIME);
>  }
>  
> +static int sja1105_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
> +{
> +	int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
> +	struct sja1105_l2_policing_entry *policing;
> +	struct sja1105_private *priv = ds->priv;
> +	int tc;
> +
> +	new_mtu += VLAN_ETH_HLEN + ETH_FCS_LEN;

Likewise
-- 
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 20:27   ` Florian Fainelli
@ 2019-11-23 20:46     ` Vladimir Oltean
  2019-11-23 21:14       ` Florian Fainelli
  2019-11-24 22:43       ` Andrew Lunn
  2019-11-23 20:51     ` Vladimir Oltean
  1 sibling, 2 replies; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 20:46 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev

On Sat, 23 Nov 2019 at 22:28, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
> Hi Vladimir,
>
> On 11/23/2019 11:48 AM, Vladimir Oltean wrote:
> > It is useful be able to configure port policers on a switch to accept
> > frames of various sizes:
> >
> > - Increase the MTU for better throughput from the default of 1500 if it
> >   is known that there is no 10/100 Mbps device in the network.
> > - Decrease the MTU to limit the latency of high-priority frames under
> >   congestion.
> >
> > For DSA slave ports, this is mostly a pass-through callback, called
> > through the regular ndo ops and at probe time (to ensure consistency
> > across all supported switches).
> >
> > The CPU port is called with an MTU equal to the largest configured MTU
> > of the slave ports. The assumption is that the user might want to
> > sustain a bidirectional conversation with a partner over any switch
> > port.
> >
> > The DSA master is configured the same as the CPU port, plus the tagger
> > overhead. Since the MTU is by definition L2 payload (sans Ethernet
> > header), it is up to each individual driver to figure out if it needs to
> > do anything special for its frame tags on the CPU port (it shouldn't
> > except in special cases). So the MTU does not contain the tagger
> > overhead on the CPU port.
> > However the MTU of the DSA master, minus the tagger overhead, is used as
> > a proxy for the MTU of the CPU port, which does not have a net device.
> > This is to avoid uselessly calling the .change_mtu function on the CPU
> > port when nothing should change.
> >
> > So it is safe to assume that the DSA master and the CPU port MTUs are
> > apart by exactly the tagger's overhead in bytes.
> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
>
> [snip]
> > +static int dsa_slave_change_mtu(struct net_device *dev, int new_mtu)
> > +{
> > +     struct net_device *master = dsa_slave_to_master(dev);
> > +     struct dsa_slave_priv *p = netdev_priv(dev);
> > +     struct dsa_switch *ds = p->dp->ds;
> > +     struct dsa_port *cpu_dp;
> > +     int port = p->dp->index;
> > +     int max_mtu = 0;
> > +     int cpu_mtu;
> > +     int err, i;
> > +
> > +     if (!ds->ops->change_mtu)
> > +             return -EOPNOTSUPP;
> > +
> > +     err = ds->ops->change_mtu(ds, port, new_mtu);
> > +     if (err < 0)
> > +             return err;
> > +
> > +     dev->mtu = new_mtu;
> > +
> > +     for (i = 0; i < ds->num_ports; i++) {
> > +             if (!dsa_is_user_port(ds, i))
> > +                     continue;
> > +
> > +             /* During probe, this function will be called for each slave
> > +              * device, while not all of them have been allocated. That's
> > +              * ok, it doesn't change what the maximum is, so ignore it.
> > +              */
> > +             if (!dsa_to_port(ds, i)->slave)
> > +                     continue;
> > +
> > +             if (max_mtu < dsa_to_port(ds, i)->slave->mtu)
> > +                     max_mtu = dsa_to_port(ds, i)->slave->mtu;
> > +     }
> > +
> > +     cpu_dp = dsa_to_port(ds, port)->cpu_dp;
> > +
> > +     max_mtu += cpu_dp->tag_ops->overhead;
> > +     cpu_mtu = master->mtu;
> > +
> > +     if (max_mtu != cpu_mtu) {
> > +             err = ds->ops->change_mtu(ds, dsa_upstream_port(ds, port),
> > +                                       max_mtu - cpu_dp->tag_ops->overhead);
> > +             if (err < 0)
> > +                     return err;
>
> Before changing and committing the slave_dev's MTU you should actually
> perform these two operations first to make sure that you can honor the
> user port MTU that is requested. Here, you would possibly leave an user
> port configured for a MTU value that is unsupported by the upstream
> port(s) and/or the CPU port and/or the DSA master device, which could
> possibly break frame forwarding depending on what the switch is willing
> to accept.
>

Correct. I was actually held back a bit while looking at Andrew's
patch dc0fe7d47f9f ("net: dsa: Set the master device's MTU to account
for DSA overheads") where he basically discarded errors, so that's the
approach I took too (thinking that some DSA masters would not have ops
for changing or reporting the MTU).

> I had prepared a patch series with Murali doing nearly the same thing
> and targeting Broadcom switches nearly a year ago but since I never got
> feedback whether this worked properly for the use case he was after, I
> did not submit it since I did not need it personally and found it to be
> a nice can of worms.
>

Nice, do you mind if I take your series instead then?

> Another thing that I had not gotten around testing was making sure that
> when a slave_dev gets enslaved as a bridge port member, that bridge MTU
> normalization would kick in and make sure that if you have say: port 0
> configured with MTU 1500 and port 1 configured with MTU 9000, the bridge
> would normalize to MTU 1500 as you would expect.
>

Nope, that doesn't happen by default, at least in my implementation.
Is there code in the bridge core for it?

> https://github.com/ffainelli/linux/commits/dsa-mtu
>
> This should be a DSA switch fabric notifier IMHO because changing the
> MTU on an user port implies changing the MTU on every DSA port in
> between plus the CPU port. Your approach here works for the first
> upstream port, but not for the ones in between, and there can be more,
> as is common with the ZII devel Rev. B and C boards.

Yes, correct. Your patch implements notifiers which is definitely
good. I don't have a cascaded setup to test yet (my Turris Mox was
supposed to arrive but for some reason it was returned to the seller
by the shipping company...).

> --
> Florian

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks
  2019-11-23 20:30   ` Florian Fainelli
@ 2019-11-23 20:48     ` Vladimir Oltean
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 20:48 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev

On Sat, 23 Nov 2019 at 22:30, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>
>
> On 11/23/2019 11:48 AM, Vladimir Oltean wrote:
> > On this switch, the frame length enforcements are performed by the
> > ingress policers. There are 2 types of those: regular L2 (also called
> > best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
> > defining L2 streams with certain QoS abilities). To avoid future
> > confusion, I prefer to call the reset reason "Best-effort policers",
> > even though the VL policers are not yet supported.
> >
> > We also need to change the setup of the initial static config, such that
> > DSA calls to .change_mtu (which are expensive) become no-ops and don't
> > reset the switch 5 times.
> >
> > A driver-level decision is to unconditionally allow single VLAN-tagged
> > traffic on all ports. The CPU port must accept an additional VLAN header
> > for the DSA tag, which is again a driver-level decision.
> >
> > The policers actually count bytes not only from the SDU, but also from
> > the Ethernet header and FCS, so those need to be accounted for as well.
> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> >  drivers/net/dsa/sja1105/sja1105.h      |  1 +
> >  drivers/net/dsa/sja1105/sja1105_main.c | 48 +++++++++++++++++++++++---
> >  2 files changed, 45 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
> > index d801fc204d19..3a5c8acb6e2a 100644
> > --- a/drivers/net/dsa/sja1105/sja1105.h
> > +++ b/drivers/net/dsa/sja1105/sja1105.h
> > @@ -122,6 +122,7 @@ enum sja1105_reset_reason {
> >       SJA1105_RX_HWTSTAMPING,
> >       SJA1105_AGEING_TIME,
> >       SJA1105_SCHEDULING,
> > +     SJA1105_BEST_EFFORT_POLICING,
> >  };
> >
> >  int sja1105_static_config_reload(struct sja1105_private *priv,
> > diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> > index b60224c55244..3d55dd3c7e83 100644
> > --- a/drivers/net/dsa/sja1105/sja1105_main.c
> > +++ b/drivers/net/dsa/sja1105/sja1105_main.c
> > @@ -459,12 +459,12 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
> >  #define SJA1105_RATE_MBPS(speed) (((speed) * 64000) / 1000)
> >
> >  static void sja1105_setup_policer(struct sja1105_l2_policing_entry *policing,
> > -                               int index)
> > +                               int index, int mtu)
> >  {
> >       policing[index].sharindx = index;
> >       policing[index].smax = 65535; /* Burst size in bytes */
> >       policing[index].rate = SJA1105_RATE_MBPS(1000);
> > -     policing[index].maxlen = ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN;
> > +     policing[index].maxlen = mtu;
> >       policing[index].partition = 0;
> >  }
> >
> > @@ -496,12 +496,16 @@ static int sja1105_init_l2_policing(struct sja1105_private *priv)
> >        */
> >       for (i = 0, k = 0; i < SJA1105_NUM_PORTS; i++) {
> >               int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + i;
> > +             int mtu = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
> > +
> > +             if (dsa_is_cpu_port(priv->ds, i))
> > +                     mtu += VLAN_HLEN;
>
> That really seems like a layering violation it so happens that you use
> DSA_TAG_8021Q which is why you need VLAN_ETH_HLEN, but you should not
> assume that from with your driver, even if this one is special on so
> many counts. How about using use dsa_port(port)->tag_ops->overhead +
> ETH_HLEN here?

True here.

> >
> >               for (j = 0; j < SJA1105_NUM_TC; j++, k++)
> > -                     sja1105_setup_policer(policing, k);
> > +                     sja1105_setup_policer(policing, k, mtu);
> >
> >               /* Set up this port's policer for broadcast traffic */
> > -             sja1105_setup_policer(policing, bcast);
> > +             sja1105_setup_policer(policing, bcast, mtu);
> >       }
> >       return 0;
> >  }
> > @@ -1346,6 +1350,7 @@ static const char * const sja1105_reset_reasons[] = {
> >       [SJA1105_RX_HWTSTAMPING] = "RX timestamping",
> >       [SJA1105_AGEING_TIME] = "Ageing time",
> >       [SJA1105_SCHEDULING] = "Time-aware scheduling",
> > +     [SJA1105_BEST_EFFORT_POLICING] = "Best-effort policing",
> >  };
> >
> >  /* For situations where we need to change a setting at runtime that is only
> > @@ -1886,6 +1891,39 @@ static int sja1105_set_ageing_time(struct dsa_switch *ds,
> >       return sja1105_static_config_reload(priv, SJA1105_AGEING_TIME);
> >  }
> >
> > +static int sja1105_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
> > +{
> > +     int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
> > +     struct sja1105_l2_policing_entry *policing;
> > +     struct sja1105_private *priv = ds->priv;
> > +     int tc;
> > +
> > +     new_mtu += VLAN_ETH_HLEN + ETH_FCS_LEN;
>
> Likewise

Not the same thing here. I wrote about this one in the commit message:
"A driver-level decision is to unconditionally allow single
VLAN-tagged traffic on all ports". How is this handled more correctly?

> --
> Florian

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 20:27   ` Florian Fainelli
  2019-11-23 20:46     ` Vladimir Oltean
@ 2019-11-23 20:51     ` Vladimir Oltean
  2019-11-23 21:09       ` Florian Fainelli
  1 sibling, 1 reply; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 20:51 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev

On Sat, 23 Nov 2019 at 22:28, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
> Hi Vladimir,
>
[snip]
>
> I had prepared a patch series with Murali doing nearly the same thing
> and targeting Broadcom switches nearly a year ago but since I never got
> feedback whether this worked properly for the use case he was after, I
> did not submit it since I did not need it personally and found it to be
> a nice can of worms.
>
[snip]

Also, is there another can of worms beyond what you've described here?

> --
> Florian

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 20:51     ` Vladimir Oltean
@ 2019-11-23 21:09       ` Florian Fainelli
  0 siblings, 0 replies; 16+ messages in thread
From: Florian Fainelli @ 2019-11-23 21:09 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev



On 11/23/2019 12:51 PM, Vladimir Oltean wrote:
> On Sat, 23 Nov 2019 at 22:28, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>> Hi Vladimir,
>>
> [snip]
>>
>> I had prepared a patch series with Murali doing nearly the same thing
>> and targeting Broadcom switches nearly a year ago but since I never got
>> feedback whether this worked properly for the use case he was after, I
>> did not submit it since I did not need it personally and found it to be
>> a nice can of worms.
>>
> [snip]
> 
> Also, is there another can of worms beyond what you've described here?

Not that I can think of, the patch series I mentioned was done without
considering the bridge MTU normalization and that is where I left.
-- 
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 20:46     ` Vladimir Oltean
@ 2019-11-23 21:14       ` Florian Fainelli
  2019-11-23 21:29         ` Vladimir Oltean
  2019-11-24 22:43       ` Andrew Lunn
  1 sibling, 1 reply; 16+ messages in thread
From: Florian Fainelli @ 2019-11-23 21:14 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev



On 11/23/2019 12:46 PM, Vladimir Oltean wrote:
> 
> Correct. I was actually held back a bit while looking at Andrew's
> patch dc0fe7d47f9f ("net: dsa: Set the master device's MTU to account
> for DSA overheads") where he basically discarded errors, so that's the
> approach I took too (thinking that some DSA masters would not have ops
> for changing or reporting the MTU).
> 
>> I had prepared a patch series with Murali doing nearly the same thing
>> and targeting Broadcom switches nearly a year ago but since I never got
>> feedback whether this worked properly for the use case he was after, I
>> did not submit it since I did not need it personally and found it to be
>> a nice can of worms.
>>
> 
> Nice, do you mind if I take your series instead then?

Not at all, if it works, please go ahead, not sure how hard it is going
to be to rebase.

> 
>> Another thing that I had not gotten around testing was making sure that
>> when a slave_dev gets enslaved as a bridge port member, that bridge MTU
>> normalization would kick in and make sure that if you have say: port 0
>> configured with MTU 1500 and port 1 configured with MTU 9000, the bridge
>> would normalize to MTU 1500 as you would expect.
>>
> 
> Nope, that doesn't happen by default, at least in my implementation.
> Is there code in the bridge core for it?

net/bridge/br_if.c::br_mtu_auto_adjust() takes care of adjusting the
bridge master device's MTU based on the minimum MTU of all ports within
the bridge, but what it seems to be missing is ensuring that if bridge
ports are enslaved, and those bridge ports happen to be part of the same
switch id (similar decision path to setting skb->fwd_offload_mark), then
the bridge port's MTU should also be auto adjusted. mlxsw also supports
changing the MTU, so I am surprised this is not something they fixed
already.

> 
>> https://github.com/ffainelli/linux/commits/dsa-mtu
>>
>> This should be a DSA switch fabric notifier IMHO because changing the
>> MTU on an user port implies changing the MTU on every DSA port in
>> between plus the CPU port. Your approach here works for the first
>> upstream port, but not for the ones in between, and there can be more,
>> as is common with the ZII devel Rev. B and C boards.
> 
> Yes, correct. Your patch implements notifiers which is definitely
> good. I don't have a cascaded setup to test yet (my Turris Mox was
> supposed to arrive but for some reason it was returned to the seller
> by the shipping company...).

Andrew and Vivien may know whether it is possible to get you a ZII Devel
rev. B or C since those have 3 switches in the fabric and have
constituted nice test platforms. Not sure if there is a version with a
CPU port that is Gigabit capable though.
-- 
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 21:14       ` Florian Fainelli
@ 2019-11-23 21:29         ` Vladimir Oltean
  2019-11-23 21:47           ` Florian Fainelli
  0 siblings, 1 reply; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 21:29 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev

On Sat, 23 Nov 2019 at 23:14, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>
>
> On 11/23/2019 12:46 PM, Vladimir Oltean wrote:
> >
> > Correct. I was actually held back a bit while looking at Andrew's
> > patch dc0fe7d47f9f ("net: dsa: Set the master device's MTU to account
> > for DSA overheads") where he basically discarded errors, so that's the
> > approach I took too (thinking that some DSA masters would not have ops
> > for changing or reporting the MTU).
> >
> >> I had prepared a patch series with Murali doing nearly the same thing
> >> and targeting Broadcom switches nearly a year ago but since I never got
> >> feedback whether this worked properly for the use case he was after, I
> >> did not submit it since I did not need it personally and found it to be
> >> a nice can of worms.
> >>
> >
> > Nice, do you mind if I take your series instead then?
>
> Not at all, if it works, please go ahead, not sure how hard it is going
> to be to rebase.
>
> >
> >> Another thing that I had not gotten around testing was making sure that
> >> when a slave_dev gets enslaved as a bridge port member, that bridge MTU
> >> normalization would kick in and make sure that if you have say: port 0
> >> configured with MTU 1500 and port 1 configured with MTU 9000, the bridge
> >> would normalize to MTU 1500 as you would expect.
> >>
> >
> > Nope, that doesn't happen by default, at least in my implementation.
> > Is there code in the bridge core for it?
>
> net/bridge/br_if.c::br_mtu_auto_adjust() takes care of adjusting the
> bridge master device's MTU based on the minimum MTU of all ports within
> the bridge, but what it seems to be missing is ensuring that if bridge
> ports are enslaved, and those bridge ports happen to be part of the same
> switch id (similar decision path to setting skb->fwd_offload_mark), then
> the bridge port's MTU should also be auto adjusted. mlxsw also supports
> changing the MTU, so I am surprised this is not something they fixed
> already.
>

But then how would you even change a bridged interface's MTU? Delete
bridge, change MTU of all ports to same value, create bridge again?

> >
> >> https://github.com/ffainelli/linux/commits/dsa-mtu
> >>
> >> This should be a DSA switch fabric notifier IMHO because changing the
> >> MTU on an user port implies changing the MTU on every DSA port in
> >> between plus the CPU port. Your approach here works for the first
> >> upstream port, but not for the ones in between, and there can be more,
> >> as is common with the ZII devel Rev. B and C boards.
> >
> > Yes, correct. Your patch implements notifiers which is definitely
> > good. I don't have a cascaded setup to test yet (my Turris Mox was
> > supposed to arrive but for some reason it was returned to the seller
> > by the shipping company...).
>
> Andrew and Vivien may know whether it is possible to get you a ZII Devel
> rev. B or C since those have 3 switches in the fabric and have
> constituted nice test platforms. Not sure if there is a version with a
> CPU port that is Gigabit capable though.

Well, I've already ordered it again, this time with more expensive
shipping... Hopefully I got their hint right.

> --
> Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 21:29         ` Vladimir Oltean
@ 2019-11-23 21:47           ` Florian Fainelli
  2019-11-23 21:54             ` Vladimir Oltean
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Fainelli @ 2019-11-23 21:47 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev



On 11/23/2019 1:29 PM, Vladimir Oltean wrote:
> On Sat, 23 Nov 2019 at 23:14, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>>
>>
>> On 11/23/2019 12:46 PM, Vladimir Oltean wrote:
>>>
>>> Correct. I was actually held back a bit while looking at Andrew's
>>> patch dc0fe7d47f9f ("net: dsa: Set the master device's MTU to account
>>> for DSA overheads") where he basically discarded errors, so that's the
>>> approach I took too (thinking that some DSA masters would not have ops
>>> for changing or reporting the MTU).
>>>
>>>> I had prepared a patch series with Murali doing nearly the same thing
>>>> and targeting Broadcom switches nearly a year ago but since I never got
>>>> feedback whether this worked properly for the use case he was after, I
>>>> did not submit it since I did not need it personally and found it to be
>>>> a nice can of worms.
>>>>
>>>
>>> Nice, do you mind if I take your series instead then?
>>
>> Not at all, if it works, please go ahead, not sure how hard it is going
>> to be to rebase.
>>
>>>
>>>> Another thing that I had not gotten around testing was making sure that
>>>> when a slave_dev gets enslaved as a bridge port member, that bridge MTU
>>>> normalization would kick in and make sure that if you have say: port 0
>>>> configured with MTU 1500 and port 1 configured with MTU 9000, the bridge
>>>> would normalize to MTU 1500 as you would expect.
>>>>
>>>
>>> Nope, that doesn't happen by default, at least in my implementation.
>>> Is there code in the bridge core for it?
>>
>> net/bridge/br_if.c::br_mtu_auto_adjust() takes care of adjusting the
>> bridge master device's MTU based on the minimum MTU of all ports within
>> the bridge, but what it seems to be missing is ensuring that if bridge
>> ports are enslaved, and those bridge ports happen to be part of the same
>> switch id (similar decision path to setting skb->fwd_offload_mark), then
>> the bridge port's MTU should also be auto adjusted. mlxsw also supports
>> changing the MTU, so I am surprised this is not something they fixed
>> already.
>>
> 
> But then how would you even change a bridged interface's MTU? Delete
> bridge, change MTU of all ports to same value, create bridge again?

I am afraid so, given that the NETDEV_CHANGEMTU even for which
br_device_event() listens to and processes with br_mtu_auto_adjust()
would lead to selecting the lowest MTU again. Unfortunately, I don't
really see a way to solve that other than walk all ports (which could be
any network device driver) and ask them if they support the new MTU of
that other port, and if so, commit, else rollback. Do you see another way?
-- 
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 21:47           ` Florian Fainelli
@ 2019-11-23 21:54             ` Vladimir Oltean
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Oltean @ 2019-11-23 21:54 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski, netdev

On Sat, 23 Nov 2019 at 23:47, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 11/23/2019 1:29 PM, Vladimir Oltean wrote:
> > On Sat, 23 Nov 2019 at 23:14, Florian Fainelli <f.fainelli@gmail.com> wrote:
> >> On 11/23/2019 12:46 PM, Vladimir Oltean wrote:

[snip]

> >>>> Another thing that I had not gotten around testing was making sure that
> >>>> when a slave_dev gets enslaved as a bridge port member, that bridge MTU
> >>>> normalization would kick in and make sure that if you have say: port 0
> >>>> configured with MTU 1500 and port 1 configured with MTU 9000, the bridge
> >>>> would normalize to MTU 1500 as you would expect.
> >>>>
> >>>
> >>> Nope, that doesn't happen by default, at least in my implementation.
> >>> Is there code in the bridge core for it?
> >>
> >> net/bridge/br_if.c::br_mtu_auto_adjust() takes care of adjusting the
> >> bridge master device's MTU based on the minimum MTU of all ports within
> >> the bridge, but what it seems to be missing is ensuring that if bridge
> >> ports are enslaved, and those bridge ports happen to be part of the same
> >> switch id (similar decision path to setting skb->fwd_offload_mark), then
> >> the bridge port's MTU should also be auto adjusted. mlxsw also supports
> >> changing the MTU, so I am surprised this is not something they fixed
> >> already.
> >>
> >
> > But then how would you even change a bridged interface's MTU? Delete
> > bridge, change MTU of all ports to same value, create bridge again?
>
> I am afraid so, given that the NETDEV_CHANGEMTU even for which
> br_device_event() listens to and processes with br_mtu_auto_adjust()
> would lead to selecting the lowest MTU again. Unfortunately, I don't
> really see a way to solve that other than walk all ports (which could be
> any network device driver) and ask them if they support the new MTU of
> that other port, and if so, commit, else rollback. Do you see another way?
> --
> Florian

Something like that would be preferable. I think it would be
frustrating for the bridge to restore the old MTU right after you
change it. I have no idea how hard it would be to prevent the bridge
from doing this. I'll poke it with a stick and see what happens.

-Vladimir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 20:46     ` Vladimir Oltean
  2019-11-23 21:14       ` Florian Fainelli
@ 2019-11-24 22:43       ` Andrew Lunn
  1 sibling, 0 replies; 16+ messages in thread
From: Andrew Lunn @ 2019-11-24 22:43 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, Vivien Didelot, David S. Miller,
	Jakub Kicinski, netdev

> Correct. I was actually held back a bit while looking at Andrew's
> patch dc0fe7d47f9f ("net: dsa: Set the master device's MTU to account
> for DSA overheads") where he basically discarded errors, so that's the
> approach I took too (thinking that some DSA masters would not have ops
> for changing or reporting the MTU).

Ignoring errors is deliberate because some master interfaces just
worked without having to set the MTU. I was worried that some that
just worked did not implement MTU changes, so i could not error out.

And my experience when things did not work was mostly the MTU did not
matter, but MRU did. The MAC would send frames with a header, but not
receive them with the header. Setting the MTU actually seems to set
the MRU on most MACs.

But when you are thinking about jumbo frames, i would not ignore the
error. We do need to be sure the master interface can support jumbo,
and big enough jumbo to support the header.

    Andrew

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports
  2019-11-23 19:48 ` [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports Vladimir Oltean
  2019-11-23 20:27   ` Florian Fainelli
@ 2019-11-24 22:48   ` Andrew Lunn
  1 sibling, 0 replies; 16+ messages in thread
From: Andrew Lunn @ 2019-11-24 22:48 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: f.fainelli, vivien.didelot, davem, jakub.kicinski, netdev

> The CPU port is called with an MTU equal to the largest configured MTU
> of the slave ports. The assumption is that the user might want to
> sustain a bidirectional conversation with a partner over any switch
> port.

There is an assumption here that the MTU is enforced before the DSA
header is applied on the egress patch. And MRU is also applied once
the header is removed on ingress.  I've no idea if this is true for
all the switches we support in DSA.

We should clearly document that if the switch need the header to be
included, the function to set the MTU needs to add it.

	  Andrew

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-11-24 22:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-23 19:48 [PATCH net-next 0/3] Configure the MTU on DSA switches Vladimir Oltean
2019-11-23 19:48 ` [PATCH net-next 1/3] net: dsa: Configure the MTU for switch ports Vladimir Oltean
2019-11-23 20:27   ` Florian Fainelli
2019-11-23 20:46     ` Vladimir Oltean
2019-11-23 21:14       ` Florian Fainelli
2019-11-23 21:29         ` Vladimir Oltean
2019-11-23 21:47           ` Florian Fainelli
2019-11-23 21:54             ` Vladimir Oltean
2019-11-24 22:43       ` Andrew Lunn
2019-11-23 20:51     ` Vladimir Oltean
2019-11-23 21:09       ` Florian Fainelli
2019-11-24 22:48   ` Andrew Lunn
2019-11-23 19:48 ` [PATCH net-next 2/3] net: dsa: sja1105: Implement the port MTU callbacks Vladimir Oltean
2019-11-23 20:30   ` Florian Fainelli
2019-11-23 20:48     ` Vladimir Oltean
2019-11-23 19:48 ` [PATCH net-next 3/3] net: dsa: vsc73xx: Make the MTU configurable Vladimir Oltean

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).