All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging
@ 2015-02-17 19:26 Florian Fainelli
  2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
                   ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-17 19:26 UTC (permalink / raw)
  To: netdev
  Cc: davem, Florian Fainelli, vivien.didelot, jerome.oufella, linux,
	andrew, cphealy

Hi all,

This patchset is mostly to demonstrate how we could integrate with SWITCHDEV
to provide HW bridging support for DSA switch drivers that are currently in
tree. Since 'net-next' is currently closed, this gives us some time to work
on the abstraction we want to have in DSA to facilitate our life.

Attached is an example of how this winds up looking like for the bcm_sf2 driver.

Florian Fainelli (2):
  net: dsa: integrate with SWITCHDEV for HW bridging
  net: dsa: bcm_sf2: implement HW bridging operations

 drivers/net/dsa/bcm_sf2.c |  93 ++++++++++++++++++++++++++++++++++++
 include/net/dsa.h         |  10 ++++
 net/dsa/Kconfig           |   1 +
 net/dsa/dsa.c             |   7 +++
 net/dsa/dsa_priv.h        |   2 +
 net/dsa/slave.c           | 117 ++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 230 insertions(+)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-17 19:26 [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Florian Fainelli
@ 2015-02-17 19:26 ` Florian Fainelli
  2015-02-17 20:13   ` Guenter Roeck
                     ` (2 more replies)
  2015-02-17 19:26 ` [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations Florian Fainelli
  2015-02-18  0:48 ` [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Scott Feldman
  2 siblings, 3 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-17 19:26 UTC (permalink / raw)
  To: netdev
  Cc: davem, Florian Fainelli, vivien.didelot, jerome.oufella, linux,
	andrew, cphealy

In order to support bridging offloads in DSA switch drivers, select
NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
NDOs that we are required to implement.

To facilitate the integratation at the DSA driver level, we implement 3
types of operations:

- port_join_bridge
- port_leave_bridge
- port_stp_update

DSA will resolve which switch ports that are currently bridge port
members as some Switch hardware/drivers need to know about that to limit
the register programming to just the relevant registers (especially for
slow MDIO buses).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 include/net/dsa.h  |  10 +++++
 net/dsa/Kconfig    |   1 +
 net/dsa/dsa.c      |   7 ++++
 net/dsa/dsa_priv.h |   2 +
 net/dsa/slave.c    | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 137 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index ed3c34bbb67a..92be34791963 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -275,6 +275,16 @@ struct dsa_switch_driver {
 	int	(*get_regs_len)(struct dsa_switch *ds, int port);
 	void	(*get_regs)(struct dsa_switch *ds, int port,
 			    struct ethtool_regs *regs, void *p);
+
+	/*
+	 * Bridge integration
+	 */
+	int	(*port_join_bridge)(struct dsa_switch *ds, int port,
+				    u32 br_port_mask);
+	int	(*port_leave_bridge)(struct dsa_switch *ds, int port,
+				     u32 br_port_mask);
+	int	(*port_stp_update)(struct dsa_switch *ds, int port,
+				   u8 state);
 };
 
 void register_switch_driver(struct dsa_switch_driver *type);
diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
index 5f8ac404535b..b45206e8dd3e 100644
--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -8,6 +8,7 @@ config NET_DSA
 	tristate
 	depends on HAVE_NET_DSA
 	select PHYLIB
+	select NET_SWITCHDEV
 
 if NET_DSA
 
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 2173402d87e0..293e0c3cc445 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -830,6 +830,10 @@ static struct packet_type dsa_pack_type __read_mostly = {
 	.func	= dsa_switch_rcv,
 };
 
+static struct notifier_block dsa_netdevice_nb __read_mostly = {
+	.notifier_call	= dsa_slave_netdevice_event,
+};
+
 #ifdef CONFIG_PM_SLEEP
 static int dsa_suspend(struct device *d)
 {
@@ -888,6 +892,8 @@ static int __init dsa_init_module(void)
 {
 	int rc;
 
+	register_netdevice_notifier(&dsa_netdevice_nb);
+
 	rc = platform_driver_register(&dsa_driver);
 	if (rc)
 		return rc;
@@ -900,6 +906,7 @@ module_init(dsa_init_module);
 
 static void __exit dsa_cleanup_module(void)
 {
+	unregister_netdevice_notifier(&dsa_netdevice_nb);
 	dev_remove_pack(&dsa_pack_type);
 	platform_driver_unregister(&dsa_driver);
 }
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index dc9756d3154c..e6f4301e719e 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -58,6 +58,8 @@ struct net_device *dsa_slave_create(struct dsa_switch *ds,
 				    int port, char *name);
 int dsa_slave_suspend(struct net_device *slave_dev);
 int dsa_slave_resume(struct net_device *slave_dev);
+int dsa_slave_netdevice_event(struct notifier_block *unused,
+			      unsigned long event, void *ptr);
 
 /* tag_dsa.c */
 extern const struct dsa_device_ops dsa_netdev_ops;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index d104ae15836f..ea31b9be320f 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -10,10 +10,12 @@
 
 #include <linux/list.h>
 #include <linux/etherdevice.h>
+#include <linux/netdevice.h>
 #include <linux/phy.h>
 #include <linux/phy_fixed.h>
 #include <linux/of_net.h>
 #include <linux/of_mdio.h>
+#include <net/rtnetlink.h>
 #include "dsa_priv.h"
 
 /* slave mii_bus handling ***************************************************/
@@ -194,6 +196,77 @@ static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	return -EOPNOTSUPP;
 }
 
+/* Return a bitmask of all ports being currently bridged. Note that on
+ * leave, the mask will still return the bitmask of ports currently bridged,
+ * prior to port removal, and this is exactly what we want.
+ */
+static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
+{
+	unsigned int port;
+	u32 mask = 0;
+
+	for (port = 0; port < DSA_MAX_PORTS; port++) {
+		if (!((1 << port) & ds->phys_port_mask))
+			continue;
+
+		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
+			mask |= 1 << port;
+	}
+
+       return mask;
+}
+
+static int dsa_slave_bridge_port_join(struct net_device *dev,
+				      struct net_device *bridge)
+{
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_switch *ds = p->parent;
+	int ret = -EOPNOTSUPP;
+
+	if (ds->drv->port_join_bridge)
+		ret = ds->drv->port_join_bridge(ds, p->port,
+						dsa_slave_br_port_mask(ds));
+
+	return ret;
+}
+
+static int dsa_slave_bridge_port_leave(struct net_device *dev)
+{
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_switch *ds = p->parent;
+	int ret = -EOPNOTSUPP;
+
+	if (ds->drv->port_leave_bridge)
+		ret = ds->drv->port_leave_bridge(ds, p->port,
+						 dsa_slave_br_port_mask(ds));
+
+	return ret;
+}
+
+static int dsa_slave_parent_id_get(struct net_device *dev,
+				   struct netdev_phys_item_id *psid)
+{
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_switch *ds = p->parent;
+
+	psid->id_len = sizeof(ds->index);
+	memcpy(&psid->id, &ds->index, psid->id_len);
+
+	return 0;
+}
+
+static int dsa_slave_stp_update(struct net_device *dev, u8 state)
+{
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_switch *ds = p->parent;
+	int ret = -EOPNOTSUPP;
+
+	if (ds->drv->port_stp_update)
+		ret = ds->drv->port_stp_update(ds, p->port, state);
+
+	return ret;
+}
+
 static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct dsa_slave_priv *p = netdev_priv(dev);
@@ -470,6 +543,8 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
 	.ndo_set_rx_mode	= dsa_slave_set_rx_mode,
 	.ndo_set_mac_address	= dsa_slave_set_mac_address,
 	.ndo_do_ioctl		= dsa_slave_ioctl,
+	.ndo_switch_parent_id_get = dsa_slave_parent_id_get,
+	.ndo_switch_port_stp_update = dsa_slave_stp_update,
 };
 
 static void dsa_slave_adjust_link(struct net_device *dev)
@@ -678,3 +753,45 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
 
 	return slave_dev;
 }
+
+static bool dsa_slave_dev_check(struct net_device *dev)
+{
+	return dev->netdev_ops == &dsa_slave_netdev_ops;
+}
+
+static int dsa_slave_master_changed(struct net_device *dev)
+{
+	struct net_device *master = netdev_master_upper_dev_get(dev);
+	int err = 0;
+
+	if (master && master->rtnl_link_ops &&
+	    !strcmp(master->rtnl_link_ops->kind, "bridge"))
+		err = dsa_slave_bridge_port_join(dev, master);
+	else
+		err = dsa_slave_bridge_port_leave(dev);
+
+	return err;
+}
+
+int dsa_slave_netdevice_event(struct notifier_block *unused,
+			      unsigned long event, void *ptr)
+{
+	struct netdevice_dev *dev;
+	int err = 0;
+
+	switch (event) {
+	case NETDEV_CHANGEUPPER:
+		dev = netdev_notifier_info_to_dev(ptr);
+		if (!dsa_slave_dev_check(dev))
+			goto out;
+
+		err = dsa_slave_master_changed(dev);
+		if (err)
+			netdev_warn(dev, "failed to reflect master change\n");
+
+		break;
+	}
+
+out:
+	return NOTIFY_DONE;
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-17 19:26 [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Florian Fainelli
  2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
@ 2015-02-17 19:26 ` Florian Fainelli
  2015-02-19  2:48   ` Florian Fainelli
  2015-02-18  0:48 ` [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Scott Feldman
  2 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-17 19:26 UTC (permalink / raw)
  To: netdev
  Cc: davem, Florian Fainelli, vivien.didelot, jerome.oufella, linux,
	andrew, cphealy

Update the Broadcom Starfighter 2 switch driver to implement the
join/leave/stp_update callbacks required for basic hardware bridging
support.

There is not much to be done at the driver level but translating the
STP state from Linux to their HW values.

Joining a bridge means that the joining port and the other port members
need to be in the same VLAN membership as the CPU, while leaving the
bridge puts the port back into a separate VLAN membership with only the
CPU.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/bcm_sf2.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 4daffb284931..006c86a4df54 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -23,6 +23,7 @@
 #include <linux/of_address.h>
 #include <net/dsa.h>
 #include <linux/ethtool.h>
+#include <linux/if_bridge.h>
 
 #include "bcm_sf2.h"
 #include "bcm_sf2_regs.h"
@@ -400,6 +401,95 @@ static int bcm_sf2_sw_set_eee(struct dsa_switch *ds, int port,
 	return 0;
 }
 
+static int bcm_sf2_sw_br_join(struct dsa_switch *ds, int port,
+			      u32 br_port_mask)
+{
+	struct bcm_sf2_priv *priv = ds_to_priv(ds);
+	unsigned int i;
+	u32 reg, p_ctl;
+
+	p_ctl = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(port));
+
+	for (i = 0; i < priv->hw_params.num_ports; i++) {
+		if (!((1 << i) & br_port_mask))
+			continue;
+
+		reg = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(i));
+		reg |= 1 << port;
+		core_writel(priv, reg, CORE_PORT_VLAN_CTL_PORT(i));
+
+		p_ctl |= 1 << i;
+	}
+
+	core_writel(priv, p_ctl, CORE_PORT_VLAN_CTL_PORT(port));
+
+	return 0;
+}
+
+static int bcm_sf2_sw_br_leave(struct dsa_switch *ds, int port,
+			       u32 br_port_mask)
+{
+	struct bcm_sf2_priv *priv = ds_to_priv(ds);
+	unsigned int i;
+	u32 reg, p_ctl;
+
+	p_ctl = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(port));
+
+	for (i = 0; i < priv->hw_params.num_ports; i++) {
+		/* Don't touch the remaining ports */
+		if (!((1 << i) & br_port_mask))
+			continue;
+
+		reg = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(i));
+		reg &= ~(1 << port);
+		core_writel(priv, reg, CORE_PORT_VLAN_CTL_PORT(i));
+
+		/* Prevent self removal to preserve isolation */
+		if (port != i)
+			p_ctl &= ~(1 << i);
+	}
+
+	core_writel(priv, p_ctl, CORE_PORT_VLAN_CTL_PORT(port));
+
+	return 0;
+}
+
+static int bcm_sf2_sw_br_stp_update(struct dsa_switch *ds, int port,
+				    u8 state)
+{
+	struct bcm_sf2_priv *priv = ds_to_priv(ds);
+	u32 reg;
+	u8 hw_state;
+
+	switch (state) {
+	case BR_STATE_DISABLED:
+		hw_state = G_MISTP_DIS_STATE;
+		break;
+	case BR_STATE_LISTENING:
+		hw_state = G_MISTP_LISTEN_STATE;
+		break;
+	case BR_STATE_LEARNING:
+		hw_state = G_MISTP_LEARN_STATE;
+		break;
+	case BR_STATE_FORWARDING:
+		hw_state = G_MISTP_FWD_STATE;
+		break;
+	case BR_STATE_BLOCKING:
+		hw_state = G_MISTP_BLOCK_STATE;
+		break;
+	default:
+		pr_err("%s: invalid STP state: %d\n", __func__, state);
+		return -EINVAL;
+	}
+
+	reg = core_readl(priv, CORE_G_PCTL_PORT(port));
+	reg &= ~(G_MISTP_STATE_MASK << G_MISTP_STATE_SHIFT);
+	reg |= hw_state;
+	core_writel(priv, reg, CORE_G_PCTL_PORT(port));
+
+	return 0;
+}
+
 static irqreturn_t bcm_sf2_switch_0_isr(int irq, void *dev_id)
 {
 	struct bcm_sf2_priv *priv = dev_id;
@@ -916,6 +1006,9 @@ static struct dsa_switch_driver bcm_sf2_switch_driver = {
 	.port_disable		= bcm_sf2_port_disable,
 	.get_eee		= bcm_sf2_sw_get_eee,
 	.set_eee		= bcm_sf2_sw_set_eee,
+	.port_join_bridge	= bcm_sf2_sw_br_join,
+	.port_leave_bridge	= bcm_sf2_sw_br_leave,
+	.port_stp_update	= bcm_sf2_sw_br_stp_update,
 };
 
 static int __init bcm_sf2_init(void)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
@ 2015-02-17 20:13   ` Guenter Roeck
  2015-02-17 20:24     ` Florian Fainelli
  2015-02-18  1:19   ` Andrew Lunn
  2015-02-23  2:20   ` Guenter Roeck
  2 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-17 20:13 UTC (permalink / raw)
  To: Florian Fainelli, netdev
  Cc: davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 02/17/2015 11:26 AM, Florian Fainelli wrote:
> In order to support bridging offloads in DSA switch drivers, select
> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
> NDOs that we are required to implement.
>
> To facilitate the integratation at the DSA driver level, we implement 3
> types of operations:
>
> - port_join_bridge
> - port_leave_bridge
> - port_stp_update
>
> DSA will resolve which switch ports that are currently bridge port
> members as some Switch hardware/drivers need to know about that to limit
> the register programming to just the relevant registers (especially for
> slow MDIO buses).
>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>   include/net/dsa.h  |  10 +++++
>   net/dsa/Kconfig    |   1 +
>   net/dsa/dsa.c      |   7 ++++
>   net/dsa/dsa_priv.h |   2 +
>   net/dsa/slave.c    | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   5 files changed, 137 insertions(+)
>
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index ed3c34bbb67a..92be34791963 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -275,6 +275,16 @@ struct dsa_switch_driver {
>   	int	(*get_regs_len)(struct dsa_switch *ds, int port);
>   	void	(*get_regs)(struct dsa_switch *ds, int port,
>   			    struct ethtool_regs *regs, void *p);
> +
> +	/*
> +	 * Bridge integration
> +	 */
> +	int	(*port_join_bridge)(struct dsa_switch *ds, int port,
> +				    u32 br_port_mask);
> +	int	(*port_leave_bridge)(struct dsa_switch *ds, int port,
> +				     u32 br_port_mask);
> +	int	(*port_stp_update)(struct dsa_switch *ds, int port,
> +				   u8 state);
>   };
>
>   void register_switch_driver(struct dsa_switch_driver *type);
> diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
> index 5f8ac404535b..b45206e8dd3e 100644
> --- a/net/dsa/Kconfig
> +++ b/net/dsa/Kconfig
> @@ -8,6 +8,7 @@ config NET_DSA
>   	tristate
>   	depends on HAVE_NET_DSA
>   	select PHYLIB
> +	select NET_SWITCHDEV

Should this be "select" or "depends on" ?

Downside of depends is that we'll need some ifdefs in the code,
but on the other side it would let people disable it if it is
not needed.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-17 20:13   ` Guenter Roeck
@ 2015-02-17 20:24     ` Florian Fainelli
  2015-02-17 20:28       ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-17 20:24 UTC (permalink / raw)
  To: Guenter Roeck, netdev
  Cc: davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 17/02/15 12:13, Guenter Roeck wrote:
> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>> In order to support bridging offloads in DSA switch drivers, select
>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>> NDOs that we are required to implement.
>>
>> To facilitate the integratation at the DSA driver level, we implement 3
>> types of operations:
>>
>> - port_join_bridge
>> - port_leave_bridge
>> - port_stp_update
>>
>> DSA will resolve which switch ports that are currently bridge port
>> members as some Switch hardware/drivers need to know about that to limit
>> the register programming to just the relevant registers (especially for
>> slow MDIO buses).
>>
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>> ---
>>   include/net/dsa.h  |  10 +++++
>>   net/dsa/Kconfig    |   1 +
>>   net/dsa/dsa.c      |   7 ++++
>>   net/dsa/dsa_priv.h |   2 +
>>   net/dsa/slave.c    | 117
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   5 files changed, 137 insertions(+)
>>
>> diff --git a/include/net/dsa.h b/include/net/dsa.h
>> index ed3c34bbb67a..92be34791963 100644
>> --- a/include/net/dsa.h
>> +++ b/include/net/dsa.h
>> @@ -275,6 +275,16 @@ struct dsa_switch_driver {
>>       int    (*get_regs_len)(struct dsa_switch *ds, int port);
>>       void    (*get_regs)(struct dsa_switch *ds, int port,
>>                   struct ethtool_regs *regs, void *p);
>> +
>> +    /*
>> +     * Bridge integration
>> +     */
>> +    int    (*port_join_bridge)(struct dsa_switch *ds, int port,
>> +                    u32 br_port_mask);
>> +    int    (*port_leave_bridge)(struct dsa_switch *ds, int port,
>> +                     u32 br_port_mask);
>> +    int    (*port_stp_update)(struct dsa_switch *ds, int port,
>> +                   u8 state);
>>   };
>>
>>   void register_switch_driver(struct dsa_switch_driver *type);
>> diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
>> index 5f8ac404535b..b45206e8dd3e 100644
>> --- a/net/dsa/Kconfig
>> +++ b/net/dsa/Kconfig
>> @@ -8,6 +8,7 @@ config NET_DSA
>>       tristate
>>       depends on HAVE_NET_DSA
>>       select PHYLIB
>> +    select NET_SWITCHDEV
> 
> Should this be "select" or "depends on" ?
> 
> Downside of depends is that we'll need some ifdefs in the code,
> but on the other side it would let people disable it if it is
> not needed.

The code overhead is not huge, and I would think that by enforcing
NET_SWITCHDEV we encourage better DSA driver practices and promote HW
bridging, if you think this should be made conditional, I guess we can
do that.
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-17 20:24     ` Florian Fainelli
@ 2015-02-17 20:28       ` Guenter Roeck
  0 siblings, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-17 20:28 UTC (permalink / raw)
  To: Florian Fainelli, netdev
  Cc: davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 02/17/2015 12:24 PM, Florian Fainelli wrote:
> On 17/02/15 12:13, Guenter Roeck wrote:
>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>> In order to support bridging offloads in DSA switch drivers, select
>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>> NDOs that we are required to implement.
>>>
>>> To facilitate the integratation at the DSA driver level, we implement 3
>>> types of operations:
>>>
>>> - port_join_bridge
>>> - port_leave_bridge
>>> - port_stp_update
>>>
>>> DSA will resolve which switch ports that are currently bridge port
>>> members as some Switch hardware/drivers need to know about that to limit
>>> the register programming to just the relevant registers (especially for
>>> slow MDIO buses).
>>>
>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>> ---
>>>    include/net/dsa.h  |  10 +++++
>>>    net/dsa/Kconfig    |   1 +
>>>    net/dsa/dsa.c      |   7 ++++
>>>    net/dsa/dsa_priv.h |   2 +
>>>    net/dsa/slave.c    | 117
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    5 files changed, 137 insertions(+)
>>>
>>> diff --git a/include/net/dsa.h b/include/net/dsa.h
>>> index ed3c34bbb67a..92be34791963 100644
>>> --- a/include/net/dsa.h
>>> +++ b/include/net/dsa.h
>>> @@ -275,6 +275,16 @@ struct dsa_switch_driver {
>>>        int    (*get_regs_len)(struct dsa_switch *ds, int port);
>>>        void    (*get_regs)(struct dsa_switch *ds, int port,
>>>                    struct ethtool_regs *regs, void *p);
>>> +
>>> +    /*
>>> +     * Bridge integration
>>> +     */
>>> +    int    (*port_join_bridge)(struct dsa_switch *ds, int port,
>>> +                    u32 br_port_mask);
>>> +    int    (*port_leave_bridge)(struct dsa_switch *ds, int port,
>>> +                     u32 br_port_mask);
>>> +    int    (*port_stp_update)(struct dsa_switch *ds, int port,
>>> +                   u8 state);
>>>    };
>>>
>>>    void register_switch_driver(struct dsa_switch_driver *type);
>>> diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
>>> index 5f8ac404535b..b45206e8dd3e 100644
>>> --- a/net/dsa/Kconfig
>>> +++ b/net/dsa/Kconfig
>>> @@ -8,6 +8,7 @@ config NET_DSA
>>>        tristate
>>>        depends on HAVE_NET_DSA
>>>        select PHYLIB
>>> +    select NET_SWITCHDEV
>>
>> Should this be "select" or "depends on" ?
>>
>> Downside of depends is that we'll need some ifdefs in the code,
>> but on the other side it would let people disable it if it is
>> not needed.
>
> The code overhead is not huge, and I would think that by enforcing
> NET_SWITCHDEV we encourage better DSA driver practices and promote HW
> bridging, if you think this should be made conditional, I guess we can
> do that.
>

For sure not me ... I am happy forcing it. In my use case it would
always be enabled. I just don't want to go too far along that route
just to have to add a bunch of ifdefs later on.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging
  2015-02-17 19:26 [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Florian Fainelli
  2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
  2015-02-17 19:26 ` [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations Florian Fainelli
@ 2015-02-18  0:48 ` Scott Feldman
  2015-02-18  1:09   ` Florian Fainelli
  2 siblings, 1 reply; 65+ messages in thread
From: Scott Feldman @ 2015-02-18  0:48 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Netdev, David S. Miller, vivien.didelot, jerome.oufella, linux,
	andrew, cphealy

On Tue, Feb 17, 2015 at 2:26 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> Hi all,
>
> This patchset is mostly to demonstrate how we could integrate with SWITCHDEV
> to provide HW bridging support for DSA switch drivers that are currently in
> tree. Since 'net-next' is currently closed, this gives us some time to work
> on the abstraction we want to have in DSA to facilitate our life.
>
> Attached is an example of how this winds up looking like for the bcm_sf2 driver.

It looks good Florian.

Does bcm_sf2 do mac/vlan learning?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging
  2015-02-18  0:48 ` [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Scott Feldman
@ 2015-02-18  1:09   ` Florian Fainelli
  0 siblings, 0 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-18  1:09 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Netdev, David S. Miller, vivien.didelot, jerome.oufella, linux,
	andrew, cphealy

On 17/02/15 16:48, Scott Feldman wrote:
> On Tue, Feb 17, 2015 at 2:26 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> Hi all,
>>
>> This patchset is mostly to demonstrate how we could integrate with SWITCHDEV
>> to provide HW bridging support for DSA switch drivers that are currently in
>> tree. Since 'net-next' is currently closed, this gives us some time to work
>> on the abstraction we want to have in DSA to facilitate our life.
>>
>> Attached is an example of how this winds up looking like for the bcm_sf2 driver.
> 
> It looks good Florian.
> 
> Does bcm_sf2 do mac/vlan learning?
> 

Yes, it can do that, that's next on my list of things to get working ;)
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
  2015-02-17 20:13   ` Guenter Roeck
@ 2015-02-18  1:19   ` Andrew Lunn
  2015-02-18  1:43     ` Florian Fainelli
  2015-02-18  3:53     ` Guenter Roeck
  2015-02-23  2:20   ` Guenter Roeck
  2 siblings, 2 replies; 65+ messages in thread
From: Andrew Lunn @ 2015-02-18  1:19 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, vivien.didelot, jerome.oufella, linux, cphealy

> +/* Return a bitmask of all ports being currently bridged. Note that on
> + * leave, the mask will still return the bitmask of ports currently bridged,
> + * prior to port removal, and this is exactly what we want.
> + */
> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
> +{
> +	unsigned int port;
> +	u32 mask = 0;
> +
> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
> +		if (!((1 << port) & ds->phys_port_mask))
> +			continue;
> +
> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
> +			mask |= 1 << port;
> +	}
> +
> +       return mask;
> +}
> +
> +static int dsa_slave_bridge_port_join(struct net_device *dev,
> +				      struct net_device *bridge)
> +{
> +	struct dsa_slave_priv *p = netdev_priv(dev);
> +	struct dsa_switch *ds = p->parent;
> +	int ret = -EOPNOTSUPP;
> +
> +	if (ds->drv->port_join_bridge)
> +		ret = ds->drv->port_join_bridge(ds, p->port,
> +						dsa_slave_br_port_mask(ds));

Hi Florian

Shouldn't this bridge port mask also be dependent on bridge?

I'm thinking of cases like

brctl addbr br0
brctl addif br0 lan0
brctl addif br0 lan1

brctl addbr br1
brctl addif br1 lan2
brctl addif br1 lan3

We have two software bridges, so need two masks.  It does look like
your hardware and the Marvell hardware supports this, disjoint sets of
bridged ports.  But with the current implementation, your going to end
up with one hardware bridge with four ports, and broken STP.

       Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-18  1:19   ` Andrew Lunn
@ 2015-02-18  1:43     ` Florian Fainelli
  2015-02-18  3:53     ` Guenter Roeck
  1 sibling, 0 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-18  1:43 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, vivien.didelot, jerome.oufella, linux, cphealy,
	Scott Feldman, Jiri Pirko

On 17/02/15 17:19, Andrew Lunn wrote:
>> +/* Return a bitmask of all ports being currently bridged. Note that on
>> + * leave, the mask will still return the bitmask of ports currently bridged,
>> + * prior to port removal, and this is exactly what we want.
>> + */
>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>> +{
>> +	unsigned int port;
>> +	u32 mask = 0;
>> +
>> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
>> +		if (!((1 << port) & ds->phys_port_mask))
>> +			continue;
>> +
>> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>> +			mask |= 1 << port;
>> +	}
>> +
>> +       return mask;
>> +}
>> +
>> +static int dsa_slave_bridge_port_join(struct net_device *dev,
>> +				      struct net_device *bridge)
>> +{
>> +	struct dsa_slave_priv *p = netdev_priv(dev);
>> +	struct dsa_switch *ds = p->parent;
>> +	int ret = -EOPNOTSUPP;
>> +
>> +	if (ds->drv->port_join_bridge)
>> +		ret = ds->drv->port_join_bridge(ds, p->port,
>> +						dsa_slave_br_port_mask(ds));
> 
> Hi Florian
> 
> Shouldn't this bridge port mask also be dependent on bridge?

Yes, you are very right, thankfully this is a RFC patch because of that ;)

> 
> I'm thinking of cases like
> 
> brctl addbr br0
> brctl addif br0 lan0
> brctl addif br0 lan1
> 
> brctl addbr br1
> brctl addif br1 lan2

mask = 0x4 | 0x2 | 0x1 # FAIL

> brctl addif br1 lan3

mask = 0x8 | 0x4 | 0x2 | 0x1 # FAIL

> 
> We have two software bridges, so need two masks.  It does look like
> your hardware and the Marvell hardware supports this, disjoint sets of
> bridged ports.  But with the current implementation, your going to end
> up with one hardware bridge with four ports, and broken STP.

Yep, I will rework that patch set to address that, and I can actually
test that, which is even better, thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-18  1:19   ` Andrew Lunn
  2015-02-18  1:43     ` Florian Fainelli
@ 2015-02-18  3:53     ` Guenter Roeck
  2015-02-18  4:53       ` Florian Fainelli
  1 sibling, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-18  3:53 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli
  Cc: netdev, davem, vivien.didelot, jerome.oufella, cphealy

On 02/17/2015 05:19 PM, Andrew Lunn wrote:
>> +/* Return a bitmask of all ports being currently bridged. Note that on
>> + * leave, the mask will still return the bitmask of ports currently bridged,
>> + * prior to port removal, and this is exactly what we want.
>> + */
>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>> +{
>> +	unsigned int port;
>> +	u32 mask = 0;
>> +
>> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
>> +		if (!((1 << port) & ds->phys_port_mask))
>> +			continue;
>> +
>> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>> +			mask |= 1 << port;
>> +	}
>> +
>> +       return mask;
>> +}
>> +
>> +static int dsa_slave_bridge_port_join(struct net_device *dev,
>> +				      struct net_device *bridge)
>> +{
>> +	struct dsa_slave_priv *p = netdev_priv(dev);
>> +	struct dsa_switch *ds = p->parent;
>> +	int ret = -EOPNOTSUPP;
>> +
>> +	if (ds->drv->port_join_bridge)
>> +		ret = ds->drv->port_join_bridge(ds, p->port,
>> +						dsa_slave_br_port_mask(ds));
>
> Hi Florian
>
> Shouldn't this bridge port mask also be dependent on bridge?
>
> I'm thinking of cases like
>
> brctl addbr br0
> brctl addif br0 lan0
> brctl addif br0 lan1
>
> brctl addbr br1
> brctl addif br1 lan2
> brctl addif br1 lan3
>
> We have two software bridges, so need two masks.  It does look like
> your hardware and the Marvell hardware supports this, disjoint sets of
> bridged ports.  But with the current implementation, your going to end
> up with one hardware bridge with four ports, and broken STP.
>

Same problem here.

With the code above I see the ports which are part of "a" bridge group,
but I can not determine which group the port is supposed to join.

I don't really see the value in providing the port mask to
port_join_bridge, but maybe I am missing something. In my
implementation I just passed on bridge * and used it to
determine which port belongs to which bridge group.
It doesn't have to be that, but maybe we can use the bridge
ifindex or anything else that lets me determine which bridge
group the port belongs to.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-18  3:53     ` Guenter Roeck
@ 2015-02-18  4:53       ` Florian Fainelli
  2015-02-18  6:14         ` Guenter Roeck
  2015-02-18 13:49         ` Andrew Lunn
  0 siblings, 2 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-18  4:53 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

2015-02-17 19:53 GMT-08:00 Guenter Roeck <linux@roeck-us.net>:
> On 02/17/2015 05:19 PM, Andrew Lunn wrote:
>>>
>>> +/* Return a bitmask of all ports being currently bridged. Note that on
>>> + * leave, the mask will still return the bitmask of ports currently
>>> bridged,
>>> + * prior to port removal, and this is exactly what we want.
>>> + */
>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>> +{
>>> +       unsigned int port;
>>> +       u32 mask = 0;
>>> +
>>> +       for (port = 0; port < DSA_MAX_PORTS; port++) {
>>> +               if (!((1 << port) & ds->phys_port_mask))
>>> +                       continue;
>>> +
>>> +               if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>> +                       mask |= 1 << port;
>>> +       }
>>> +
>>> +       return mask;
>>> +}
>>> +
>>> +static int dsa_slave_bridge_port_join(struct net_device *dev,
>>> +                                     struct net_device *bridge)
>>> +{
>>> +       struct dsa_slave_priv *p = netdev_priv(dev);
>>> +       struct dsa_switch *ds = p->parent;
>>> +       int ret = -EOPNOTSUPP;
>>> +
>>> +       if (ds->drv->port_join_bridge)
>>> +               ret = ds->drv->port_join_bridge(ds, p->port,
>>> +
>>> dsa_slave_br_port_mask(ds));
>>
>>
>> Hi Florian
>>
>> Shouldn't this bridge port mask also be dependent on bridge?
>>
>> I'm thinking of cases like
>>
>> brctl addbr br0
>> brctl addif br0 lan0
>> brctl addif br0 lan1
>>
>> brctl addbr br1
>> brctl addif br1 lan2
>> brctl addif br1 lan3
>>
>> We have two software bridges, so need two masks.  It does look like
>> your hardware and the Marvell hardware supports this, disjoint sets of
>> bridged ports.  But with the current implementation, your going to end
>> up with one hardware bridge with four ports, and broken STP.
>>
>
> Same problem here.
>
> With the code above I see the ports which are part of "a" bridge group,
> but I can not determine which group the port is supposed to join.

I have put a v2 here which addresses that by retaining which bridge
the port was added to and comparing that against the bridge net_device
we want to join:

https://github.com/ffainelli/linux/tree/dsa-hw-bridge-v2

>
> I don't really see the value in providing the port mask to
> port_join_bridge, but maybe I am missing something. In my
> implementation I just passed on bridge * and used it to
> determine which port belongs to which bridge group.
> It doesn't have to be that, but maybe we can use the bridge
> ifindex or anything else that lets me determine which bridge
> group the port belongs to.

As I described in the cover letter, I tend to think that resolving
this bitmask is part of the abstraction DSA should provide to its
driver, maybe it is more future proof and better to give access to the
bridge net_device pointer such that drivers can figure out the bridge
details themselves. With that in mind, maybe we can do something like
this:

- add the bridge net_device pointer to the join/leave callbacks
- provide a helper like dsa_slave_br_port_mask that drivers use or not
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-18  4:53       ` Florian Fainelli
@ 2015-02-18  6:14         ` Guenter Roeck
  2015-02-18 13:49         ` Andrew Lunn
  1 sibling, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-18  6:14 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On 02/17/2015 08:53 PM, Florian Fainelli wrote:
> 2015-02-17 19:53 GMT-08:00 Guenter Roeck <linux@roeck-us.net>:
[...]
>
> I have put a v2 here which addresses that by retaining which bridge
> the port was added to and comparing that against the bridge net_device
> we want to join:
>
> https://github.com/ffainelli/linux/tree/dsa-hw-bridge-v2
>
>>
>> I don't really see the value in providing the port mask to
>> port_join_bridge, but maybe I am missing something. In my
>> implementation I just passed on bridge * and used it to
>> determine which port belongs to which bridge group.
>> It doesn't have to be that, but maybe we can use the bridge
>> ifindex or anything else that lets me determine which bridge
>> group the port belongs to.
>
> As I described in the cover letter, I tend to think that resolving
> this bitmask is part of the abstraction DSA should provide to its
> driver, maybe it is more future proof and better to give access to the
> bridge net_device pointer such that drivers can figure out the bridge
> details themselves. With that in mind, maybe we can do something like
> this:
>
> - add the bridge net_device pointer to the join/leave callbacks
> - provide a helper like dsa_slave_br_port_mask that drivers use or not
>
Not sure if that is needed.

At first glance v2 should do it. I only use the bridge pointer
to create a set of bit masks, one for each bridge group. This is
pretty much what you do now. So I should have the information I need,
and it should actually simplify my code to some degree.

Unfortunately I won't be able to work on this for the next few days,
since I'll be at the Linux Collaboration Summit in Santa Rosa.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-18  4:53       ` Florian Fainelli
  2015-02-18  6:14         ` Guenter Roeck
@ 2015-02-18 13:49         ` Andrew Lunn
  2015-02-18 14:05           ` Guenter Roeck
  1 sibling, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-18 13:49 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Guenter Roeck, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

> I have put a v2 here which addresses that by retaining which bridge
> the port was added to and comparing that against the bridge net_device
> we want to join:
> 
> https://github.com/ffainelli/linux/tree/dsa-hw-bridge-v2

Hi Florian

This looks O.K.

If we ever get a switch which does not allow disjoint sets of ports,
it will need some changes, but lets not over engineer it now.

   Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-18 13:49         ` Andrew Lunn
@ 2015-02-18 14:05           ` Guenter Roeck
  0 siblings, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-18 14:05 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli
  Cc: netdev, David Miller, Vivien Didelot, jerome.oufella, Chris Healy

On 02/18/2015 05:49 AM, Andrew Lunn wrote:
>> I have put a v2 here which addresses that by retaining which bridge
>> the port was added to and comparing that against the bridge net_device
>> we want to join:
>>
>> https://github.com/ffainelli/linux/tree/dsa-hw-bridge-v2
>
> Hi Florian
>
> This looks O.K.
>
> If we ever get a switch which does not allow disjoint sets of ports,
> it will need some changes, but lets not over engineer it now.
>

Good point. The driver for that chip would have to detect the situation,
maybe by storing the bridge mask, and reject to configure the port
if there is a bridge mask mismatch. So I think this could work with the
current API.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-17 19:26 ` [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations Florian Fainelli
@ 2015-02-19  2:48   ` Florian Fainelli
  2015-02-19  5:59     ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-19  2:48 UTC (permalink / raw)
  To: netdev; +Cc: davem, vivien.didelot, jerome.oufella, linux, andrew, cphealy

On 17/02/15 11:26, Florian Fainelli wrote:
> Update the Broadcom Starfighter 2 switch driver to implement the
> join/leave/stp_update callbacks required for basic hardware bridging
> support.
> 
> There is not much to be done at the driver level but translating the
> STP state from Linux to their HW values.
> 
> Joining a bridge means that the joining port and the other port members
> need to be in the same VLAN membership as the CPU, while leaving the
> bridge puts the port back into a separate VLAN membership with only the
> CPU.

I found a couple additional issues while testing:

- manipulating UP/DOWN state of interfaces that are part of a bridge
would not restore their bridge membership

- removing an interface from a bridge and bringing it back up would
leave it in blocked state

> 
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  drivers/net/dsa/bcm_sf2.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 93 insertions(+)
> 
> diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
> index 4daffb284931..006c86a4df54 100644
> --- a/drivers/net/dsa/bcm_sf2.c
> +++ b/drivers/net/dsa/bcm_sf2.c
> @@ -23,6 +23,7 @@
>  #include <linux/of_address.h>
>  #include <net/dsa.h>
>  #include <linux/ethtool.h>
> +#include <linux/if_bridge.h>
>  
>  #include "bcm_sf2.h"
>  #include "bcm_sf2_regs.h"
> @@ -400,6 +401,95 @@ static int bcm_sf2_sw_set_eee(struct dsa_switch *ds, int port,
>  	return 0;
>  }
>  
> +static int bcm_sf2_sw_br_join(struct dsa_switch *ds, int port,
> +			      u32 br_port_mask)
> +{
> +	struct bcm_sf2_priv *priv = ds_to_priv(ds);
> +	unsigned int i;
> +	u32 reg, p_ctl;
> +
> +	p_ctl = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(port));
> +
> +	for (i = 0; i < priv->hw_params.num_ports; i++) {
> +		if (!((1 << i) & br_port_mask))
> +			continue;
> +
> +		reg = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(i));
> +		reg |= 1 << port;
> +		core_writel(priv, reg, CORE_PORT_VLAN_CTL_PORT(i));
> +
> +		p_ctl |= 1 << i;
> +	}
> +
> +	core_writel(priv, p_ctl, CORE_PORT_VLAN_CTL_PORT(port));
> +
> +	return 0;
> +}
> +
> +static int bcm_sf2_sw_br_leave(struct dsa_switch *ds, int port,
> +			       u32 br_port_mask)
> +{
> +	struct bcm_sf2_priv *priv = ds_to_priv(ds);
> +	unsigned int i;
> +	u32 reg, p_ctl;
> +
> +	p_ctl = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(port));
> +
> +	for (i = 0; i < priv->hw_params.num_ports; i++) {
> +		/* Don't touch the remaining ports */
> +		if (!((1 << i) & br_port_mask))
> +			continue;
> +
> +		reg = core_readl(priv, CORE_PORT_VLAN_CTL_PORT(i));
> +		reg &= ~(1 << port);
> +		core_writel(priv, reg, CORE_PORT_VLAN_CTL_PORT(i));
> +
> +		/* Prevent self removal to preserve isolation */
> +		if (port != i)
> +			p_ctl &= ~(1 << i);
> +	}
> +
> +	core_writel(priv, p_ctl, CORE_PORT_VLAN_CTL_PORT(port));
> +
> +	return 0;
> +}
> +
> +static int bcm_sf2_sw_br_stp_update(struct dsa_switch *ds, int port,
> +				    u8 state)
> +{
> +	struct bcm_sf2_priv *priv = ds_to_priv(ds);
> +	u32 reg;
> +	u8 hw_state;
> +
> +	switch (state) {
> +	case BR_STATE_DISABLED:
> +		hw_state = G_MISTP_DIS_STATE;
> +		break;
> +	case BR_STATE_LISTENING:
> +		hw_state = G_MISTP_LISTEN_STATE;
> +		break;
> +	case BR_STATE_LEARNING:
> +		hw_state = G_MISTP_LEARN_STATE;
> +		break;
> +	case BR_STATE_FORWARDING:
> +		hw_state = G_MISTP_FWD_STATE;
> +		break;
> +	case BR_STATE_BLOCKING:
> +		hw_state = G_MISTP_BLOCK_STATE;
> +		break;
> +	default:
> +		pr_err("%s: invalid STP state: %d\n", __func__, state);
> +		return -EINVAL;
> +	}
> +
> +	reg = core_readl(priv, CORE_G_PCTL_PORT(port));
> +	reg &= ~(G_MISTP_STATE_MASK << G_MISTP_STATE_SHIFT);
> +	reg |= hw_state;
> +	core_writel(priv, reg, CORE_G_PCTL_PORT(port));
> +
> +	return 0;
> +}
> +
>  static irqreturn_t bcm_sf2_switch_0_isr(int irq, void *dev_id)
>  {
>  	struct bcm_sf2_priv *priv = dev_id;
> @@ -916,6 +1006,9 @@ static struct dsa_switch_driver bcm_sf2_switch_driver = {
>  	.port_disable		= bcm_sf2_port_disable,
>  	.get_eee		= bcm_sf2_sw_get_eee,
>  	.set_eee		= bcm_sf2_sw_set_eee,
> +	.port_join_bridge	= bcm_sf2_sw_br_join,
> +	.port_leave_bridge	= bcm_sf2_sw_br_leave,
> +	.port_stp_update	= bcm_sf2_sw_br_stp_update,
>  };
>  
>  static int __init bcm_sf2_init(void)
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-19  2:48   ` Florian Fainelli
@ 2015-02-19  5:59     ` Guenter Roeck
  2015-02-19 17:27       ` Florian Fainelli
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-19  5:59 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy

On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
> On 17/02/15 11:26, Florian Fainelli wrote:
> > Update the Broadcom Starfighter 2 switch driver to implement the
> > join/leave/stp_update callbacks required for basic hardware bridging
> > support.
> > 
> > There is not much to be done at the driver level but translating the
> > STP state from Linux to their HW values.
> > 
> > Joining a bridge means that the joining port and the other port members
> > need to be in the same VLAN membership as the CPU, while leaving the
> > bridge puts the port back into a separate VLAN membership with only the
> > CPU.
> 
> I found a couple additional issues while testing:
> 
> - manipulating UP/DOWN state of interfaces that are part of a bridge
> would not restore their bridge membership
> 
> - removing an interface from a bridge and bringing it back up would
> leave it in blocked state
> 
Is this a problem with your implementation for sf2 or a generic problem
with the first patch, such as some missing state transitions ?

For sf2, you might have to set the port state as well as the bridge
association in the port_setup function. That is of course just a
wild guess.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-19  5:59     ` Guenter Roeck
@ 2015-02-19 17:27       ` Florian Fainelli
  2015-02-19 17:46         ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-19 17:27 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 18/02/15 21:59, Guenter Roeck wrote:
> On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
>> On 17/02/15 11:26, Florian Fainelli wrote:
>>> Update the Broadcom Starfighter 2 switch driver to implement the
>>> join/leave/stp_update callbacks required for basic hardware bridging
>>> support.
>>>
>>> There is not much to be done at the driver level but translating the
>>> STP state from Linux to their HW values.
>>>
>>> Joining a bridge means that the joining port and the other port members
>>> need to be in the same VLAN membership as the CPU, while leaving the
>>> bridge puts the port back into a separate VLAN membership with only the
>>> CPU.
>>
>> I found a couple additional issues while testing:
>>
>> - manipulating UP/DOWN state of interfaces that are part of a bridge
>> would not restore their bridge membership
>>
>> - removing an interface from a bridge and bringing it back up would
>> leave it in blocked state
>>
> Is this a problem with your implementation for sf2 or a generic problem
> with the first patch, such as some missing state transitions ?

This is more of a side effect of having HW (an Ethernet switch) that can
really block a given port, based on last night's discussion with Roopa,
we can either fix this at the DSA level (in our case) or better fix this
at the bridge layer, I will propose a fix for this shortly.

> 
> For sf2, you might have to set the port state as well as the bridge
> association in the port_setup function. That is of course just a
> wild guess.

Right, that's what I ended up doing. Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-19 17:27       ` Florian Fainelli
@ 2015-02-19 17:46         ` Guenter Roeck
  2015-02-19 23:50           ` Florian Fainelli
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-19 17:46 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy

On Thu, Feb 19, 2015 at 09:27:23AM -0800, Florian Fainelli wrote:
> On 18/02/15 21:59, Guenter Roeck wrote:
> > On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
> >> On 17/02/15 11:26, Florian Fainelli wrote:
> >>> Update the Broadcom Starfighter 2 switch driver to implement the
> >>> join/leave/stp_update callbacks required for basic hardware bridging
> >>> support.
> >>>
> >>> There is not much to be done at the driver level but translating the
> >>> STP state from Linux to their HW values.
> >>>
> >>> Joining a bridge means that the joining port and the other port members
> >>> need to be in the same VLAN membership as the CPU, while leaving the
> >>> bridge puts the port back into a separate VLAN membership with only the
> >>> CPU.
> >>
> >> I found a couple additional issues while testing:
> >>
> >> - manipulating UP/DOWN state of interfaces that are part of a bridge
> >> would not restore their bridge membership
> >>
> >> - removing an interface from a bridge and bringing it back up would
> >> leave it in blocked state
> >>
> > Is this a problem with your implementation for sf2 or a generic problem
> > with the first patch, such as some missing state transitions ?
> 
> This is more of a side effect of having HW (an Ethernet switch) that can
> really block a given port, based on last night's discussion with Roopa,
> we can either fix this at the DSA level (in our case) or better fix this
> at the bridge layer, I will propose a fix for this shortly.
> 
> > 
> > For sf2, you might have to set the port state as well as the bridge
> > association in the port_setup function. That is of course just a
> > wild guess.
> 
> Right, that's what I ended up doing. Thanks!

Great.

Another question: How do you handle flushing the forwarding database ?

My current code for mv88e6352 flushes the forwarding database for a bridge
group if the port association for that group changes (whenever a port joins
or leaves a group, or whenever the state of a port in a group changes).
There is, however, another situation where the forwarding database may
have to be flushed - essentially on each topology change.

How do you handle this situation ? Is it a real problem or do I just
imagine that it is ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-19 17:46         ` Guenter Roeck
@ 2015-02-19 23:50           ` Florian Fainelli
  2015-02-20  0:09             ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-19 23:50 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 19/02/15 09:46, Guenter Roeck wrote:
> On Thu, Feb 19, 2015 at 09:27:23AM -0800, Florian Fainelli wrote:
>> On 18/02/15 21:59, Guenter Roeck wrote:
>>> On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
>>>> On 17/02/15 11:26, Florian Fainelli wrote:
>>>>> Update the Broadcom Starfighter 2 switch driver to implement the
>>>>> join/leave/stp_update callbacks required for basic hardware bridging
>>>>> support.
>>>>>
>>>>> There is not much to be done at the driver level but translating the
>>>>> STP state from Linux to their HW values.
>>>>>
>>>>> Joining a bridge means that the joining port and the other port members
>>>>> need to be in the same VLAN membership as the CPU, while leaving the
>>>>> bridge puts the port back into a separate VLAN membership with only the
>>>>> CPU.
>>>>
>>>> I found a couple additional issues while testing:
>>>>
>>>> - manipulating UP/DOWN state of interfaces that are part of a bridge
>>>> would not restore their bridge membership
>>>>
>>>> - removing an interface from a bridge and bringing it back up would
>>>> leave it in blocked state
>>>>
>>> Is this a problem with your implementation for sf2 or a generic problem
>>> with the first patch, such as some missing state transitions ?
>>
>> This is more of a side effect of having HW (an Ethernet switch) that can
>> really block a given port, based on last night's discussion with Roopa,
>> we can either fix this at the DSA level (in our case) or better fix this
>> at the bridge layer, I will propose a fix for this shortly.
>>
>>>
>>> For sf2, you might have to set the port state as well as the bridge
>>> association in the port_setup function. That is of course just a
>>> wild guess.
>>
>> Right, that's what I ended up doing. Thanks!
> 
> Great.
> 
> Another question: How do you handle flushing the forwarding database ?
> 
> My current code for mv88e6352 flushes the forwarding database for a bridge
> group if the port association for that group changes (whenever a port joins
> or leaves a group, or whenever the state of a port in a group changes).
> There is, however, another situation where the forwarding database may
> have to be flushed - essentially on each topology change.
> 
> How do you handle this situation ? Is it a real problem or do I just
> imagine that it is ?

This is a real problem, for once I was working under the assumption that
the SF2 hardware was doing an automatic FDB flushing, but after
re-reading the documentation, this is not the case. My lab network does
not have many stations and I certainly did not catch that case.
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-19 23:50           ` Florian Fainelli
@ 2015-02-20  0:09             ` Guenter Roeck
  2015-02-20  0:51               ` roopa
                                 ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-20  0:09 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy

On Thu, Feb 19, 2015 at 03:50:53PM -0800, Florian Fainelli wrote:
> On 19/02/15 09:46, Guenter Roeck wrote:
> > On Thu, Feb 19, 2015 at 09:27:23AM -0800, Florian Fainelli wrote:
> >> On 18/02/15 21:59, Guenter Roeck wrote:
> >>> On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
> >>>> On 17/02/15 11:26, Florian Fainelli wrote:
> >>>>> Update the Broadcom Starfighter 2 switch driver to implement the
> >>>>> join/leave/stp_update callbacks required for basic hardware bridging
> >>>>> support.
> >>>>>
> >>>>> There is not much to be done at the driver level but translating the
> >>>>> STP state from Linux to their HW values.
> >>>>>
> >>>>> Joining a bridge means that the joining port and the other port members
> >>>>> need to be in the same VLAN membership as the CPU, while leaving the
> >>>>> bridge puts the port back into a separate VLAN membership with only the
> >>>>> CPU.
> >>>>
> >>>> I found a couple additional issues while testing:
> >>>>
> >>>> - manipulating UP/DOWN state of interfaces that are part of a bridge
> >>>> would not restore their bridge membership
> >>>>
> >>>> - removing an interface from a bridge and bringing it back up would
> >>>> leave it in blocked state
> >>>>
> >>> Is this a problem with your implementation for sf2 or a generic problem
> >>> with the first patch, such as some missing state transitions ?
> >>
> >> This is more of a side effect of having HW (an Ethernet switch) that can
> >> really block a given port, based on last night's discussion with Roopa,
> >> we can either fix this at the DSA level (in our case) or better fix this
> >> at the bridge layer, I will propose a fix for this shortly.
> >>
> >>>
> >>> For sf2, you might have to set the port state as well as the bridge
> >>> association in the port_setup function. That is of course just a
> >>> wild guess.
> >>
> >> Right, that's what I ended up doing. Thanks!
> > 
> > Great.
> > 
> > Another question: How do you handle flushing the forwarding database ?
> > 
> > My current code for mv88e6352 flushes the forwarding database for a bridge
> > group if the port association for that group changes (whenever a port joins
> > or leaves a group, or whenever the state of a port in a group changes).
> > There is, however, another situation where the forwarding database may
> > have to be flushed - essentially on each topology change.
> > 
> > How do you handle this situation ? Is it a real problem or do I just
> > imagine that it is ?
> 
> This is a real problem, for once I was working under the assumption that
> the SF2 hardware was doing an automatic FDB flushing, but after
> re-reading the documentation, this is not the case. My lab network does
> not have many stations and I certainly did not catch that case.

The rocker code implements the fdb flush operation pretty much the same
way I do, so I seem to be doing something right.

Not sure yet what to do about setting the fdb aging time. I don't see a
mechanism to do that. No idea how important that is.

I think we'll also need support for ndo_vlan_rx_add_vid, ndo_vlan_rx_kill_vid,
ndo_fdb_add, ndo_fdb_del, and ndo_fdb_dump. Have you thought about those ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  0:09             ` Guenter Roeck
@ 2015-02-20  0:51               ` roopa
  2015-02-20  1:03                 ` Guenter Roeck
  2015-02-20  2:02               ` Andrew Lunn
  2015-02-20  2:03               ` Florian Fainelli
  2 siblings, 1 reply; 65+ messages in thread
From: roopa @ 2015-02-20  0:51 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy

On 2/19/15, 4:09 PM, Guenter Roeck wrote:
> On Thu, Feb 19, 2015 at 03:50:53PM -0800, Florian Fainelli wrote:
>> On 19/02/15 09:46, Guenter Roeck wrote:
>>> On Thu, Feb 19, 2015 at 09:27:23AM -0800, Florian Fainelli wrote:
>>>> On 18/02/15 21:59, Guenter Roeck wrote:
>>>>> On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
>>>>>> On 17/02/15 11:26, Florian Fainelli wrote:
>>>>>>> Update the Broadcom Starfighter 2 switch driver to implement the
>>>>>>> join/leave/stp_update callbacks required for basic hardware bridging
>>>>>>> support.
>>>>>>>
>>>>>>> There is not much to be done at the driver level but translating the
>>>>>>> STP state from Linux to their HW values.
>>>>>>>
>>>>>>> Joining a bridge means that the joining port and the other port members
>>>>>>> need to be in the same VLAN membership as the CPU, while leaving the
>>>>>>> bridge puts the port back into a separate VLAN membership with only the
>>>>>>> CPU.
>>>>>> I found a couple additional issues while testing:
>>>>>>
>>>>>> - manipulating UP/DOWN state of interfaces that are part of a bridge
>>>>>> would not restore their bridge membership
>>>>>>
>>>>>> - removing an interface from a bridge and bringing it back up would
>>>>>> leave it in blocked state
>>>>>>
>>>>> Is this a problem with your implementation for sf2 or a generic problem
>>>>> with the first patch, such as some missing state transitions ?
>>>> This is more of a side effect of having HW (an Ethernet switch) that can
>>>> really block a given port, based on last night's discussion with Roopa,
>>>> we can either fix this at the DSA level (in our case) or better fix this
>>>> at the bridge layer, I will propose a fix for this shortly.
>>>>
>>>>> For sf2, you might have to set the port state as well as the bridge
>>>>> association in the port_setup function. That is of course just a
>>>>> wild guess.
>>>> Right, that's what I ended up doing. Thanks!
>>> Great.
>>>
>>> Another question: How do you handle flushing the forwarding database ?
>>>
>>> My current code for mv88e6352 flushes the forwarding database for a bridge
>>> group if the port association for that group changes (whenever a port joins
>>> or leaves a group, or whenever the state of a port in a group changes).
>>> There is, however, another situation where the forwarding database may
>>> have to be flushed - essentially on each topology change.
>>>
>>> How do you handle this situation ? Is it a real problem or do I just
>>> imagine that it is ?
>> This is a real problem, for once I was working under the assumption that
>> the SF2 hardware was doing an automatic FDB flushing, but after
>> re-reading the documentation, this is not the case. My lab network does
>> not have many stations and I certainly did not catch that case.
> The rocker code implements the fdb flush operation pretty much the same
> way I do, so I seem to be doing something right.
>
> Not sure yet what to do about setting the fdb aging time. I don't see a
> mechanism to do that. No idea how important that is.
rocker, the only consumer today relies on the bridge driver aging of 
learnt entries.
You could do the same.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  0:51               ` roopa
@ 2015-02-20  1:03                 ` Guenter Roeck
  2015-02-20  1:46                   ` roopa
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-20  1:03 UTC (permalink / raw)
  To: roopa
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy

On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
> >
> >Not sure yet what to do about setting the fdb aging time. I don't see a
> >mechanism to do that. No idea how important that is.
> rocker, the only consumer today relies on the bridge driver aging of learnt
> entries.
> You could do the same.
> 

Remember that we are dealing with hardware switch chips. Those chips
won't time out fdb entries just because the kernel's bridge driver
thinks that it should.

Am I missing something ?

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  1:03                 ` Guenter Roeck
@ 2015-02-20  1:46                   ` roopa
  2015-02-20  2:00                     ` Florian Fainelli
  2015-02-20  2:18                     ` Andrew Lunn
  0 siblings, 2 replies; 65+ messages in thread
From: roopa @ 2015-02-20  1:46 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy

On 2/19/15, 5:03 PM, Guenter Roeck wrote:
> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
>>> Not sure yet what to do about setting the fdb aging time. I don't see a
>>> mechanism to do that. No idea how important that is.
>> rocker, the only consumer today relies on the bridge driver aging of learnt
>> entries.
>> You could do the same.
>>
> Remember that we are dealing with hardware switch chips. Those chips
> won't time out fdb entries just because the kernel's bridge driver
> thinks that it should.
Oh, they dont..?. sorry,  I dont know the details about your hardware. 
But, if these are entries learnt by hw, there should be a hw config to 
age them (I guess that is what you are talking about). Which the swicth 
driver can set.
If you disable hw aging, you can sync these entries to the bridge 
driver, and make the bridge driver age them followed by a subsequent 
delete in hw.

just trying to see if what we do will help here.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  1:46                   ` roopa
@ 2015-02-20  2:00                     ` Florian Fainelli
  2015-02-20  2:41                       ` roopa
  2015-02-20  3:20                       ` Andy Gospodarek
  2015-02-20  2:18                     ` Andrew Lunn
  1 sibling, 2 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-20  2:00 UTC (permalink / raw)
  To: roopa, Guenter Roeck
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 19/02/15 17:46, roopa wrote:
> On 2/19/15, 5:03 PM, Guenter Roeck wrote:
>> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
>>>> Not sure yet what to do about setting the fdb aging time. I don't see a
>>>> mechanism to do that. No idea how important that is.
>>> rocker, the only consumer today relies on the bridge driver aging of
>>> learnt
>>> entries.
>>> You could do the same.
>>>
>> Remember that we are dealing with hardware switch chips. Those chips
>> won't time out fdb entries just because the kernel's bridge driver
>> thinks that it should.
> Oh, they dont..?. sorry,  I dont know the details about your hardware.
> But, if these are entries learnt by hw, there should be a hw config to
> age them (I guess that is what you are talking about). Which the swicth
> driver can set.
> If you disable hw aging, you can sync these entries to the bridge
> driver, and make the bridge driver age them followed by a subsequent
> delete in hw.

The SF2 HW has and aging and a valid bit available, I guess my question
would be, do we have anything today in "net-next" that allows
configuring HW aging vs. SW aging (implying doing a HW to SW sync)?
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  0:09             ` Guenter Roeck
  2015-02-20  0:51               ` roopa
@ 2015-02-20  2:02               ` Andrew Lunn
  2015-02-20  3:55                 ` Guenter Roeck
  2015-02-20  2:03               ` Florian Fainelli
  2 siblings, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-20  2:02 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella, cphealy

> Not sure yet what to do about setting the fdb aging time. I don't
> see a mechanism to do that. No idea how important that is.

Hi Guenter

I don't know about your chip, but the public data sheet for the
88e6060 talks about being able to set the age time globally in the ATU
control register. The granularity is not so good, multiple of 16
seconds, so it could be the hardware bridge times out an entry faster
than the software bridge.

	 Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  0:09             ` Guenter Roeck
  2015-02-20  0:51               ` roopa
  2015-02-20  2:02               ` Andrew Lunn
@ 2015-02-20  2:03               ` Florian Fainelli
  2015-02-20  2:46                 ` roopa
  2 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-20  2:03 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: netdev, davem, vivien.didelot, jerome.oufella, andrew, cphealy,
	Scott Feldman, Jiri Pirko

On 19/02/15 16:09, Guenter Roeck wrote:
> On Thu, Feb 19, 2015 at 03:50:53PM -0800, Florian Fainelli wrote:
>> On 19/02/15 09:46, Guenter Roeck wrote:
>>> On Thu, Feb 19, 2015 at 09:27:23AM -0800, Florian Fainelli wrote:
>>>> On 18/02/15 21:59, Guenter Roeck wrote:
>>>>> On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
>>>>>> On 17/02/15 11:26, Florian Fainelli wrote:
>>>>>>> Update the Broadcom Starfighter 2 switch driver to implement the
>>>>>>> join/leave/stp_update callbacks required for basic hardware bridging
>>>>>>> support.
>>>>>>>
>>>>>>> There is not much to be done at the driver level but translating the
>>>>>>> STP state from Linux to their HW values.
>>>>>>>
>>>>>>> Joining a bridge means that the joining port and the other port members
>>>>>>> need to be in the same VLAN membership as the CPU, while leaving the
>>>>>>> bridge puts the port back into a separate VLAN membership with only the
>>>>>>> CPU.
>>>>>>
>>>>>> I found a couple additional issues while testing:
>>>>>>
>>>>>> - manipulating UP/DOWN state of interfaces that are part of a bridge
>>>>>> would not restore their bridge membership
>>>>>>
>>>>>> - removing an interface from a bridge and bringing it back up would
>>>>>> leave it in blocked state
>>>>>>
>>>>> Is this a problem with your implementation for sf2 or a generic problem
>>>>> with the first patch, such as some missing state transitions ?
>>>>
>>>> This is more of a side effect of having HW (an Ethernet switch) that can
>>>> really block a given port, based on last night's discussion with Roopa,
>>>> we can either fix this at the DSA level (in our case) or better fix this
>>>> at the bridge layer, I will propose a fix for this shortly.
>>>>
>>>>>
>>>>> For sf2, you might have to set the port state as well as the bridge
>>>>> association in the port_setup function. That is of course just a
>>>>> wild guess.
>>>>
>>>> Right, that's what I ended up doing. Thanks!
>>>
>>> Great.
>>>
>>> Another question: How do you handle flushing the forwarding database ?
>>>
>>> My current code for mv88e6352 flushes the forwarding database for a bridge
>>> group if the port association for that group changes (whenever a port joins
>>> or leaves a group, or whenever the state of a port in a group changes).
>>> There is, however, another situation where the forwarding database may
>>> have to be flushed - essentially on each topology change.
>>>
>>> How do you handle this situation ? Is it a real problem or do I just
>>> imagine that it is ?
>>
>> This is a real problem, for once I was working under the assumption that
>> the SF2 hardware was doing an automatic FDB flushing, but after
>> re-reading the documentation, this is not the case. My lab network does
>> not have many stations and I certainly did not catch that case.
> 
> The rocker code implements the fdb flush operation pretty much the same
> way I do, so I seem to be doing something right.
> 
> Not sure yet what to do about setting the fdb aging time. I don't see a
> mechanism to do that. No idea how important that is.
> 
> I think we'll also need support for ndo_vlan_rx_add_vid, ndo_vlan_rx_kill_vid,
> ndo_fdb_add, ndo_fdb_del, and ndo_fdb_dump. Have you thought about those ?

Yes, we will need those as well, although for now, I think we might just
be able to get basic HW bridging to work with the current patches +
driver specific FDB flushing, what do you think?

The only part thing that I cannot really map to HW switches today is
that the ndo_vlan_rx_add_vid() does not specify whether the VLAN id you
are adding is tagged or untagged (aka native). You can configure that
using the "bridge" sub-command from iproute2, but I am not sure how this
translates in terms of programming the hardware with the tagged/untagged
bit yet.

Maybe we should update ndo_vlan_rx_add_vid() to allow for a tag bit to
specified?
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  1:46                   ` roopa
  2015-02-20  2:00                     ` Florian Fainelli
@ 2015-02-20  2:18                     ` Andrew Lunn
  2015-02-20  2:51                       ` roopa
  1 sibling, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-20  2:18 UTC (permalink / raw)
  To: roopa
  Cc: Guenter Roeck, Florian Fainelli, netdev, davem, vivien.didelot,
	jerome.oufella, cphealy

> >Remember that we are dealing with hardware switch chips. Those chips
> >won't time out fdb entries just because the kernel's bridge driver
> >thinks that it should.

> Oh, they dont..?

To some extent, it is better to think of these as two switches
connected to each other, not one switch. The HW switch needs help with
STP, but otherwise it is a fully functional and autonomous switch.

This is going to make displaying the forwarding database interesting,
because the SW bridge fdb and the HW bridge fdb are each subsets of
the big picture and possible even contradictory since they are not
updated atomically.

	Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:00                     ` Florian Fainelli
@ 2015-02-20  2:41                       ` roopa
  2015-02-20  4:05                         ` Guenter Roeck
  2015-02-20  3:20                       ` Andy Gospodarek
  1 sibling, 1 reply; 65+ messages in thread
From: roopa @ 2015-02-20  2:41 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Guenter Roeck, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy

On 2/19/15, 6:00 PM, Florian Fainelli wrote:
> On 19/02/15 17:46, roopa wrote:
>> On 2/19/15, 5:03 PM, Guenter Roeck wrote:
>>> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
>>>>> Not sure yet what to do about setting the fdb aging time. I don't see a
>>>>> mechanism to do that. No idea how important that is.
>>>> rocker, the only consumer today relies on the bridge driver aging of
>>>> learnt
>>>> entries.
>>>> You could do the same.
>>>>
>>> Remember that we are dealing with hardware switch chips. Those chips
>>> won't time out fdb entries just because the kernel's bridge driver
>>> thinks that it should.
>> Oh, they dont..?. sorry,  I dont know the details about your hardware.
>> But, if these are entries learnt by hw, there should be a hw config to
>> age them (I guess that is what you are talking about). Which the swicth
>> driver can set.
>> If you disable hw aging, you can sync these entries to the bridge
>> driver, and make the bridge driver age them followed by a subsequent
>> delete in hw.
> The SF2 HW has and aging and a valid bit available, I guess my question
> would be, do we have anything today in "net-next" that allows
> configuring HW aging vs. SW aging (implying doing a HW to SW sync)?


There is no config parameter to set HW aging vs SW aging. But if you 
want SW aging, an example is the rocker implementation today.
And there is BR_LEARNING_SYNC per bridge port flag to sync HW to SW 
using notifiers (see rocker).

There is no netlink based age time sets AFAIK. But there is age time 
sets from sysctl. And there is no offload support for this today.
The offload support has been discussed previously and there was no need 
to add it immediately.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:03               ` Florian Fainelli
@ 2015-02-20  2:46                 ` roopa
  0 siblings, 0 replies; 65+ messages in thread
From: roopa @ 2015-02-20  2:46 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Guenter Roeck, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy, Scott Feldman, Jiri Pirko

On 2/19/15, 6:03 PM, Florian Fainelli wrote:
> On 19/02/15 16:09, Guenter Roeck wrote:
>> On Thu, Feb 19, 2015 at 03:50:53PM -0800, Florian Fainelli wrote:
>>> On 19/02/15 09:46, Guenter Roeck wrote:
>>>> On Thu, Feb 19, 2015 at 09:27:23AM -0800, Florian Fainelli wrote:
>>>>> On 18/02/15 21:59, Guenter Roeck wrote:
>>>>>> On Wed, Feb 18, 2015 at 06:48:19PM -0800, Florian Fainelli wrote:
>>>>>>> On 17/02/15 11:26, Florian Fainelli wrote:
>>>>>>>> Update the Broadcom Starfighter 2 switch driver to implement the
>>>>>>>> join/leave/stp_update callbacks required for basic hardware bridging
>>>>>>>> support.
>>>>>>>>
>>>>>>>> There is not much to be done at the driver level but translating the
>>>>>>>> STP state from Linux to their HW values.
>>>>>>>>
>>>>>>>> Joining a bridge means that the joining port and the other port members
>>>>>>>> need to be in the same VLAN membership as the CPU, while leaving the
>>>>>>>> bridge puts the port back into a separate VLAN membership with only the
>>>>>>>> CPU.
>>>>>>> I found a couple additional issues while testing:
>>>>>>>
>>>>>>> - manipulating UP/DOWN state of interfaces that are part of a bridge
>>>>>>> would not restore their bridge membership
>>>>>>>
>>>>>>> - removing an interface from a bridge and bringing it back up would
>>>>>>> leave it in blocked state
>>>>>>>
>>>>>> Is this a problem with your implementation for sf2 or a generic problem
>>>>>> with the first patch, such as some missing state transitions ?
>>>>> This is more of a side effect of having HW (an Ethernet switch) that can
>>>>> really block a given port, based on last night's discussion with Roopa,
>>>>> we can either fix this at the DSA level (in our case) or better fix this
>>>>> at the bridge layer, I will propose a fix for this shortly.
>>>>>
>>>>>> For sf2, you might have to set the port state as well as the bridge
>>>>>> association in the port_setup function. That is of course just a
>>>>>> wild guess.
>>>>> Right, that's what I ended up doing. Thanks!
>>>> Great.
>>>>
>>>> Another question: How do you handle flushing the forwarding database ?
>>>>
>>>> My current code for mv88e6352 flushes the forwarding database for a bridge
>>>> group if the port association for that group changes (whenever a port joins
>>>> or leaves a group, or whenever the state of a port in a group changes).
>>>> There is, however, another situation where the forwarding database may
>>>> have to be flushed - essentially on each topology change.
>>>>
>>>> How do you handle this situation ? Is it a real problem or do I just
>>>> imagine that it is ?
>>> This is a real problem, for once I was working under the assumption that
>>> the SF2 hardware was doing an automatic FDB flushing, but after
>>> re-reading the documentation, this is not the case. My lab network does
>>> not have many stations and I certainly did not catch that case.
>> The rocker code implements the fdb flush operation pretty much the same
>> way I do, so I seem to be doing something right.
>>
>> Not sure yet what to do about setting the fdb aging time. I don't see a
>> mechanism to do that. No idea how important that is.
>>
>> I think we'll also need support for ndo_vlan_rx_add_vid, ndo_vlan_rx_kill_vid,
>> ndo_fdb_add, ndo_fdb_del, and ndo_fdb_dump. Have you thought about those ?
> Yes, we will need those as well, although for now, I think we might just
> be able to get basic HW bridging to work with the current patches +
> driver specific FDB flushing, what do you think?
>
> The only part thing that I cannot really map to HW switches today is
> that the ndo_vlan_rx_add_vid() does not specify whether the VLAN id you
> are adding is tagged or untagged (aka native). You can configure that
> using the "bridge" sub-command from iproute2, but I am not sure how this
> translates in terms of programming the hardware with the tagged/untagged
> bit yet.
>
> Maybe we should update ndo_vlan_rx_add_vid() to allow for a tag bit to
> specified?
You should be able to use switchdev api 
netdev_switch_port_bridge_setlink/dellink that in turn calls 
ndo_bridge_setlink/dellink for this.
PF_BRIDGE dellink/setlink calls from the user for vlan add/dels are 
mapped to these ndos. These include the vlan tagged/untagged/pvid flags.
(IFLA_BRIDGE_VLAN_INFO which is of type struct  bridge_vlan_info).
But note that these are only available with the vlan filtering bridge.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:18                     ` Andrew Lunn
@ 2015-02-20  2:51                       ` roopa
  2015-02-20  3:52                         ` Andrew Lunn
  0 siblings, 1 reply; 65+ messages in thread
From: roopa @ 2015-02-20  2:51 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Guenter Roeck, Florian Fainelli, netdev, davem, vivien.didelot,
	jerome.oufella, cphealy

On 2/19/15, 6:18 PM, Andrew Lunn wrote:
>>> Remember that we are dealing with hardware switch chips. Those chips
>>> won't time out fdb entries just because the kernel's bridge driver
>>> thinks that it should.
>> Oh, they dont..?
> To some extent, it is better to think of these as two switches
> connected to each other, not one switch. The HW switch needs help with
> STP, but otherwise it is a fully functional and autonomous switch.
yes, exactly.
>
> This is going to make displaying the forwarding database interesting,
> because the SW bridge fdb and the HW bridge fdb are each subsets of
> the big picture and possible even contradictory since they are not
> updated atomically.
>
What we do on our switches is,
- disable SW bridge learning
- enable HW bridge learning
- we keep the SW bridge fdb in sync with the HW fdb, which leads to:
         - HW learnt entries are pushed to SW bridge fdb
         - SW static fdb entries (added by the user) are pushed to HW

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:00                     ` Florian Fainelli
  2015-02-20  2:41                       ` roopa
@ 2015-02-20  3:20                       ` Andy Gospodarek
  2015-02-20  3:53                         ` Viswanath Bandaru
  2015-02-20  3:56                         ` Andy Gospodarek
  1 sibling, 2 replies; 65+ messages in thread
From: Andy Gospodarek @ 2015-02-20  3:20 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: roopa, Guenter Roeck, netdev, davem, vivien.didelot,
	jerome.oufella, andrew, cphealy

On Thu, Feb 19, 2015 at 06:00:22PM -0800, Florian Fainelli wrote:
> On 19/02/15 17:46, roopa wrote:
> > On 2/19/15, 5:03 PM, Guenter Roeck wrote:
> >> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
> >>>> Not sure yet what to do about setting the fdb aging time. I don't see a
> >>>> mechanism to do that. No idea how important that is.
> >>> rocker, the only consumer today relies on the bridge driver aging of
> >>> learnt
> >>> entries.
> >>> You could do the same.
> >>>
> >> Remember that we are dealing with hardware switch chips. Those chips
> >> won't time out fdb entries just because the kernel's bridge driver
> >> thinks that it should.
> > Oh, they dont..?. sorry,  I dont know the details about your hardware.
> > But, if these are entries learnt by hw, there should be a hw config to
> > age them (I guess that is what you are talking about). Which the swicth
> > driver can set.
> > If you disable hw aging, you can sync these entries to the bridge
> > driver, and make the bridge driver age them followed by a subsequent
> > delete in hw.
> 
> The SF2 HW has and aging and a valid bit available, I guess my question
> would be, do we have anything today in "net-next" that allows
> configuring HW aging vs. SW aging (implying doing a HW to SW sync)?

Yes, the setting of the BR_LEARNING_SYNC bit in bridge port flags should
signal to the hardware that it should send learning notifications up to
the kernel bridge.  This is set via the IFLA_BRPORT_LEARNING attribute
in a setlink message.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:51                       ` roopa
@ 2015-02-20  3:52                         ` Andrew Lunn
  2015-02-20  4:07                           ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-20  3:52 UTC (permalink / raw)
  To: roopa
  Cc: Guenter Roeck, Florian Fainelli, netdev, davem, vivien.didelot,
	jerome.oufella, cphealy

> - we keep the SW bridge fdb in sync with the HW fdb, which leads to:

I don't think this is going to be easy or efficient. At least for the
88e6060 there is no interrupt when there is a change to the
database. So we are going to have to poll. And the only way to figure
out if anything has changed is to read it all out and compare against
the last read.

I suppose we only need to update the SW fdb shortly after the SW
bridge had to perform a flood because it had no entry for the
destination MAC, or after an entry sync'ed into the SW fdb reaches
timeout age.

    Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  3:20                       ` Andy Gospodarek
@ 2015-02-20  3:53                         ` Viswanath Bandaru
  2015-02-20  3:56                         ` Andy Gospodarek
  1 sibling, 0 replies; 65+ messages in thread
From: Viswanath Bandaru @ 2015-02-20  3:53 UTC (permalink / raw)
  To: Andy Gospodarek, Florian Fainelli
  Cc: roopa, Guenter Roeck, netdev, davem, vivien.didelot,
	jerome.oufella, andrew, cphealy, sfeldma

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Andy Gospodarek
> Sent: Friday, February 20, 2015 8:51 AM
> To: Florian Fainelli
> Cc: roopa; Guenter Roeck; netdev@vger.kernel.org; davem@davemloft.net;
> vivien.didelot@savoirfairelinux.com; jerome.oufella@savoirfairelinux.com;
> andrew@lunn.ch; cphealy@gmail.com
> Subject: Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging
> operations
> 
> On Thu, Feb 19, 2015 at 06:00:22PM -0800, Florian Fainelli wrote:
> > On 19/02/15 17:46, roopa wrote:
> > > On 2/19/15, 5:03 PM, Guenter Roeck wrote:
> > >> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
> > >>>> Not sure yet what to do about setting the fdb aging time. I don't
> > >>>> see a mechanism to do that. No idea how important that is.
> > >>> rocker, the only consumer today relies on the bridge driver aging
> > >>> of learnt entries.
> > >>> You could do the same.
> > >>>
> > >> Remember that we are dealing with hardware switch chips. Those
> > >> chips won't time out fdb entries just because the kernel's bridge
> > >> driver thinks that it should.
> > > Oh, they dont..?. sorry,  I dont know the details about your hardware.
> > > But, if these are entries learnt by hw, there should be a hw config
> > > to age them (I guess that is what you are talking about). Which the
> > > swicth driver can set.
> > > If you disable hw aging, you can sync these entries to the bridge
> > > driver, and make the bridge driver age them followed by a subsequent
> > > delete in hw.
> >
> > The SF2 HW has and aging and a valid bit available, I guess my
> > question would be, do we have anything today in "net-next" that allows
> > configuring HW aging vs. SW aging (implying doing a HW to SW sync)?
> 
> Yes, the setting of the BR_LEARNING_SYNC bit in bridge port flags should
> signal to the hardware that it should send learning notifications up to the
> kernel bridge.  This is set via the IFLA_BRPORT_LEARNING attribute in a
> setlink message.
> 

We had a recent discussion on this subject on roker and ageing.  Siva submitted a patch for bridge not to age the 'externally learnt entries'. This broke Rocker because it depended on bridge to age out entries even if they are 'externally learnt'. 

After some discussion, Scott agreed that rocker should also take care of ageing out the entries (in addition to learning them), just like a real silicon would do.  He is going to submit a patch for Rocker soon.

Vissu

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:02               ` Andrew Lunn
@ 2015-02-20  3:55                 ` Guenter Roeck
  0 siblings, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-20  3:55 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella, cphealy

On Fri, Feb 20, 2015 at 03:02:41AM +0100, Andrew Lunn wrote:
> > Not sure yet what to do about setting the fdb aging time. I don't
> > see a mechanism to do that. No idea how important that is.
> 
> Hi Guenter
> 
> I don't know about your chip, but the public data sheet for the
> 88e6060 talks about being able to set the age time globally in the ATU
> control register. The granularity is not so good, multiple of 16
> seconds, so it could be the hardware bridge times out an entry faster
> than the software bridge.
> 
Yes, but only if the minimum is configured. The default is about 5.5
minutes on mv88e6352. I didn't check the default on the other 886exxx
chips, but I assume it is the same.

Problem is topology changes or non-standard aging times. Maybe fdb flush
on port status changes takes care of those cases; I am just not sure if
that is sufficient.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  3:20                       ` Andy Gospodarek
  2015-02-20  3:53                         ` Viswanath Bandaru
@ 2015-02-20  3:56                         ` Andy Gospodarek
  1 sibling, 0 replies; 65+ messages in thread
From: Andy Gospodarek @ 2015-02-20  3:56 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: roopa, Guenter Roeck, netdev, davem, vivien.didelot,
	jerome.oufella, andrew, cphealy

On Thu, Feb 19, 2015 at 10:20:58PM -0500, Andy Gospodarek wrote:
> On Thu, Feb 19, 2015 at 06:00:22PM -0800, Florian Fainelli wrote:
> > On 19/02/15 17:46, roopa wrote:
> > > On 2/19/15, 5:03 PM, Guenter Roeck wrote:
> > >> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
> > >>>> Not sure yet what to do about setting the fdb aging time. I don't see a
> > >>>> mechanism to do that. No idea how important that is.
> > >>> rocker, the only consumer today relies on the bridge driver aging of
> > >>> learnt
> > >>> entries.
> > >>> You could do the same.
> > >>>
> > >> Remember that we are dealing with hardware switch chips. Those chips
> > >> won't time out fdb entries just because the kernel's bridge driver
> > >> thinks that it should.
> > > Oh, they dont..?. sorry,  I dont know the details about your hardware.
> > > But, if these are entries learnt by hw, there should be a hw config to
> > > age them (I guess that is what you are talking about). Which the swicth
> > > driver can set.
> > > If you disable hw aging, you can sync these entries to the bridge
> > > driver, and make the bridge driver age them followed by a subsequent
> > > delete in hw.
> > 
> > The SF2 HW has and aging and a valid bit available, I guess my question
> > would be, do we have anything today in "net-next" that allows
> > configuring HW aging vs. SW aging (implying doing a HW to SW sync)?
> 
> Yes, the setting of the BR_LEARNING_SYNC bit in bridge port flags should
> signal to the hardware that it should send learning notifications up to
> the kernel bridge.  This is set via the IFLA_BRPORT_LEARNING attribute
> in a setlink message.

Sorry I should offer some clarification on this point.  The flag I
mentioned is only used to signal to drivers that they should notify the
core kernel about new MACs that are learned.

If there is a need for different driver behavior depending on whether
hardware learning is enabled or not it might be time to add a
configuration option up into the stack.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  2:41                       ` roopa
@ 2015-02-20  4:05                         ` Guenter Roeck
  2015-02-20  4:58                           ` Scott Feldman
  2015-02-20  4:59                           ` roopa
  0 siblings, 2 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-20  4:05 UTC (permalink / raw)
  To: roopa
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy

On Thu, Feb 19, 2015 at 06:41:34PM -0800, roopa wrote:
> On 2/19/15, 6:00 PM, Florian Fainelli wrote:
> >On 19/02/15 17:46, roopa wrote:
> >>On 2/19/15, 5:03 PM, Guenter Roeck wrote:
> >>>On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
> >>>>>Not sure yet what to do about setting the fdb aging time. I don't see a
> >>>>>mechanism to do that. No idea how important that is.
> >>>>rocker, the only consumer today relies on the bridge driver aging of
> >>>>learnt
> >>>>entries.
> >>>>You could do the same.
> >>>>
> >>>Remember that we are dealing with hardware switch chips. Those chips
> >>>won't time out fdb entries just because the kernel's bridge driver
> >>>thinks that it should.
> >>Oh, they dont..?. sorry,  I dont know the details about your hardware.
> >>But, if these are entries learnt by hw, there should be a hw config to
> >>age them (I guess that is what you are talking about). Which the swicth
> >>driver can set.
> >>If you disable hw aging, you can sync these entries to the bridge
> >>driver, and make the bridge driver age them followed by a subsequent
> >>delete in hw.
> >The SF2 HW has and aging and a valid bit available, I guess my question
> >would be, do we have anything today in "net-next" that allows
> >configuring HW aging vs. SW aging (implying doing a HW to SW sync)?
> 
> 
> There is no config parameter to set HW aging vs SW aging. But if you want SW
> aging, an example is the rocker implementation today.
> And there is BR_LEARNING_SYNC per bridge port flag to sync HW to SW using
> notifiers (see rocker).
> 
SW aging is not practical. In my specific use case there can be 800+ mac
addresses in a single mac domain, connected through an mdio bus on gpio pins.
Besides, it doesn't really make much sense to burden SW with something that
can easily be handled in HW, just because SW doesn't have the means to pass
the necessary parameter - aging time - to the HW.

I think the question here was how to communicate aging time to the switch chip,
not SW aging vs. HW aging.

> There is no netlink based age time sets AFAIK. But there is age time sets
> from sysctl. And there is no offload support for this today.
> The offload support has been discussed previously and there was no need to
> add it immediately.
> 
Ok with me; just hope it doesn't cause any trouble.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  3:52                         ` Andrew Lunn
@ 2015-02-20  4:07                           ` Guenter Roeck
  0 siblings, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-20  4:07 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: roopa, Florian Fainelli, netdev, davem, vivien.didelot,
	jerome.oufella, cphealy

On Fri, Feb 20, 2015 at 04:52:37AM +0100, Andrew Lunn wrote:
> > - we keep the SW bridge fdb in sync with the HW fdb, which leads to:
> 
> I don't think this is going to be easy or efficient. At least for the
> 88e6060 there is no interrupt when there is a change to the
> database. So we are going to have to poll. And the only way to figure
> out if anything has changed is to read it all out and compare against
> the last read.
> 
Agreed.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  4:05                         ` Guenter Roeck
@ 2015-02-20  4:58                           ` Scott Feldman
  2015-02-20  4:59                           ` roopa
  1 sibling, 0 replies; 65+ messages in thread
From: Scott Feldman @ 2015-02-20  4:58 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: roopa, Florian Fainelli, Netdev, David S. Miller, vivien.didelot,
	jerome.oufella, andrew, Chris Healy

On Thu, Feb 19, 2015 at 11:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On Thu, Feb 19, 2015 at 06:41:34PM -0800, roopa wrote:
>> On 2/19/15, 6:00 PM, Florian Fainelli wrote:
>> >On 19/02/15 17:46, roopa wrote:
>> >>On 2/19/15, 5:03 PM, Guenter Roeck wrote:
>> >>>On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
>> >>>>>Not sure yet what to do about setting the fdb aging time. I don't see a
>> >>>>>mechanism to do that. No idea how important that is.
>> >>>>rocker, the only consumer today relies on the bridge driver aging of
>> >>>>learnt
>> >>>>entries.
>> >>>>You could do the same.
>> >>>>
>> >>>Remember that we are dealing with hardware switch chips. Those chips
>> >>>won't time out fdb entries just because the kernel's bridge driver
>> >>>thinks that it should.
>> >>Oh, they dont..?. sorry,  I dont know the details about your hardware.
>> >>But, if these are entries learnt by hw, there should be a hw config to
>> >>age them (I guess that is what you are talking about). Which the swicth
>> >>driver can set.
>> >>If you disable hw aging, you can sync these entries to the bridge
>> >>driver, and make the bridge driver age them followed by a subsequent
>> >>delete in hw.
>> >The SF2 HW has and aging and a valid bit available, I guess my question
>> >would be, do we have anything today in "net-next" that allows
>> >configuring HW aging vs. SW aging (implying doing a HW to SW sync)?
>>
>>
>> There is no config parameter to set HW aging vs SW aging. But if you want SW
>> aging, an example is the rocker implementation today.
>> And there is BR_LEARNING_SYNC per bridge port flag to sync HW to SW using
>> notifiers (see rocker).
>>
> SW aging is not practical. In my specific use case there can be 800+ mac
> addresses in a single mac domain, connected through an mdio bus on gpio pins.
> Besides, it doesn't really make much sense to burden SW with something that
> can easily be handled in HW, just because SW doesn't have the means to pass
> the necessary parameter - aging time - to the HW.

I have a patch which I'll post shortly as an RFC that addresses ageing
either by SW or by HW.

-scott

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations
  2015-02-20  4:05                         ` Guenter Roeck
  2015-02-20  4:58                           ` Scott Feldman
@ 2015-02-20  4:59                           ` roopa
  1 sibling, 0 replies; 65+ messages in thread
From: roopa @ 2015-02-20  4:59 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella,
	andrew, cphealy

On 2/19/15, 8:05 PM, Guenter Roeck wrote:
> On Thu, Feb 19, 2015 at 06:41:34PM -0800, roopa wrote:
>> On 2/19/15, 6:00 PM, Florian Fainelli wrote:
>>> On 19/02/15 17:46, roopa wrote:
>>>> On 2/19/15, 5:03 PM, Guenter Roeck wrote:
>>>>> On Thu, Feb 19, 2015 at 04:51:30PM -0800, roopa wrote:
>>>>>>> Not sure yet what to do about setting the fdb aging time. I don't see a
>>>>>>> mechanism to do that. No idea how important that is.
>>>>>> rocker, the only consumer today relies on the bridge driver aging of
>>>>>> learnt
>>>>>> entries.
>>>>>> You could do the same.
>>>>>>
>>>>> Remember that we are dealing with hardware switch chips. Those chips
>>>>> won't time out fdb entries just because the kernel's bridge driver
>>>>> thinks that it should.
>>>> Oh, they dont..?. sorry,  I dont know the details about your hardware.
>>>> But, if these are entries learnt by hw, there should be a hw config to
>>>> age them (I guess that is what you are talking about). Which the swicth
>>>> driver can set.
>>>> If you disable hw aging, you can sync these entries to the bridge
>>>> driver, and make the bridge driver age them followed by a subsequent
>>>> delete in hw.
>>> The SF2 HW has and aging and a valid bit available, I guess my question
>>> would be, do we have anything today in "net-next" that allows
>>> configuring HW aging vs. SW aging (implying doing a HW to SW sync)?
>>
>> There is no config parameter to set HW aging vs SW aging. But if you want SW
>> aging, an example is the rocker implementation today.
>> And there is BR_LEARNING_SYNC per bridge port flag to sync HW to SW using
>> notifiers (see rocker).
>>
> SW aging is not practical. In my specific use case there can be 800+ mac
> addresses in a single mac domain, connected through an mdio bus on gpio pins.
> Besides, it doesn't really make much sense to burden SW with something that
> can easily be handled in HW, just because SW doesn't have the means to pass
> the necessary parameter - aging time - to the HW.
>
> I think the question here was how to communicate aging time to the switch chip,
> not SW aging vs. HW aging.

I think florians last email was just a question about HW aging vs SW 
aging and so we were discussing that.
But sure if SW aging is not an option for these chip, Lets only discuss 
HW aging.
And for HW aging, you could leave it at the default hw value or have the 
switch driver set a default aging value.
and if there is a need to make it configurable per bridge we can add an 
API to pass it via the bridge
ageing time to the switch driver.

>
>> There is no netlink based age time sets AFAIK. But there is age time sets
>> from sysctl. And there is no offload support for this today.
>> The offload support has been discussed previously and there was no need to
>> add it immediately.
>>
> Ok with me; just hope it doesn't cause any trouble.
>
> Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
  2015-02-17 20:13   ` Guenter Roeck
  2015-02-18  1:19   ` Andrew Lunn
@ 2015-02-23  2:20   ` Guenter Roeck
  2015-02-23  3:14     ` Andrew Lunn
  2 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-23  2:20 UTC (permalink / raw)
  To: Florian Fainelli, netdev
  Cc: davem, vivien.didelot, jerome.oufella, andrew, cphealy

On 02/17/2015 11:26 AM, Florian Fainelli wrote:
> In order to support bridging offloads in DSA switch drivers, select
> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
> NDOs that we are required to implement.
>
> To facilitate the integratation at the DSA driver level, we implement 3
> types of operations:
>

Hi Florian,

>
> +/* Return a bitmask of all ports being currently bridged. Note that on
> + * leave, the mask will still return the bitmask of ports currently bridged,
> + * prior to port removal, and this is exactly what we want.
> + */
> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
> +{
> +	unsigned int port;
> +	u32 mask = 0;
> +
> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
> +		if (!((1 << port) & ds->phys_port_mask))
> +			continue;
> +
> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
> +			mask |= 1 << port;

Problem is that the function can be called through dsa_slave_netdevice_event
before the slave devices are fully initialized.

After adding

+               if (!ds->ports[port]) {
+                       netdev_err(bridge,
+                                  "No ports data for port %d, mask=0x%x\n",
+                                  port, ds->phys_port_mask);
+                       continue;
+               }

and with some more debug messages added to dsa_switch_setup(), I see the following.

[   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
[   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
[   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
[   14.472002] br0: No ports data for port 3, mask=0x1e
[   14.472053] br0: No ports data for port 4, mask=0x1e
[   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)

This happens if I add the bridge configuration to /etc/network/interfaces instead
of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
when dsa_slave_netdevice_event is executed to handle a state change on one of its
newly created slave interfaces.

The relevant information from /etc/network/interfaces is:

auto br0

iface port1 inet manual
iface port2 inet manual

iface br0 inet dhcp
	bridge_ports port1 port2
	up ifconfig br0 mtu 1492

Any idea how to best address this ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  2:20   ` Guenter Roeck
@ 2015-02-23  3:14     ` Andrew Lunn
  2015-02-23  4:07       ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-23  3:14 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella, cphealy

On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
> >In order to support bridging offloads in DSA switch drivers, select
> >NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
> >NDOs that we are required to implement.
> >
> >To facilitate the integratation at the DSA driver level, we implement 3
> >types of operations:
> >
> 
> Hi Florian,
> 
> >
> >+/* Return a bitmask of all ports being currently bridged. Note that on
> >+ * leave, the mask will still return the bitmask of ports currently bridged,
> >+ * prior to port removal, and this is exactly what we want.
> >+ */
> >+static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
> >+{
> >+	unsigned int port;
> >+	u32 mask = 0;
> >+
> >+	for (port = 0; port < DSA_MAX_PORTS; port++) {
> >+		if (!((1 << port) & ds->phys_port_mask))
> >+			continue;
> >+
> >+		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
> >+			mask |= 1 << port;
> 
> Problem is that the function can be called through dsa_slave_netdevice_event
> before the slave devices are fully initialized.
> 
> After adding
> 
> +               if (!ds->ports[port]) {
> +                       netdev_err(bridge,
> +                                  "No ports data for port %d, mask=0x%x\n",
> +                                  port, ds->phys_port_mask);
> +                       continue;
> +               }
> 
> and with some more debug messages added to dsa_switch_setup(), I see the following.
> 
> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
> [   14.472002] br0: No ports data for port 3, mask=0x1e
> [   14.472053] br0: No ports data for port 4, mask=0x1e
> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
> 
> This happens if I add the bridge configuration to /etc/network/interfaces instead
> of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
> when dsa_slave_netdevice_event is executed to handle a state change on one of its
> newly created slave interfaces.
> 
> The relevant information from /etc/network/interfaces is:
> 
> auto br0
> 
> iface port1 inet manual
> iface port2 inet manual
> 
> iface br0 inet dhcp
> 	bridge_ports port1 port2

Hi Guenter

Does this actually matter? The ports which don't exists yet are not
being added to the bridge. The mask will come out correct. What
happens when port4 is made a member of the bridge? I expect it
works. It is the creation of the interface which triggers hotplug to
read interfaces and add the interface to the port.

It might however be a good idea to document somewhere that each
interface should be configured without assuming anything about the
state of other interfaces.

     Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  3:14     ` Andrew Lunn
@ 2015-02-23  4:07       ` Guenter Roeck
  2015-02-23  4:22         ` Andrew Lunn
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-23  4:07 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella, cphealy

On 02/22/2015 07:14 PM, Andrew Lunn wrote:
> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>> In order to support bridging offloads in DSA switch drivers, select
>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>> NDOs that we are required to implement.
>>>
>>> To facilitate the integratation at the DSA driver level, we implement 3
>>> types of operations:
>>>
>>
>> Hi Florian,
>>
>>>
>>> +/* Return a bitmask of all ports being currently bridged. Note that on
>>> + * leave, the mask will still return the bitmask of ports currently bridged,
>>> + * prior to port removal, and this is exactly what we want.
>>> + */
>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>> +{
>>> +	unsigned int port;
>>> +	u32 mask = 0;
>>> +
>>> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
>>> +		if (!((1 << port) & ds->phys_port_mask))
>>> +			continue;
>>> +
>>> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>> +			mask |= 1 << port;
>>
>> Problem is that the function can be called through dsa_slave_netdevice_event
>> before the slave devices are fully initialized.
>>
>> After adding
>>
>> +               if (!ds->ports[port]) {
>> +                       netdev_err(bridge,
>> +                                  "No ports data for port %d, mask=0x%x\n",
>> +                                  port, ds->phys_port_mask);
>> +                       continue;
>> +               }
>>
>> and with some more debug messages added to dsa_switch_setup(), I see the following.
>>
>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
>>
>> This happens if I add the bridge configuration to /etc/network/interfaces instead
>> of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
>> when dsa_slave_netdevice_event is executed to handle a state change on one of its
>> newly created slave interfaces.
>>
>> The relevant information from /etc/network/interfaces is:
>>
>> auto br0
>>
>> iface port1 inet manual
>> iface port2 inet manual
>>
>> iface br0 inet dhcp
>> 	bridge_ports port1 port2
>
> Hi Guenter
>
> Does this actually matter? The ports which don't exists yet are not
> being added to the bridge. The mask will come out correct. What
> happens when port4 is made a member of the bridge? I expect it
> works. It is the creation of the interface which triggers hotplug to
> read interfaces and add the interface to the port.
>

	if (!ds->ports[port])
		continue;

might be an option. However, I am not sure that what you say is correct,
at least not strictly speaking. dsa_slave_create() returns the created
slave device, which is added to ds->ports[port] in dsa_switch_setup().
Since there is no protection in dsa_switch_setup(), there is no guarantee
that the callback doesn't happen prior to the initialization of
ds->ports[port]. So the above would leave a race condition, where the
port being added to the bridge _is_ one for which ds->ports[port] is
not yet initialized.

Protecting the entire slave creation loop in dsa_switch_setup()
and using register_netdevice() in dsa_slave_create() solves the problem
as far as I can see, I just don't know if it is an acceptable solution.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  4:07       ` Guenter Roeck
@ 2015-02-23  4:22         ` Andrew Lunn
  2015-02-23  4:33           ` Florian Fainelli
  2015-02-23  4:38           ` Guenter Roeck
  0 siblings, 2 replies; 65+ messages in thread
From: Andrew Lunn @ 2015-02-23  4:22 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella, cphealy

On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
> >On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
> >>On 02/17/2015 11:26 AM, Florian Fainelli wrote:
> >>>In order to support bridging offloads in DSA switch drivers, select
> >>>NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
> >>>NDOs that we are required to implement.
> >>>
> >>>To facilitate the integratation at the DSA driver level, we implement 3
> >>>types of operations:
> >>>
> >>
> >>Hi Florian,
> >>
> >>>
> >>>+/* Return a bitmask of all ports being currently bridged. Note that on
> >>>+ * leave, the mask will still return the bitmask of ports currently bridged,
> >>>+ * prior to port removal, and this is exactly what we want.
> >>>+ */
> >>>+static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
> >>>+{
> >>>+	unsigned int port;
> >>>+	u32 mask = 0;
> >>>+
> >>>+	for (port = 0; port < DSA_MAX_PORTS; port++) {
> >>>+		if (!((1 << port) & ds->phys_port_mask))
> >>>+			continue;
> >>>+
> >>>+		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
> >>>+			mask |= 1 << port;
> >>
> >>Problem is that the function can be called through dsa_slave_netdevice_event
> >>before the slave devices are fully initialized.
> >>
> >>After adding
> >>
> >>+               if (!ds->ports[port]) {
> >>+                       netdev_err(bridge,
> >>+                                  "No ports data for port %d, mask=0x%x\n",
> >>+                                  port, ds->phys_port_mask);
> >>+                       continue;
> >>+               }
> >>
> >>and with some more debug messages added to dsa_switch_setup(), I see the following.
> >>
> >>[   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
> >>[   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
> >>[   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
> >>[   14.472002] br0: No ports data for port 3, mask=0x1e
> >>[   14.472053] br0: No ports data for port 4, mask=0x1e
> >>[   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
> >>
> >>This happens if I add the bridge configuration to /etc/network/interfaces instead
> >>of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
> >>when dsa_slave_netdevice_event is executed to handle a state change on one of its
> >>newly created slave interfaces.
> >>
> >>The relevant information from /etc/network/interfaces is:
> >>
> >>auto br0
> >>
> >>iface port1 inet manual
> >>iface port2 inet manual
> >>
> >>iface br0 inet dhcp
> >>	bridge_ports port1 port2
> >
> >Hi Guenter
> >
> >Does this actually matter? The ports which don't exists yet are not
> >being added to the bridge. The mask will come out correct. What
> >happens when port4 is made a member of the bridge? I expect it
> >works. It is the creation of the interface which triggers hotplug to
> >read interfaces and add the interface to the port.
> >
> 
> 	if (!ds->ports[port])
> 		continue;
> 
> might be an option. However, I am not sure that what you say is correct,
> at least not strictly speaking. dsa_slave_create() returns the created
> slave device, which is added to ds->ports[port] in dsa_switch_setup().
> Since there is no protection in dsa_switch_setup(), there is no guarantee
> that the callback doesn't happen prior to the initialization of
> ds->ports[port]. So the above would leave a race condition, where the
> port being added to the bridge _is_ one for which ds->ports[port] is
> not yet initialized.

Yes, you are correct, there is a race here.


> Protecting the entire slave creation loop in dsa_switch_setup()
> and using register_netdevice() in dsa_slave_create() solves the problem
> as far as I can see, I just don't know if it is an acceptable solution.

Or:

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 2173402d87e0..1aa120d6d0e4 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
                                   index, i, pd->port_names[i]);
                        continue;
                }
-
-               ds->ports[i] = slave_dev;
        }
 
 #ifdef CONFIG_NET_DSA_HWMON
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index f23deadf42a0..d6004072a957 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
                free_netdev(slave_dev);
                return NULL;
        }
+       ds->ports[port] = slave_dev;
 
        ret = register_netdev(slave_dev);
        if (ret) {
                netdev_err(master, "error %d registering interface %s\n",
                           ret, slave_dev->name);
                phy_disconnect(p->phy);
+               ds->ports[port] = NULL;
                free_netdev(slave_dev);
                return NULL;
        }

Not tested. But the point being, ensure everything is setup before
calling register_netdev().

	Andrew

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  4:22         ` Andrew Lunn
@ 2015-02-23  4:33           ` Florian Fainelli
  2015-02-23  4:38           ` Guenter Roeck
  1 sibling, 0 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-23  4:33 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Guenter Roeck, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

2015-02-22 20:22 GMT-08:00 Andrew Lunn <andrew@lunn.ch>:
> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>> >On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>> >>On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>> >>>In order to support bridging offloads in DSA switch drivers, select
>> >>>NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>> >>>NDOs that we are required to implement.
>> >>>
>> >>>To facilitate the integratation at the DSA driver level, we implement 3
>> >>>types of operations:
>> >>>
>> >>
>> >>Hi Florian,
>> >>
>> >>>
>> >>>+/* Return a bitmask of all ports being currently bridged. Note that on
>> >>>+ * leave, the mask will still return the bitmask of ports currently bridged,
>> >>>+ * prior to port removal, and this is exactly what we want.
>> >>>+ */
>> >>>+static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>> >>>+{
>> >>>+  unsigned int port;
>> >>>+  u32 mask = 0;
>> >>>+
>> >>>+  for (port = 0; port < DSA_MAX_PORTS; port++) {
>> >>>+          if (!((1 << port) & ds->phys_port_mask))
>> >>>+                  continue;
>> >>>+
>> >>>+          if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>> >>>+                  mask |= 1 << port;
>> >>
>> >>Problem is that the function can be called through dsa_slave_netdevice_event
>> >>before the slave devices are fully initialized.
>> >>
>> >>After adding
>> >>
>> >>+               if (!ds->ports[port]) {
>> >>+                       netdev_err(bridge,
>> >>+                                  "No ports data for port %d, mask=0x%x\n",
>> >>+                                  port, ds->phys_port_mask);
>> >>+                       continue;
>> >>+               }
>> >>
>> >>and with some more debug messages added to dsa_switch_setup(), I see the following.
>> >>
>> >>[   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
>> >>[   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
>> >>[   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
>> >>[   14.472002] br0: No ports data for port 3, mask=0x1e
>> >>[   14.472053] br0: No ports data for port 4, mask=0x1e
>> >>[   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
>> >>
>> >>This happens if I add the bridge configuration to /etc/network/interfaces instead
>> >>of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
>> >>when dsa_slave_netdevice_event is executed to handle a state change on one of its
>> >>newly created slave interfaces.
>> >>
>> >>The relevant information from /etc/network/interfaces is:
>> >>
>> >>auto br0
>> >>
>> >>iface port1 inet manual
>> >>iface port2 inet manual
>> >>
>> >>iface br0 inet dhcp
>> >>    bridge_ports port1 port2
>> >
>> >Hi Guenter
>> >
>> >Does this actually matter? The ports which don't exists yet are not
>> >being added to the bridge. The mask will come out correct. What
>> >happens when port4 is made a member of the bridge? I expect it
>> >works. It is the creation of the interface which triggers hotplug to
>> >read interfaces and add the interface to the port.
>> >
>>
>>       if (!ds->ports[port])
>>               continue;
>>
>> might be an option. However, I am not sure that what you say is correct,
>> at least not strictly speaking. dsa_slave_create() returns the created
>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>> that the callback doesn't happen prior to the initialization of
>> ds->ports[port]. So the above would leave a race condition, where the
>> port being added to the bridge _is_ one for which ds->ports[port] is
>> not yet initialized.
>
> Yes, you are correct, there is a race here.
>
>
>> Protecting the entire slave creation loop in dsa_switch_setup()
>> and using register_netdevice() in dsa_slave_create() solves the problem
>> as far as I can see, I just don't know if it is an acceptable solution.
>
> Or:
>
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 2173402d87e0..1aa120d6d0e4 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
>                                    index, i, pd->port_names[i]);
>                         continue;
>                 }
> -
> -               ds->ports[i] = slave_dev;
>         }
>
>  #ifdef CONFIG_NET_DSA_HWMON
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index f23deadf42a0..d6004072a957 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>                 free_netdev(slave_dev);
>                 return NULL;
>         }
> +       ds->ports[port] = slave_dev;
>
>         ret = register_netdev(slave_dev);
>         if (ret) {
>                 netdev_err(master, "error %d registering interface %s\n",
>                            ret, slave_dev->name);
>                 phy_disconnect(p->phy);
> +               ds->ports[port] = NULL;
>                 free_netdev(slave_dev);
>                 return NULL;
>         }
>
> Not tested. But the point being, ensure everything is setup before
> calling register_netdev().

I prefer that too over using locking around the dsa slave network
device creation, provided that this works for Guenter as well.
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  4:22         ` Andrew Lunn
  2015-02-23  4:33           ` Florian Fainelli
@ 2015-02-23  4:38           ` Guenter Roeck
  2015-02-23  4:43             ` Florian Fainelli
  1 sibling, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-23  4:38 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, netdev, davem, vivien.didelot, jerome.oufella, cphealy

On 02/22/2015 08:22 PM, Andrew Lunn wrote:
> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>>> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>>>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>>>> In order to support bridging offloads in DSA switch drivers, select
>>>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>>>> NDOs that we are required to implement.
>>>>>
>>>>> To facilitate the integratation at the DSA driver level, we implement 3
>>>>> types of operations:
>>>>>
>>>>
>>>> Hi Florian,
>>>>
>>>>>
>>>>> +/* Return a bitmask of all ports being currently bridged. Note that on
>>>>> + * leave, the mask will still return the bitmask of ports currently bridged,
>>>>> + * prior to port removal, and this is exactly what we want.
>>>>> + */
>>>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>>>> +{
>>>>> +	unsigned int port;
>>>>> +	u32 mask = 0;
>>>>> +
>>>>> +	for (port = 0; port < DSA_MAX_PORTS; port++) {
>>>>> +		if (!((1 << port) & ds->phys_port_mask))
>>>>> +			continue;
>>>>> +
>>>>> +		if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>>>> +			mask |= 1 << port;
>>>>
>>>> Problem is that the function can be called through dsa_slave_netdevice_event
>>>> before the slave devices are fully initialized.
>>>>
>>>> After adding
>>>>
>>>> +               if (!ds->ports[port]) {
>>>> +                       netdev_err(bridge,
>>>> +                                  "No ports data for port %d, mask=0x%x\n",
>>>> +                                  port, ds->phys_port_mask);
>>>> +                       continue;
>>>> +               }
>>>>
>>>> and with some more debug messages added to dsa_switch_setup(), I see the following.
>>>>
>>>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 1(port1)
>>>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 2(port2)
>>>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 3(port3)
>>>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>>>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>>>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for port 4(host2esb)
>>>>
>>>> This happens if I add the bridge configuration to /etc/network/interfaces instead
>>>> of creating the bridge manually. Apparently dsa_switch_setup() is not yet complete
>>>> when dsa_slave_netdevice_event is executed to handle a state change on one of its
>>>> newly created slave interfaces.
>>>>
>>>> The relevant information from /etc/network/interfaces is:
>>>>
>>>> auto br0
>>>>
>>>> iface port1 inet manual
>>>> iface port2 inet manual
>>>>
>>>> iface br0 inet dhcp
>>>> 	bridge_ports port1 port2
>>>
>>> Hi Guenter
>>>
>>> Does this actually matter? The ports which don't exists yet are not
>>> being added to the bridge. The mask will come out correct. What
>>> happens when port4 is made a member of the bridge? I expect it
>>> works. It is the creation of the interface which triggers hotplug to
>>> read interfaces and add the interface to the port.
>>>
>>
>> 	if (!ds->ports[port])
>> 		continue;
>>
>> might be an option. However, I am not sure that what you say is correct,
>> at least not strictly speaking. dsa_slave_create() returns the created
>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>> that the callback doesn't happen prior to the initialization of
>> ds->ports[port]. So the above would leave a race condition, where the
>> port being added to the bridge _is_ one for which ds->ports[port] is
>> not yet initialized.
>
> Yes, you are correct, there is a race here.
>
>
>> Protecting the entire slave creation loop in dsa_switch_setup()
>> and using register_netdevice() in dsa_slave_create() solves the problem
>> as far as I can see, I just don't know if it is an acceptable solution.
>
> Or:
>
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 2173402d87e0..1aa120d6d0e4 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
>                                     index, i, pd->port_names[i]);
>                          continue;
>                  }
> -
> -               ds->ports[i] = slave_dev;
>          }
>
>   #ifdef CONFIG_NET_DSA_HWMON
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index f23deadf42a0..d6004072a957 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>                  free_netdev(slave_dev);
>                  return NULL;
>          }
> +       ds->ports[port] = slave_dev;
>
>          ret = register_netdev(slave_dev);
>          if (ret) {
>                  netdev_err(master, "error %d registering interface %s\n",
>                             ret, slave_dev->name);
>                  phy_disconnect(p->phy);
> +               ds->ports[port] = NULL;
>                  free_netdev(slave_dev);
>                  return NULL;
>          }
>
> Not tested. But the point being, ensure everything is setup before
> calling register_netdev().
>

That should work. I'll give it a try.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  4:38           ` Guenter Roeck
@ 2015-02-23  4:43             ` Florian Fainelli
  2015-02-23  6:19               ` Guenter Roeck
  2015-02-23 13:34               ` Andrew Lunn
  0 siblings, 2 replies; 65+ messages in thread
From: Florian Fainelli @ 2015-02-23  4:43 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

2015-02-22 20:38 GMT-08:00 Guenter Roeck <linux@roeck-us.net>:
> On 02/22/2015 08:22 PM, Andrew Lunn wrote:
>>
>> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>>>
>>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>>>>
>>>> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>>>>>
>>>>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>>>>>
>>>>>> In order to support bridging offloads in DSA switch drivers, select
>>>>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>>>>> NDOs that we are required to implement.
>>>>>>
>>>>>> To facilitate the integratation at the DSA driver level, we implement
>>>>>> 3
>>>>>> types of operations:
>>>>>>
>>>>>
>>>>> Hi Florian,
>>>>>
>>>>>>
>>>>>> +/* Return a bitmask of all ports being currently bridged. Note that
>>>>>> on
>>>>>> + * leave, the mask will still return the bitmask of ports currently
>>>>>> bridged,
>>>>>> + * prior to port removal, and this is exactly what we want.
>>>>>> + */
>>>>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>>>>> +{
>>>>>> +       unsigned int port;
>>>>>> +       u32 mask = 0;
>>>>>> +
>>>>>> +       for (port = 0; port < DSA_MAX_PORTS; port++) {
>>>>>> +               if (!((1 << port) & ds->phys_port_mask))
>>>>>> +                       continue;
>>>>>> +
>>>>>> +               if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>>>>> +                       mask |= 1 << port;
>>>>>
>>>>>
>>>>> Problem is that the function can be called through
>>>>> dsa_slave_netdevice_event
>>>>> before the slave devices are fully initialized.
>>>>>
>>>>> After adding
>>>>>
>>>>> +               if (!ds->ports[port]) {
>>>>> +                       netdev_err(bridge,
>>>>> +                                  "No ports data for port %d,
>>>>> mask=0x%x\n",
>>>>> +                                  port, ds->phys_port_mask);
>>>>> +                       continue;
>>>>> +               }
>>>>>
>>>>> and with some more debug messages added to dsa_switch_setup(), I see
>>>>> the following.
>>>>>
>>>>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 1(port1)
>>>>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 2(port2)
>>>>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 3(port3)
>>>>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>>>>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>>>>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>> port 4(host2esb)
>>>>>
>>>>> This happens if I add the bridge configuration to
>>>>> /etc/network/interfaces instead
>>>>> of creating the bridge manually. Apparently dsa_switch_setup() is not
>>>>> yet complete
>>>>> when dsa_slave_netdevice_event is executed to handle a state change on
>>>>> one of its
>>>>> newly created slave interfaces.
>>>>>
>>>>> The relevant information from /etc/network/interfaces is:
>>>>>
>>>>> auto br0
>>>>>
>>>>> iface port1 inet manual
>>>>> iface port2 inet manual
>>>>>
>>>>> iface br0 inet dhcp
>>>>>         bridge_ports port1 port2
>>>>
>>>>
>>>> Hi Guenter
>>>>
>>>> Does this actually matter? The ports which don't exists yet are not
>>>> being added to the bridge. The mask will come out correct. What
>>>> happens when port4 is made a member of the bridge? I expect it
>>>> works. It is the creation of the interface which triggers hotplug to
>>>> read interfaces and add the interface to the port.
>>>>
>>>
>>>         if (!ds->ports[port])
>>>                 continue;
>>>
>>> might be an option. However, I am not sure that what you say is correct,
>>> at least not strictly speaking. dsa_slave_create() returns the created
>>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>>> that the callback doesn't happen prior to the initialization of
>>> ds->ports[port]. So the above would leave a race condition, where the
>>> port being added to the bridge _is_ one for which ds->ports[port] is
>>> not yet initialized.
>>
>>
>> Yes, you are correct, there is a race here.
>>
>>
>>> Protecting the entire slave creation loop in dsa_switch_setup()
>>> and using register_netdevice() in dsa_slave_create() solves the problem
>>> as far as I can see, I just don't know if it is an acceptable solution.
>>
>>
>> Or:
>>
>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>> index 2173402d87e0..1aa120d6d0e4 100644
>> --- a/net/dsa/dsa.c
>> +++ b/net/dsa/dsa.c
>> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int
>> index,
>>                                     index, i, pd->port_names[i]);
>>                          continue;
>>                  }
>> -
>> -               ds->ports[i] = slave_dev;
>>          }
>>
>>   #ifdef CONFIG_NET_DSA_HWMON
>> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
>> index f23deadf42a0..d6004072a957 100644
>> --- a/net/dsa/slave.c
>> +++ b/net/dsa/slave.c
>> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct
>> device *parent,
>>                  free_netdev(slave_dev);
>>                  return NULL;
>>          }
>> +       ds->ports[port] = slave_dev;
>>
>>          ret = register_netdev(slave_dev);
>>          if (ret) {
>>                  netdev_err(master, "error %d registering interface %s\n",
>>                             ret, slave_dev->name);
>>                  phy_disconnect(p->phy);
>> +               ds->ports[port] = NULL;
>>                  free_netdev(slave_dev);
>>                  return NULL;
>>          }
>>
>> Not tested. But the point being, ensure everything is setup before
>> calling register_netdev().
>>
>
> That should work. I'll give it a try.

BTW, before I re-submit this patch series, do you think we should
introduce a fdb_flush() callback that switch drivers are required to
implement, and invoke it from net/dsa/slave.c upon port join/leave?

Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  4:43             ` Florian Fainelli
@ 2015-02-23  6:19               ` Guenter Roeck
  2015-02-23 13:34               ` Andrew Lunn
  1 sibling, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-23  6:19 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On 02/22/2015 08:43 PM, Florian Fainelli wrote:
> 2015-02-22 20:38 GMT-08:00 Guenter Roeck <linux@roeck-us.net>:
>> On 02/22/2015 08:22 PM, Andrew Lunn wrote:
>>>
>>> On Sun, Feb 22, 2015 at 08:07:03PM -0800, Guenter Roeck wrote:
>>>>
>>>> On 02/22/2015 07:14 PM, Andrew Lunn wrote:
>>>>>
>>>>> On Sun, Feb 22, 2015 at 06:20:44PM -0800, Guenter Roeck wrote:
>>>>>>
>>>>>> On 02/17/2015 11:26 AM, Florian Fainelli wrote:
>>>>>>>
>>>>>>> In order to support bridging offloads in DSA switch drivers, select
>>>>>>> NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
>>>>>>> NDOs that we are required to implement.
>>>>>>>
>>>>>>> To facilitate the integratation at the DSA driver level, we implement
>>>>>>> 3
>>>>>>> types of operations:
>>>>>>>
>>>>>>
>>>>>> Hi Florian,
>>>>>>
>>>>>>>
>>>>>>> +/* Return a bitmask of all ports being currently bridged. Note that
>>>>>>> on
>>>>>>> + * leave, the mask will still return the bitmask of ports currently
>>>>>>> bridged,
>>>>>>> + * prior to port removal, and this is exactly what we want.
>>>>>>> + */
>>>>>>> +static u32 dsa_slave_br_port_mask(struct dsa_switch *ds)
>>>>>>> +{
>>>>>>> +       unsigned int port;
>>>>>>> +       u32 mask = 0;
>>>>>>> +
>>>>>>> +       for (port = 0; port < DSA_MAX_PORTS; port++) {
>>>>>>> +               if (!((1 << port) & ds->phys_port_mask))
>>>>>>> +                       continue;
>>>>>>> +
>>>>>>> +               if (ds->ports[port]->priv_flags & IFF_BRIDGE_PORT)
>>>>>>> +                       mask |= 1 << port;
>>>>>>
>>>>>>
>>>>>> Problem is that the function can be called through
>>>>>> dsa_slave_netdevice_event
>>>>>> before the slave devices are fully initialized.
>>>>>>
>>>>>> After adding
>>>>>>
>>>>>> +               if (!ds->ports[port]) {
>>>>>> +                       netdev_err(bridge,
>>>>>> +                                  "No ports data for port %d,
>>>>>> mask=0x%x\n",
>>>>>> +                                  port, ds->phys_port_mask);
>>>>>> +                       continue;
>>>>>> +               }
>>>>>>
>>>>>> and with some more debug messages added to dsa_switch_setup(), I see
>>>>>> the following.
>>>>>>
>>>>>> [   14.187290] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 1(port1)
>>>>>> [   14.272605] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 2(port2)
>>>>>> [   14.353118] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 3(port3)
>>>>>> [   14.472002] br0: No ports data for port 3, mask=0x1e
>>>>>> [   14.472053] br0: No ports data for port 4, mask=0x1e
>>>>>> [   14.472753] e1000e 0000:00:19.0 em1: [0]: Creating slave device for
>>>>>> port 4(host2esb)
>>>>>>
>>>>>> This happens if I add the bridge configuration to
>>>>>> /etc/network/interfaces instead
>>>>>> of creating the bridge manually. Apparently dsa_switch_setup() is not
>>>>>> yet complete
>>>>>> when dsa_slave_netdevice_event is executed to handle a state change on
>>>>>> one of its
>>>>>> newly created slave interfaces.
>>>>>>
>>>>>> The relevant information from /etc/network/interfaces is:
>>>>>>
>>>>>> auto br0
>>>>>>
>>>>>> iface port1 inet manual
>>>>>> iface port2 inet manual
>>>>>>
>>>>>> iface br0 inet dhcp
>>>>>>          bridge_ports port1 port2
>>>>>
>>>>>
>>>>> Hi Guenter
>>>>>
>>>>> Does this actually matter? The ports which don't exists yet are not
>>>>> being added to the bridge. The mask will come out correct. What
>>>>> happens when port4 is made a member of the bridge? I expect it
>>>>> works. It is the creation of the interface which triggers hotplug to
>>>>> read interfaces and add the interface to the port.
>>>>>
>>>>
>>>>          if (!ds->ports[port])
>>>>                  continue;
>>>>
>>>> might be an option. However, I am not sure that what you say is correct,
>>>> at least not strictly speaking. dsa_slave_create() returns the created
>>>> slave device, which is added to ds->ports[port] in dsa_switch_setup().
>>>> Since there is no protection in dsa_switch_setup(), there is no guarantee
>>>> that the callback doesn't happen prior to the initialization of
>>>> ds->ports[port]. So the above would leave a race condition, where the
>>>> port being added to the bridge _is_ one for which ds->ports[port] is
>>>> not yet initialized.
>>>
>>>
>>> Yes, you are correct, there is a race here.
>>>
>>>
>>>> Protecting the entire slave creation loop in dsa_switch_setup()
>>>> and using register_netdevice() in dsa_slave_create() solves the problem
>>>> as far as I can see, I just don't know if it is an acceptable solution.
>>>
>>>
>>> Or:
>>>
>>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>>> index 2173402d87e0..1aa120d6d0e4 100644
>>> --- a/net/dsa/dsa.c
>>> +++ b/net/dsa/dsa.c
>>> @@ -325,8 +325,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int
>>> index,
>>>                                      index, i, pd->port_names[i]);
>>>                           continue;
>>>                   }
>>> -
>>> -               ds->ports[i] = slave_dev;
>>>           }
>>>
>>>    #ifdef CONFIG_NET_DSA_HWMON
>>> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
>>> index f23deadf42a0..d6004072a957 100644
>>> --- a/net/dsa/slave.c
>>> +++ b/net/dsa/slave.c
>>> @@ -669,12 +669,14 @@ dsa_slave_create(struct dsa_switch *ds, struct
>>> device *parent,
>>>                   free_netdev(slave_dev);
>>>                   return NULL;
>>>           }
>>> +       ds->ports[port] = slave_dev;
>>>
>>>           ret = register_netdev(slave_dev);
>>>           if (ret) {
>>>                   netdev_err(master, "error %d registering interface %s\n",
>>>                              ret, slave_dev->name);
>>>                   phy_disconnect(p->phy);
>>> +               ds->ports[port] = NULL;
>>>                   free_netdev(slave_dev);
>>>                   return NULL;
>>>           }
>>>
>>> Not tested. But the point being, ensure everything is setup before
>>> calling register_netdev().
>>>
>>
>> That should work. I'll give it a try.
>
Looks like it works, with a couple of additional changes.
dsa_slave_create doesn't need to return slave_dev anymore,
and I still need to check if ds->ports[port] is NULL in
dsa_slave_br_port_mask.

> BTW, before I re-submit this patch series, do you think we should
> introduce a fdb_flush() callback that switch drivers are required to
> implement, and invoke it from net/dsa/slave.c upon port join/leave?
>

It also needs to be called for state changes. Right now I call it from
the driver code. I don't really have a strong opinion either way.

One caveat: state updates may be called with soft interrupts disabled,
meaning I can only schedule a state change but not directly execute it.
You don't see that in the sf2 code since you don't use mdio to talk
with the switch. This should probably be documented somewhere.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23  4:43             ` Florian Fainelli
  2015-02-23  6:19               ` Guenter Roeck
@ 2015-02-23 13:34               ` Andrew Lunn
  2015-02-23 14:23                 ` Guenter Roeck
  1 sibling, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-23 13:34 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Guenter Roeck, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

> BTW, before I re-submit this patch series, do you think we should
> introduce a fdb_flush() callback that switch drivers are required to
> implement, and invoke it from net/dsa/slave.c upon port join/leave?

Probably a good idea.

We should define exactly what is flushed. Everything? Or only dynamic
entries? The Marvell hardware can also have multicast addresses in its
tables, which can age, or static unicast and multicast entries.

I guess at some point, somebody will start thinking about IGMP
snooping and then we might need separated operations.

	Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23 13:34               ` Andrew Lunn
@ 2015-02-23 14:23                 ` Guenter Roeck
  2015-02-23 16:01                   ` Andrew Lunn
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-23 14:23 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli
  Cc: netdev, David Miller, Vivien Didelot, jerome.oufella, Chris Healy

On 02/23/2015 05:34 AM, Andrew Lunn wrote:
>> BTW, before I re-submit this patch series, do you think we should
>> introduce a fdb_flush() callback that switch drivers are required to
>> implement, and invoke it from net/dsa/slave.c upon port join/leave?
>
> Probably a good idea.
>
> We should define exactly what is flushed. Everything? Or only dynamic
> entries? The Marvell hardware can also have multicast addresses in its
> tables, which can age, or static unicast and multicast entries.
>

I currently use ATU command 110 (flush all non-static entries in a
particular FID). I see means to flush either all entries or all
non-static entries, but no means to only flush unicast or multicast
entries. Does any of the standards distinguish between learned unicast
and multicast addresses ? Flushing those selectively might be a
challenge.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23 14:23                 ` Guenter Roeck
@ 2015-02-23 16:01                   ` Andrew Lunn
  2015-02-23 18:05                     ` Florian Fainelli
  0 siblings, 1 reply; 65+ messages in thread
From: Andrew Lunn @ 2015-02-23 16:01 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Florian Fainelli, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

> I currently use ATU command 110 (flush all non-static entries in a
> particular FID). I see means to flush either all entries or all
> non-static entries, but no means to only flush unicast or multicast
> entries. Does any of the standards distinguish between learned unicast
> and multicast addresses ? Flushing those selectively might be a
> challenge.

You might need to walk the table and flush records individually if you
are only interested in one type.

We should also consider do we need to make these flush operations
atomic with respect to other operations? Do we need to disable
learning, flush, change the port STP status, and then enable learning?

	  Andrew

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23 16:01                   ` Andrew Lunn
@ 2015-02-23 18:05                     ` Florian Fainelli
  2015-02-23 18:35                       ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Florian Fainelli @ 2015-02-23 18:05 UTC (permalink / raw)
  To: Andrew Lunn, Guenter Roeck
  Cc: netdev, David Miller, Vivien Didelot, jerome.oufella, Chris Healy

On 23/02/15 08:01, Andrew Lunn wrote:
>> I currently use ATU command 110 (flush all non-static entries in a
>> particular FID). I see means to flush either all entries or all
>> non-static entries, but no means to only flush unicast or multicast
>> entries. Does any of the standards distinguish between learned unicast
>> and multicast addresses ? Flushing those selectively might be a
>> challenge.

Lucky you, on Broadcom switches you have to issue an ARL search, get the
results (there are all valid MAC entries, fortunately), and invalidate
the entries one by one for your particular ports of interest, there is
no "flush all non-static entries".

> 
> You might need to walk the table and flush records individually if you
> are only interested in one type.
> 
> We should also consider do we need to make these flush operations
> atomic with respect to other operations? Do we need to disable
> learning, flush, change the port STP status, and then enable learning?

I think we may have to do this to guarantee no race conditions between
flushing the switch's FDB, although it would look like only "joining" a
bridge needs to be a more controlled operation, on leave we can probably
just leave the bridge, flush entries and the switch port will start
learning new MAC addresses, right?

Alternatively, would not setting a very low aging timeout and
maintaining HW learning still allow us to simplify these operations?
-- 
Florian

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23 18:05                     ` Florian Fainelli
@ 2015-02-23 18:35                       ` Guenter Roeck
  2015-02-25 13:43                         ` Andrey Volkov
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-23 18:35 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
> On 23/02/15 08:01, Andrew Lunn wrote:
> >> I currently use ATU command 110 (flush all non-static entries in a
> >> particular FID). I see means to flush either all entries or all
> >> non-static entries, but no means to only flush unicast or multicast
> >> entries. Does any of the standards distinguish between learned unicast
> >> and multicast addresses ? Flushing those selectively might be a
> >> challenge.
> 
> Lucky you, on Broadcom switches you have to issue an ARL search, get the
> results (there are all valid MAC entries, fortunately), and invalidate
> the entries one by one for your particular ports of interest, there is
> no "flush all non-static entries".
> 
> > 
> > You might need to walk the table and flush records individually if you
> > are only interested in one type.
> > 
> > We should also consider do we need to make these flush operations
> > atomic with respect to other operations? Do we need to disable
> > learning, flush, change the port STP status, and then enable learning?
> 
Wonder what if anything RSTP specifies for flush operation details.

> I think we may have to do this to guarantee no race conditions between
> flushing the switch's FDB, although it would look like only "joining" a
> bridge needs to be a more controlled operation, on leave we can probably
> just leave the bridge, flush entries and the switch port will start
> learning new MAC addresses, right?
> 
> Alternatively, would not setting a very low aging timeout and
> maintaining HW learning still allow us to simplify these operations?

That is what STP specifies. With RSTP, the expectation is that the database
is flushed immediately on port status changes. Also, the minimum aging
period on Marvell switches is 15 seconds, which is way too long for RSTP.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-23 18:35                       ` Guenter Roeck
@ 2015-02-25 13:43                         ` Andrey Volkov
  2015-02-25 14:25                           ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Andrey Volkov @ 2015-02-25 13:43 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

Gunter, Florian,

Le 23/02/2015 19:35, Guenter Roeck a wrote :
> On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
>> On 23/02/15 08:01, Andrew Lunn wrote:
>>>> I currently use ATU command 110 (flush all non-static entries in a
>>>> particular FID). I see means to flush either all entries or all
>>>> non-static entries, but no means to only flush unicast or multicast
>>>> entries. Does any of the standards distinguish between learned unicast
>>>> and multicast addresses ? Flushing those selectively might be a
>>>> challenge.
>> Lucky you, on Broadcom switches you have to issue an ARL search, get the
>> results (there are all valid MAC entries, fortunately), and invalidate
>> the entries one by one for your particular ports of interest, there is
>> no "flush all non-static entries".
>>
>>> You might need to walk the table and flush records individually if you
>>> are only interested in one type.
>>>
>>> We should also consider do we need to make these flush operations
>>> atomic with respect to other operations? Do we need to disable
>>> learning, flush, change the port STP status, and then enable learning?
> Wonder what if anything RSTP specifies for flush operation details.
>
>> I think we may have to do this to guarantee no race conditions between
>> flushing the switch's FDB, although it would look like only "joining" a
>> bridge needs to be a more controlled operation, on leave we can probably
>> just leave the bridge, flush entries and the switch port will start
>> learning new MAC addresses, right?
>>
>> Alternatively, would not setting a very low aging timeout and
>> maintaining HW learning still allow us to simplify these operations?
> That is what STP specifies. With RSTP, the expectation is that the database
> is flushed immediately on port status changes. Also, the minimum aging
> period on Marvell switches is 15 seconds, which is way too long for RSTP.
>
> Guenter
>
I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
(I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will 
contain something useful at the enter time. Also I've look through yours patches and I haven't
seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?

Btw your current FID implementation contain funny security problem: same ports in the different chips,
interconnected by DSA, will have same FID and as result they will treated as bridged together by
internal switch logic...

Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-25 13:43                         ` Andrey Volkov
@ 2015-02-25 14:25                           ` Guenter Roeck
  2015-02-25 16:05                             ` Andrey Volkov
  2015-02-27 17:09                             ` Andrey Volkov
  0 siblings, 2 replies; 65+ messages in thread
From: Guenter Roeck @ 2015-02-25 14:25 UTC (permalink / raw)
  To: Andrey Volkov, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

Andrey,

On 02/25/2015 05:43 AM, Andrey Volkov wrote:
> Gunter, Florian,
>
> Le 23/02/2015 19:35, Guenter Roeck a wrote :
>> On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
>>> On 23/02/15 08:01, Andrew Lunn wrote:
>>>>> I currently use ATU command 110 (flush all non-static entries in a
>>>>> particular FID). I see means to flush either all entries or all
>>>>> non-static entries, but no means to only flush unicast or multicast
>>>>> entries. Does any of the standards distinguish between learned unicast
>>>>> and multicast addresses ? Flushing those selectively might be a
>>>>> challenge.
>>> Lucky you, on Broadcom switches you have to issue an ARL search, get the
>>> results (there are all valid MAC entries, fortunately), and invalidate
>>> the entries one by one for your particular ports of interest, there is
>>> no "flush all non-static entries".
>>>
>>>> You might need to walk the table and flush records individually if you
>>>> are only interested in one type.
>>>>
>>>> We should also consider do we need to make these flush operations
>>>> atomic with respect to other operations? Do we need to disable
>>>> learning, flush, change the port STP status, and then enable learning?
>> Wonder what if anything RSTP specifies for flush operation details.
>>
>>> I think we may have to do this to guarantee no race conditions between
>>> flushing the switch's FDB, although it would look like only "joining" a
>>> bridge needs to be a more controlled operation, on leave we can probably
>>> just leave the bridge, flush entries and the switch port will start
>>> learning new MAC addresses, right?
>>>
>>> Alternatively, would not setting a very low aging timeout and
>>> maintaining HW learning still allow us to simplify these operations?
>> That is what STP specifies. With RSTP, the expectation is that the database
>> is flushed immediately on port status changes. Also, the minimum aging
>> period on Marvell switches is 15 seconds, which is way too long for RSTP.
>>
>> Guenter
>>
> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
> contain something useful at the enter time. Also I've look through yours patches and I haven't

Does removing a port from a fid clean up the entries associated with it
in the database ?

> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>
Not me. That would be difficult to test without real hardware.

The above suggests that you have a HW bridge implementation for Marvell chips as well.
Would it make sense to merge our implementations, or just use yours if it is better ?

> Btw your current FID implementation contain funny security problem: same ports in the different chips,
> interconnected by DSA, will have same FID and as result they will treated as bridged together by
> internal switch logic...
>
You mean if multiple switch chips are used ? Those ports are configured to only send
data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
looked into multi-chip applications, so there may well be some problems. Maybe
it is possible to merge a chip ID into the fid to solve it.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-25 14:25                           ` Guenter Roeck
@ 2015-02-25 16:05                             ` Andrey Volkov
  2015-02-25 17:41                               ` Guenter Roeck
  2015-02-27 17:09                             ` Andrey Volkov
  1 sibling, 1 reply; 65+ messages in thread
From: Andrey Volkov @ 2015-02-25 16:05 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

Le 25/02/2015 15:25, Guenter Roeck wrote:
> Andrey,
> 
> On 02/25/2015 05:43 AM, Andrey Volkov wrote:
>> Gunter, Florian,
>>
>> Le 23/02/2015 19:35, Guenter Roeck a wrote :
>>> On Mon, Feb 23, 2015 at 10:05:41AM -0800, Florian Fainelli wrote:
>>>> On 23/02/15 08:01, Andrew Lunn wrote:
>>>>>> I currently use ATU command 110 (flush all non-static entries in a
>>>>>> particular FID). I see means to flush either all entries or all
>>>>>> non-static entries, but no means to only flush unicast or multicast
>>>>>> entries. Does any of the standards distinguish between learned unicast
>>>>>> and multicast addresses ? Flushing those selectively might be a
>>>>>> challenge.
>>>> Lucky you, on Broadcom switches you have to issue an ARL search, get the
>>>> results (there are all valid MAC entries, fortunately), and invalidate
>>>> the entries one by one for your particular ports of interest, there is
>>>> no "flush all non-static entries".
>>>>
>>>>> You might need to walk the table and flush records individually if you
>>>>> are only interested in one type.
>>>>>
>>>>> We should also consider do we need to make these flush operations
>>>>> atomic with respect to other operations? Do we need to disable
>>>>> learning, flush, change the port STP status, and then enable learning?
>>> Wonder what if anything RSTP specifies for flush operation details.
>>>
>>>> I think we may have to do this to guarantee no race conditions between
>>>> flushing the switch's FDB, although it would look like only "joining" a
>>>> bridge needs to be a more controlled operation, on leave we can probably
>>>> just leave the bridge, flush entries and the switch port will start
>>>> learning new MAC addresses, right?
>>>>
>>>> Alternatively, would not setting a very low aging timeout and
>>>> maintaining HW learning still allow us to simplify these operations?
>>> That is what STP specifies. With RSTP, the expectation is that the database
>>> is flushed immediately on port status changes. Also, the minimum aging
>>> period on Marvell switches is 15 seconds, which is way too long for RSTP.
>>>
>>> Guenter
>>>
>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>> contain something useful at the enter time. Also I've look through yours patches and I haven't
> 
> Does removing a port from a fid clean up the entries associated with it
> in the database ?
> 
It doesn't, sorry that I didn't described it clearly: I've tried to point to that fact that 
changing FID will cause 2 things:
 - learn/discard/... process for all following packets will begin from scratch (as it should be)
 - we could start (potentially) slow database cleanup process in dedicated thread/work, and we may not
   care about appearing of new ATU rules for the removed port, since packets now will be rejected 
   by port's logic.

>> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>>
> Not me. That would be difficult to test without real hardware.
Not a problem for me :), I've already monster switch containing three different types of Marvell's chips 
just before me on my table.

> 
> The above suggests that you have a HW bridge implementation for Marvell chips as well.
> Would it make sense to merge our implementations, or just use yours if it is better ?
I've implemented same thing almost by same way, so for me it will be easer to rebase on top of your jobs,
especially due to the fact that I've enforced to use very old kernel: proprietary binary blobs...

> 
>> Btw your current FID implementation contain funny security problem: same ports in the different chips,
>> interconnected by DSA, will have same FID and as result they will treated as bridged together by
>> internal switch logic...
>>
> You mean if multiple switch chips are used ? Those ports are configured to only send
> data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
> looked into multi-chip applications, so there may well be some problems.

My current project is to implement support of something like:

       .----------.    .--------.
       |  CPU1    |    |  CPU2  |
 .DSA--o (master) |    |        |
 |     '----------'    'o-------'
 |                  .---'
 | .-----.       .--o--.       .-----.
 '-o SW1 o--DSA--o SW2 o==DSA==o SW3 |
   '-----'       '-----'       '-----'
     |             |              |
   ports         ports          ports

Where SW2 and SW3 are interconnected by "trunk", everything managed by CPU1,
some ports of SW1-SW3 are bridged with CPU2, some with CPU1, and some bridged 
independently of CPUs. Also, as I told before, all SWs are from 
different chips families, so I'm using all, except 88e6060 and 6171, Marvell's drivers.

> Maybe
> it is possible to merge a chip ID into the fid to solve it.
Will not work IMHO, since to support interswitch bridges, some ports must have common id's,
so we should have some enumeration management at level of the DSA tree.
I've already implemented it as a free running counter, but implementation is wrong, terrible
and must be redesigned by hlists or alike.

Regards
Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-25 16:05                             ` Andrey Volkov
@ 2015-02-25 17:41                               ` Guenter Roeck
  2015-02-26  1:18                                 ` Andrey Volkov
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-25 17:41 UTC (permalink / raw)
  To: Andrey Volkov
  Cc: Florian Fainelli, Andrew Lunn, netdev, David Miller,
	Vivien Didelot, jerome.oufella, Chris Healy

Andrey,

On Wed, Feb 25, 2015 at 05:05:12PM +0100, Andrey Volkov wrote:
> > 
> > Does removing a port from a fid clean up the entries associated with it
> > in the database ?
> > 
> It doesn't, sorry that I didn't described it clearly: I've tried to point to that fact that 
> changing FID will cause 2 things:
>  - learn/discard/... process for all following packets will begin from scratch (as it should be)
>  - we could start (potentially) slow database cleanup process in dedicated thread/work, and we may not
>    care about appearing of new ATU rules for the removed port, since packets now will be rejected 
>    by port's logic.
> 
Any idea what happens if a packet is received which has an fdb entry
pointing to port X, which was just removed from the bridge group ?

> >> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
> >>
> > Not me. That would be difficult to test without real hardware.
> Not a problem for me :), I've already monster switch containing three different types of Marvell's chips 
> just before me on my table.
> 

Lucky (or unlucky ;-) you.

> > 
> > The above suggests that you have a HW bridge implementation for Marvell chips as well.
> > Would it make sense to merge our implementations, or just use yours if it is better ?
> I've implemented same thing almost by same way, so for me it will be easer to rebase on top of your jobs,
> especially due to the fact that I've enforced to use very old kernel: proprietary binary blobs...
> 

Can you by any chance share your code, and or do you plan
to submit it ?

I'll have to look into multi-bridge implementations at some point
in the future, so that would help a lot.

> > 
> >> Btw your current FID implementation contain funny security problem: same ports in the different chips,
> >> interconnected by DSA, will have same FID and as result they will treated as bridged together by
> >> internal switch logic...
> >>
> > You mean if multiple switch chips are used ? Those ports are configured to only send
> > data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
> > looked into multi-chip applications, so there may well be some problems.
> 
> My current project is to implement support of something like:
> 
>        .----------.    .--------.
>        |  CPU1    |    |  CPU2  |
>  .DSA--o (master) |    |        |
>  |     '----------'    'o-------'
>  |                  .---'
>  | .-----.       .--o--.       .-----.
>  '-o SW1 o--DSA--o SW2 o==DSA==o SW3 |
>    '-----'       '-----'       '-----'
>      |             |              |
>    ports         ports          ports
> 
> Where SW2 and SW3 are interconnected by "trunk", everything managed by CPU1,
> some ports of SW1-SW3 are bridged with CPU2, some with CPU1, and some bridged 
> independently of CPUs. Also, as I told before, all SWs are from 
> different chips families, so I'm using all, except 88e6060 and 6171, Marvell's drivers.
> 
Sounds like a lot of fun, especially if/when both CPUs start messing
with switch configuration.

> > Maybe
> > it is possible to merge a chip ID into the fid to solve it.
> Will not work IMHO, since to support interswitch bridges, some ports must have common id's,
> so we should have some enumeration management at level of the DSA tree.
> I've already implemented it as a free running counter, but implementation is wrong, terrible
> and must be redesigned by hlists or alike.
> 
Maybe use ida to get a well defined id for each bridge group touched
by dsa ? This id could then be used by the driver to identify a bridge.

Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-25 17:41                               ` Guenter Roeck
@ 2015-02-26  1:18                                 ` Andrey Volkov
  0 siblings, 0 replies; 65+ messages in thread
From: Andrey Volkov @ 2015-02-26  1:18 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

Guenter,

Le 25/02/2015 18:41, Guenter Roeck wrote :
> Andrey,
>
> On Wed, Feb 25, 2015 at 05:05:12PM +0100, Andrey Volkov wrote:
>>> Does removing a port from a fid clean up the entries associated with it
>>> in the database ?
>>>
>> It doesn't, sorry that I didn't described it clearly: I've tried to point to that fact that
>> changing FID will cause 2 things:
>>   - learn/discard/... process for all following packets will begin from scratch (as it should be)
>>   - we could start (potentially) slow database cleanup process in dedicated thread/work, and we may not
>>     care about appearing of new ATU rules for the removed port, since packets now will be rejected
>>     by port's logic.
>>
> Any idea what happens if a packet is received which has an fdb entry
> pointing to port X, which was just removed from the bridge group ?
I have some ideas (looks like it must be filtered out), but not sure,
I will do some experiments tomorrow.

>>>> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>>>>
>>> Not me. That would be difficult to test without real hardware.
>> Not a problem for me :), I've already monster switch containing three different types of Marvell's chips
>> just before me on my table.
>>
> Lucky (or unlucky ;-) you.
M-m-m, how about undocumented 'PPU enable' 14'th bit and other such
gifts :D?

>>> The above suggests that you have a HW bridge implementation for Marvell chips as well.
>>> Would it make sense to merge our implementations, or just use yours if it is better ?
>> I've implemented same thing almost by same way, so for me it will be easer to rebase on top of your jobs,
>> especially due to the fact that I've enforced to use very old kernel: proprietary binary blobs...
>>
> Can you by any chance share your code, and or do you plan
> to submit it ?
Yes, sure, I plan to start submitting it in the first half of the March.

>
> I'll have to look into multi-bridge implementations at some point
> in the future, so that would help a lot.
>
>>>> Btw your current FID implementation contain funny security problem: same ports in the different chips,
>>>> interconnected by DSA, will have same FID and as result they will treated as bridged together by
>>>> internal switch logic...
>>>>
>>> You mean if multiple switch chips are used ? Those ports are configured to only send
>>> data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
>>> looked into multi-chip applications, so there may well be some problems.
>> My current project is to implement support of something like:
>>
>>         .----------.    .--------.
>>         |  CPU1    |    |  CPU2  |
>>   .DSA--o (master) |    |        |
>>   |     '----------'    'o-------'
>>   |                  .---'
>>   | .-----.       .--o--.       .-----.
>>   '-o SW1 o--DSA--o SW2 o==DSA==o SW3 |
>>     '-----'       '-----'       '-----'
>>       |             |              |
>>     ports         ports          ports
>>
>> Where SW2 and SW3 are interconnected by "trunk", everything managed by CPU1,
>> some ports of SW1-SW3 are bridged with CPU2, some with CPU1, and some bridged
>> independently of CPUs. Also, as I told before, all SWs are from
>> different chips families, so I'm using all, except 88e6060 and 6171, Marvell's drivers.
>>
> Sounds like a lot of fun, especially if/when both CPUs start messing
> with switch configuration.
Yeah, the toy is really interesting and powerful, and fortunately not so terrible:
CPU2  connected like "normal" client, meanwhile mixture of trunk/bridge/normal ports
connected through mixture of PHYs still "little bit" disturbing me :).

>
>>> Maybe
>>> it is possible to merge a chip ID into the fid to solve it.
>> Will not work IMHO, since to support interswitch bridges, some ports must have common id's,
>> so we should have some enumeration management at level of the DSA tree.
>> I've already implemented it as a free running counter, but implementation is wrong, terrible
>> and must be redesigned by hlists or alike.
>>
> Maybe use ida to get a well defined id for each bridge group touched
> by dsa ? This id could then be used by the driver to identify a bridge.
Hmm, probably you are right, I need thinking about it little bit more.
Just for any case: these shared ids must be unique not only for bridges,
but for the Marvell's trunks (hidden from user space) and normal ports too.

Florian, I suspect that Broadcom have same document's policy as Marvell,
so you can not tell us too much, but is the trunk/multiswitch stuff will be
interesting for your too?

Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-25 14:25                           ` Guenter Roeck
  2015-02-25 16:05                             ` Andrey Volkov
@ 2015-02-27 17:09                             ` Andrey Volkov
  2015-02-28  7:53                               ` Guenter Roeck
  1 sibling, 1 reply; 65+ messages in thread
From: Andrey Volkov @ 2015-02-27 17:09 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

Gunter,

Sorry with response delay, I very was busy yesterday

Le 25/02/2015 15:25, Guenter Roeck a écrit :
> Andrey,

------- snip ------- 

>>>
>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>> contain something useful at the enter time. Also I've look through yours patches and I haven't
> 
> Does removing a port from a fid clean up the entries associated with it
> in the database ?

I've checked what happened when port changed its FID: switch logic block traffic to it
immediately, as far as I can see, meanwhile record still exists in the bridge database,
it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it 
will happened.

> 
>> seen any mutichip bridges/hardwared "trunks" support (in the Marvell's sense), did anyone, except me, use it?
>>
> Not me. That would be difficult to test without real hardware.
> 
> The above suggests that you have a HW bridge implementation for Marvell chips as well.
> Would it make sense to merge our implementations, or just use yours if it is better ?
> 
>> Btw your current FID implementation contain funny security problem: same ports in the different chips,
>> interconnected by DSA, will have same FID and as result they will treated as bridged together by
>> internal switch logic...
>>
> You mean if multiple switch chips are used ? Those ports are configured to only send
> data to the CPU port. Doesn't that take care of the problem ? Granted, I have not
> looked into multi-chip applications, so there may well be some problems. Maybe
> it is possible to merge a chip ID into the fid to solve it.
> 
> Thanks,
> Guenter
> 

Regards,
Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-27 17:09                             ` Andrey Volkov
@ 2015-02-28  7:53                               ` Guenter Roeck
  2015-03-02 14:38                                 ` Andrey Volkov
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-02-28  7:53 UTC (permalink / raw)
  To: Andrey Volkov, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On 02/27/2015 09:09 AM, Andrey Volkov wrote:
> Gunter,
>
> Sorry with response delay, I very was busy yesterday
>
> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>> Andrey,
>
> ------- snip -------
>
>>>>
>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>
>> Does removing a port from a fid clean up the entries associated with it
>> in the database ?
>
> I've checked what happened when port changed its FID: switch logic block traffic to it
> immediately, as far as I can see, meanwhile record still exists in the bridge database,
> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
> will happened.
>
Hmm - interesting. I assume you mean updating port registers 5 and 6 ?

Different question: For 6185, did you write a new driver or extend an existing one ?
I found that it is quite similar to 6131, and that adding support for it to the 6131
driver should be straightforward.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-02-28  7:53                               ` Guenter Roeck
@ 2015-03-02 14:38                                 ` Andrey Volkov
  2015-03-02 14:49                                   ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Andrey Volkov @ 2015-03-02 14:38 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy


On 28/02/2015 08:53, Guenter Roeck wrote:
> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>> Gunter,
>>
>> Sorry with response delay, I very was busy yesterday
>>
>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>> Andrey,
>>
>> ------- snip -------
>>
>>>>>
>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>
>>> Does removing a port from a fid clean up the entries associated with it
>>> in the database ?
>>
>> I've checked what happened when port changed its FID: switch logic block traffic to it
>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>> will happened.
>>
> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
Yes sure, it's reason why we must disable the port before changing the FID.

> 
> Different question: For 6185, did you write a new driver or extend an existing one ?
> I found that it is quite similar to 6131, and that adding support for it to the 6131
> driver should be straightforward.
Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is 
large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.

> 
> Thanks,
> Guenter
> 
> 
--
Regards
Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-03-02 14:38                                 ` Andrey Volkov
@ 2015-03-02 14:49                                   ` Guenter Roeck
  2015-03-02 15:27                                     ` Andrey Volkov
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-03-02 14:49 UTC (permalink / raw)
  To: Andrey Volkov, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>
> On 28/02/2015 08:53, Guenter Roeck wrote:
>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>> Gunter,
>>>
>>> Sorry with response delay, I very was busy yesterday
>>>
>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>> Andrey,
>>>
>>> ------- snip -------
>>>
>>>>>>
>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>
>>>> Does removing a port from a fid clean up the entries associated with it
>>>> in the database ?
>>>
>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>> will happened.
>>>
>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
> Yes sure, it's reason why we must disable the port before changing the FID.
>
Yes, I think we'll need to do that once we use the bits in register 5.

>>
>> Different question: For 6185, did you write a new driver or extend an existing one ?
>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>> driver should be straightforward.
> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>

I have a patch set to change the number of ports to a variable in the 6131 driver.
Want me to submit it now ? Though I guess you must have pretty much the same,
so we can also use your approach. Let me know.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-03-02 14:49                                   ` Guenter Roeck
@ 2015-03-02 15:27                                     ` Andrey Volkov
  2015-03-02 17:16                                       ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Andrey Volkov @ 2015-03-02 15:27 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On 02/03/2015 15:49, Guenter Roeck wrote:
> On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>>
>> On 28/02/2015 08:53, Guenter Roeck wrote:
>>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>>> Gunter,
>>>>
>>>> Sorry with response delay, I very was busy yesterday
>>>>
>>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>>> Andrey,
>>>>
>>>> ------- snip -------
>>>>
>>>>>>>
>>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>>
>>>>> Does removing a port from a fid clean up the entries associated with it
>>>>> in the database ?
>>>>
>>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>>> will happened.
>>>>
>>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
>> Yes sure, it's reason why we must disable the port before changing the FID.
>>
> Yes, I think we'll need to do that once we use the bits in register 5.
> 
>>>
>>> Different question: For 6185, did you write a new driver or extend an existing one ?
>>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>>> driver should be straightforward.
>> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
>> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
>> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>>
> 
> I have a patch set to change the number of ports to a variable in the 6131 driver.
> Want me to submit it now ? Though I guess you must have pretty much the same,
> so we can also use your approach. Let me know.

I think that better to start from my patches: they are more generic and have support of sysfs
(should be useful for "MII over ethernet"). Also IMO it will be better if we continue exchange/prereview
our patches in more narrow mail list, since I do not want to pollute netdev by useless discussions about drafts.
Objections/suggestions?

--
Regards
Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-03-02 15:27                                     ` Andrey Volkov
@ 2015-03-02 17:16                                       ` Guenter Roeck
  2015-03-04 10:07                                         ` Andrey Volkov
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2015-03-02 17:16 UTC (permalink / raw)
  To: Andrey Volkov, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

On 03/02/2015 07:27 AM, Andrey Volkov wrote:
> On 02/03/2015 15:49, Guenter Roeck wrote:
>> On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>>>
>>> On 28/02/2015 08:53, Guenter Roeck wrote:
>>>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>>>> Gunter,
>>>>>
>>>>> Sorry with response delay, I very was busy yesterday
>>>>>
>>>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>>>> Andrey,
>>>>>
>>>>> ------- snip -------
>>>>>
>>>>>>>>
>>>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>>>
>>>>>> Does removing a port from a fid clean up the entries associated with it
>>>>>> in the database ?
>>>>>
>>>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>>>> will happened.
>>>>>
>>>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
>>> Yes sure, it's reason why we must disable the port before changing the FID.
>>>
>> Yes, I think we'll need to do that once we use the bits in register 5.
>>
>>>>
>>>> Different question: For 6185, did you write a new driver or extend an existing one ?
>>>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>>>> driver should be straightforward.
>>> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
>>> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
>>> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>>>
>>
>> I have a patch set to change the number of ports to a variable in the 6131 driver.
>> Want me to submit it now ? Though I guess you must have pretty much the same,
>> so we can also use your approach. Let me know.
>
> I think that better to start from my patches: they are more generic and have support of sysfs
> (should be useful for "MII over ethernet"). Also IMO it will be better if we continue exchange/prereview

Sure, no problem. Only concern I have is that your patches don't seem to be available
in public, or maybe I missed the reference to it.

> our patches in more narrow mail list, since I do not want to pollute netdev by useless discussions about drafts.
> Objections/suggestions?
>
Sure, no problem, though personally I have no issues with the discussions
or with submitting draft patches, and I did not have the impression that
they are useless.

My patches are all in my repository at kernel.org; it is fine with me to keep
them (only) there if submitting drafts to netdev is considered pollution.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFC 1/2] net: dsa: integrate with SWITCHDEV for HW bridging
  2015-03-02 17:16                                       ` Guenter Roeck
@ 2015-03-04 10:07                                         ` Andrey Volkov
  0 siblings, 0 replies; 65+ messages in thread
From: Andrey Volkov @ 2015-03-04 10:07 UTC (permalink / raw)
  To: Guenter Roeck, Florian Fainelli
  Cc: Andrew Lunn, netdev, David Miller, Vivien Didelot,
	jerome.oufella, Chris Healy

Gunter,

On 02/03/2015 18:16, Guenter Roeck wrote:
> On 03/02/2015 07:27 AM, Andrey Volkov wrote:
>> On 02/03/2015 15:49, Guenter Roeck wrote:
>>> On 03/02/2015 06:38 AM, Andrey Volkov wrote:
>>>>
>>>> On 28/02/2015 08:53, Guenter Roeck wrote:
>>>>> On 02/27/2015 09:09 AM, Andrey Volkov wrote:
>>>>>> Gunter,
>>>>>>
>>>>>> Sorry with response delay, I very was busy yesterday
>>>>>>
>>>>>> Le 25/02/2015 15:25, Guenter Roeck a écrit :
>>>>>>> Andrey,
>>>>>>
>>>>>> ------- snip -------
>>>>>>
>>>>>>>>>
>>>>>>>> I simply modify port's fid to the new one in the leave routine and set to common bridge FID in enter
>>>>>>>> (I'm using Marvell's chips). So the port's database will cleaned up automatically for the leave and will
>>>>>>>> contain something useful at the enter time. Also I've look through yours patches and I haven't
>>>>>>>
>>>>>>> Does removing a port from a fid clean up the entries associated with it
>>>>>>> in the database ?
>>>>>>
>>>>>> I've checked what happened when port changed its FID: switch logic block traffic to it
>>>>>> immediately, as far as I can see, meanwhile record still exists in the bridge database,
>>>>>> it was checked on 88e6185, 88e6097 and 88e6352 chips. And yet another 5c: changing of group membership is
>>>>>> not atomic operation in the Marvell's chips known for me, so the port must be in the disabled state when it
>>>>>> will happened.
>>>>>>
>>>>> Hmm - interesting. I assume you mean updating port registers 5 and 6 ?
>>>> Yes sure, it's reason why we must disable the port before changing the FID.
>>>>
>>> Yes, I think we'll need to do that once we use the bits in register 5.
>>>
>>>>>
>>>>> Different question: For 6185, did you write a new driver or extend an existing one ?
>>>>> I found that it is quite similar to 6131, and that adding support for it to the 6131
>>>>> driver should be straightforward.
>>>> Yes again :), and 88E6097 have same core as 6123_61_65. Difference in both cases only in the number
>>>> of supported ports, and it was main reason why hardcoded port's number was unacceptable for me, difference is
>>>> large enough: for ex. 88e6123 have only 3 ports, but 88E6097 - 11.
>>>>
>>>
>>> I have a patch set to change the number of ports to a variable in the 6131 driver.
>>> Want me to submit it now ? Though I guess you must have pretty much the same,
>>> so we can also use your approach. Let me know.
>>
>> I think that better to start from my patches: they are more generic and have support of sysfs
>> (should be useful for "MII over ethernet"). Also IMO it will be better if we continue exchange/prereview
> 
> Sure, no problem. Only concern I have is that your patches don't seem to be available
> in public, or maybe I missed the reference to it.
No I didn't publish them yet, sorry, I plan to submit them at the end of next week,
as I've wrote before, but if this date is late for you, then yes, 
I think that we could begin from your patches.

> 
>> our patches in more narrow mail list, since I do not want to pollute netdev by useless discussions about drafts.
>> Objections/suggestions?
>>
> Sure, no problem, though personally I have no issues with the discussions
> or with submitting draft patches, and I did not have the impression that
> they are useless.
Ok, I've just mean to simplify discussion, especially that David told about his 
fresh hardware switches related project, so ok this mail list then then this mail list.

> 
> My patches are all in my repository at kernel.org; it is fine with me to keep
> them (only) there if submitting drafts to netdev is considered pollution.
Nice, I'll keep this thing in mind.

Thank you
Andrey

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2015-03-04 11:24 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-17 19:26 [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Florian Fainelli
2015-02-17 19:26 ` [PATCH RFC 1/2] net: dsa: integrate " Florian Fainelli
2015-02-17 20:13   ` Guenter Roeck
2015-02-17 20:24     ` Florian Fainelli
2015-02-17 20:28       ` Guenter Roeck
2015-02-18  1:19   ` Andrew Lunn
2015-02-18  1:43     ` Florian Fainelli
2015-02-18  3:53     ` Guenter Roeck
2015-02-18  4:53       ` Florian Fainelli
2015-02-18  6:14         ` Guenter Roeck
2015-02-18 13:49         ` Andrew Lunn
2015-02-18 14:05           ` Guenter Roeck
2015-02-23  2:20   ` Guenter Roeck
2015-02-23  3:14     ` Andrew Lunn
2015-02-23  4:07       ` Guenter Roeck
2015-02-23  4:22         ` Andrew Lunn
2015-02-23  4:33           ` Florian Fainelli
2015-02-23  4:38           ` Guenter Roeck
2015-02-23  4:43             ` Florian Fainelli
2015-02-23  6:19               ` Guenter Roeck
2015-02-23 13:34               ` Andrew Lunn
2015-02-23 14:23                 ` Guenter Roeck
2015-02-23 16:01                   ` Andrew Lunn
2015-02-23 18:05                     ` Florian Fainelli
2015-02-23 18:35                       ` Guenter Roeck
2015-02-25 13:43                         ` Andrey Volkov
2015-02-25 14:25                           ` Guenter Roeck
2015-02-25 16:05                             ` Andrey Volkov
2015-02-25 17:41                               ` Guenter Roeck
2015-02-26  1:18                                 ` Andrey Volkov
2015-02-27 17:09                             ` Andrey Volkov
2015-02-28  7:53                               ` Guenter Roeck
2015-03-02 14:38                                 ` Andrey Volkov
2015-03-02 14:49                                   ` Guenter Roeck
2015-03-02 15:27                                     ` Andrey Volkov
2015-03-02 17:16                                       ` Guenter Roeck
2015-03-04 10:07                                         ` Andrey Volkov
2015-02-17 19:26 ` [PATCH RFC 2/2] net: dsa: bcm_sf2: implement HW bridging operations Florian Fainelli
2015-02-19  2:48   ` Florian Fainelli
2015-02-19  5:59     ` Guenter Roeck
2015-02-19 17:27       ` Florian Fainelli
2015-02-19 17:46         ` Guenter Roeck
2015-02-19 23:50           ` Florian Fainelli
2015-02-20  0:09             ` Guenter Roeck
2015-02-20  0:51               ` roopa
2015-02-20  1:03                 ` Guenter Roeck
2015-02-20  1:46                   ` roopa
2015-02-20  2:00                     ` Florian Fainelli
2015-02-20  2:41                       ` roopa
2015-02-20  4:05                         ` Guenter Roeck
2015-02-20  4:58                           ` Scott Feldman
2015-02-20  4:59                           ` roopa
2015-02-20  3:20                       ` Andy Gospodarek
2015-02-20  3:53                         ` Viswanath Bandaru
2015-02-20  3:56                         ` Andy Gospodarek
2015-02-20  2:18                     ` Andrew Lunn
2015-02-20  2:51                       ` roopa
2015-02-20  3:52                         ` Andrew Lunn
2015-02-20  4:07                           ` Guenter Roeck
2015-02-20  2:02               ` Andrew Lunn
2015-02-20  3:55                 ` Guenter Roeck
2015-02-20  2:03               ` Florian Fainelli
2015-02-20  2:46                 ` roopa
2015-02-18  0:48 ` [PATCH RFC 0/2] net: dsa: integration with SWITCHDEV for HW bridging Scott Feldman
2015-02-18  1:09   ` Florian Fainelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.