All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 net-next 00/17] RX filtering in DSA
@ 2021-02-24 11:43 Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses Vladimir Oltean
                   ` (16 more replies)
  0 siblings, 17 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

This is my second stab at creating a list of unicast and multicast
addresses that the DSA CPU port must trap. I am reusing a lot of
Tobias's work which he submitted here:
https://patchwork.kernel.org/project/netdevbpf/cover/20210116012515.3152-1-tobias@waldekranz.com/

I did not yet have the guts to disable flooding towards the CPU port,
but if feedback for the approach taken here is positive, that will be
next.

Tobias Waldekranz (6):
  net: bridge: switchdev: refactor br_switchdev_fdb_notify
  net: bridge: switchdev: include local flag in FDB notifications
  net: bridge: switchdev: send FDB notifications for host addresses
  net: dsa: include bridge addresses which are local in the host fdb
    list
  net: dsa: sync static FDB entries on foreign interfaces to hardware
  net: dsa: mv88e6xxx: Request assisted learning on CPU port

Vladimir Oltean (11):
  net: dsa: reference count the host mdb addresses
  net: dsa: reference count the host fdb addresses
  net: dsa: install the host MDB and FDB entries in the master's RX
    filter
  net: dsa: install the port MAC addresses as host fdb entries
  net: bridge: implement unicast filtering for the bridge device
  net: dsa: add addresses obtained from RX filtering to host addresses
  net: dsa: include fdb entries pointing to bridge in the host fdb list
  net: dsa: replay port and host-joined mdb entries when joining the
    bridge
  net: dsa: replay port and local fdb entries when joining the bridge
  net: bridge: switchdev: let drivers inform which bridge ports are
    offloaded
  net: bridge: offloaded ports are always promiscuous

 drivers/net/dsa/mv88e6xxx/chip.c              |   1 +
 .../marvell/prestera/prestera_switchdev.c     |   4 +-
 .../mellanox/mlxsw/spectrum_switchdev.c       |   7 +-
 drivers/net/ethernet/mscc/ocelot_net.c        |  43 +-
 drivers/net/ethernet/rocker/rocker_main.c     |   4 +-
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |   2 +
 drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   2 +
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c |   4 +-
 drivers/net/ethernet/ti/cpsw_new.c            |   1 +
 drivers/net/ethernet/ti/cpsw_switchdev.c      |   4 +-
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c       |   6 +-
 include/linux/if_bridge.h                     |  18 +
 include/net/dsa.h                             |  35 ++
 include/net/switchdev.h                       |  19 +
 net/bridge/br.c                               |   8 +-
 net/bridge/br_device.c                        |  23 +-
 net/bridge/br_fdb.c                           |  57 +-
 net/bridge/br_if.c                            |  19 +-
 net/bridge/br_mdb.c                           | 117 ++++
 net/bridge/br_private.h                       |  14 +-
 net/bridge/br_switchdev.c                     |  74 +--
 net/dsa/dsa2.c                                |  30 +-
 net/dsa/dsa_priv.h                            |  23 +-
 net/dsa/port.c                                |  51 ++
 net/dsa/slave.c                               | 589 ++++++++++++++----
 net/dsa/switch.c                              | 123 +++-
 net/switchdev/switchdev.c                     |  18 +
 27 files changed, 1047 insertions(+), 249 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-26  9:20   ` Tobias Waldekranz
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 02/17] net: dsa: reference count the host fdb addresses Vladimir Oltean
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

Currently any DSA switch that is strict when implementing the mdb
operations prints these benign errors after the addresses expire, with
at least 2 ports bridged:

[  286.013814] mscc_felix 0000:00:00.5 swp3: failed (err=-2) to del object (id=3)

The reason has to do with this piece of code:

	netdev_for_each_lower_dev(dev, lower_dev, iter)
		br_mdb_switchdev_host_port(dev, lower_dev, mp, type);

called from:

br_multicast_group_expired
-> br_multicast_host_leave
   -> br_mdb_notify
      -> br_mdb_switchdev_host

Basically, the bridge code is correct. It tells each switchdev port that
the host can leave that multicast group. But in the case of DSA, all
user ports are connected to the host through the same pipe. So, because
DSA blindly translates a host MDB to a normal MDB on the CPU port, this
means that when all user ports leave a multicast group, DSA tries to
remove it N times from the CPU port.

We should be reference-counting these addresses. Otherwise, the first
port on which the MDB expires will cause an entry removal from the CPU
port, which will break the host MDB for the remaining ports.

There is more. In expectation of future support for multiple CPUs, the
addresses filtered towards the host should be kept in a list per CPU
port. And cross-chip setups with multiple CPU ports can be really
unconventional, nobody says that the switches must be cascaded. Consider
just this example:

               +-------------------------+
               |        Host system      |
               |                         |
               |  eth0             eth1  |
               +-------------------------+
          CPU port |                 | CPU port
             +-----------+     +-----------+
             |           | DSA |           |
             |           |-----|           |
             |           | link|           |
             +-----------+     +-----------+
               |  |  |  |       |  |  |  |
               sw0pN user       sw1pN user
                 ports            ports

It is clear that the host addresses of sw0 should not interfere with the
host addresses of sw1, and one switch should not even respond to the
notifiers emitted for the other's host addresses.

An entirely different thing can be said about the typical cross-chip
setup that DSA works with today:

               +-----------------+
               |   Host system   |
               |                 |
               |       eth0      |
               +-----------------+
                     CPU | port
                   +-----------+
            sw0pN -|           |
             user -|           |
            ports -|           |
                  -|           |
                   +-----------+
                     DSA | link
                   +-----------+
            sw1pN -|           |
             user -|           |
            ports -|           |
                  -|           |
                   +-----------+

where a host MDB entry installed on a user port of sw1pN should be
installed both on its upstream DSA link as well as on the CPU port
(but not on the downstream DSA link of sw0pN).

Basically this calls for the introduction of a separate notifier for
host addresses, which matches on all upstream ports of the targeted user
port (be they CPU ports or DSA links). This means we can simplify the
normal MDB notifiers to be identical to the FDB notifiers now.

Note that for switches which don't implement .port_mdb_add and
.port_mdb_del, we don't even attempt to keep the address lists, only to
fail at install time.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h  | 34 ++++++++++++++++++
 net/dsa/dsa2.c     | 24 +++++++++++--
 net/dsa/dsa_priv.h |  6 ++++
 net/dsa/port.c     | 24 +++++++++++++
 net/dsa/slave.c    | 89 +++++++++++++++++++++++++++++++++++++++++-----
 net/dsa/switch.c   | 73 +++++++++++++++++++++++++++++--------
 6 files changed, 226 insertions(+), 24 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 83a933e563fe..31381bfcf35c 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -290,6 +290,11 @@ struct dsa_port {
 	 */
 	const struct dsa_netdevice_ops *netdev_ops;
 
+	/* List of MAC addresses that must be extracted from the fabric
+	 * through this CPU port. Valid only for DSA_PORT_TYPE_CPU.
+	 */
+	struct list_head	host_mdb;
+
 	bool setup;
 };
 
@@ -304,6 +309,13 @@ struct dsa_link {
 	struct list_head list;
 };
 
+struct dsa_host_addr {
+	unsigned char addr[ETH_ALEN];
+	u16 vid;
+	refcount_t refcount;
+	struct list_head list;
+};
+
 struct dsa_switch {
 	bool setup;
 
@@ -481,6 +493,28 @@ static inline unsigned int dsa_upstream_port(struct dsa_switch *ds, int port)
 	return dsa_towards_port(ds, cpu_dp->ds->index, cpu_dp->index);
 }
 
+static inline bool dsa_is_upstream_port(struct dsa_switch *ds, int port)
+{
+	if (dsa_is_unused_port(ds, port))
+		return false;
+
+	return port == dsa_upstream_port(ds, port);
+}
+
+/* Return true if @upstream_ds is an upstream switch of @downstream_ds. */
+static inline bool dsa_switch_is_upstream_of(struct dsa_switch *upstream_ds,
+					     struct dsa_switch *downstream_ds)
+{
+	int routing_port;
+
+	if (upstream_ds == downstream_ds)
+		return true;
+
+	routing_port = dsa_routing_port(downstream_ds, upstream_ds->index);
+
+	return dsa_is_upstream_port(downstream_ds, routing_port);
+}
+
 static inline bool dsa_port_is_vlan_filtering(const struct dsa_port *dp)
 {
 	const struct dsa_switch *ds = dp->ds;
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 4d4956ed303b..d64f1287625d 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -326,6 +326,11 @@ static struct dsa_port *dsa_tree_find_first_cpu(struct dsa_switch_tree *dst)
 	return NULL;
 }
 
+static void dsa_setup_cpu_port(struct dsa_port *cpu_dp)
+{
+	INIT_LIST_HEAD(&cpu_dp->host_mdb);
+}
+
 static int dsa_tree_setup_default_cpu(struct dsa_switch_tree *dst)
 {
 	struct dsa_port *cpu_dp, *dp;
@@ -336,6 +341,8 @@ static int dsa_tree_setup_default_cpu(struct dsa_switch_tree *dst)
 		return -EINVAL;
 	}
 
+	dsa_setup_cpu_port(cpu_dp);
+
 	/* Assign the default CPU port to all ports of the fabric */
 	list_for_each_entry(dp, &dst->ports, list)
 		if (dsa_port_is_user(dp) || dsa_port_is_dsa(dp))
@@ -344,13 +351,26 @@ static int dsa_tree_setup_default_cpu(struct dsa_switch_tree *dst)
 	return 0;
 }
 
+static void dsa_teardown_cpu_port(struct dsa_port *cpu_dp)
+{
+	struct dsa_host_addr *a, *tmp;
+
+	list_for_each_entry_safe(a, tmp, &cpu_dp->host_mdb, list) {
+		list_del(&a->list);
+		kfree(a);
+	}
+}
+
 static void dsa_tree_teardown_default_cpu(struct dsa_switch_tree *dst)
 {
 	struct dsa_port *dp;
 
-	list_for_each_entry(dp, &dst->ports, list)
-		if (dsa_port_is_user(dp) || dsa_port_is_dsa(dp))
+	list_for_each_entry(dp, &dst->ports, list) {
+		if (dsa_port_is_user(dp) || dsa_port_is_dsa(dp)) {
+			dsa_teardown_cpu_port(dp->cpu_dp);
 			dp->cpu_dp = NULL;
+		}
+	}
 }
 
 static int dsa_port_setup(struct dsa_port *dp)
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 2eeaa42f2e08..c730d40b81b9 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -27,6 +27,8 @@ enum {
 	DSA_NOTIFIER_LAG_LEAVE,
 	DSA_NOTIFIER_MDB_ADD,
 	DSA_NOTIFIER_MDB_DEL,
+	DSA_NOTIFIER_HOST_MDB_ADD,
+	DSA_NOTIFIER_HOST_MDB_DEL,
 	DSA_NOTIFIER_VLAN_ADD,
 	DSA_NOTIFIER_VLAN_DEL,
 	DSA_NOTIFIER_MTU,
@@ -203,6 +205,10 @@ int dsa_port_mdb_add(const struct dsa_port *dp,
 		     const struct switchdev_obj_port_mdb *mdb);
 int dsa_port_mdb_del(const struct dsa_port *dp,
 		     const struct switchdev_obj_port_mdb *mdb);
+int dsa_port_host_mdb_add(const struct dsa_port *dp,
+			  const struct switchdev_obj_port_mdb *mdb);
+int dsa_port_host_mdb_del(const struct dsa_port *dp,
+			  const struct switchdev_obj_port_mdb *mdb);
 int dsa_port_pre_bridge_flags(const struct dsa_port *dp,
 			      struct switchdev_brport_flags flags,
 			      struct netlink_ext_ack *extack);
diff --git a/net/dsa/port.c b/net/dsa/port.c
index c9c6d7ab3f47..df9ba9b67675 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -538,6 +538,30 @@ int dsa_port_mdb_del(const struct dsa_port *dp,
 	return dsa_port_notify(dp, DSA_NOTIFIER_MDB_DEL, &info);
 }
 
+int dsa_port_host_mdb_add(const struct dsa_port *dp,
+			  const struct switchdev_obj_port_mdb *mdb)
+{
+	struct dsa_notifier_mdb_info info = {
+		.sw_index = dp->ds->index,
+		.port = dp->index,
+		.mdb = mdb,
+	};
+
+	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_ADD, &info);
+}
+
+int dsa_port_host_mdb_del(const struct dsa_port *dp,
+			  const struct switchdev_obj_port_mdb *mdb)
+{
+	struct dsa_notifier_mdb_info info = {
+		.sw_index = dp->ds->index,
+		.port = dp->index,
+		.mdb = mdb,
+	};
+
+	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_DEL, &info);
+}
+
 int dsa_port_vlan_add(struct dsa_port *dp,
 		      const struct switchdev_obj_port_vlan *vlan,
 		      struct netlink_ext_ack *extack)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 491e3761b5f4..14a51503efe0 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -23,6 +23,85 @@
 
 #include "dsa_priv.h"
 
+static struct dsa_host_addr *dsa_host_addr_find(struct list_head *addr_list,
+						const unsigned char *addr,
+						u16 vid)
+{
+	struct dsa_host_addr *a;
+
+	list_for_each_entry(a, addr_list, list)
+		if (ether_addr_equal(a->addr, addr) && a->vid == vid)
+			return a;
+
+	return NULL;
+}
+
+/* DSA can directly translate this to a normal MDB add, but on the CPU port.
+ * But because multiple user ports can join the same multicast group and the
+ * bridge will emit a notification for each port, we need to add/delete the
+ * entry towards the host only once, so we reference count it.
+ */
+static int dsa_host_mdb_add(struct dsa_port *dp,
+			    const struct switchdev_obj_port_mdb *mdb)
+{
+	struct dsa_port *cpu_dp = dp->cpu_dp;
+	struct dsa_host_addr *a;
+	int err;
+
+	if (!dp->ds->ops->port_mdb_add || !dp->ds->ops->port_mdb_del)
+		return -EOPNOTSUPP;
+
+	a = dsa_host_addr_find(&cpu_dp->host_mdb, mdb->addr, mdb->vid);
+	if (a) {
+		refcount_inc(&a->refcount);
+		return 0;
+	}
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return -ENOMEM;
+
+	err = dsa_port_host_mdb_add(dp, mdb);
+	if (err) {
+		kfree(a);
+		return err;
+	}
+
+	ether_addr_copy(a->addr, mdb->addr);
+	a->vid = mdb->vid;
+	refcount_set(&a->refcount, 1);
+	list_add_tail(&a->list, &cpu_dp->host_mdb);
+
+	return 0;
+}
+
+static int dsa_host_mdb_del(struct dsa_port *dp,
+			    const struct switchdev_obj_port_mdb *mdb)
+{
+	struct dsa_port *cpu_dp = dp->cpu_dp;
+	struct dsa_host_addr *a;
+	int err;
+
+	if (!dp->ds->ops->port_mdb_add || !dp->ds->ops->port_mdb_del)
+		return -EOPNOTSUPP;
+
+	a = dsa_host_addr_find(&cpu_dp->host_mdb, mdb->addr, mdb->vid);
+	if (!a)
+		return -ENOENT;
+
+	if (!refcount_dec_and_test(&a->refcount))
+		return 0;
+
+	err = dsa_port_host_mdb_del(dp, mdb);
+	if (err)
+		return err;
+
+	list_del(&a->list);
+	kfree(a);
+
+	return 0;
+}
+
 /* slave mii_bus handling ***************************************************/
 static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg)
 {
@@ -396,10 +475,7 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
 		err = dsa_port_mdb_add(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
 		break;
 	case SWITCHDEV_OBJ_ID_HOST_MDB:
-		/* DSA can directly translate this to a normal MDB add,
-		 * but on the CPU port.
-		 */
-		err = dsa_port_mdb_add(dp->cpu_dp, SWITCHDEV_OBJ_PORT_MDB(obj));
+		err = dsa_host_mdb_add(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
 		break;
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		err = dsa_slave_vlan_add(dev, obj, extack);
@@ -464,10 +540,7 @@ static int dsa_slave_port_obj_del(struct net_device *dev,
 		err = dsa_port_mdb_del(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
 		break;
 	case SWITCHDEV_OBJ_ID_HOST_MDB:
-		/* DSA can directly translate this to a normal MDB add,
-		 * but on the CPU port.
-		 */
-		err = dsa_port_mdb_del(dp->cpu_dp, SWITCHDEV_OBJ_PORT_MDB(obj));
+		err = dsa_host_mdb_del(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
 		break;
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		err = dsa_slave_vlan_del(dev, obj);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 4b5da89dc27a..94996e213469 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -148,6 +148,25 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 	return 0;
 }
 
+/* Matches for all upstream-facing ports (the CPU port and all upstream-facing
+ * DSA links) that sit between the port that emitted the notification and the
+ * DSA master.
+ */
+static bool dsa_switch_host_address_match(struct dsa_switch *ds, int port,
+					  int info_sw_index, int info_port)
+{
+	struct dsa_switch *downstream_ds;
+
+	downstream_ds = dsa_switch_find(ds->dst->index, info_sw_index);
+	if (WARN_ON(!downstream_ds))
+		return false;
+
+	if (dsa_switch_is_upstream_of(ds, downstream_ds))
+		return dsa_is_upstream_port(ds, port);
+
+	return false;
+}
+
 static int dsa_switch_fdb_add(struct dsa_switch *ds,
 			      struct dsa_notifier_fdb_info *info)
 {
@@ -229,20 +248,30 @@ static int dsa_switch_lag_leave(struct dsa_switch *ds,
 	return 0;
 }
 
-static bool dsa_switch_mdb_match(struct dsa_switch *ds, int port,
-				 struct dsa_notifier_mdb_info *info)
+static int dsa_switch_mdb_add(struct dsa_switch *ds,
+			      struct dsa_notifier_mdb_info *info)
 {
-	if (ds->index == info->sw_index && port == info->port)
-		return true;
+	int port = dsa_towards_port(ds, info->sw_index, info->port);
 
-	if (dsa_is_dsa_port(ds, port))
-		return true;
+	if (!ds->ops->port_mdb_add)
+		return -EOPNOTSUPP;
 
-	return false;
+	return ds->ops->port_mdb_add(ds, port, info->mdb);
 }
 
-static int dsa_switch_mdb_add(struct dsa_switch *ds,
+static int dsa_switch_mdb_del(struct dsa_switch *ds,
 			      struct dsa_notifier_mdb_info *info)
+{
+	int port = dsa_towards_port(ds, info->sw_index, info->port);
+
+	if (!ds->ops->port_mdb_del)
+		return -EOPNOTSUPP;
+
+	return ds->ops->port_mdb_del(ds, port, info->mdb);
+}
+
+static int dsa_switch_host_mdb_add(struct dsa_switch *ds,
+				   struct dsa_notifier_mdb_info *info)
 {
 	int err = 0;
 	int port;
@@ -251,7 +280,8 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 		return -EOPNOTSUPP;
 
 	for (port = 0; port < ds->num_ports; port++) {
-		if (dsa_switch_mdb_match(ds, port, info)) {
+		if (dsa_switch_host_address_match(ds, port, info->sw_index,
+						  info->port)) {
 			err = ds->ops->port_mdb_add(ds, port, info->mdb);
 			if (err)
 				break;
@@ -261,16 +291,25 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 	return err;
 }
 
-static int dsa_switch_mdb_del(struct dsa_switch *ds,
-			      struct dsa_notifier_mdb_info *info)
+static int dsa_switch_host_mdb_del(struct dsa_switch *ds,
+				   struct dsa_notifier_mdb_info *info)
 {
+	int err = 0;
+	int port;
+
 	if (!ds->ops->port_mdb_del)
 		return -EOPNOTSUPP;
 
-	if (ds->index == info->sw_index)
-		return ds->ops->port_mdb_del(ds, info->port, info->mdb);
+	for (port = 0; port < ds->num_ports; port++) {
+		if (dsa_switch_host_address_match(ds, port, info->sw_index,
+						  info->port)) {
+			err = ds->ops->port_mdb_del(ds, info->port, info->mdb);
+			if (err)
+				break;
+		}
+	}
 
-	return 0;
+	return err;
 }
 
 static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
@@ -508,6 +547,12 @@ static int dsa_switch_event(struct notifier_block *nb,
 	case DSA_NOTIFIER_MDB_DEL:
 		err = dsa_switch_mdb_del(ds, info);
 		break;
+	case DSA_NOTIFIER_HOST_MDB_ADD:
+		err = dsa_switch_host_mdb_add(ds, info);
+		break;
+	case DSA_NOTIFIER_HOST_MDB_DEL:
+		err = dsa_switch_host_mdb_del(ds, info);
+		break;
 	case DSA_NOTIFIER_VLAN_ADD:
 		err = dsa_switch_vlan_add(ds, info);
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 02/17] net: dsa: reference count the host fdb addresses
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 03/17] net: dsa: install the host MDB and FDB entries in the master's RX filter Vladimir Oltean
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

In preparation of unicast filtering towards the CPU, DSA will need to
send to the CPU several classes of addresses:
- local FDB entries from the bridge
- FDB entries pointing to the software bridge itself
- the MAC addresses used for termination in standalone mode
- the MAC addresses of various upper interfaces (8021q, macvlan) that
  DSA ports might have

So it will no longer be sufficient to gather the address list from a
single source such as the bridge. Consider the fact that the bridge
currently records the MAC address of every bridge port as a local
('permanent') address placed in the FDB. Sure we could use that as an
indication that the address must be sent to the CPU, but that address
needs to be sent to the CPU anyway, even if we're operating in
standalone mode. So the bridge can't dictate anything, we must keep more
addresses, and if we do that, there is a risk that we might delete an
address when it was still used, if we just listen to the deletion event
emitted by the bridge through switchdev. And that is where the
requirement for reference counting comes from.

Similar to host MDB entries, we should create a new notifier in DSA, and
keep the existing one for out-facing FDB entries untouched. We can also
simplify dsa_slave_switchdev_event_work a little bit now, since we
always schedule the work item for a user port now, we can unconditionally
take the refcount on a net_device.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/net/dsa.h  |  1 +
 net/dsa/dsa2.c     |  6 ++++
 net/dsa/dsa_priv.h |  7 ++++
 net/dsa/port.c     | 27 ++++++++++++++
 net/dsa/slave.c    | 88 +++++++++++++++++++++++++++++++++++++++++-----
 net/dsa/switch.c   | 50 ++++++++++++++++++++++++++
 6 files changed, 170 insertions(+), 9 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 31381bfcf35c..0210b49f291e 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -293,6 +293,7 @@ struct dsa_port {
 	/* List of MAC addresses that must be extracted from the fabric
 	 * through this CPU port. Valid only for DSA_PORT_TYPE_CPU.
 	 */
+	struct list_head	host_fdb;
 	struct list_head	host_mdb;
 
 	bool setup;
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index d64f1287625d..27654cac1c61 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -328,6 +328,7 @@ static struct dsa_port *dsa_tree_find_first_cpu(struct dsa_switch_tree *dst)
 
 static void dsa_setup_cpu_port(struct dsa_port *cpu_dp)
 {
+	INIT_LIST_HEAD(&cpu_dp->host_fdb);
 	INIT_LIST_HEAD(&cpu_dp->host_mdb);
 }
 
@@ -355,6 +356,11 @@ static void dsa_teardown_cpu_port(struct dsa_port *cpu_dp)
 {
 	struct dsa_host_addr *a, *tmp;
 
+	list_for_each_entry_safe(a, tmp, &cpu_dp->host_fdb, list) {
+		list_del(&a->list);
+		kfree(a);
+	}
+
 	list_for_each_entry_safe(a, tmp, &cpu_dp->host_mdb, list) {
 		list_del(&a->list);
 		kfree(a);
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index c730d40b81b9..4043da2bacc0 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -20,6 +20,8 @@ enum {
 	DSA_NOTIFIER_BRIDGE_LEAVE,
 	DSA_NOTIFIER_FDB_ADD,
 	DSA_NOTIFIER_FDB_DEL,
+	DSA_NOTIFIER_HOST_FDB_ADD,
+	DSA_NOTIFIER_HOST_FDB_DEL,
 	DSA_NOTIFIER_HSR_JOIN,
 	DSA_NOTIFIER_HSR_LEAVE,
 	DSA_NOTIFIER_LAG_CHANGE,
@@ -121,6 +123,7 @@ struct dsa_switchdev_event_work {
 	 */
 	unsigned char addr[ETH_ALEN];
 	u16 vid;
+	bool host_addr;
 };
 
 /* DSA_NOTIFIER_HSR_* */
@@ -200,6 +203,10 @@ int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
 		     u16 vid);
 int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
 		     u16 vid);
+int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr,
+			  u16 vid);
+int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
+			  u16 vid);
 int dsa_port_fdb_dump(struct dsa_port *dp, dsa_fdb_dump_cb_t *cb, void *data);
 int dsa_port_mdb_add(const struct dsa_port *dp,
 		     const struct switchdev_obj_port_mdb *mdb);
diff --git a/net/dsa/port.c b/net/dsa/port.c
index df9ba9b67675..d9ff222c041c 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -503,6 +503,33 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
 	return dsa_port_notify(dp, DSA_NOTIFIER_FDB_DEL, &info);
 }
 
+int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr,
+			  u16 vid)
+{
+	struct dsa_notifier_fdb_info info = {
+		.sw_index = dp->ds->index,
+		.port = dp->index,
+		.addr = addr,
+		.vid = vid,
+	};
+
+	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_ADD, &info);
+}
+
+int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
+			  u16 vid)
+{
+	struct dsa_notifier_fdb_info info = {
+		.sw_index = dp->ds->index,
+		.port = dp->index,
+		.addr = addr,
+		.vid = vid,
+
+	};
+
+	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_DEL, &info);
+}
+
 int dsa_port_fdb_dump(struct dsa_port *dp, dsa_fdb_dump_cb_t *cb, void *data)
 {
 	struct dsa_switch *ds = dp->ds;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 14a51503efe0..12d51bdb5eea 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -102,6 +102,67 @@ static int dsa_host_mdb_del(struct dsa_port *dp,
 	return 0;
 }
 
+static int dsa_host_fdb_add(struct dsa_port *dp, const unsigned char *addr,
+			    u16 vid)
+{
+	struct dsa_port *cpu_dp = dp->cpu_dp;
+	struct dsa_host_addr *a;
+	int err;
+
+	if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
+		return -EOPNOTSUPP;
+
+	a = dsa_host_addr_find(&cpu_dp->host_fdb, addr, vid);
+	if (a) {
+		refcount_inc(&a->refcount);
+		return 0;
+	}
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return -ENOMEM;
+
+	err = dsa_port_host_fdb_add(dp, addr, vid);
+	if (err) {
+		kfree(a);
+		return err;
+	}
+
+	ether_addr_copy(a->addr, addr);
+	a->vid = vid;
+	refcount_set(&a->refcount, 1);
+	list_add_tail(&a->list, &cpu_dp->host_fdb);
+
+	return 0;
+}
+
+static int dsa_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
+			    u16 vid)
+{
+	struct dsa_port *cpu_dp = dp->cpu_dp;
+	struct dsa_host_addr *a;
+	int err;
+
+	if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
+		return -EOPNOTSUPP;
+
+	a = dsa_host_addr_find(&cpu_dp->host_fdb, addr, vid);
+	if (!a)
+		return -ENOENT;
+
+	if (!refcount_dec_and_test(&a->refcount))
+		return 0;
+
+	err = dsa_port_host_fdb_del(dp, addr, vid);
+	if (err)
+		return err;
+
+	list_del(&a->list);
+	kfree(a);
+
+	return 0;
+}
+
 /* slave mii_bus handling ***************************************************/
 static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg)
 {
@@ -2264,8 +2325,12 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work)
 	rtnl_lock();
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		err = dsa_port_fdb_add(dp, switchdev_work->addr,
-				       switchdev_work->vid);
+		if (switchdev_work->host_addr)
+			err = dsa_host_fdb_add(dp, switchdev_work->addr,
+					       switchdev_work->vid);
+		else
+			err = dsa_port_fdb_add(dp, switchdev_work->addr,
+					       switchdev_work->vid);
 		if (err) {
 			dev_err(ds->dev,
 				"port %d failed to add %pM vid %d to fdb: %d\n",
@@ -2277,8 +2342,12 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work)
 		break;
 
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		err = dsa_port_fdb_del(dp, switchdev_work->addr,
-				       switchdev_work->vid);
+		if (switchdev_work->host_addr)
+			err = dsa_host_fdb_del(dp, switchdev_work->addr,
+					       switchdev_work->vid);
+		else
+			err = dsa_port_fdb_del(dp, switchdev_work->addr,
+					       switchdev_work->vid);
 		if (err) {
 			dev_err(ds->dev,
 				"port %d failed to delete %pM vid %d from fdb: %d\n",
@@ -2291,8 +2360,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work)
 	rtnl_unlock();
 
 	kfree(switchdev_work);
-	if (dsa_is_user_port(ds, dp->index))
-		dev_put(dp->slave);
+	dev_put(dp->slave);
 }
 
 static int dsa_lower_dev_walk(struct net_device *lower_dev,
@@ -2324,6 +2392,7 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
 	const struct switchdev_notifier_fdb_info *fdb_info;
 	struct dsa_switchdev_event_work *switchdev_work;
+	bool host_addr = false;
 	struct dsa_port *dp;
 	int err;
 
@@ -2361,7 +2430,8 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 			if (!p)
 				return NOTIFY_DONE;
 
-			dp = p->dp->cpu_dp;
+			dp = p->dp;
+			host_addr = true;
 
 			if (!dp->ds->assisted_learning_on_cpu_port)
 				return NOTIFY_DONE;
@@ -2391,10 +2461,10 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 		ether_addr_copy(switchdev_work->addr,
 				fdb_info->addr);
 		switchdev_work->vid = fdb_info->vid;
+		switchdev_work->host_addr = host_addr;
 
 		/* Hold a reference on the slave for dsa_fdb_offload_notify */
-		if (dsa_is_user_port(dp->ds, dp->index))
-			dev_hold(dev);
+		dev_hold(dev);
 		dsa_schedule_work(&switchdev_work->work);
 		break;
 	default:
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 94996e213469..a89363117f6f 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -167,6 +167,50 @@ static bool dsa_switch_host_address_match(struct dsa_switch *ds, int port,
 	return false;
 }
 
+static int dsa_switch_host_fdb_add(struct dsa_switch *ds,
+				   struct dsa_notifier_fdb_info *info)
+{
+	int err = 0;
+	int port;
+
+	if (!ds->ops->port_fdb_add)
+		return -EOPNOTSUPP;
+
+	for (port = 0; port < ds->num_ports; port++) {
+		if (dsa_switch_host_address_match(ds, port, info->sw_index,
+						  info->port)) {
+			err = ds->ops->port_fdb_add(ds, port, info->addr,
+						    info->vid);
+			if (err)
+				break;
+		}
+	}
+
+	return err;
+}
+
+static int dsa_switch_host_fdb_del(struct dsa_switch *ds,
+				   struct dsa_notifier_fdb_info *info)
+{
+	int err = 0;
+	int port;
+
+	if (!ds->ops->port_fdb_del)
+		return -EOPNOTSUPP;
+
+	for (port = 0; port < ds->num_ports; port++) {
+		if (dsa_switch_host_address_match(ds, port, info->sw_index,
+						  info->port)) {
+			err = ds->ops->port_fdb_del(ds, info->port, info->addr,
+						    info->vid);
+			if (err)
+				break;
+		}
+	}
+
+	return err;
+}
+
 static int dsa_switch_fdb_add(struct dsa_switch *ds,
 			      struct dsa_notifier_fdb_info *info)
 {
@@ -526,6 +570,12 @@ static int dsa_switch_event(struct notifier_block *nb,
 	case DSA_NOTIFIER_FDB_DEL:
 		err = dsa_switch_fdb_del(ds, info);
 		break;
+	case DSA_NOTIFIER_HOST_FDB_ADD:
+		err = dsa_switch_host_fdb_add(ds, info);
+		break;
+	case DSA_NOTIFIER_HOST_FDB_DEL:
+		err = dsa_switch_host_fdb_del(ds, info);
+		break;
 	case DSA_NOTIFIER_HSR_JOIN:
 		err = dsa_switch_hsr_join(ds, info);
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 03/17] net: dsa: install the host MDB and FDB entries in the master's RX filter
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 02/17] net: dsa: reference count the host fdb addresses Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 04/17] net: dsa: install the port MAC addresses as host fdb entries Vladimir Oltean
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

If the DSA master implements strict address filtering, then the unicast
and multicast addresses kept by the DSA CPU ports should be synchronized
with the address lists of the DSA master.

Note that we want the synchronization of the master's address lists even
if the DSA switch doesn't support unicast/multicast database operations,
on the premises that the packets will be flooded to the CPU in that
case, and we should still instruct the master to receive them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/slave.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 12d51bdb5eea..b6ea7e21b3b6 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -48,6 +48,8 @@ static int dsa_host_mdb_add(struct dsa_port *dp,
 	struct dsa_host_addr *a;
 	int err;
 
+	dev_mc_add(cpu_dp->master, mdb->addr);
+
 	if (!dp->ds->ops->port_mdb_add || !dp->ds->ops->port_mdb_del)
 		return -EOPNOTSUPP;
 
@@ -82,6 +84,8 @@ static int dsa_host_mdb_del(struct dsa_port *dp,
 	struct dsa_host_addr *a;
 	int err;
 
+	dev_mc_del(cpu_dp->master, mdb->addr);
+
 	if (!dp->ds->ops->port_mdb_add || !dp->ds->ops->port_mdb_del)
 		return -EOPNOTSUPP;
 
@@ -109,6 +113,8 @@ static int dsa_host_fdb_add(struct dsa_port *dp, const unsigned char *addr,
 	struct dsa_host_addr *a;
 	int err;
 
+	dev_uc_add(cpu_dp->master, addr);
+
 	if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
 		return -EOPNOTSUPP;
 
@@ -143,6 +149,8 @@ static int dsa_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
 	struct dsa_host_addr *a;
 	int err;
 
+	dev_uc_del(cpu_dp->master, addr);
+
 	if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
 		return -EOPNOTSUPP;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 04/17] net: dsa: install the port MAC addresses as host fdb entries
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (2 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 03/17] net: dsa: install the host MDB and FDB entries in the master's RX filter Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device Vladimir Oltean
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

In preparation of support for IFF_UNICAST_FLT for DSA standalone ports,
the unicast address of the DSA interfaces should always be known to the
switch and installed as an FDB entry towards the CPU port associated
with that user port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/slave.c | 32 ++++++++++----------------------
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index b6ea7e21b3b6..6544a4ec69f4 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -223,11 +223,9 @@ static int dsa_slave_open(struct net_device *dev)
 		goto out;
 	}
 
-	if (!ether_addr_equal(dev->dev_addr, master->dev_addr)) {
-		err = dev_uc_add(master, dev->dev_addr);
-		if (err < 0)
-			goto out;
-	}
+	err = dsa_host_fdb_add(dp, dev->dev_addr, 0);
+	if (err && err != -EOPNOTSUPP)
+		goto out;
 
 	if (dev->flags & IFF_ALLMULTI) {
 		err = dev_set_allmulti(master, 1);
@@ -253,8 +251,7 @@ static int dsa_slave_open(struct net_device *dev)
 	if (dev->flags & IFF_ALLMULTI)
 		dev_set_allmulti(master, -1);
 del_unicast:
-	if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
-		dev_uc_del(master, dev->dev_addr);
+	dsa_host_fdb_del(dp, dev->dev_addr, 0);
 out:
 	return err;
 }
@@ -273,8 +270,7 @@ static int dsa_slave_close(struct net_device *dev)
 	if (dev->flags & IFF_PROMISC)
 		dev_set_promiscuity(master, -1);
 
-	if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
-		dev_uc_del(master, dev->dev_addr);
+	dsa_host_fdb_del(dp, dev->dev_addr, 0);
 
 	return 0;
 }
@@ -302,26 +298,18 @@ static void dsa_slave_set_rx_mode(struct net_device *dev)
 
 static int dsa_slave_set_mac_address(struct net_device *dev, void *a)
 {
-	struct net_device *master = dsa_slave_to_master(dev);
+	struct dsa_port *dp = dsa_slave_to_port(dev);
 	struct sockaddr *addr = a;
 	int err;
 
 	if (!is_valid_ether_addr(addr->sa_data))
 		return -EADDRNOTAVAIL;
 
-	if (!(dev->flags & IFF_UP))
-		goto out;
-
-	if (!ether_addr_equal(addr->sa_data, master->dev_addr)) {
-		err = dev_uc_add(master, addr->sa_data);
-		if (err < 0)
-			return err;
-	}
-
-	if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
-		dev_uc_del(master, dev->dev_addr);
+	err = dsa_host_fdb_add(dp, addr->sa_data, 0);
+	if (err && err != -EOPNOTSUPP)
+		return err;
 
-out:
+	dsa_host_fdb_del(dp, dev->dev_addr, 0);
 	ether_addr_copy(dev->dev_addr, addr->sa_data);
 
 	return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (3 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 04/17] net: dsa: install the port MAC addresses as host fdb entries Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-03-01 15:22   ` Ido Schimmel
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses Vladimir Oltean
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

The bridge device currently goes into promiscuous mode when it has an
upper with a different MAC address than itself. But it could do better:
it could sync the MAC addresses of its uppers to the software FDB, as
local entries pointing to the bridge itself. This is compatible with
switchdev, since drivers are now instructed to trap these MAC addresses
to the CPU.

Note that the dev_uc_add API does not propagate VLAN ID, so this only
works for VLAN-unaware bridges.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_device.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 3f2f06b4dd27..a7d9d35e70d0 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -179,8 +179,25 @@ static int br_dev_open(struct net_device *dev)
 	return 0;
 }
 
-static void br_dev_set_multicast_list(struct net_device *dev)
+static int br_dev_sync_uc(struct net_device *dev, const unsigned char *addr)
 {
+	struct net_bridge *br = netdev_priv(dev);
+
+	return br_fdb_insert(br, NULL, addr, 0);
+}
+
+static int br_dev_unsync_uc(struct net_device *dev, const unsigned char *addr)
+{
+	struct net_bridge *br = netdev_priv(dev);
+
+	br_fdb_find_delete_local(br, NULL, addr, 0);
+
+	return 0;
+}
+
+static void br_dev_set_rx_mode(struct net_device *dev)
+{
+	__dev_uc_sync(dev, br_dev_sync_uc, br_dev_unsync_uc);
 }
 
 static void br_dev_change_rx_flags(struct net_device *dev, int change)
@@ -399,7 +416,7 @@ static const struct net_device_ops br_netdev_ops = {
 	.ndo_start_xmit		 = br_dev_xmit,
 	.ndo_get_stats64	 = dev_get_tstats64,
 	.ndo_set_mac_address	 = br_set_mac_address,
-	.ndo_set_rx_mode	 = br_dev_set_multicast_list,
+	.ndo_set_rx_mode	 = br_dev_set_rx_mode,
 	.ndo_change_rx_flags	 = br_dev_change_rx_flags,
 	.ndo_change_mtu		 = br_change_mtu,
 	.ndo_do_ioctl		 = br_dev_ioctl,
@@ -436,7 +453,7 @@ void br_dev_setup(struct net_device *dev)
 	dev->needs_free_netdev = true;
 	dev->ethtool_ops = &br_ethtool_ops;
 	SET_NETDEV_DEVTYPE(dev, &br_type);
-	dev->priv_flags = IFF_EBRIDGE | IFF_NO_QUEUE;
+	dev->priv_flags = IFF_EBRIDGE | IFF_NO_QUEUE | IFF_UNICAST_FLT;
 
 	dev->features = COMMON_FEATURES | NETIF_F_LLTX | NETIF_F_NETNS_LOCAL |
 			NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (4 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-26 10:59   ` Tobias Waldekranz
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 07/17] net: bridge: switchdev: refactor br_switchdev_fdb_notify Vladimir Oltean
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

In case we have ptp4l running on a bridged DSA switch interface, the PTP
traffic is classified as link-local (in the default profile, the MAC
addresses are 01:1b:19:00:00:00 and 01:80:c2:00:00:0e), which means it
isn't the responsibility of the bridge to make sure it gets trapped to
the CPU.

The solution is to implement the standard callbacks for dev_uc_add and
dev_mc_add, and behave just like any other network interface: ensure
that the user space program can see those packets.

Note that since ndo_set_rx_mode runs in atomic context, we must schedule
the dsa_slave_switchdev_event_work in order to install the FDB and MDB
entries to a DSA switch that may sleep. But since the DSA switchdev
event logic only deals with FDB entries, we must fake some events for
MDB ones.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/dsa_priv.h |  10 +-
 net/dsa/slave.c    | 329 ++++++++++++++++++++++++++++++++-------------
 2 files changed, 239 insertions(+), 100 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 4043da2bacc0..c03c67631e23 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -117,10 +117,12 @@ struct dsa_switchdev_event_work {
 	struct dsa_switch *ds;
 	int port;
 	struct work_struct work;
-	unsigned long event;
-	/* Specific for SWITCHDEV_FDB_ADD_TO_DEVICE and
-	 * SWITCHDEV_FDB_DEL_TO_DEVICE
-	 */
+	enum dsa_switchdev_event {
+		DSA_EVENT_FDB_ADD,
+		DSA_EVENT_FDB_DEL,
+		DSA_EVENT_MDB_ADD,
+		DSA_EVENT_MDB_DEL,
+	} event;
 	unsigned char addr[ETH_ALEN];
 	u16 vid;
 	bool host_addr;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 6544a4ec69f4..65b3c1166fe7 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -171,6 +171,220 @@ static int dsa_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
 	return 0;
 }
 
+static void
+dsa_fdb_offload_notify(struct dsa_switchdev_event_work *switchdev_work)
+{
+	struct dsa_switch *ds = switchdev_work->ds;
+	struct switchdev_notifier_fdb_info info;
+	struct dsa_port *dp;
+
+	if (!dsa_is_user_port(ds, switchdev_work->port))
+		return;
+
+	info.addr = switchdev_work->addr;
+	info.vid = switchdev_work->vid;
+	info.offloaded = true;
+	dp = dsa_to_port(ds, switchdev_work->port);
+	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
+				 dp->slave, &info.info, NULL);
+}
+
+#define work_to_ctx(w) \
+	container_of((w), struct dsa_switchdev_event_work, work)
+
+static void dsa_slave_switchdev_event_work(struct work_struct *work)
+{
+	struct dsa_switchdev_event_work *switchdev_work = work_to_ctx(work);
+	struct dsa_switch *ds = switchdev_work->ds;
+	struct dsa_port *dp;
+	int err;
+
+	dp = dsa_to_port(ds, switchdev_work->port);
+
+	rtnl_lock();
+	switch (switchdev_work->event) {
+	case DSA_EVENT_FDB_ADD:
+		if (switchdev_work->host_addr)
+			err = dsa_host_fdb_add(dp, switchdev_work->addr,
+					       switchdev_work->vid);
+		else
+			err = dsa_port_fdb_add(dp, switchdev_work->addr,
+					       switchdev_work->vid);
+		if (err) {
+			dev_err(ds->dev,
+				"port %d failed to add %pM vid %d to fdb: %d\n",
+				dp->index, switchdev_work->addr,
+				switchdev_work->vid, err);
+			break;
+		}
+		dsa_fdb_offload_notify(switchdev_work);
+		break;
+
+	case DSA_EVENT_FDB_DEL:
+		if (switchdev_work->host_addr)
+			err = dsa_host_fdb_del(dp, switchdev_work->addr,
+					       switchdev_work->vid);
+		else
+			err = dsa_port_fdb_del(dp, switchdev_work->addr,
+					       switchdev_work->vid);
+		if (err) {
+			dev_err(ds->dev,
+				"port %d failed to delete %pM vid %d from fdb: %d\n",
+				dp->index, switchdev_work->addr,
+				switchdev_work->vid, err);
+		}
+
+		break;
+
+	case DSA_EVENT_MDB_ADD: {
+		struct switchdev_obj_port_mdb mdb;
+
+		ether_addr_copy(mdb.addr, switchdev_work->addr);
+		mdb.vid = switchdev_work->vid;
+
+		if (switchdev_work->host_addr)
+			err = dsa_host_mdb_add(dp, &mdb);
+		else
+			err = dsa_port_mdb_add(dp, &mdb);
+		if (err) {
+			dev_err(ds->dev,
+				"port %d failed to add %pM vid %d to mdb: %d\n",
+				dp->index, mdb.addr, mdb.vid, err);
+			break;
+		}
+		dsa_fdb_offload_notify(switchdev_work);
+		break;
+	}
+	case DSA_EVENT_MDB_DEL: {
+		struct switchdev_obj_port_mdb mdb;
+
+		ether_addr_copy(mdb.addr, switchdev_work->addr);
+		mdb.vid = switchdev_work->vid;
+
+		if (switchdev_work->host_addr)
+			err = dsa_host_mdb_del(dp, &mdb);
+		else
+			err = dsa_port_mdb_del(dp, &mdb);
+		if (err) {
+			dev_err(ds->dev,
+				"port %d failed to delete %pM vid %d from mdb: %d\n",
+				dp->index, mdb.addr, mdb.vid, err);
+		}
+		break;
+	}
+	default:
+		break;
+	}
+	rtnl_unlock();
+
+	kfree(switchdev_work);
+	dev_put(dp->slave);
+}
+
+
+static int dsa_slave_schedule_switchdev_work(struct net_device *dev,
+					     enum dsa_switchdev_event event,
+					     const unsigned char *addr, u16 vid,
+					     bool host_addr)
+{
+	struct dsa_switchdev_event_work *switchdev_work;
+	struct dsa_port *dp = dsa_slave_to_port(dev);
+
+	if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
+		return -EOPNOTSUPP;
+
+	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+	if (!switchdev_work)
+		return -ENOMEM;
+
+	INIT_WORK(&switchdev_work->work, dsa_slave_switchdev_event_work);
+	switchdev_work->ds = dp->ds;
+	switchdev_work->port = dp->index;
+	switchdev_work->event = event;
+
+	ether_addr_copy(switchdev_work->addr, addr);
+	switchdev_work->vid = vid;
+	switchdev_work->host_addr = host_addr;
+
+	/* Hold a reference on the slave for dsa_fdb_offload_notify */
+	dev_hold(dev);
+	dsa_schedule_work(&switchdev_work->work);
+
+	return 0;
+}
+
+static int dsa_slave_sync_uc(struct net_device *dev,
+			     const unsigned char *addr)
+{
+	int err;
+
+	err = dsa_slave_schedule_switchdev_work(dev, DSA_EVENT_FDB_ADD,
+						addr, 0, true);
+	if (err == -EOPNOTSUPP) {
+		struct dsa_port *dp = dsa_slave_to_port(dev);
+
+		dev_uc_add(dp->cpu_dp->master, addr);
+
+		return 0;
+	}
+
+	return err;
+}
+
+static int dsa_slave_unsync_uc(struct net_device *dev,
+			       const unsigned char *addr)
+{
+	int err;
+
+	err = dsa_slave_schedule_switchdev_work(dev, DSA_EVENT_FDB_DEL,
+						addr, 0, true);
+	if (err == -EOPNOTSUPP) {
+		struct dsa_port *dp = dsa_slave_to_port(dev);
+
+		dev_uc_del(dp->cpu_dp->master, addr);
+
+		return 0;
+	}
+
+	return err;
+}
+
+static int dsa_slave_sync_mc(struct net_device *dev,
+			     const unsigned char *addr)
+{
+	int err;
+
+	err = dsa_slave_schedule_switchdev_work(dev, DSA_EVENT_MDB_ADD,
+						addr, 0, true);
+	if (err == -EOPNOTSUPP) {
+		struct dsa_port *dp = dsa_slave_to_port(dev);
+
+		dev_mc_add(dp->cpu_dp->master, addr);
+
+		return 0;
+	}
+
+	return err;
+}
+
+static int dsa_slave_unsync_mc(struct net_device *dev,
+			       const unsigned char *addr)
+{
+	int err;
+
+	err = dsa_slave_schedule_switchdev_work(dev, DSA_EVENT_MDB_DEL,
+						addr, 0, true);
+	if (err == -EOPNOTSUPP) {
+		struct dsa_port *dp = dsa_slave_to_port(dev);
+
+		dev_mc_del(dp->cpu_dp->master, addr);
+
+		return 0;
+	}
+
+	return err;
+}
+
 /* slave mii_bus handling ***************************************************/
 static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg)
 {
@@ -263,8 +477,9 @@ static int dsa_slave_close(struct net_device *dev)
 
 	dsa_port_disable_rt(dp);
 
-	dev_mc_unsync(master, dev);
-	dev_uc_unsync(master, dev);
+	__dev_uc_sync(dev, dsa_slave_sync_uc, dsa_slave_unsync_uc);
+	__dev_mc_sync(dev, dsa_slave_sync_mc, dsa_slave_unsync_mc);
+
 	if (dev->flags & IFF_ALLMULTI)
 		dev_set_allmulti(master, -1);
 	if (dev->flags & IFF_PROMISC)
@@ -290,10 +505,8 @@ static void dsa_slave_change_rx_flags(struct net_device *dev, int change)
 
 static void dsa_slave_set_rx_mode(struct net_device *dev)
 {
-	struct net_device *master = dsa_slave_to_master(dev);
-
-	dev_mc_sync(master, dev);
-	dev_uc_sync(master, dev);
+	__dev_uc_sync(dev, dsa_slave_sync_uc, dsa_slave_unsync_uc);
+	__dev_mc_sync(dev, dsa_slave_sync_mc, dsa_slave_unsync_mc);
 }
 
 static int dsa_slave_set_mac_address(struct net_device *dev, void *a)
@@ -1970,6 +2183,8 @@ int dsa_slave_create(struct dsa_port *port)
 	else
 		eth_hw_addr_inherit(slave_dev, master);
 	slave_dev->priv_flags |= IFF_NO_QUEUE;
+	if (ds->ops->port_fdb_add && ds->ops->port_fdb_del)
+		slave_dev->priv_flags |= IFF_UNICAST_FLT;
 	slave_dev->netdev_ops = &dsa_slave_netdev_ops;
 	if (ds->ops->port_max_mtu)
 		slave_dev->max_mtu = ds->ops->port_max_mtu(ds, port->index);
@@ -2290,75 +2505,6 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
-static void
-dsa_fdb_offload_notify(struct dsa_switchdev_event_work *switchdev_work)
-{
-	struct dsa_switch *ds = switchdev_work->ds;
-	struct switchdev_notifier_fdb_info info;
-	struct dsa_port *dp;
-
-	if (!dsa_is_user_port(ds, switchdev_work->port))
-		return;
-
-	info.addr = switchdev_work->addr;
-	info.vid = switchdev_work->vid;
-	info.offloaded = true;
-	dp = dsa_to_port(ds, switchdev_work->port);
-	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
-				 dp->slave, &info.info, NULL);
-}
-
-static void dsa_slave_switchdev_event_work(struct work_struct *work)
-{
-	struct dsa_switchdev_event_work *switchdev_work =
-		container_of(work, struct dsa_switchdev_event_work, work);
-	struct dsa_switch *ds = switchdev_work->ds;
-	struct dsa_port *dp;
-	int err;
-
-	dp = dsa_to_port(ds, switchdev_work->port);
-
-	rtnl_lock();
-	switch (switchdev_work->event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		if (switchdev_work->host_addr)
-			err = dsa_host_fdb_add(dp, switchdev_work->addr,
-					       switchdev_work->vid);
-		else
-			err = dsa_port_fdb_add(dp, switchdev_work->addr,
-					       switchdev_work->vid);
-		if (err) {
-			dev_err(ds->dev,
-				"port %d failed to add %pM vid %d to fdb: %d\n",
-				dp->index, switchdev_work->addr,
-				switchdev_work->vid, err);
-			break;
-		}
-		dsa_fdb_offload_notify(switchdev_work);
-		break;
-
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		if (switchdev_work->host_addr)
-			err = dsa_host_fdb_del(dp, switchdev_work->addr,
-					       switchdev_work->vid);
-		else
-			err = dsa_port_fdb_del(dp, switchdev_work->addr,
-					       switchdev_work->vid);
-		if (err) {
-			dev_err(ds->dev,
-				"port %d failed to delete %pM vid %d from fdb: %d\n",
-				dp->index, switchdev_work->addr,
-				switchdev_work->vid, err);
-		}
-
-		break;
-	}
-	rtnl_unlock();
-
-	kfree(switchdev_work);
-	dev_put(dp->slave);
-}
-
 static int dsa_lower_dev_walk(struct net_device *lower_dev,
 			      struct netdev_nested_priv *priv)
 {
@@ -2387,7 +2533,7 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
 	const struct switchdev_notifier_fdb_info *fdb_info;
-	struct dsa_switchdev_event_work *switchdev_work;
+	enum dsa_switchdev_event dsa_event;
 	bool host_addr = false;
 	struct dsa_port *dp;
 	int err;
@@ -2441,28 +2587,19 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 				return NOTIFY_DONE;
 		}
 
-		if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
-			return NOTIFY_DONE;
-
-		switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
-		if (!switchdev_work)
-			return NOTIFY_BAD;
-
-		INIT_WORK(&switchdev_work->work,
-			  dsa_slave_switchdev_event_work);
-		switchdev_work->ds = dp->ds;
-		switchdev_work->port = dp->index;
-		switchdev_work->event = event;
+		if (event == SWITCHDEV_FDB_ADD_TO_DEVICE)
+			dsa_event = DSA_EVENT_FDB_ADD;
+		else
+			dsa_event = DSA_EVENT_FDB_DEL;
 
-		ether_addr_copy(switchdev_work->addr,
-				fdb_info->addr);
-		switchdev_work->vid = fdb_info->vid;
-		switchdev_work->host_addr = host_addr;
+		err = dsa_slave_schedule_switchdev_work(dp->slave, dsa_event,
+							fdb_info->addr,
+							fdb_info->vid,
+							host_addr);
+		if (err == -EOPNOTSUPP)
+			return NOTIFY_OK;
 
-		/* Hold a reference on the slave for dsa_fdb_offload_notify */
-		dev_hold(dev);
-		dsa_schedule_work(&switchdev_work->work);
-		break;
+		return notifier_from_errno(err);
 	default:
 		return NOTIFY_DONE;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 07/17] net: bridge: switchdev: refactor br_switchdev_fdb_notify
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (5 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 08/17] net: bridge: switchdev: include local flag in FDB notifications Vladimir Oltean
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Tobias Waldekranz <tobias@waldekranz.com>

Instead of having to add more and more arguments to
br_switchdev_fdb_call_notifiers, get rid of it and build the info
struct directly in br_switchdev_fdb_notify.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_switchdev.c | 41 +++++++++++----------------------------
 1 file changed, 11 insertions(+), 30 deletions(-)

diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index b89503832fcc..1386677bdaf8 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -107,46 +107,27 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p,
 	return 0;
 }
 
-static void
-br_switchdev_fdb_call_notifiers(bool adding, const unsigned char *mac,
-				u16 vid, struct net_device *dev,
-				bool added_by_user, bool offloaded)
-{
-	struct switchdev_notifier_fdb_info info;
-	unsigned long notifier_type;
-
-	info.addr = mac;
-	info.vid = vid;
-	info.added_by_user = added_by_user;
-	info.offloaded = offloaded;
-	notifier_type = adding ? SWITCHDEV_FDB_ADD_TO_DEVICE : SWITCHDEV_FDB_DEL_TO_DEVICE;
-	call_switchdev_notifiers(notifier_type, dev, &info.info, NULL);
-}
-
 void
 br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type)
 {
+	struct switchdev_notifier_fdb_info info = {
+		.addr = fdb->key.addr.addr,
+		.vid = fdb->key.vlan_id,
+		.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags),
+		.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags),
+	};
+
 	if (!fdb->dst)
 		return;
 
 	switch (type) {
 	case RTM_DELNEIGH:
-		br_switchdev_fdb_call_notifiers(false, fdb->key.addr.addr,
-						fdb->key.vlan_id,
-						fdb->dst->dev,
-						test_bit(BR_FDB_ADDED_BY_USER,
-							 &fdb->flags),
-						test_bit(BR_FDB_OFFLOADED,
-							 &fdb->flags));
+		call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_DEVICE,
+					 fdb->dst->dev, &info.info, NULL);
 		break;
 	case RTM_NEWNEIGH:
-		br_switchdev_fdb_call_notifiers(true, fdb->key.addr.addr,
-						fdb->key.vlan_id,
-						fdb->dst->dev,
-						test_bit(BR_FDB_ADDED_BY_USER,
-							 &fdb->flags),
-						test_bit(BR_FDB_OFFLOADED,
-							 &fdb->flags));
+		call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_DEVICE,
+					 fdb->dst->dev, &info.info, NULL);
 		break;
 	}
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 08/17] net: bridge: switchdev: include local flag in FDB notifications
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (6 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 07/17] net: bridge: switchdev: refactor br_switchdev_fdb_notify Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 09/17] net: bridge: switchdev: send FDB notifications for host addresses Vladimir Oltean
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

As explained in the discussion on the initial version of this patch:
https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/

the switchdev notifiers for FDB entries managed to have a zero-day bug.
The bridge would not say that this entry is local:

ip link add br0 type bridge
ip link set swp0 master br0
bridge fdb add dev swp0 00:01:02:03:04:05 master local

and the switchdev driver would be more than happy to offload it as a
normal static FDB entry. This is despite the fact that 'local' and
non-'local' entries have completely opposite directions: a local entry
is locally terminated and not forwarded, whereas a static entry is
forwarded and not locally terminated. So, for example, DSA would install
this entry on swp0 instead of installing it on the CPU port as it should.

There is an even sadder part, which is that the 'local' flag is implicit
if 'static' is not specified, meaning that this command produces the
same result of adding a 'local' entry:

bridge fdb add dev swp0 00:01:02:03:04:05 master

So this patch is a bugfix which does what should have been done from day
one: make the bridge inform switchdev that it's a local entry, and have
all drivers ignore local entries (since none of them currently has any
logic to deal with host entries).

At least we've updated the man pages for 'bridge' now, and we're pretty
explicit that the commands above were broken and should have never worked:
https://patchwork.kernel.org/project/netdevbpf/cover/20210211104502.2081443-1-olteanv@gmail.com/

But we don't do anything to even attempt to keep backwards compatibility
with the broken behavior. If we did that, it would be pretty much game
over for any attempt to really deal with host entries in switchdev drivers.

Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/ethernet/marvell/prestera/prestera_switchdev.c | 2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c   | 5 +++--
 drivers/net/ethernet/rocker/rocker_main.c                  | 4 ++--
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c              | 4 ++--
 drivers/net/ethernet/ti/cpsw_switchdev.c                   | 4 ++--
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c                    | 4 ++--
 include/net/switchdev.h                                    | 1 +
 net/bridge/br_switchdev.c                                  | 1 +
 net/dsa/slave.c                                            | 2 +-
 9 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
index 49e052273f30..cb564890a3dc 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
@@ -798,7 +798,7 @@ static void prestera_fdb_event_work(struct work_struct *work)
 	switch (swdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 		fdb_info = &swdev_work->fdb_info;
-		if (!fdb_info->added_by_user)
+		if (!fdb_info->added_by_user || fdb_info->is_local)
 			break;
 
 		err = prestera_port_fdb_set(port, fdb_info, true);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 23b7e8d6386b..9aadc29ad777 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -2865,7 +2865,8 @@ mlxsw_sp_switchdev_bridge_nve_fdb_event(struct mlxsw_sp_switchdev_event_work *
 		return;
 
 	if (switchdev_work->event == SWITCHDEV_FDB_ADD_TO_DEVICE &&
-	    !switchdev_work->fdb_info.added_by_user)
+	    (!switchdev_work->fdb_info.added_by_user ||
+	     switchdev_work->fdb_info.is_local))
 		return;
 
 	if (!netif_running(dev))
@@ -2920,7 +2921,7 @@ static void mlxsw_sp_switchdev_bridge_fdb_event_work(struct work_struct *work)
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 		fdb_info = &switchdev_work->fdb_info;
-		if (!fdb_info->added_by_user)
+		if (!fdb_info->added_by_user || fdb_info->is_local)
 			break;
 		err = mlxsw_sp_port_fdb_set(mlxsw_sp_port, fdb_info, true);
 		if (err)
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 3473d296b2e2..a46633606cae 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2736,7 +2736,7 @@ static void rocker_switchdev_event_work(struct work_struct *work)
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 		fdb_info = &switchdev_work->fdb_info;
-		if (!fdb_info->added_by_user)
+		if (!fdb_info->added_by_user || fdb_info->is_local)
 			break;
 		err = rocker_world_port_fdb_add(rocker_port, fdb_info);
 		if (err) {
@@ -2747,7 +2747,7 @@ static void rocker_switchdev_event_work(struct work_struct *work)
 		break;
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
 		fdb_info = &switchdev_work->fdb_info;
-		if (!fdb_info->added_by_user)
+		if (!fdb_info->added_by_user || fdb_info->is_local)
 			break;
 		err = rocker_world_port_fdb_del(rocker_port, fdb_info);
 		if (err)
diff --git a/drivers/net/ethernet/ti/am65-cpsw-switchdev.c b/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
index d93ffd8a08b0..23cfb91e9c4d 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
@@ -385,7 +385,7 @@ static void am65_cpsw_switchdev_event_work(struct work_struct *work)
 			   fdb->addr, fdb->vid, fdb->added_by_user,
 			   fdb->offloaded, port_id);
 
-		if (!fdb->added_by_user)
+		if (!fdb->added_by_user || fdb->is_local)
 			break;
 		if (memcmp(port->slave.mac_addr, (u8 *)fdb->addr, ETH_ALEN) == 0)
 			port_id = HOST_PORT_NUM;
@@ -401,7 +401,7 @@ static void am65_cpsw_switchdev_event_work(struct work_struct *work)
 			   fdb->addr, fdb->vid, fdb->added_by_user,
 			   fdb->offloaded, port_id);
 
-		if (!fdb->added_by_user)
+		if (!fdb->added_by_user || fdb->is_local)
 			break;
 		if (memcmp(port->slave.mac_addr, (u8 *)fdb->addr, ETH_ALEN) == 0)
 			port_id = HOST_PORT_NUM;
diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c
index a72bb570756f..05a64fb7a04f 100644
--- a/drivers/net/ethernet/ti/cpsw_switchdev.c
+++ b/drivers/net/ethernet/ti/cpsw_switchdev.c
@@ -395,7 +395,7 @@ static void cpsw_switchdev_event_work(struct work_struct *work)
 			fdb->addr, fdb->vid, fdb->added_by_user,
 			fdb->offloaded, port);
 
-		if (!fdb->added_by_user)
+		if (!fdb->added_by_user || fdb->is_local)
 			break;
 		if (memcmp(priv->mac_addr, (u8 *)fdb->addr, ETH_ALEN) == 0)
 			port = HOST_PORT_NUM;
@@ -411,7 +411,7 @@ static void cpsw_switchdev_event_work(struct work_struct *work)
 			fdb->addr, fdb->vid, fdb->added_by_user,
 			fdb->offloaded, port);
 
-		if (!fdb->added_by_user)
+		if (!fdb->added_by_user || fdb->is_local)
 			break;
 		if (memcmp(priv->mac_addr, (u8 *)fdb->addr, ETH_ALEN) == 0)
 			port = HOST_PORT_NUM;
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index 703055e063ff..aad212b9b97b 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -1298,7 +1298,7 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 
 	switch (switchdev_work->event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		if (!fdb_info->added_by_user)
+		if (!fdb_info->added_by_user || fdb_info->is_local)
 			break;
 		if (is_unicast_ether_addr(fdb_info->addr))
 			err = dpaa2_switch_port_fdb_add_uc(netdev_priv(dev),
@@ -1313,7 +1313,7 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 					 &fdb_info->info, NULL);
 		break;
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		if (!fdb_info->added_by_user)
+		if (!fdb_info->added_by_user || fdb_info->is_local)
 			break;
 		if (is_unicast_ether_addr(fdb_info->addr))
 			dpaa2_switch_port_fdb_del_uc(netdev_priv(dev), fdb_info->addr);
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index b7fc7d0f54e2..ca7223a79135 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -208,6 +208,7 @@ struct switchdev_notifier_fdb_info {
 	const unsigned char *addr;
 	u16 vid;
 	u8 added_by_user:1,
+	   is_local:1,
 	   offloaded:1;
 };
 
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 1386677bdaf8..a5e601e41cb9 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -114,6 +114,7 @@ br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type)
 		.addr = fdb->key.addr.addr,
 		.vid = fdb->key.vlan_id,
 		.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags),
+		.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags),
 		.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags),
 	};
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 65b3c1166fe7..c4db870b48e5 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2549,7 +2549,7 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 		fdb_info = ptr;
 
 		if (dsa_slave_dev_check(dev)) {
-			if (!fdb_info->added_by_user)
+			if (!fdb_info->added_by_user || fdb_info->is_local)
 				return NOTIFY_OK;
 
 			dp = dsa_slave_to_port(dev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 09/17] net: bridge: switchdev: send FDB notifications for host addresses
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (7 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 08/17] net: bridge: switchdev: include local flag in FDB notifications Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 10/17] net: dsa: include bridge addresses which are local in the host fdb list Vladimir Oltean
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Tobias Waldekranz <tobias@waldekranz.com>

Treat addresses added to the bridge itself in the same way as regular
ports and send out a notification so that drivers may sync it down to
the hardware FDB.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_fdb.c       |  4 ++--
 net/bridge/br_private.h   |  7 ++++---
 net/bridge/br_switchdev.c | 11 +++++------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index b7490237f3fc..1d54ae0f58fb 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -602,7 +602,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 			/* fastpath: update of existing entry */
 			if (unlikely(source != fdb->dst &&
 				     !test_bit(BR_FDB_STICKY, &fdb->flags))) {
-				br_switchdev_fdb_notify(fdb, RTM_DELNEIGH);
+				br_switchdev_fdb_notify(br, fdb, RTM_DELNEIGH);
 				fdb->dst = source;
 				fdb_modified = true;
 				/* Take over HW learned entry */
@@ -735,7 +735,7 @@ static void fdb_notify(struct net_bridge *br,
 	int err = -ENOBUFS;
 
 	if (swdev_notify)
-		br_switchdev_fdb_notify(fdb, type);
+		br_switchdev_fdb_notify(br, fdb, type);
 
 	skb = nlmsg_new(fdb_nlmsg_size(), GFP_ATOMIC);
 	if (skb == NULL)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index d7d167e10b70..4a262dc55e6b 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -1581,8 +1581,8 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p,
 			       unsigned long flags,
 			       unsigned long mask,
 			       struct netlink_ext_ack *extack);
-void br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb,
-			     int type);
+void br_switchdev_fdb_notify(struct net_bridge *br,
+			     const struct net_bridge_fdb_entry *fdb, int type);
 int br_switchdev_port_vlan_add(struct net_device *dev, u16 vid, u16 flags,
 			       struct netlink_ext_ack *extack);
 int br_switchdev_port_vlan_del(struct net_device *dev, u16 vid);
@@ -1629,7 +1629,8 @@ static inline int br_switchdev_port_vlan_del(struct net_device *dev, u16 vid)
 }
 
 static inline void
-br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type)
+br_switchdev_fdb_notify(struct net_bridge *br,
+			const struct net_bridge_fdb_entry *fdb, int type)
 {
 }
 
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index a5e601e41cb9..9a707da79dfe 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -108,7 +108,8 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p,
 }
 
 void
-br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type)
+br_switchdev_fdb_notify(struct net_bridge *br,
+			const struct net_bridge_fdb_entry *fdb, int type)
 {
 	struct switchdev_notifier_fdb_info info = {
 		.addr = fdb->key.addr.addr,
@@ -117,18 +118,16 @@ br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type)
 		.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags),
 		.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags),
 	};
-
-	if (!fdb->dst)
-		return;
+	struct net_device *dev = fdb->dst ? fdb->dst->dev : br->dev;
 
 	switch (type) {
 	case RTM_DELNEIGH:
 		call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_DEVICE,
-					 fdb->dst->dev, &info.info, NULL);
+					 dev, &info.info, NULL);
 		break;
 	case RTM_NEWNEIGH:
 		call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_DEVICE,
-					 fdb->dst->dev, &info.info, NULL);
+					 dev, &info.info, NULL);
 		break;
 	}
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 10/17] net: dsa: include bridge addresses which are local in the host fdb list
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (8 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 09/17] net: bridge: switchdev: send FDB notifications for host addresses Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 11/17] net: dsa: include fdb entries pointing to bridge " Vladimir Oltean
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Tobias Waldekranz <tobias@waldekranz.com>

The bridge automatically creates local (not forwarded) fdb entries
pointing towards physical ports with their interface MAC addresses.
For switchdev, the significance of these fdb entries is the exact
opposite of that of non-local entries: instead of sending these frame
outwards, we must send them inwards (towards the host).

NOTE: The bridge's own MAC address is also "local". If that address is
not shared with any port, the bridge's MAC is not be added by this
functionality - but the following commit takes care of that case.

NOTE 2: We mark these addresses as host-filtered regardless of the value
of ds->assisted_learning_on_cpu_port. This is because, as opposed to the
speculative logic done for dynamic address learning on foreign
interfaces, the local FDB entries are rather fixed, so there isn't any
risk of them migrating from one bridge port to another.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/slave.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index c4db870b48e5..8d4cd27cc79f 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2549,10 +2549,12 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 		fdb_info = ptr;
 
 		if (dsa_slave_dev_check(dev)) {
-			if (!fdb_info->added_by_user || fdb_info->is_local)
-				return NOTIFY_OK;
-
 			dp = dsa_slave_to_port(dev);
+
+			if (fdb_info->is_local)
+				host_addr = true;
+			else if (!fdb_info->added_by_user)
+				return NOTIFY_OK;
 		} else {
 			/* Snoop addresses learnt on foreign interfaces
 			 * bridged with us, for switches that don't
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 11/17] net: dsa: include fdb entries pointing to bridge in the host fdb list
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (9 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 10/17] net: dsa: include bridge addresses which are local in the host fdb list Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 12/17] net: dsa: sync static FDB entries on foreign interfaces to hardware Vladimir Oltean
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

The bridge supports a legacy way of adding local (non-forwarded) FDB
entries, which works on an individual port basis:

bridge fdb add dev swp0 00:01:02:03:04:05 master local

As well as a new way, added by Roopa Prabhu in commit 3741873b4f73
("bridge: allow adding of fdb entries pointing to the bridge device"):

bridge fdb add dev br0 00:01:02:03:04:05 self local

The two commands are functionally equivalent, except that the first one
produces an entry with fdb->dst == swp0, and the other an entry with
fdb->dst == NULL. The confusing part, though, is that even if fdb->dst
is swp0 for the 'local on port' entry, that destination is not used.

Nonetheless, the idea is that the bridge has reference counting for
local entries, and local entries pointing towards the bridge are still
'as local' as local entries for a port.

The bridge adds the MAC addresses of the interfaces automatically as
FDB entries with is_local=1. For the MAC address of the ports, fdb->dst
will be equal to the port, and for the MAC address of the bridge,
fdb->dst will point towards the bridge (i.e. be NULL). Therefore, if the
MAC address of the bridge is not inherited from either of the physical
ports, then we must explicitly catch local FDB entries emitted towards
the br0, otherwise we'll miss the MAC address of the bridge (and, of
course, any entry with 'bridge add dev br0 ... self local').

Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/slave.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 8d4cd27cc79f..425b3223b7d1 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2563,7 +2563,11 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 			struct net_device *br_dev;
 			struct dsa_slave_priv *p;
 
-			br_dev = netdev_master_upper_dev_get_rcu(dev);
+			if (netif_is_bridge_master(dev))
+				br_dev = dev;
+			else
+				br_dev = netdev_master_upper_dev_get_rcu(dev);
+
 			if (!br_dev)
 				return NOTIFY_DONE;
 
@@ -2584,8 +2588,13 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 			 * LAG we don't want to send traffic to the CPU, the
 			 * other ports bridged with the LAG should be able to
 			 * autonomously forward towards it.
+			 * On the other hand, if the address is local
+			 * (therefore not learned) then we want to trap it to
+			 * the CPU regardless of whether the interface it
+			 * belongs to is offloaded or not.
 			 */
-			if (dsa_tree_offloads_netdev(dp->ds->dst, dev))
+			if (dsa_tree_offloads_netdev(dp->ds->dst, dev) &&
+			    !fdb_info->is_local)
 				return NOTIFY_DONE;
 		}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 12/17] net: dsa: sync static FDB entries on foreign interfaces to hardware
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (10 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 11/17] net: dsa: include fdb entries pointing to bridge " Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 13/17] net: dsa: mv88e6xxx: Request assisted learning on CPU port Vladimir Oltean
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Tobias Waldekranz <tobias@waldekranz.com>

Reuse the "assisted_learning_on_cpu_port" functionality to always add
entries for user-configured entries on foreign interfaces, even if
assisted_learning_on_cpu_port is not enabled. E.g. in this situation:

   br0
   / \
swp0 dummy0

$ bridge fdb add 02:00:de:ad:00:01 dev dummy0 vlan 1 master static

Results in DSA adding an entry in the hardware FDB, pointing this
address towards the CPU port.

The same is true for entries added to the bridge itself, e.g:

$ bridge fdb add 02:00:de:ad:00:01 dev br0 vlan 1 self local

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/slave.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 425b3223b7d1..a32875d3dc5f 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2556,9 +2556,12 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 			else if (!fdb_info->added_by_user)
 				return NOTIFY_OK;
 		} else {
-			/* Snoop addresses learnt on foreign interfaces
-			 * bridged with us, for switches that don't
-			 * automatically learn SA from CPU-injected traffic
+			/* Snoop addresses added to foreign interfaces
+			 * bridged with us, or the bridge
+			 * itself. Dynamically learned addresses can
+			 * also be added for switches that don't
+			 * automatically learn SA from CPU-injected
+			 * traffic.
 			 */
 			struct net_device *br_dev;
 			struct dsa_slave_priv *p;
@@ -2581,7 +2584,8 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 			dp = p->dp;
 			host_addr = true;
 
-			if (!dp->ds->assisted_learning_on_cpu_port)
+			if (!fdb_info->added_by_user &&
+			    !dp->ds->assisted_learning_on_cpu_port)
 				return NOTIFY_DONE;
 
 			/* When the bridge learns an address on an offloaded
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 13/17] net: dsa: mv88e6xxx: Request assisted learning on CPU port
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (11 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 12/17] net: dsa: sync static FDB entries on foreign interfaces to hardware Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 14/17] net: dsa: replay port and host-joined mdb entries when joining the bridge Vladimir Oltean
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Tobias Waldekranz <tobias@waldekranz.com>

While the hardware is capable of performing learning on the CPU port,
it requires alot of additions to the bridge's forwarding path in order
to handle multi-destination traffic correctly.

Until that is in place, opt for the next best thing and let DSA sync
the relevant addresses down to the hardware FDB.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 903d619e08ed..e25bfcde8324 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -5818,6 +5818,7 @@ static int mv88e6xxx_register_switch(struct mv88e6xxx_chip *chip)
 	ds->ops = &mv88e6xxx_switch_ops;
 	ds->ageing_time_min = chip->info->age_time_coeff;
 	ds->ageing_time_max = chip->info->age_time_coeff * U8_MAX;
+	ds->assisted_learning_on_cpu_port = true;
 
 	/* Some chips support up to 32, but that requires enabling the
 	 * 5-bit port mode, which we do not support. 640k^W16 ought to
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 14/17] net: dsa: replay port and host-joined mdb entries when joining the bridge
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (12 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 13/17] net: dsa: mv88e6xxx: Request assisted learning on CPU port Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb " Vladimir Oltean
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

I have udhcpcd in my system and this is configured to bring interfaces
up as soon as they are created.

I create a bridge as follows:

ip link add br0 type bridge

As soon as I create the bridge and udhcpcd brings it up, I have some
other crap (avahi) that starts sending some random IPv6 packets to
advertise some local services, and from there, the br0 bridge joins the
following IPv6 groups:

33:33:ff:6d:c1:9c vid 0
33:33:00:00:00:6a vid 0
33:33:00:00:00:fb vid 0

br_dev_xmit
-> br_multicast_rcv
   -> br_ip6_multicast_add_group
      -> __br_multicast_add_group
         -> br_multicast_host_join
            -> br_mdb_notify

This is all fine, but inside br_mdb_notify we have br_mdb_switchdev_host
hooked up, and switchdev will attempt to offload the host joined groups
to an empty list of ports. Of course nobody offloads them.

Then when we add a port to br0:

ip link set swp0 master br0

the bridge doesn't replay the host-joined MDB entries from br_add_if,
and eventually the host joined addresses expire, and a switchdev
notification for deleting it is emitted, but surprise, the original
addition was already completely missed.

The strategy to address this problem is to replay the MDB entries (both
the port ones and the host joined ones) when the new port joins the
bridge, similar to what vxlan_fdb_replay does (in that case, its FDB can
be populated and only then attached to a bridge that you offload).
However there are 2 possibilities: the addresses can be 'pushed' by the
bridge into the port, or the port can 'pull' them from the bridge.

Considering that in the general case, the new port can be really late to
the party, and there may have been many other switchdev ports that
already received the initial notification, we would like to avoid
delivering duplicate events to them, since they might misbehave. And
currently, the bridge calls the entire switchdev notifier chain, whereas
for replaying it should just call the notifier block of the new guy.
But the bridge doesn't know what is the new guy's notifier block, it
just knows where the switchdev notifier chain is. So for simplification,
we make this a driver-initiated pull for now, and the notifier block is
passed as an argument.

To emulate the calling context for mdb objects (deferred and put on the
blocking notifier chain), we must iterate under RCU protection through
the bridge's mdb entries, queue them, and only call them once we're out
of the RCU read-side critical section.

Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/linux/if_bridge.h |   8 +++
 include/net/switchdev.h   |   1 +
 net/bridge/br_mdb.c       | 117 ++++++++++++++++++++++++++++++++++++++
 net/dsa/slave.c           |  17 +++++-
 4 files changed, 141 insertions(+), 2 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index b979005ea39c..2f0e5713bf39 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -69,6 +69,8 @@ bool br_multicast_has_querier_anywhere(struct net_device *dev, int proto);
 bool br_multicast_has_querier_adjacent(struct net_device *dev, int proto);
 bool br_multicast_enabled(const struct net_device *dev);
 bool br_multicast_router(const struct net_device *dev);
+int br_mdb_replay(struct net_device *br_dev, struct net_device *dev,
+		  struct notifier_block *nb);
 #else
 static inline int br_multicast_list_adjacent(struct net_device *dev,
 					     struct list_head *br_ip_list)
@@ -93,6 +95,12 @@ static inline bool br_multicast_router(const struct net_device *dev)
 {
 	return false;
 }
+static inline int br_mdb_replay(struct net_device *br_dev,
+				struct net_device *dev,
+				struct notifier_block *nb)
+{
+	return -EINVAL;
+}
 #endif
 
 #if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_BRIDGE_VLAN_FILTERING)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index ca7223a79135..f1a5a9a3634d 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -68,6 +68,7 @@ enum switchdev_obj_id {
 };
 
 struct switchdev_obj {
+	struct list_head list;
 	struct net_device *orig_dev;
 	enum switchdev_obj_id id;
 	u32 flags;
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 8846c5bcd075..170353510c35 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -506,6 +506,123 @@ static void br_mdb_complete(struct net_device *dev, int err, void *priv)
 	kfree(priv);
 }
 
+static int br_mdb_replay_one(struct notifier_block *nb, struct net_device *dev,
+			     struct switchdev_obj_port_mdb *mdb)
+{
+	struct switchdev_notifier_port_obj_info obj_info = {
+		.info = {
+			.dev = dev,
+		},
+		.obj = &mdb->obj,
+	};
+	int err;
+
+	err = nb->notifier_call(nb, SWITCHDEV_PORT_OBJ_ADD, &obj_info);
+	return notifier_to_errno(err);
+}
+
+static int br_mdb_queue_one(struct list_head *mdb_list,
+			    enum switchdev_obj_id id,
+			    struct net_bridge_mdb_entry *mp,
+			    struct net_device *orig_dev)
+{
+	struct switchdev_obj_port_mdb *mdb;
+
+	mdb = kzalloc(sizeof(*mdb), GFP_ATOMIC);
+	if (!mdb)
+		return -ENOMEM;
+
+	mdb->obj.id = id;
+	mdb->obj.orig_dev = orig_dev;
+	mdb->vid = mp->addr.vid;
+
+	if (mp->addr.proto == htons(ETH_P_IP))
+		ip_eth_mc_map(mp->addr.dst.ip4, mdb->addr);
+#if IS_ENABLED(CONFIG_IPV6)
+	else if (mp->addr.proto == htons(ETH_P_IPV6))
+		ipv6_eth_mc_map(&mp->addr.dst.ip6, mdb->addr);
+#endif
+	else
+		ether_addr_copy(mdb->addr, mp->addr.dst.mac_addr);
+
+	list_add_tail(&mdb->obj.list, mdb_list);
+
+	return 0;
+}
+
+int br_mdb_replay(struct net_device *br_dev, struct net_device *dev,
+		  struct notifier_block *nb)
+{
+	struct net_bridge_mdb_entry *mp;
+	struct switchdev_obj *obj, *tmp;
+	struct list_head mdb_list;
+	struct net_bridge *br;
+	int err = 0;
+
+	ASSERT_RTNL();
+
+	INIT_LIST_HEAD(&mdb_list);
+
+	if (!netif_is_bridge_master(br_dev))
+		return -EINVAL;
+
+	if (!netif_is_bridge_port(dev))
+		return -EINVAL;
+
+	br = netdev_priv(br_dev);
+
+	if (!br_opt_get(br, BROPT_MULTICAST_ENABLED))
+		return 0;
+
+	rcu_read_lock();
+
+	hlist_for_each_entry_rcu(mp, &br->mdb_list, mdb_node) {
+		struct net_bridge_port_group __rcu **pp;
+		struct net_bridge_port_group *p;
+
+		if (mp->host_joined) {
+			err = br_mdb_queue_one(&mdb_list,
+					       SWITCHDEV_OBJ_ID_HOST_MDB,
+					       mp, br_dev);
+			if (err) {
+				rcu_read_unlock();
+				goto out_free_mdb;
+			}
+		}
+
+		for (pp = &mp->ports; (p = rcu_dereference(*pp)) != NULL;
+		     pp = &p->next) {
+			if (p->key.port->dev != dev)
+				continue;
+
+			err = br_mdb_queue_one(&mdb_list,
+					       SWITCHDEV_OBJ_ID_PORT_MDB,
+					       mp, dev);
+			if (err) {
+				rcu_read_unlock();
+				goto out_free_mdb;
+			}
+		}
+	}
+
+	rcu_read_unlock();
+
+	list_for_each_entry(obj, &mdb_list, list) {
+		err = br_mdb_replay_one(nb, dev, SWITCHDEV_OBJ_PORT_MDB(obj));
+		if (err)
+			goto out_free_mdb;
+	}
+
+out_free_mdb:
+	list_for_each_entry_safe(obj, tmp, &mdb_list, list) {
+		list_del(&obj->list);
+		kfree(SWITCHDEV_OBJ_PORT_MDB(obj));
+	}
+
+	return err;
+}
+EXPORT_SYMBOL(br_mdb_replay);
+
 static void br_mdb_switchdev_host_port(struct net_device *dev,
 				       struct net_device *lower_dev,
 				       struct net_bridge_mdb_entry *mp,
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index a32875d3dc5f..10b4a0f72dcb 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2290,6 +2290,9 @@ bool dsa_slave_dev_check(const struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(dsa_slave_dev_check);
 
+/* Circular reference */
+static struct notifier_block dsa_slave_switchdev_blocking_notifier;
+
 static int dsa_slave_changeupper(struct net_device *dev,
 				 struct netdev_notifier_changeupper_info *info)
 {
@@ -2297,10 +2300,15 @@ static int dsa_slave_changeupper(struct net_device *dev,
 	int err = NOTIFY_DONE;
 
 	if (netif_is_bridge_master(info->upper_dev)) {
+		struct net_device *bridge_dev = info->upper_dev;
+
 		if (info->linking) {
-			err = dsa_port_bridge_join(dp, info->upper_dev);
-			if (!err)
+			err = dsa_port_bridge_join(dp, bridge_dev);
+			if (!err) {
 				dsa_bridge_mtu_normalization(dp);
+				br_mdb_replay(bridge_dev, dev,
+					      &dsa_slave_switchdev_blocking_notifier);
+			}
 			err = notifier_from_errno(err);
 		} else {
 			dsa_port_bridge_leave(dp, info->upper_dev);
@@ -2361,6 +2369,11 @@ dsa_slave_lag_changeupper(struct net_device *dev,
 			break;
 	}
 
+	if (netif_is_bridge_master(info->upper_dev) && !err) {
+		br_mdb_replay(info->upper_dev, dev,
+			      &dsa_slave_switchdev_blocking_notifier);
+	}
+
 	return err;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb entries when joining the bridge
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (13 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 14/17] net: dsa: replay port and host-joined mdb entries when joining the bridge Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-26 12:23   ` Tobias Waldekranz
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 16/17] net: bridge: switchdev: let drivers inform which bridge ports are offloaded Vladimir Oltean
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 17/17] net: bridge: offloaded ports are always promiscuous Vladimir Oltean
  16 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

When a DSA port joins a LAG that already had an FDB entry pointing to it:

ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static
ip link set swp0 master bond0

the DSA port will have no idea that this FDB entry is there, because it
missed the switchdev event emitted at its creation.

Ido Schimmel pointed this out during a discussion about challenges with
switchdev offloading of stacked interfaces between the physical port and
the bridge, and recommended to just catch that condition and deny the
CHANGEUPPER event:
https://lore.kernel.org/netdev/20210210105949.GB287766@shredder.lan/

But in fact, we might need to deal with the hard thing anyway, which is
to replay all FDB addresses relevant to this port, because it isn't just
static FDB entries, but also local addresses (ones that are not
forwarded but terminated by the bridge). There, we can't just say 'oh
yeah, there was an upper already so I'm not joining that'.

So, similar to the logic for replaying MDB entries, add a function that
must be called by individual switchdev drivers and replays local FDB
entries as well as ones pointing towards a bridge port. This time, we
use the atomic switchdev notifier block, since that's what FDB entries
expect for some reason.

Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 include/linux/if_bridge.h | 10 ++++++++
 include/net/switchdev.h   |  1 +
 net/bridge/br_fdb.c       | 53 +++++++++++++++++++++++++++++++++++++++
 net/dsa/slave.c           |  7 +++++-
 4 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 2f0e5713bf39..2a90ac638b06 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -144,6 +144,8 @@ struct net_device *br_fdb_find_port(const struct net_device *br_dev,
 				    __u16 vid);
 void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
 bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
+int br_fdb_replay(struct net_device *br_dev, struct net_device *dev,
+		  struct notifier_block *nb);
 #else
 static inline struct net_device *
 br_fdb_find_port(const struct net_device *br_dev,
@@ -162,6 +164,14 @@ br_port_flag_is_set(const struct net_device *dev, unsigned long flag)
 {
 	return false;
 }
+
+static inline int br_fdb_replay(struct net_device *br_dev,
+				struct net_device *dev,
+				struct notifier_block *nb)
+{
+	return -EINVAL;
+}
+
 #endif
 
 #endif
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index f1a5a9a3634d..5b63dfd444c6 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -206,6 +206,7 @@ struct switchdev_notifier_info {
 
 struct switchdev_notifier_fdb_info {
 	struct switchdev_notifier_info info; /* must be first */
+	struct list_head list;
 	const unsigned char *addr;
 	u16 vid;
 	u8 added_by_user:1,
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 1d54ae0f58fb..9eb776503b02 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -726,6 +726,59 @@ static inline size_t fdb_nlmsg_size(void)
 		+ nla_total_size(sizeof(u8)); /* NFEA_ACTIVITY_NOTIFY */
 }
 
+static int br_fdb_replay_one(struct notifier_block *nb,
+			     struct net_bridge_fdb_entry *fdb,
+			     struct net_device *dev)
+{
+	struct switchdev_notifier_fdb_info item;
+	int err;
+
+	item.addr = fdb->key.addr.addr;
+	item.vid = fdb->key.vlan_id;
+	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
+	item.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
+	item.info.dev = dev;
+
+	err = nb->notifier_call(nb, SWITCHDEV_FDB_ADD_TO_DEVICE, &item);
+	return notifier_to_errno(err);
+}
+
+int br_fdb_replay(struct net_device *br_dev, struct net_device *dev,
+		  struct notifier_block *nb)
+{
+	struct net_bridge_fdb_entry *fdb;
+	struct net_bridge *br;
+	int err = 0;
+
+	if (!netif_is_bridge_master(br_dev))
+		return -EINVAL;
+
+	if (!netif_is_bridge_port(dev))
+		return -EINVAL;
+
+	br = netdev_priv(br_dev);
+
+	rcu_read_lock();
+
+	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
+		struct net_device *dst_dev;
+
+		dst_dev = fdb->dst ? fdb->dst->dev : br->dev;
+		if (dst_dev != br_dev && dst_dev != dev)
+			continue;
+
+		err = br_fdb_replay_one(nb, fdb, dst_dev);
+		if (err)
+			break;
+	}
+
+	rcu_read_unlock();
+
+	return err;
+}
+EXPORT_SYMBOL(br_fdb_replay);
+
 static void fdb_notify(struct net_bridge *br,
 		       const struct net_bridge_fdb_entry *fdb, int type,
 		       bool swdev_notify)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 10b4a0f72dcb..5fa5737e622c 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2290,7 +2290,8 @@ bool dsa_slave_dev_check(const struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(dsa_slave_dev_check);
 
-/* Circular reference */
+/* Circular references */
+static struct notifier_block dsa_slave_switchdev_notifier;
 static struct notifier_block dsa_slave_switchdev_blocking_notifier;
 
 static int dsa_slave_changeupper(struct net_device *dev,
@@ -2306,6 +2307,8 @@ static int dsa_slave_changeupper(struct net_device *dev,
 			err = dsa_port_bridge_join(dp, bridge_dev);
 			if (!err) {
 				dsa_bridge_mtu_normalization(dp);
+				br_fdb_replay(bridge_dev, dev,
+					      &dsa_slave_switchdev_notifier);
 				br_mdb_replay(bridge_dev, dev,
 					      &dsa_slave_switchdev_blocking_notifier);
 			}
@@ -2370,6 +2373,8 @@ dsa_slave_lag_changeupper(struct net_device *dev,
 	}
 
 	if (netif_is_bridge_master(info->upper_dev) && !err) {
+		br_fdb_replay(info->upper_dev, dev,
+			      &dsa_slave_switchdev_notifier);
 		br_mdb_replay(info->upper_dev, dev,
 			      &dsa_slave_switchdev_blocking_notifier);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 16/17] net: bridge: switchdev: let drivers inform which bridge ports are offloaded
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (14 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb " Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 14:25   ` kernel test robot
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 17/17] net: bridge: offloaded ports are always promiscuous Vladimir Oltean
  16 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

On reception of an skb, the bridge checks if it was marked as 'already
forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it
is, it puts a mark of its own on that skb, with the switchdev mark of
the ingress port. Then during forwarding, it enforces that the egress
port must have a different switchdev mark than the ingress one (this is
done in nbp_switchdev_allowed_egress).

Non-switchdev drivers don't report any physical switch id (neither
through devlink nor .ndo_get_port_parent_id), therefore the bridge
assigns them a switchdev mark of 0, and packets coming from them will
always have skb->offload_fwd_mark = 0. So there aren't any restrictions.

Problems appear due to the fact that DSA would like to perform software
fallback for bonding and team interfaces that the physical switch cannot
offload.

         +-- br0 -+
        /   / |    \
       /   /  |     \
      /   /   |      \
     /   /    |       \
    /   /     |        \
   /    |     |       bond0
  /     |     |      /    \
 swp0  swp1  swp2  swp3  swp4

There, it is desirable that the presence of swp3 and swp4 under a
non-offloaded LAG does not preclude us from doing hardware bridging
beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high
enough that software bridging between {swp0,swp1,swp2} and bond0 is not
impractical.

But this creates an impossible paradox given the current way in which
port switchdev marks are assigned. When the driver receives a packet
from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to
something.

- If we set it to 0, then the bridge will forward it towards swp1, swp2
  and bond0. But the switch has already forwarded it towards swp1 and
  swp2 (not to bond0, remember, that isn't offloaded, so as far as the
  switch is concerned, ports swp3 and swp4 are not looking up the FDB,
  and the entire bond0 is a destination that is strictly behind the
  CPU). But we don't want duplicated traffic towards swp1 and swp2, so
  it's not ok to set skb->offload_fwd_mark = 0.

- If we set it to 1, then the bridge will not forward the skb towards
  the ports with the same switchdev mark, i.e. not to swp1, swp2 and
  bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should
  have forwarded the skb there.

So the real issue is that bond0 will be assigned the same switchdev mark
as {swp0,swp1,swp2}, because the function that assigns switchdev marks
to bridge ports, nbp_switchdev_mark_set, recurses through bond0's lower
interfaces until it finds something that implements devlink.

A solution is to give the bridge explicit hints as to what switchdev
mark it should use for each port.

Currently, the bridging offload is very 'silent': a driver registers a
netdevice notifier, which is put on the netns's notifier chain, and
which sniffs around for NETDEV_CHANGEUPPER events where the upper is a
bridge, and the lower is an interface it knows about (one registered by
this driver, normally). Then, from within that notifier, it does a bunch
of stuff behind the bridge's back, without the bridge necessarily
knowing that there's somebody offloading that port. It looks like this:

     ip link set swp0 master br0
                  |
                  v
   bridge calls netdev_master_upper_dev_link
                  |
                  v
        call_netdevice_notifiers
                  |
                  v
       dsa_slave_netdevice_event
                  |
                  v
        oh, hey! it's for me!
                  |
                  v
           .port_bridge_join

What we do to solve the conundrum is to be less silent, and emit a
notification back. Something like this:

     ip link set swp0 master br0
                  |
                  v
   bridge calls netdev_master_upper_dev_link
                  |
                  v                    bridge: Aye! I'll use this
        call_netdevice_notifiers           ^  ppid as the
                  |                        |  switchdev mark for
                  v                        |  this port, and zero
       dsa_slave_netdevice_event           |  if I got nothing.
                  |                        |
                  v                        |
        oh, hey! it's for me!              |
                  |                        |
                  v                        |
           .port_bridge_join               |
                  |                        |
                  +------------------------+
      call_switchdev_notifiers(swp0, SWITCHDEV_BRPORT_OFFLOADED, ppid)

Then stacked interfaces (like bond0 on top of swp3/swp4) would be
treated differently in DSA, depending on whether we can or cannot
offload them.

The offload case:

    ip link set bond0 master br0
                  |
                  v
   bridge calls netdev_master_upper_dev_link
                  |
                  v                    bridge: Aye! I'll use this
        call_netdevice_notifiers           ^  ppid as the
                  |                        |  switchdev mark for
                  v                        |        bond0.
       dsa_slave_netdevice_event           | Coincidentally (or not),
                  |                        | bond0 and swp0, swp1, swp2
                  v                        | all have the same switchdev
        hmm, it's not quite for me,        | mark now, since the ASIC
         but my driver has already         | is able to forward towards
           called .port_lag_join           | all these ports in hw.
          for it, because I have           |
      a port with dp->lag_dev == bond0.    |
                  |                        |
                  v                        |
           .port_bridge_join               |
           for swp3 and swp4               |
                  |                        |
                  +------------------------+
      call_switchdev_notifiers(bond0, SWITCHDEV_BRPORT_OFFLOADED, ppid)

And the non-offload case:

    ip link set bond0 master br0
                  |
                  v
   bridge calls netdev_master_upper_dev_link
                  |
                  v                    bridge waiting:
        call_netdevice_notifiers           ^  huh, no SWITCHDEV_BRPORT_OFFLOADED
                  |                        |  event, okay, I'll use a switchdev
                  v                        |  mark of zero for this one.
       dsa_slave_netdevice_event           :  Then packets received on swp0 will
                  |                        :  not be forwarded towards swp1, but
                  v                        :  they will towards bond0.
         it's not for me, but
       bond0 is an upper of swp3
      and swp4, but their dp->lag_dev
       is NULL because they couldn't
            offload it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 .../marvell/prestera/prestera_switchdev.c     |  2 +
 .../mellanox/mlxsw/spectrum_switchdev.c       |  2 +
 drivers/net/ethernet/mscc/ocelot_net.c        | 43 +++++++++++++++----
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |  2 +
 drivers/net/ethernet/ti/am65-cpsw-nuss.c      |  2 +
 drivers/net/ethernet/ti/cpsw_new.c            |  1 +
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c       |  2 +
 include/net/switchdev.h                       | 16 +++++++
 net/bridge/br.c                               |  8 +++-
 net/bridge/br_if.c                            | 11 ++---
 net/bridge/br_private.h                       |  7 +--
 net/bridge/br_switchdev.c                     | 27 +++++-------
 net/dsa/slave.c                               | 20 +++++----
 net/switchdev/switchdev.c                     | 18 ++++++++
 14 files changed, 116 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
index cb564890a3dc..10a6242b7ac1 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
@@ -443,6 +443,8 @@ static int prestera_port_bridge_join(struct prestera_port *port,
 		goto err_brport_create;
 	}
 
+	switchdev_bridge_port_offload_notify(port->dev);
+
 	if (bridge->vlan_enabled)
 		return 0;
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 9aadc29ad777..241276dbe876 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -2326,6 +2326,8 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
 	if (err)
 		goto err_port_join;
 
+	switchdev_bridge_port_offload_notify(brport_dev);
+
 	return 0;
 
 err_port_join:
diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c
index 12cb6867a2d0..b8be69ade1bd 100644
--- a/drivers/net/ethernet/mscc/ocelot_net.c
+++ b/drivers/net/ethernet/mscc/ocelot_net.c
@@ -1111,10 +1111,14 @@ static int ocelot_port_obj_del(struct net_device *dev,
 	return ret;
 }
 
-static int ocelot_netdevice_bridge_join(struct ocelot *ocelot, int port,
+static int ocelot_netdevice_bridge_join(struct net_device *dev,
 					struct net_device *bridge)
 {
+	struct ocelot_port_private *priv = netdev_priv(dev);
+	struct ocelot_port *ocelot_port = &priv->port;
+	struct ocelot *ocelot = ocelot_port->ocelot;
 	struct switchdev_brport_flags flags;
+	int port = priv->chip_port;
 	int err;
 
 	flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
@@ -1124,15 +1128,20 @@ static int ocelot_netdevice_bridge_join(struct ocelot *ocelot, int port,
 	if (err)
 		return err;
 
+	switchdev_bridge_port_offload_notify(dev);
 	ocelot_port_bridge_flags(ocelot, port, flags);
 
 	return 0;
 }
 
-static int ocelot_netdevice_bridge_leave(struct ocelot *ocelot, int port,
+static int ocelot_netdevice_bridge_leave(struct net_device *dev,
 					 struct net_device *bridge)
 {
+	struct ocelot_port_private *priv = netdev_priv(dev);
+	struct ocelot_port *ocelot_port = &priv->port;
+	struct ocelot *ocelot = ocelot_port->ocelot;
 	struct switchdev_brport_flags flags;
+	int port = priv->chip_port;
 	int err;
 
 	flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
@@ -1146,7 +1155,8 @@ static int ocelot_netdevice_bridge_leave(struct ocelot *ocelot, int port,
 }
 
 static int ocelot_netdevice_changeupper(struct net_device *dev,
-					struct netdev_notifier_changeupper_info *info)
+					struct netdev_notifier_changeupper_info *info,
+					bool notify)
 {
 	struct ocelot_port_private *priv = netdev_priv(dev);
 	struct ocelot_port *ocelot_port = &priv->port;
@@ -1156,11 +1166,11 @@ static int ocelot_netdevice_changeupper(struct net_device *dev,
 
 	if (netif_is_bridge_master(info->upper_dev)) {
 		if (info->linking) {
-			err = ocelot_netdevice_bridge_join(ocelot, port,
-							   info->upper_dev);
+			err = ocelot_netdevice_bridge_join(dev, info->upper_dev);
+			if (!err && notify)
+				switchdev_bridge_port_offload_notify(dev);
 		} else {
-			err = ocelot_netdevice_bridge_leave(ocelot, port,
-							    info->upper_dev);
+			err = ocelot_netdevice_bridge_leave(dev, info->upper_dev);
 		}
 	}
 	if (netif_is_lag_master(info->upper_dev)) {
@@ -1182,6 +1192,12 @@ static int ocelot_netdevice_changeupper(struct net_device *dev,
 	return notifier_from_errno(err);
 }
 
+/* Treat CHANGEUPPER events on an offloaded LAG as individual CHANGEUPPER
+ * events for the lower physical ports of the LAG.
+ * If the LAG upper isn't offloaded, ignore its CHANGEUPPER events.
+ * In case the LAG joined a bridge, notify that we are offloading it and can do
+ * forwarding in hardware towards it.
+ */
 static int
 ocelot_netdevice_lag_changeupper(struct net_device *dev,
 				 struct netdev_notifier_changeupper_info *info)
@@ -1191,11 +1207,20 @@ ocelot_netdevice_lag_changeupper(struct net_device *dev,
 	int err = NOTIFY_DONE;
 
 	netdev_for_each_lower_dev(dev, lower, iter) {
-		err = ocelot_netdevice_changeupper(lower, info);
+		struct ocelot_port_private *priv = netdev_priv(lower);
+		struct ocelot_port *ocelot_port = &priv->port;
+
+		if (ocelot_port->bond != dev)
+			return NOTIFY_OK;
+
+		err = ocelot_netdevice_changeupper(lower, info, false);
 		if (err)
 			return notifier_from_errno(err);
 	}
 
+	if (info->linking && netif_is_bridge_master(info->upper_dev))
+		switchdev_bridge_port_offload_notify(dev);
+
 	return NOTIFY_DONE;
 }
 
@@ -1230,7 +1255,7 @@ static int ocelot_netdevice_event(struct notifier_block *unused,
 		struct netdev_notifier_changeupper_info *info = ptr;
 
 		if (ocelot_netdevice_dev_check(dev))
-			return ocelot_netdevice_changeupper(dev, info);
+			return ocelot_netdevice_changeupper(dev, info, true);
 
 		if (netif_is_lag_master(dev))
 			return ocelot_netdevice_lag_changeupper(dev, info);
diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index 967a634ee9ac..f57c26a24924 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -2592,6 +2592,8 @@ static int ofdpa_port_bridge_join(struct ofdpa_port *ofdpa_port,
 
 	ofdpa_port->bridge_dev = bridge;
 
+	switchdev_bridge_port_offload_notify(ofdpa_port->dev);
+
 	return ofdpa_port_vlan_add(ofdpa_port, OFDPA_UNTAGGED_VID, 0);
 }
 
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
index 638d7b03be4b..7bb1b7b83031 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
@@ -27,6 +27,7 @@
 #include <linux/sys_soc.h>
 #include <linux/dma/ti-cppi5.h>
 #include <linux/dma/k3-udma-glue.h>
+#include <net/switchdev.h>
 
 #include "cpsw_ale.h"
 #include "cpsw_sl.h"
@@ -2096,6 +2097,7 @@ static int am65_cpsw_netdevice_port_link(struct net_device *ndev, struct net_dev
 	common->br_members |= BIT(priv->port->port_id);
 
 	am65_cpsw_port_offload_fwd_mark_update(common);
+	switchdev_bridge_port_offload_notify(ndev);
 
 	return NOTIFY_DONE;
 }
diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
index 58a64313ac00..e246010db5be 100644
--- a/drivers/net/ethernet/ti/cpsw_new.c
+++ b/drivers/net/ethernet/ti/cpsw_new.c
@@ -1522,6 +1522,7 @@ static int cpsw_netdevice_port_link(struct net_device *ndev,
 	cpsw->br_members |= BIT(priv->emac_port);
 
 	cpsw_port_offload_fwd_mark_update(cpsw);
+	switchdev_bridge_port_offload_notify(ndev);
 
 	return NOTIFY_DONE;
 }
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index aad212b9b97b..535982be5605 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -1237,6 +1237,8 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 	if (!err)
 		port_priv->bridge_dev = upper_dev;
 
+	switchdev_bridge_port_offload_notify(netdev);
+
 	return err;
 }
 
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 5b63dfd444c6..a3b4cf6ade6c 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -197,6 +197,8 @@ enum switchdev_notifier_type {
 	SWITCHDEV_VXLAN_FDB_ADD_TO_DEVICE,
 	SWITCHDEV_VXLAN_FDB_DEL_TO_DEVICE,
 	SWITCHDEV_VXLAN_FDB_OFFLOADED,
+
+	SWITCHDEV_BRPORT_OFFLOADED,
 };
 
 struct switchdev_notifier_info {
@@ -204,6 +206,11 @@ struct switchdev_notifier_info {
 	struct netlink_ext_ack *extack;
 };
 
+struct switchdev_notifier_brport_info {
+	struct switchdev_notifier_info info; /* must be first */
+	struct netdev_phys_item_id ppid;
+};
+
 struct switchdev_notifier_fdb_info {
 	struct switchdev_notifier_info info; /* must be first */
 	struct list_head list;
@@ -284,6 +291,9 @@ int switchdev_handle_port_attr_set(struct net_device *dev,
 			int (*set_cb)(struct net_device *dev,
 				      const struct switchdev_attr *attr,
 				      struct netlink_ext_ack *extack));
+
+int switchdev_bridge_port_offload_notify(struct net_device *dev);
+
 #else
 
 static inline void switchdev_deferred_process(void)
@@ -380,6 +390,12 @@ switchdev_handle_port_attr_set(struct net_device *dev,
 {
 	return 0;
 }
+
+static inline int switchdev_bridge_port_offload_notify(struct net_device *dev)
+{
+	return 0;
+}
+
 #endif
 
 #endif /* _LINUX_SWITCHDEV_H_ */
diff --git a/net/bridge/br.c b/net/bridge/br.c
index ef743f94254d..72dcd0bc462a 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -151,9 +151,10 @@ static int br_switchdev_event(struct notifier_block *unused,
 			      unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	struct switchdev_notifier_brport_info *brport_info;
+	struct switchdev_notifier_fdb_info *fdb_info;
 	struct net_bridge_port *p;
 	struct net_bridge *br;
-	struct switchdev_notifier_fdb_info *fdb_info;
 	int err = NOTIFY_DONE;
 
 	p = br_port_get_rtnl_rcu(dev);
@@ -191,6 +192,11 @@ static int br_switchdev_event(struct notifier_block *unused,
 		/* Don't delete static entries */
 		br_fdb_delete_by_port(br, p, fdb_info->vid, 0);
 		break;
+	case SWITCHDEV_BRPORT_OFFLOADED:
+		brport_info = ptr;
+		p->ppid = brport_info->ppid;
+		p->offloaded = true;
+		break;
 	}
 
 out:
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index f7d2f472ae24..680fc3bed549 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -643,9 +643,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
 	if (err)
 		goto err5;
 
-	err = nbp_switchdev_mark_set(p);
-	if (err)
-		goto err6;
+	nbp_switchdev_mark_set(p);
 
 	dev_disable_lro(dev);
 
@@ -671,13 +669,13 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
 		 */
 		err = dev_pre_changeaddr_notify(br->dev, dev->dev_addr, extack);
 		if (err)
-			goto err7;
+			goto err6;
 	}
 
 	err = nbp_vlan_init(p, extack);
 	if (err) {
 		netdev_err(dev, "failed to initialize vlan filtering on this port\n");
-		goto err7;
+		goto err6;
 	}
 
 	spin_lock_bh(&br->lock);
@@ -700,11 +698,10 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
 
 	return 0;
 
-err7:
+err6:
 	list_del_rcu(&p->list);
 	br_fdb_delete_by_port(br, p, 0, 1);
 	nbp_update_port_count(br);
-err6:
 	netdev_upper_dev_unlink(dev, br->dev);
 err5:
 	dev->priv_flags &= ~IFF_BRIDGE_PORT;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 4a262dc55e6b..484e23e8bb9d 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -328,6 +328,8 @@ struct net_bridge_port {
 #endif
 #ifdef CONFIG_NET_SWITCHDEV
 	int				offload_fwd_mark;
+	bool				offloaded;
+	struct netdev_phys_item_id	ppid;
 #endif
 	u16				group_fwd_mask;
 	u16				backup_redirected_cnt;
@@ -1572,7 +1574,7 @@ static inline void br_sysfs_delbr(struct net_device *dev) { return; }
 
 /* br_switchdev.c */
 #ifdef CONFIG_NET_SWITCHDEV
-int nbp_switchdev_mark_set(struct net_bridge_port *p);
+void nbp_switchdev_mark_set(struct net_bridge_port *p);
 void nbp_switchdev_frame_mark(const struct net_bridge_port *p,
 			      struct sk_buff *skb);
 bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
@@ -1592,9 +1594,8 @@ static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
 	skb->offload_fwd_mark = 0;
 }
 #else
-static inline int nbp_switchdev_mark_set(struct net_bridge_port *p)
+static inline void nbp_switchdev_mark_set(struct net_bridge_port *p)
 {
-	return 0;
 }
 
 static inline void nbp_switchdev_frame_mark(const struct net_bridge_port *p,
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 9a707da79dfe..c700da5c71e0 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -8,36 +8,29 @@
 
 #include "br_private.h"
 
-static int br_switchdev_mark_get(struct net_bridge *br, struct net_device *dev)
+static int br_switchdev_mark_get(struct net_bridge *br,
+				 struct net_bridge_port *new_nbp)
 {
 	struct net_bridge_port *p;
 
 	/* dev is yet to be added to the port list. */
 	list_for_each_entry(p, &br->port_list, list) {
-		if (netdev_port_same_parent_id(dev, p->dev))
+		if (!p->offloaded)
+			continue;
+
+		if (netdev_phys_item_id_same(&p->ppid, &new_nbp->ppid))
 			return p->offload_fwd_mark;
 	}
 
 	return ++br->offload_fwd_mark;
 }
 
-int nbp_switchdev_mark_set(struct net_bridge_port *p)
+void nbp_switchdev_mark_set(struct net_bridge_port *p)
 {
-	struct netdev_phys_item_id ppid = { };
-	int err;
-
-	ASSERT_RTNL();
+	if (!p->offloaded)
+		return;
 
-	err = dev_get_port_parent_id(p->dev, &ppid, true);
-	if (err) {
-		if (err == -EOPNOTSUPP)
-			return 0;
-		return err;
-	}
-
-	p->offload_fwd_mark = br_switchdev_mark_get(p->br, p->dev);
-
-	return 0;
+	p->offload_fwd_mark = br_switchdev_mark_get(p->br, p);
 }
 
 void nbp_switchdev_frame_mark(const struct net_bridge_port *p,
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 5fa5737e622c..bbb7846d6022 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2295,7 +2295,8 @@ static struct notifier_block dsa_slave_switchdev_notifier;
 static struct notifier_block dsa_slave_switchdev_blocking_notifier;
 
 static int dsa_slave_changeupper(struct net_device *dev,
-				 struct netdev_notifier_changeupper_info *info)
+				 struct netdev_notifier_changeupper_info *info,
+				 bool notify)
 {
 	struct dsa_port *dp = dsa_slave_to_port(dev);
 	int err = NOTIFY_DONE;
@@ -2305,6 +2306,8 @@ static int dsa_slave_changeupper(struct net_device *dev,
 
 		if (info->linking) {
 			err = dsa_port_bridge_join(dp, bridge_dev);
+			if (!err && notify)
+				switchdev_bridge_port_offload_notify(dev);
 			if (!err) {
 				dsa_bridge_mtu_normalization(dp);
 				br_fdb_replay(bridge_dev, dev,
@@ -2364,22 +2367,23 @@ dsa_slave_lag_changeupper(struct net_device *dev,
 
 		dp = dsa_slave_to_port(lower);
 		if (!dp->lag_dev)
-			/* Software LAG */
-			continue;
+			/* Software LAG, ignore all its CHANGEUPPER events */
+			return NOTIFY_DONE;
 
-		err = dsa_slave_changeupper(lower, info);
+		err = dsa_slave_changeupper(lower, info, false);
 		if (notifier_to_errno(err))
-			break;
+			return err;
 	}
 
-	if (netif_is_bridge_master(info->upper_dev) && !err) {
+	if (info->linking && netif_is_bridge_master(info->upper_dev)) {
+		switchdev_bridge_port_offload_notify(dev);
 		br_fdb_replay(info->upper_dev, dev,
 			      &dsa_slave_switchdev_notifier);
 		br_mdb_replay(info->upper_dev, dev,
 			      &dsa_slave_switchdev_blocking_notifier);
 	}
 
-	return err;
+	return 0;
 }
 
 static int
@@ -2475,7 +2479,7 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb,
 	}
 	case NETDEV_CHANGEUPPER:
 		if (dsa_slave_dev_check(dev))
-			return dsa_slave_changeupper(dev, ptr);
+			return dsa_slave_changeupper(dev, ptr, true);
 
 		if (netif_is_lag_master(dev))
 			return dsa_slave_lag_changeupper(dev, ptr);
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 89a36db47ab4..7f48effc1ffb 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -546,3 +546,21 @@ int switchdev_handle_port_attr_set(struct net_device *dev,
 	return err;
 }
 EXPORT_SYMBOL_GPL(switchdev_handle_port_attr_set);
+
+/* Let the bridge know that this port is offloaded, so that it can use the
+ * port parent id obtained by recursion to determine the bridge port's
+ * switchdev mark.
+ */
+int switchdev_bridge_port_offload_notify(struct net_device *dev)
+{
+	struct switchdev_notifier_brport_info info;
+	int err;
+
+	err = dev_get_port_parent_id(dev, &info.ppid, true);
+	if (err)
+		return err;
+
+	return call_switchdev_notifiers(SWITCHDEV_BRPORT_OFFLOADED, dev,
+					&info.info, NULL);
+}
+EXPORT_SYMBOL(switchdev_bridge_port_offload_notify);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v2 net-next 17/17] net: bridge: offloaded ports are always promiscuous
  2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
                   ` (15 preceding siblings ...)
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 16/17] net: bridge: switchdev: let drivers inform which bridge ports are offloaded Vladimir Oltean
@ 2021-02-24 11:43 ` Vladimir Oltean
  2021-02-24 15:21   ` kernel test robot
  16 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-24 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, Tobias Waldekranz,
	George McCollister, Vlad Yasevich, Roopa Prabhu,
	Nikolay Aleksandrov

From: Vladimir Oltean <vladimir.oltean@nxp.com>

Automatic bridge ports are ones with source address learning or unicast
flooding enabled (corollary: non-automatic ports have learning and
flooding disabled).

The bridge driver has an optimization which says that if all ports are
non-automatic, they don't need to operate in promiscuous mode (i.e. they
don't need to receive all packets). Instead, if a non-automatic port
supports unicast filtering, it can be made to receive only the packets
which have a static FDB entry towards another port in the same forwarding
domain. The logic is that, if a packet is received and does not have a
static FDB entry for routing, it would be dropped anyway because all the
other ports have flooding disabled. So it makes sense to not accept the
packet on RX in the first place.

When a non-automatic port switches to non-promiscuous mode, the static
FDB entries towards the other bridge ports are synced to the RX filtering
list of this ingress port using dev_uc_add.

However, this optimization doesn't bring any benefit for switchdev ports
that offload the bridge. Their hardware data path is promiscuous by its
very definition, i.e. they accept packets regardless of destination MAC
address, because they need to forward them towards the correct station.

Not only is the optimization not necessary, it is also detrimential.
The promise of switchdev is that it looks just like a regular interface
to user space, and it offers extra offloading functionality for stacked
virtual interfaces that can benefit from it. Therefore, it is imaginable
that a switchdev driver might want to implement IFF_UNICAST_FLT too.
When not offloading the bridge, a switchdev interface should really be
indistinguishable from a normal port from user space's perspective,
which means that addresses installed with dev_uc_add and dev_mc_add
should be accepted, and those which aren't should be dropped.

So a switchdev driver might implement dev_uc_add and dev_mc_add by
extracting these addresses from the hardware data path and delivering
them to the CPU, and drop the rest by disabling flooding to the CPU,
and this makes perfect sense when there is no bridge involved on top.

However, things get complicated when the bridge ports are non-automatic
and enter non-promiscuous mode. The bridge will then panic 'oh no, I
need to do something in order for my packets to not get dropped', and
will do the dev_uc_add procedure mentioned above. This will result in
the undesirable behavior that the switchdev driver will extract those
MAC addresses to the CPU, when in fact all that the bridge wanted was
for the packets to not be dropped.

To avoid this situation, the switchdev driver would need to conditionally
accept an address added through dev_uc_add and extract it to the CPU
only if it wasn't added by the bridge, which is both complicated,
strange and counterproductive. It is already unfortunate enough that the
bridge uses its own notification mechanisms for addresses which need to
be extracted (SWITCHDEV_OBJ_ID_HOST_MDB, SWITCHDEV_FDB_ADD_TO_DEVICE
with dev=br0). It shouldn't monopolize the switchdev driver's
functionality and instead it should allow it to offer its services to
other layers which are unaware of switchdev.

So this patch's premise is that the bridge, which is fully aware of
switchdev anyway, is the one that needs to compromise, and must not do
something which isn't needed if switchdev is being used to offload a
port. This way, dev_uc_add and dev_mc_add can be used as a valid mechanism
for address filtering towards the CPU requested by switchdev-unaware
layers ('towards the CPU' because switchdev-unaware will not benefit
from the hardware offload datapath anyway, that's effectively the only
destination which is relevant for them).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_if.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 680fc3bed549..fc32066bbfdc 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -111,9 +111,13 @@ static void br_port_clear_promisc(struct net_bridge_port *p)
 	/* Check if the port is already non-promisc or if it doesn't
 	 * support UNICAST filtering.  Without unicast filtering support
 	 * we'll end up re-enabling promisc mode anyway, so just check for
-	 * it here.
+	 * it here. Also, a switchdev offloading this port needs to be
+	 * promiscuous by definition, so don't even attempt to get it out of
+	 * promiscuous mode or sync unicast FDB entries to it, since that is
+	 * pointless and not necessary.
 	 */
-	if (!br_promisc_port(p) || !(p->dev->priv_flags & IFF_UNICAST_FLT))
+	if (!br_promisc_port(p) || !(p->dev->priv_flags & IFF_UNICAST_FLT) ||
+	    p->offloaded)
 		return;
 
 	/* Since we'll be clearing the promisc mode, program the port
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 16/17] net: bridge: switchdev: let drivers inform which bridge ports are offloaded
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 16/17] net: bridge: switchdev: let drivers inform which bridge ports are offloaded Vladimir Oltean
@ 2021-02-24 14:25   ` kernel test robot
  0 siblings, 0 replies; 34+ messages in thread
From: kernel test robot @ 2021-02-24 14:25 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3581 bytes --]

Hi Vladimir,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Vladimir-Oltean/RX-filtering-in-DSA/20210224-195147
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git d310ec03a34e92a77302edb804f7d68ee4f01ba0
config: openrisc-randconfig-p002-20210223 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/292ea1a7fc1b03aebab5571439861ac396d15687
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Vladimir-Oltean/RX-filtering-in-DSA/20210224-195147
        git checkout 292ea1a7fc1b03aebab5571439861ac396d15687
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   net/bridge/br.c: In function 'br_switchdev_event':
>> net/bridge/br.c:197:4: error: 'struct net_bridge_port' has no member named 'ppid'
     197 |   p->ppid = brport_info->ppid;
         |    ^~
>> net/bridge/br.c:198:4: error: 'struct net_bridge_port' has no member named 'offloaded'
     198 |   p->offloaded = true;
         |    ^~


vim +197 net/bridge/br.c

   148	
   149	/* called with RTNL or RCU */
   150	static int br_switchdev_event(struct notifier_block *unused,
   151				      unsigned long event, void *ptr)
   152	{
   153		struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
   154		struct switchdev_notifier_brport_info *brport_info;
   155		struct switchdev_notifier_fdb_info *fdb_info;
   156		struct net_bridge_port *p;
   157		struct net_bridge *br;
   158		int err = NOTIFY_DONE;
   159	
   160		p = br_port_get_rtnl_rcu(dev);
   161		if (!p)
   162			goto out;
   163	
   164		br = p->br;
   165	
   166		switch (event) {
   167		case SWITCHDEV_FDB_ADD_TO_BRIDGE:
   168			fdb_info = ptr;
   169			err = br_fdb_external_learn_add(br, p, fdb_info->addr,
   170							fdb_info->vid, false);
   171			if (err) {
   172				err = notifier_from_errno(err);
   173				break;
   174			}
   175			br_fdb_offloaded_set(br, p, fdb_info->addr,
   176					     fdb_info->vid, true);
   177			break;
   178		case SWITCHDEV_FDB_DEL_TO_BRIDGE:
   179			fdb_info = ptr;
   180			err = br_fdb_external_learn_del(br, p, fdb_info->addr,
   181							fdb_info->vid, false);
   182			if (err)
   183				err = notifier_from_errno(err);
   184			break;
   185		case SWITCHDEV_FDB_OFFLOADED:
   186			fdb_info = ptr;
   187			br_fdb_offloaded_set(br, p, fdb_info->addr,
   188					     fdb_info->vid, fdb_info->offloaded);
   189			break;
   190		case SWITCHDEV_FDB_FLUSH_TO_BRIDGE:
   191			fdb_info = ptr;
   192			/* Don't delete static entries */
   193			br_fdb_delete_by_port(br, p, fdb_info->vid, 0);
   194			break;
   195		case SWITCHDEV_BRPORT_OFFLOADED:
   196			brport_info = ptr;
 > 197			p->ppid = brport_info->ppid;
 > 198			p->offloaded = true;
   199			break;
   200		}
   201	
   202	out:
   203		return err;
   204	}
   205	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 22478 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 17/17] net: bridge: offloaded ports are always promiscuous
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 17/17] net: bridge: offloaded ports are always promiscuous Vladimir Oltean
@ 2021-02-24 15:21   ` kernel test robot
  0 siblings, 0 replies; 34+ messages in thread
From: kernel test robot @ 2021-02-24 15:21 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2680 bytes --]

Hi Vladimir,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Vladimir-Oltean/RX-filtering-in-DSA/20210224-195147
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git d310ec03a34e92a77302edb804f7d68ee4f01ba0
config: openrisc-randconfig-p002-20210223 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/57f6434fb1b797f12dc1b614e68dba69054caee2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Vladimir-Oltean/RX-filtering-in-DSA/20210224-195147
        git checkout 57f6434fb1b797f12dc1b614e68dba69054caee2
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   net/bridge/br_if.c: In function 'br_port_clear_promisc':
>> net/bridge/br_if.c:120:7: error: 'struct net_bridge_port' has no member named 'offloaded'
     120 |      p->offloaded)
         |       ^~


vim +120 net/bridge/br_if.c

   106	
   107	static void br_port_clear_promisc(struct net_bridge_port *p)
   108	{
   109		int err;
   110	
   111		/* Check if the port is already non-promisc or if it doesn't
   112		 * support UNICAST filtering.  Without unicast filtering support
   113		 * we'll end up re-enabling promisc mode anyway, so just check for
   114		 * it here. Also, a switchdev offloading this port needs to be
   115		 * promiscuous by definition, so don't even attempt to get it out of
   116		 * promiscuous mode or sync unicast FDB entries to it, since that is
   117		 * pointless and not necessary.
   118		 */
   119		if (!br_promisc_port(p) || !(p->dev->priv_flags & IFF_UNICAST_FLT) ||
 > 120		    p->offloaded)
   121			return;
   122	
   123		/* Since we'll be clearing the promisc mode, program the port
   124		 * first so that we don't have interruption in traffic.
   125		 */
   126		err = br_fdb_sync_static(p->br, p);
   127		if (err)
   128			return;
   129	
   130		dev_set_promiscuity(p->dev, -1);
   131		p->flags &= ~BR_PROMISC;
   132	}
   133	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 22478 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses Vladimir Oltean
@ 2021-02-26  9:20   ` Tobias Waldekranz
  0 siblings, 0 replies; 34+ messages in thread
From: Tobias Waldekranz @ 2021-02-26  9:20 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, George McCollister, Vlad Yasevich,
	Roopa Prabhu, Nikolay Aleksandrov

On Wed, Feb 24, 2021 at 13:43, Vladimir Oltean <olteanv@gmail.com> wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> Currently any DSA switch that is strict when implementing the mdb
> operations prints these benign errors after the addresses expire, with
> at least 2 ports bridged:
>
> [  286.013814] mscc_felix 0000:00:00.5 swp3: failed (err=-2) to del object (id=3)
>
> The reason has to do with this piece of code:
>
> 	netdev_for_each_lower_dev(dev, lower_dev, iter)
> 		br_mdb_switchdev_host_port(dev, lower_dev, mp, type);
>
> called from:
>
> br_multicast_group_expired
> -> br_multicast_host_leave
>    -> br_mdb_notify
>       -> br_mdb_switchdev_host
>
> Basically, the bridge code is correct.

How about "the bridge code is not wrong"? Is there any reason why we
could not get rid of the duplicated messages at the source
(br_mdb_switchdev_host)?

The forward offloading we have talked about before requires that the
bridge-internal port OFMs are bounded to some low value
(e.g. BITS_PER_LONG) such that they can be tracked in a small
bitfield. I have a patch that does this, it is tiny.

With that in place, imagine a `br_switchdev_for_each_distinct_dev`
helper that would wrap `netdev_for_each_lower_dev`, keeping track of
OFMs it had already visited.

For all host related switchdev calls, we should be able to use that
instead of the full iterator to only contact each switchdev driver once.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses Vladimir Oltean
@ 2021-02-26 10:59   ` Tobias Waldekranz
  2021-02-26 13:28     ` Vladimir Oltean
  0 siblings, 1 reply; 34+ messages in thread
From: Tobias Waldekranz @ 2021-02-26 10:59 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, George McCollister, Vlad Yasevich,
	Roopa Prabhu, Nikolay Aleksandrov

On Wed, Feb 24, 2021 at 13:43, Vladimir Oltean <olteanv@gmail.com> wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> In case we have ptp4l running on a bridged DSA switch interface, the PTP
> traffic is classified as link-local (in the default profile, the MAC
> addresses are 01:1b:19:00:00:00 and 01:80:c2:00:00:0e), which means it
> isn't the responsibility of the bridge to make sure it gets trapped to
> the CPU.
>
> The solution is to implement the standard callbacks for dev_uc_add and
> dev_mc_add, and behave just like any other network interface: ensure
> that the user space program can see those packets.

So presumably the application would use PACKET_ADD_MEMBERSHIP to set
this up?

This is a really elegant way of solving this problem I think!

One problem I see is that this will not result in packets getting
trapped to the CPU, rather they will simply be forwarded.  I.e. with
this patch applied, once ptp4l adds the groups it is interested in, my
HW FDB will look like this:

ADDR                VID  DST   TYPE
01:1b:19:00:00:00     0  cpu0  static
01:80:c2:00:00:0e     0  cpu0  static

But this will not allow these groups to ingress on (STP) blocked
ports. AFAIK, PTP (certainly LLDP which also uses the latter group)
should be able to do that.

For mv88e6xxx (but I think this applies to most switches), there are
roughly three ways a given multicast group can reach the CPU:

1. Trap: Packet is unconditionally redirected to the CPU, independent
   of things like 802.1X or STP state on the ingressing port.
2. Mirror: Send a copy of packets that pass all other ingress policy to
   the CPU.
3. Forward: Forward packets that pass all other ingress policy to the
   CPU.

Entries are now added as "Forward", which means that the group will no
longer reach the other local ports. But the command from the application
is "I want to see these packets", it says nothing about preventing the
group from being forwarded. So I think the default ought to be
"Mirror". Additionally, we probably need some way of specifying "Trap"
to those applications that need it. E.g. ptp4l could specify
PACKET_MR_MULTICAST_TRAP in mr_action or something if it does not want
the bridge (or the switch) to forward it.

If "Forward" is desired, the existing "bridge mdb" interface seems like
the proper one, since it also affects other ports.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb entries when joining the bridge
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb " Vladimir Oltean
@ 2021-02-26 12:23   ` Tobias Waldekranz
  2021-02-26 18:08     ` Vladimir Oltean
  0 siblings, 1 reply; 34+ messages in thread
From: Tobias Waldekranz @ 2021-02-26 12:23 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, Jiri Pirko,
	Ido Schimmel, DENG Qingfang, George McCollister, Vlad Yasevich,
	Roopa Prabhu, Nikolay Aleksandrov

On Wed, Feb 24, 2021 at 13:43, Vladimir Oltean <olteanv@gmail.com> wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> When a DSA port joins a LAG that already had an FDB entry pointing to it:
>
> ip link set bond0 master br0
> bridge fdb add dev bond0 00:01:02:03:04:05 master static
> ip link set swp0 master bond0
>
> the DSA port will have no idea that this FDB entry is there, because it
> missed the switchdev event emitted at its creation.
>
> Ido Schimmel pointed this out during a discussion about challenges with
> switchdev offloading of stacked interfaces between the physical port and
> the bridge, and recommended to just catch that condition and deny the
> CHANGEUPPER event:
> https://lore.kernel.org/netdev/20210210105949.GB287766@shredder.lan/
>
> But in fact, we might need to deal with the hard thing anyway, which is
> to replay all FDB addresses relevant to this port, because it isn't just
> static FDB entries, but also local addresses (ones that are not
> forwarded but terminated by the bridge). There, we can't just say 'oh
> yeah, there was an upper already so I'm not joining that'.
>
> So, similar to the logic for replaying MDB entries, add a function that
> must be called by individual switchdev drivers and replays local FDB
> entries as well as ones pointing towards a bridge port. This time, we
> use the atomic switchdev notifier block, since that's what FDB entries
> expect for some reason.
>
> Reported-by: Ido Schimmel <idosch@idosch.org>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  include/linux/if_bridge.h | 10 ++++++++
>  include/net/switchdev.h   |  1 +
>  net/bridge/br_fdb.c       | 53 +++++++++++++++++++++++++++++++++++++++
>  net/dsa/slave.c           |  7 +++++-
>  4 files changed, 70 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index 2f0e5713bf39..2a90ac638b06 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -144,6 +144,8 @@ struct net_device *br_fdb_find_port(const struct net_device *br_dev,
>  				    __u16 vid);
>  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
> +int br_fdb_replay(struct net_device *br_dev, struct net_device *dev,
> +		  struct notifier_block *nb);
>  #else
>  static inline struct net_device *
>  br_fdb_find_port(const struct net_device *br_dev,
> @@ -162,6 +164,14 @@ br_port_flag_is_set(const struct net_device *dev, unsigned long flag)
>  {
>  	return false;
>  }
> +
> +static inline int br_fdb_replay(struct net_device *br_dev,
> +				struct net_device *dev,
> +				struct notifier_block *nb)
> +{
> +	return -EINVAL;
> +}
> +
>  #endif
>  
>  #endif
> diff --git a/include/net/switchdev.h b/include/net/switchdev.h
> index f1a5a9a3634d..5b63dfd444c6 100644
> --- a/include/net/switchdev.h
> +++ b/include/net/switchdev.h
> @@ -206,6 +206,7 @@ struct switchdev_notifier_info {
>  
>  struct switchdev_notifier_fdb_info {
>  	struct switchdev_notifier_info info; /* must be first */
> +	struct list_head list;
>  	const unsigned char *addr;
>  	u16 vid;
>  	u8 added_by_user:1,
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 1d54ae0f58fb..9eb776503b02 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -726,6 +726,59 @@ static inline size_t fdb_nlmsg_size(void)
>  		+ nla_total_size(sizeof(u8)); /* NFEA_ACTIVITY_NOTIFY */
>  }
>  
> +static int br_fdb_replay_one(struct notifier_block *nb,
> +			     struct net_bridge_fdb_entry *fdb,
> +			     struct net_device *dev)
> +{
> +	struct switchdev_notifier_fdb_info item;
> +	int err;
> +
> +	item.addr = fdb->key.addr.addr;
> +	item.vid = fdb->key.vlan_id;
> +	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> +	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
> +	item.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
> +	item.info.dev = dev;
> +
> +	err = nb->notifier_call(nb, SWITCHDEV_FDB_ADD_TO_DEVICE, &item);
> +	return notifier_to_errno(err);
> +}
> +
> +int br_fdb_replay(struct net_device *br_dev, struct net_device *dev,
> +		  struct notifier_block *nb)
> +{
> +	struct net_bridge_fdb_entry *fdb;
> +	struct net_bridge *br;
> +	int err = 0;
> +
> +	if (!netif_is_bridge_master(br_dev))
> +		return -EINVAL;
> +
> +	if (!netif_is_bridge_port(dev))
> +		return -EINVAL;
> +
> +	br = netdev_priv(br_dev);
> +
> +	rcu_read_lock();
> +
> +	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
> +		struct net_device *dst_dev;
> +
> +		dst_dev = fdb->dst ? fdb->dst->dev : br->dev;
> +		if (dst_dev != br_dev && dst_dev != dev)
> +			continue;
> +
> +		err = br_fdb_replay_one(nb, fdb, dst_dev);
> +		if (err)
> +			break;
> +	}
> +
> +	rcu_read_unlock();
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(br_fdb_replay);
> +
>  static void fdb_notify(struct net_bridge *br,
>  		       const struct net_bridge_fdb_entry *fdb, int type,
>  		       bool swdev_notify)
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 10b4a0f72dcb..5fa5737e622c 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -2290,7 +2290,8 @@ bool dsa_slave_dev_check(const struct net_device *dev)
>  }
>  EXPORT_SYMBOL_GPL(dsa_slave_dev_check);
>  
> -/* Circular reference */
> +/* Circular references */
> +static struct notifier_block dsa_slave_switchdev_notifier;
>  static struct notifier_block dsa_slave_switchdev_blocking_notifier;
>  
>  static int dsa_slave_changeupper(struct net_device *dev,
> @@ -2306,6 +2307,8 @@ static int dsa_slave_changeupper(struct net_device *dev,
>  			err = dsa_port_bridge_join(dp, bridge_dev);
>  			if (!err) {
>  				dsa_bridge_mtu_normalization(dp);
> +				br_fdb_replay(bridge_dev, dev,
> +					      &dsa_slave_switchdev_notifier);
>  				br_mdb_replay(bridge_dev, dev,
>  					      &dsa_slave_switchdev_blocking_notifier);

If VLAN filtering is enabled, we would also have to replay that. Port
attributes also, right?

I like the pull model, because it saves the bridge from doing lots of
dumpster diving. However, should there be a single `bridge_replay` that
takes care of everything?

Rather than this kit-car approarch which outsources ordering etc to each
switchdev driver, you issue a single call saying: "bring me up to
speed". It seems right that that knowledge should reside in the bridge
since it was the one who sent the original events that are being
replayed.

>  			}
> @@ -2370,6 +2373,8 @@ dsa_slave_lag_changeupper(struct net_device *dev,
>  	}
>  
>  	if (netif_is_bridge_master(info->upper_dev) && !err) {
> +		br_fdb_replay(info->upper_dev, dev,
> +			      &dsa_slave_switchdev_notifier);
>  		br_mdb_replay(info->upper_dev, dev,
>  			      &dsa_slave_switchdev_blocking_notifier);
>  	}
> -- 
> 2.25.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses
  2021-02-26 10:59   ` Tobias Waldekranz
@ 2021-02-26 13:28     ` Vladimir Oltean
  2021-02-26 22:44       ` Tobias Waldekranz
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-26 13:28 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, Ido Schimmel, DENG Qingfang, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Fri, Feb 26, 2021 at 11:59:36AM +0100, Tobias Waldekranz wrote:
> On Wed, Feb 24, 2021 at 13:43, Vladimir Oltean <olteanv@gmail.com> wrote:
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> >
> > In case we have ptp4l running on a bridged DSA switch interface, the PTP
> > traffic is classified as link-local (in the default profile, the MAC
> > addresses are 01:1b:19:00:00:00 and 01:80:c2:00:00:0e), which means it
> > isn't the responsibility of the bridge to make sure it gets trapped to
> > the CPU.
> >
> > The solution is to implement the standard callbacks for dev_uc_add and
> > dev_mc_add, and behave just like any other network interface: ensure
> > that the user space program can see those packets.
>
> So presumably the application would use PACKET_ADD_MEMBERSHIP to set
> this up?
>
> This is a really elegant way of solving this problem I think!

Yes, using the unmodified *_ADD_MEMBERSHIP UAPI was the intention.
If that is not possible, the whole idea kinda loses its appeal and we'd
be better off starting from scratch and figuring out how we'd prefer for
user space to request (exclusive) address membership.

> One problem I see is that this will not result in packets getting
> trapped to the CPU, rather they will simply be forwarded.  I.e. with
> this patch applied, once ptp4l adds the groups it is interested in, my
> HW FDB will look like this:
>
> ADDR                VID  DST   TYPE
> 01:1b:19:00:00:00     0  cpu0  static
> 01:80:c2:00:00:0e     0  cpu0  static
>
> But this will not allow these groups to ingress on (STP) blocked
> ports. AFAIK, PTP (certainly LLDP which also uses the latter group)
> should be able to do that.
>
> For mv88e6xxx (but I think this applies to most switches), there are
> roughly three ways a given multicast group can reach the CPU:
>
> 1. Trap: Packet is unconditionally redirected to the CPU, independent
>    of things like 802.1X or STP state on the ingressing port.
> 2. Mirror: Send a copy of packets that pass all other ingress policy to
>    the CPU.
> 3. Forward: Forward packets that pass all other ingress policy to the
>    CPU.
>
> Entries are now added as "Forward", which means that the group will no
> longer reach the other local ports. But the command from the application
> is "I want to see these packets", it says nothing about preventing the
> group from being forwarded. So I think the default ought to be
> "Mirror". Additionally, we probably need some way of specifying "Trap"
> to those applications that need it. E.g. ptp4l could specify
> PACKET_MR_MULTICAST_TRAP in mr_action or something if it does not want
> the bridge (or the switch) to forward it.
>
> If "Forward" is desired, the existing "bridge mdb" interface seems like
> the proper one, since it also affects other ports.

I'm not sure I understand your exact requirement.

Let me try to quote from IEEE 802.1Q-2018 clause 8.13.9 "Points of
attachment and connectivity for Higher Layer Entities". For context,
this talks about about Higher Layer Entities, aka applications such as
STP, MRP, ISIS-SPB, whose world view is as though they are connected
directly to the LAN that the bridge port is connected to, i.e. they
bypass the MAC relay (forwarding) function.

The spec says:

  Controls placed in the forwarding path have no effect on the ability
  of a Higher Layer Entity to transmit and receive frames to or from a
  given LAN using a direct attachment to that LAN (e.g., from entity A
  to LAN A); they only affect the path taken by any indirect
  transmission or reception (e.g., from entity A to or from LAN B).

Then there's this drawing:

 +-----------------------+                         +-----------------------+
 | Higher Layer Entity A |                         | Higher Layer Entity B |
 +-----------------------+                         +-----------------------+
            |                                                   |
            |     +---------------------------------------+     |
            |     |             MAC Relay Entity          |     |
            |     |                                       |     |
            |     |   Port   Filtering Database   Port    |     |
            |     |  State       Information     State    |     |
            |     |                                       |     |
            |     |     /             /             /     |     |
            +-----|---x/  x---------x/  x---------x/  x---+-----+
            |     |                                       |     |
            |     +---------------------------------------+     |
            |                                                   |
 +----------------------------+                      +----------------------------+
 |          |                 |                      |          |                 |
 |          x                 |                      |          x                 |
 |    MAC    /                |                      |    MAC    /                |
 |  Entity  / MAC_Operational |                      |  Entity  / MAC_Operational |
 |          x                 |                      |          x                 |
 |          |                 |                      |          |                 |
 +----------------------------+                      +----------------------------+
            |                                                   |
            |                                                   |
            |                                                   |
   ----------------------------                        ----------------------------
  /       LAN A               /                       /       LAN B               /
 /                           /                       /                           /
/---------------------------/                       /---------------------------/

Figure 8-20: Effect of control information on the forwarding path

A one phrase conclusion from the above is that a Higher Layer Entity
should be able to accept packets from the bridge port regardless of its
STP state, as long as the MAC is operational.

In my world view, anything sent and received through the swpN DSA
interfaces, and therefore by injection/extraction through the CPU port,
represents a Higher Layer Entity.

So the fact that your Marvell switch treats the CPU port as just another
destination for forwarding should be hidden away from the externally
observable behavior. DSA does not really support the switched endpoint
use case, we should deny attaching IP interfaces to it.

But to your point: we are currently using the .port_fdb_add and
.port_mdb_add API in DSA to install the host-filtered addresses. This is
true, and potentially incorrect, since indeed it assumes that the
underlying mechanism to trap these addresses to the CPU is through the
FDB/MDB, which will violate the expectation of Higher Layer Entities
since it goes through the forwarding process. I think we could add a new
set of APIs in DSA, something like .host_uc_add and .host_mc_add. Then,
drivers that can't do any better can go ahead and internally call their
.port_fdb_add and .port_mdb_add. Question: If we add .host_uc_add and
.host_mc_add, should these APIs be per front port? And if they should,
should we leave the switch driver the task of reference counting them?


As to why does IEEE 1588 use 01-80-C2-00-00-0E for peer delay in its
default profile for the L2 transport, I could only find this (quoting
clause "F.3 Multicast MAC Addresses"):

  To ensure peer delay measurements on ports blocked by (Rapid/Multiple)
  Spanning Tree Protocols, a reserved address, 01-80-C2-00-00-0E, shall be
  used as a Destination MAC Address for PTP peer delay mechanism messages.

Basically, unlike end-to-end delay measurements, peer delay is by
definition point-to-point, which in an L2 network mean link-local.
So if your LAN uses the peer delay measurement protocol, then all your
switches must have some sort of PTP awareness. There are 2 cases really:
- You are a Peer-to-peer Transparent Clock, so you must speak the Peer
  Delay protocol yourself. Therefore you must have a Higher Layer Entity
  that consumes the PTP PDUs. So for this case, it doesn't really make
  any difference that the reserved bridge MAC address range is used.
- You are an End-to-end Transparent Clock. These are an oddity of the
  1588 standard which do not speak the Peer Delay protocol, but are
  allowed to forward Peer Delay messages, and update the correctionField
  of those Peer Delay messages that are Events (contain timestamps).
  I would think that this is where having a reserved address for the
  Peer Delay messages would make a difference.


Then, there seems to be the second part of your request, that of address
exclusivity. I think that for addresses that should explicitly be
removed from the hardware data path, the tc-trap action was created for
that exact purpose. However, the question really becomes: does PTP care
enough to know it's running on top of a switchdev interface, so that it
should claim exclusive ownership of that multicast group, or should it
just care enough to say "hey, I'm here, I'm interested in it too"?

For one thing, the 01-80-c2-00-00-0e multicast group is shared, even
IEEE 1588 says so:

  NOTE 2: At its July 17−20, 2006 meeting the IEEE 802.1 Working Group
  approved a motion that included the following text: "re-designate the
  reserved address currently identified for use by 802.1AB as an address
  that can be used by protocols that require the scope of the address to
  be limited to an individual LAN" and "The reserved multicast address
  that IEEE 1588 should use is 01-80-C2-00-00-0E." This address is not
  reserved exclusively for PTP, but rather it is a shared address.

and the bridge driver also supports this odd attribute:

  group_fwd_mask MASK - set the group forward mask. This is the bitmask
  that is applied to decide whether to forward incoming frames destined
  to link-local addresses, ie addresses of the form 01:80:C2:00:00:0X
  (defaults to 0, ie the bridge does not forward any link-local frames).

which by the way seems to be in blatant contradiction of clause
8.13.4 Reserved MAC addresses: "Any frame with a destination address
that is a reserved MAC address shall not be forwarded by a Bridge".

So if ptp4l was to install a trap action by itself, it would potentially
interact in unexpected ways with other uses of that address on the same
system.

Thoughts?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb entries when joining the bridge
  2021-02-26 12:23   ` Tobias Waldekranz
@ 2021-02-26 18:08     ` Vladimir Oltean
  0 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2021-02-26 18:08 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, Ido Schimmel, DENG Qingfang, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Fri, Feb 26, 2021 at 01:23:23PM +0100, Tobias Waldekranz wrote:
> If VLAN filtering is enabled, we would also have to replay that. Port
> attributes also, right?
> 
> I like the pull model, because it saves the bridge from doing lots of
> dumpster diving. However, should there be a single `bridge_replay` that
> takes care of everything?
> 
> Rather than this kit-car approarch which outsources ordering etc to each
> switchdev driver, you issue a single call saying: "bring me up to
> speed". It seems right that that knowledge should reside in the bridge
> since it was the one who sent the original events that are being
> replayed.

Yes, in the non-RFC version I'm going to do that.
I'm also thinking I could just pass the blocking and atomic switchdev
notifiers as an argument to the switchdev_bridge_port_offload_notify()
call, such that the drivers need to do one thing and one thing only.

For the purposes of this RFC I just wanted to have something that works
for address filtering.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses
  2021-02-26 13:28     ` Vladimir Oltean
@ 2021-02-26 22:44       ` Tobias Waldekranz
  0 siblings, 0 replies; 34+ messages in thread
From: Tobias Waldekranz @ 2021-02-26 22:44 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, Ido Schimmel, DENG Qingfang, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Fri, Feb 26, 2021 at 15:28, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Fri, Feb 26, 2021 at 11:59:36AM +0100, Tobias Waldekranz wrote:
>> On Wed, Feb 24, 2021 at 13:43, Vladimir Oltean <olteanv@gmail.com> wrote:
>> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
>> >
>> > In case we have ptp4l running on a bridged DSA switch interface, the PTP
>> > traffic is classified as link-local (in the default profile, the MAC
>> > addresses are 01:1b:19:00:00:00 and 01:80:c2:00:00:0e), which means it
>> > isn't the responsibility of the bridge to make sure it gets trapped to
>> > the CPU.
>> >
>> > The solution is to implement the standard callbacks for dev_uc_add and
>> > dev_mc_add, and behave just like any other network interface: ensure
>> > that the user space program can see those packets.
>>
>> So presumably the application would use PACKET_ADD_MEMBERSHIP to set
>> this up?
>>
>> This is a really elegant way of solving this problem I think!
>
> Yes, using the unmodified *_ADD_MEMBERSHIP UAPI was the intention.
> If that is not possible, the whole idea kinda loses its appeal and we'd
> be better off starting from scratch and figuring out how we'd prefer for
> user space to request (exclusive) address membership.
>
>> One problem I see is that this will not result in packets getting
>> trapped to the CPU, rather they will simply be forwarded.  I.e. with
>> this patch applied, once ptp4l adds the groups it is interested in, my
>> HW FDB will look like this:
>>
>> ADDR                VID  DST   TYPE
>> 01:1b:19:00:00:00     0  cpu0  static
>> 01:80:c2:00:00:0e     0  cpu0  static
>>
>> But this will not allow these groups to ingress on (STP) blocked
>> ports. AFAIK, PTP (certainly LLDP which also uses the latter group)
>> should be able to do that.
>>
>> For mv88e6xxx (but I think this applies to most switches), there are
>> roughly three ways a given multicast group can reach the CPU:
>>
>> 1. Trap: Packet is unconditionally redirected to the CPU, independent
>>    of things like 802.1X or STP state on the ingressing port.
>> 2. Mirror: Send a copy of packets that pass all other ingress policy to
>>    the CPU.
>> 3. Forward: Forward packets that pass all other ingress policy to the
>>    CPU.
>>
>> Entries are now added as "Forward", which means that the group will no
>> longer reach the other local ports. But the command from the application
>> is "I want to see these packets", it says nothing about preventing the
>> group from being forwarded. So I think the default ought to be
>> "Mirror". Additionally, we probably need some way of specifying "Trap"
>> to those applications that need it. E.g. ptp4l could specify
>> PACKET_MR_MULTICAST_TRAP in mr_action or something if it does not want
>> the bridge (or the switch) to forward it.
>>
>> If "Forward" is desired, the existing "bridge mdb" interface seems like
>> the proper one, since it also affects other ports.
>
> I'm not sure I understand your exact requirement.
>
> Let me try to quote from IEEE 802.1Q-2018 clause 8.13.9 "Points of
> attachment and connectivity for Higher Layer Entities". For context,
> this talks about about Higher Layer Entities, aka applications such as
> STP, MRP, ISIS-SPB, whose world view is as though they are connected
> directly to the LAN that the bridge port is connected to, i.e. they
> bypass the MAC relay (forwarding) function.
>
> The spec says:
>
>   Controls placed in the forwarding path have no effect on the ability
>   of a Higher Layer Entity to transmit and receive frames to or from a
>   given LAN using a direct attachment to that LAN (e.g., from entity A
>   to LAN A); they only affect the path taken by any indirect
>   transmission or reception (e.g., from entity A to or from LAN B).
>
> Then there's this drawing:
>
>  +-----------------------+                         +-----------------------+
>  | Higher Layer Entity A |                         | Higher Layer Entity B |
>  +-----------------------+                         +-----------------------+
>             |                                                   |
>             |     +---------------------------------------+     |
>             |     |             MAC Relay Entity          |     |
>             |     |                                       |     |
>             |     |   Port   Filtering Database   Port    |     |
>             |     |  State       Information     State    |     |
>             |     |                                       |     |
>             |     |     /             /             /     |     |
>             +-----|---x/  x---------x/  x---------x/  x---+-----+
>             |     |                                       |     |
>             |     +---------------------------------------+     |
>             |                                                   |
>  +----------------------------+                      +----------------------------+
>  |          |                 |                      |          |                 |
>  |          x                 |                      |          x                 |
>  |    MAC    /                |                      |    MAC    /                |
>  |  Entity  / MAC_Operational |                      |  Entity  / MAC_Operational |
>  |          x                 |                      |          x                 |
>  |          |                 |                      |          |                 |
>  +----------------------------+                      +----------------------------+
>             |                                                   |
>             |                                                   |
>             |                                                   |
>    ----------------------------                        ----------------------------
>   /       LAN A               /                       /       LAN B               /
>  /                           /                       /                           /
> /---------------------------/                       /---------------------------/
>
> Figure 8-20: Effect of control information on the forwarding path
>
> A one phrase conclusion from the above is that a Higher Layer Entity
> should be able to accept packets from the bridge port regardless of its
> STP state, as long as the MAC is operational.
>
> In my world view, anything sent and received through the swpN DSA
> interfaces, and therefore by injection/extraction through the CPU port,
> represents a Higher Layer Entity.

Yes, I agree.

> So the fact that your Marvell switch treats the CPU port as just another
> destination for forwarding should be hidden away from the externally
> observable behavior. DSA does not really support the switched endpoint
> use case, we should deny attaching IP interfaces to it.
>
> But to your point: we are currently using the .port_fdb_add and
> .port_mdb_add API in DSA to install the host-filtered addresses. This is
> true, and potentially incorrect, since indeed it assumes that the
> underlying mechanism to trap these addresses to the CPU is through the
> FDB/MDB, which will violate the expectation of Higher Layer Entities
> since it goes through the forwarding process.

Well, at least the way it is implemented now. E.g. mv88e6xxx can setup
mirrors/traps via the FDB (ATU), I imagine that many smaller devices
reuse their FDBs for this purpose. So it might be possible to just
extend the API a bit to signal the type of entry required.

> I think we could add a new
> set of APIs in DSA, something like .host_uc_add and .host_mc_add. Then,
> drivers that can't do any better can go ahead and internally call their
> .port_fdb_add and .port_mdb_add.

That might be an ever better way of managing it. That would also make it
easier to design in the difference between trap/mirror from the start.

> Question: If we add .host_uc_add and
> .host_mc_add, should these APIs be per front port?

I think they would have to be per port in order to work in the
standalone case. You need some context to about which FID (or
equivalent) to tie the entry to.

> And if they should,
> should we leave the switch driver the task of reference counting them?

Well, I know you do not like the DSA layer "trying to be helpful", but I
can not imagine a case where some hardware would be interested in having
N callbacks for the same address when N ports are attached to the same
bridge. So I think it would be a service that DSA can provide.

>
> As to why does IEEE 1588 use 01-80-C2-00-00-0E for peer delay in its
> default profile for the L2 transport, I could only find this (quoting
> clause "F.3 Multicast MAC Addresses"):
>
>   To ensure peer delay measurements on ports blocked by (Rapid/Multiple)
>   Spanning Tree Protocols, a reserved address, 01-80-C2-00-00-0E, shall be
>   used as a Destination MAC Address for PTP peer delay mechanism messages.

Yeah that makes sense. You want to make sure that you are already synced
with your neighbor when a topology change occurs.

> Basically, unlike end-to-end delay measurements, peer delay is by
> definition point-to-point, which in an L2 network mean link-local.
> So if your LAN uses the peer delay measurement protocol, then all your
> switches must have some sort of PTP awareness. There are 2 cases really:
> - You are a Peer-to-peer Transparent Clock, so you must speak the Peer
>   Delay protocol yourself. Therefore you must have a Higher Layer Entity
>   that consumes the PTP PDUs. So for this case, it doesn't really make
>   any difference that the reserved bridge MAC address range is used.
> - You are an End-to-end Transparent Clock. These are an oddity of the
>   1588 standard which do not speak the Peer Delay protocol, but are
>   allowed to forward Peer Delay messages, and update the correctionField
>   of those Peer Delay messages that are Events (contain timestamps).
>   I would think that this is where having a reserved address for the
>   Peer Delay messages would make a difference.
>
>
> Then, there seems to be the second part of your request, that of address
> exclusivity. I think that for addresses that should explicitly be
> removed from the hardware data path, the tc-trap action was created for
> that exact purpose.

Maybe. Tc-trap is a weird creature in that it seems to violate the rule
that in order for an offload to be accepted into the kernel, a software
implementation needs to be in place first. The exception, as I
understand it, is for things that do not make sense in software
(e.g. cut-through switching).

Trapping a packet, blocking bridging, certainly seems like something
that can be done in software. The fact that this was not done means that
it is hard to add it now without potentially breaking existing
use-cases.

Now when a tc trap filter is added, the switch will trap it to the CPU,
the switchdev driver will _not_ set OFM. The result is that the frame is
software forwarded by the bridge. Even if you hack the driver to pretend
that it was HW forwarded (setting OFM), that will not stop the bridge
from forwarding it to foreign interfaces.

One idea could be to implement the software version of "trap" to mean
that sch_handle_ingress could return a signal to
__netif_receive_skb_core to skip any rx handlers and only consider
device specific protocol handlers. A kind of "RX_HANDLER_EXACT with a
twist". Again, this is hard to add after the fact, but you should at
least be able to get the old behavior with "skip_sw".

> However, the question really becomes: does PTP care
> enough to know it's running on top of a switchdev interface, so that it
> should claim exclusive ownership of that multicast group, or should it
> just care enough to say "hey, I'm here, I'm interested in it too"?

Ideally it would not know about switchdev, but maybe it should care
whether it is running on top of a bridge, since that has implications
for how PTP frames should be forwarded (or not).

> For one thing, the 01-80-c2-00-00-0e multicast group is shared, even
> IEEE 1588 says so:
>
>   NOTE 2: At its July 17−20, 2006 meeting the IEEE 802.1 Working Group
>   approved a motion that included the following text: "re-designate the
>   reserved address currently identified for use by 802.1AB as an address
>   that can be used by protocols that require the scope of the address to
>   be limited to an individual LAN" and "The reserved multicast address
>   that IEEE 1588 should use is 01-80-C2-00-00-0E." This address is not
>   reserved exclusively for PTP, but rather it is a shared address.
>
> and the bridge driver also supports this odd attribute:
>
>   group_fwd_mask MASK - set the group forward mask. This is the bitmask
>   that is applied to decide whether to forward incoming frames destined
>   to link-local addresses, ie addresses of the form 01:80:C2:00:00:0X
>   (defaults to 0, ie the bridge does not forward any link-local frames).
>
> which by the way seems to be in blatant contradiction of clause
> 8.13.4 Reserved MAC addresses: "Any frame with a destination address
> that is a reserved MAC address shall not be forwarded by a Bridge".

Well, the set of addresses that are reserved vary based on the type of
bridge (Tables 8-1, 8-2, and 8-3). So this setting allows you to choose
the applicable set.

Additionally, there are cases where you want to emulate a $10 switch (or
even hub) which does _no_ kind of filtering at all.

> So if ptp4l was to install a trap action by itself, it would potentially
> interact in unexpected ways with other uses of that address on the same
> system.

Yeah that is a problem. The easy way out is probably "that is the
admin's problem". A middle way could be to warn/abort if the trap is not
in place. You could also go all the way and reference count filters I
suppose.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2021-02-24 11:43 ` [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device Vladimir Oltean
@ 2021-03-01 15:22   ` Ido Schimmel
  2022-02-22 11:21     ` Vladimir Oltean
  0 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2021-03-01 15:22 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> The bridge device currently goes into promiscuous mode when it has an
> upper with a different MAC address than itself. But it could do better:
> it could sync the MAC addresses of its uppers to the software FDB, as
> local entries pointing to the bridge itself. This is compatible with
> switchdev, since drivers are now instructed to trap these MAC addresses
> to the CPU.
> 
> Note that the dev_uc_add API does not propagate VLAN ID, so this only
> works for VLAN-unaware bridges.

IOW, it breaks VLAN-aware bridges...

I understand that you do not want to track bridge uppers, but once you
look beyond L2 you will need to do it anyway.

Currently, you only care about getting packets with specific DMACs to
the CPU. With L3 offload you will need to send these packets to your
router block instead and track other attributes of these uppers such as
their MTU so that the hardware will know to generate MTU exceptions. In
addition, the hardware needs to know the MAC addresses of these uppers
so that it will rewrite the SMAC of forwarded packets.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2021-03-01 15:22   ` Ido Schimmel
@ 2022-02-22 11:21     ` Vladimir Oltean
  2022-02-22 16:54       ` Ido Schimmel
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2022-02-22 11:21 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

Hi Ido,

On Mon, 1 Mar 2021 at 17:22, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> >
> > The bridge device currently goes into promiscuous mode when it has an
> > upper with a different MAC address than itself. But it could do better:
> > it could sync the MAC addresses of its uppers to the software FDB, as
> > local entries pointing to the bridge itself. This is compatible with
> > switchdev, since drivers are now instructed to trap these MAC addresses
> > to the CPU.
> >
> > Note that the dev_uc_add API does not propagate VLAN ID, so this only
> > works for VLAN-unaware bridges.
>
> IOW, it breaks VLAN-aware bridges...
>
> I understand that you do not want to track bridge uppers, but once you
> look beyond L2 you will need to do it anyway.
>
> Currently, you only care about getting packets with specific DMACs to
> the CPU. With L3 offload you will need to send these packets to your
> router block instead and track other attributes of these uppers such as
> their MTU so that the hardware will know to generate MTU exceptions. In
> addition, the hardware needs to know the MAC addresses of these uppers
> so that it will rewrite the SMAC of forwarded packets.

Ok, let's say I want to track bridge uppers. How can I track the changes to
those interfaces' secondary addresses, in a way that keeps the association
with their VLAN ID, if those uppers are VLAN interfaces?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2022-02-22 11:21     ` Vladimir Oltean
@ 2022-02-22 16:54       ` Ido Schimmel
  2022-02-22 17:18         ` Vladimir Oltean
  0 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2022-02-22 16:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Tue, Feb 22, 2022 at 01:21:53PM +0200, Vladimir Oltean wrote:
> Hi Ido,
> 
> On Mon, 1 Mar 2021 at 17:22, Ido Schimmel <idosch@idosch.org> wrote:
> >
> > On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > >
> > > The bridge device currently goes into promiscuous mode when it has an
> > > upper with a different MAC address than itself. But it could do better:
> > > it could sync the MAC addresses of its uppers to the software FDB, as
> > > local entries pointing to the bridge itself. This is compatible with
> > > switchdev, since drivers are now instructed to trap these MAC addresses
> > > to the CPU.
> > >
> > > Note that the dev_uc_add API does not propagate VLAN ID, so this only
> > > works for VLAN-unaware bridges.
> >
> > IOW, it breaks VLAN-aware bridges...
> >
> > I understand that you do not want to track bridge uppers, but once you
> > look beyond L2 you will need to do it anyway.
> >
> > Currently, you only care about getting packets with specific DMACs to
> > the CPU. With L3 offload you will need to send these packets to your
> > router block instead and track other attributes of these uppers such as
> > their MTU so that the hardware will know to generate MTU exceptions. In
> > addition, the hardware needs to know the MAC addresses of these uppers
> > so that it will rewrite the SMAC of forwarded packets.
> 
> Ok, let's say I want to track bridge uppers. How can I track the changes to
> those interfaces' secondary addresses, in a way that keeps the association
> with their VLAN ID, if those uppers are VLAN interfaces?

Hi,

I'm not sure what you mean by "secondary addresses", but the canonical
way that I'm familiar with of adding MAC addresses to a netdev is to use
macvlan uppers. For example:

# ip link add name br0 up type bridge vlan_filtering 1
# ip link add link br0 name br0.10 type vlan id 10
# ip link add link br0.10 name br0.10-v address 00:11:22:33:44:55 type macvlan mode private

keepalived uses it in VRRP virtual MAC mode (for example):
https://github.com/acassen/keepalived/blob/master/doc/NOTE_vrrp_vmac.txt

In the software data path, this will result in br0 transitioning to
promisc mode and passing all the packets to upper devices that will
filter them.

In the hardware data path, you can apply promisc mode by flooding to
your CPU port (I believe this is what you are trying to avoid) or
install an FDB entry <00:11:22:33:44:55,10> that points to your CPU
port.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2022-02-22 16:54       ` Ido Schimmel
@ 2022-02-22 17:18         ` Vladimir Oltean
  2022-02-24 13:22           ` Ido Schimmel
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2022-02-22 17:18 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Tue, Feb 22, 2022 at 06:54:13PM +0200, Ido Schimmel wrote:
> On Tue, Feb 22, 2022 at 01:21:53PM +0200, Vladimir Oltean wrote:
> > Hi Ido,
> > 
> > On Mon, 1 Mar 2021 at 17:22, Ido Schimmel <idosch@idosch.org> wrote:
> > >
> > > On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> > > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > >
> > > > The bridge device currently goes into promiscuous mode when it has an
> > > > upper with a different MAC address than itself. But it could do better:
> > > > it could sync the MAC addresses of its uppers to the software FDB, as
> > > > local entries pointing to the bridge itself. This is compatible with
> > > > switchdev, since drivers are now instructed to trap these MAC addresses
> > > > to the CPU.
> > > >
> > > > Note that the dev_uc_add API does not propagate VLAN ID, so this only
> > > > works for VLAN-unaware bridges.
> > >
> > > IOW, it breaks VLAN-aware bridges...
> > >
> > > I understand that you do not want to track bridge uppers, but once you
> > > look beyond L2 you will need to do it anyway.
> > >
> > > Currently, you only care about getting packets with specific DMACs to
> > > the CPU. With L3 offload you will need to send these packets to your
> > > router block instead and track other attributes of these uppers such as
> > > their MTU so that the hardware will know to generate MTU exceptions. In
> > > addition, the hardware needs to know the MAC addresses of these uppers
> > > so that it will rewrite the SMAC of forwarded packets.
> > 
> > Ok, let's say I want to track bridge uppers. How can I track the changes to
> > those interfaces' secondary addresses, in a way that keeps the association
> > with their VLAN ID, if those uppers are VLAN interfaces?
> 
> Hi,
> 
> I'm not sure what you mean by "secondary addresses", but the canonical
> way that I'm familiar with of adding MAC addresses to a netdev is to use
> macvlan uppers. For example:
> 
> # ip link add name br0 up type bridge vlan_filtering 1
> # ip link add link br0 name br0.10 type vlan id 10
> # ip link add link br0.10 name br0.10-v address 00:11:22:33:44:55 type macvlan mode private
> 
> keepalived uses it in VRRP virtual MAC mode (for example):
> https://github.com/acassen/keepalived/blob/master/doc/NOTE_vrrp_vmac.txt
> 
> In the software data path, this will result in br0 transitioning to
> promisc mode and passing all the packets to upper devices that will
> filter them.
> 
> In the hardware data path, you can apply promisc mode by flooding to
> your CPU port (I believe this is what you are trying to avoid) or
> install an FDB entry <00:11:22:33:44:55,10> that points to your CPU
> port.

Maybe the terminology is not the best, but by secondary addresses I mean
struct net_device :: uc and mc. To my knowledge, the MAC address of
vlan/macvlan uppers is not the only way in which these lists can be
populated. There is also AF_PACKET UAPI for PACKET_MR_MULTICAST and
PACKET_MR_UNICAST, and this ends up calling dev_mc_add() and
dev_uc_add(). User space may use this API to add a secondary address to
a VLAN upper interface of a bridge.

The question was how can the bridge get notified of changes to those 2
lists of its upper interfaces?

If it monitors NETDEV_CHANGEUPPER it has access to those lists only when
an upper joins or leaves.
If it monitors NETDEV_CHANGEADDR, it gets notified only to changes on
the primary addresses of the uppers (dev_addr and dev_addrs).
If it implements ndo_set_rx_mode (this patch), it has all the addresses
synced to it, but they lack a VLAN ID, because every address lacks
further information about which device added it.

If there's logic in the mlxsw driver that does this, unfortunately I
haven't found it.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2022-02-22 17:18         ` Vladimir Oltean
@ 2022-02-24 13:22           ` Ido Schimmel
  2022-02-24 13:52             ` Vladimir Oltean
  0 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2022-02-24 13:22 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Tue, Feb 22, 2022 at 07:18:10PM +0200, Vladimir Oltean wrote:
> On Tue, Feb 22, 2022 at 06:54:13PM +0200, Ido Schimmel wrote:
> > On Tue, Feb 22, 2022 at 01:21:53PM +0200, Vladimir Oltean wrote:
> > > Hi Ido,
> > > 
> > > On Mon, 1 Mar 2021 at 17:22, Ido Schimmel <idosch@idosch.org> wrote:
> > > >
> > > > On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> > > > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > > >
> > > > > The bridge device currently goes into promiscuous mode when it has an
> > > > > upper with a different MAC address than itself. But it could do better:
> > > > > it could sync the MAC addresses of its uppers to the software FDB, as
> > > > > local entries pointing to the bridge itself. This is compatible with
> > > > > switchdev, since drivers are now instructed to trap these MAC addresses
> > > > > to the CPU.
> > > > >
> > > > > Note that the dev_uc_add API does not propagate VLAN ID, so this only
> > > > > works for VLAN-unaware bridges.
> > > >
> > > > IOW, it breaks VLAN-aware bridges...
> > > >
> > > > I understand that you do not want to track bridge uppers, but once you
> > > > look beyond L2 you will need to do it anyway.
> > > >
> > > > Currently, you only care about getting packets with specific DMACs to
> > > > the CPU. With L3 offload you will need to send these packets to your
> > > > router block instead and track other attributes of these uppers such as
> > > > their MTU so that the hardware will know to generate MTU exceptions. In
> > > > addition, the hardware needs to know the MAC addresses of these uppers
> > > > so that it will rewrite the SMAC of forwarded packets.
> > > 
> > > Ok, let's say I want to track bridge uppers. How can I track the changes to
> > > those interfaces' secondary addresses, in a way that keeps the association
> > > with their VLAN ID, if those uppers are VLAN interfaces?
> > 
> > Hi,
> > 
> > I'm not sure what you mean by "secondary addresses", but the canonical
> > way that I'm familiar with of adding MAC addresses to a netdev is to use
> > macvlan uppers. For example:
> > 
> > # ip link add name br0 up type bridge vlan_filtering 1
> > # ip link add link br0 name br0.10 type vlan id 10
> > # ip link add link br0.10 name br0.10-v address 00:11:22:33:44:55 type macvlan mode private
> > 
> > keepalived uses it in VRRP virtual MAC mode (for example):
> > https://github.com/acassen/keepalived/blob/master/doc/NOTE_vrrp_vmac.txt
> > 
> > In the software data path, this will result in br0 transitioning to
> > promisc mode and passing all the packets to upper devices that will
> > filter them.
> > 
> > In the hardware data path, you can apply promisc mode by flooding to
> > your CPU port (I believe this is what you are trying to avoid) or
> > install an FDB entry <00:11:22:33:44:55,10> that points to your CPU
> > port.
> 
> Maybe the terminology is not the best, but by secondary addresses I mean
> struct net_device :: uc and mc. To my knowledge, the MAC address of
> vlan/macvlan uppers is not the only way in which these lists can be
> populated. There is also AF_PACKET UAPI for PACKET_MR_MULTICAST and
> PACKET_MR_UNICAST, and this ends up calling dev_mc_add() and
> dev_uc_add(). User space may use this API to add a secondary address to
> a VLAN upper interface of a bridge.

OK, I see the problem... So you want the bridge to support
'IFF_UNICAST_FLT' by installing local FDB entries? I see two potential
problems:

1. For VLAN-unaware bridges this is trivial as VLAN information is of no
use. For VLAN-aware bridges we either need to communicate VLAN
information from upper layers or install a local FDB entry per each
configured VLAN (wasteful...). Note that VLAN information will not
always be available (in PACKET_MR_UNICAST, for example), in which case a
local FDB entry will need to be configured per each existing VLAN in
order to maintain existing behavior. Which lead to me think about the
second problem...

2. The bigger problem that I see is that if the bridge starts supporting
'IFF_UNICAST_FLT' by installing local FDB entries, then packets that
were previously locally received and flooded will only be locally
received. Only locally receiving them makes sense, but I don't know what
will break if we change the existing behavior... Maybe this needs to be
guarded by a new bridge option?

> 
> The question was how can the bridge get notified of changes to those 2
> lists of its upper interfaces?
> 
> If it monitors NETDEV_CHANGEUPPER it has access to those lists only when
> an upper joins or leaves.
> If it monitors NETDEV_CHANGEADDR, it gets notified only to changes on
> the primary addresses of the uppers (dev_addr and dev_addrs).
> If it implements ndo_set_rx_mode (this patch), it has all the addresses
> synced to it, but they lack a VLAN ID, because every address lacks
> further information about which device added it.
> 
> If there's logic in the mlxsw driver that does this, unfortunately I
> haven't found it.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2022-02-24 13:22           ` Ido Schimmel
@ 2022-02-24 13:52             ` Vladimir Oltean
  2022-03-01 16:20               ` Ido Schimmel
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Oltean @ 2022-02-24 13:52 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Thu, Feb 24, 2022 at 03:22:31PM +0200, Ido Schimmel wrote:
> On Tue, Feb 22, 2022 at 07:18:10PM +0200, Vladimir Oltean wrote:
> > On Tue, Feb 22, 2022 at 06:54:13PM +0200, Ido Schimmel wrote:
> > > On Tue, Feb 22, 2022 at 01:21:53PM +0200, Vladimir Oltean wrote:
> > > > Hi Ido,
> > > > 
> > > > On Mon, 1 Mar 2021 at 17:22, Ido Schimmel <idosch@idosch.org> wrote:
> > > > >
> > > > > On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> > > > > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > > > >
> > > > > > The bridge device currently goes into promiscuous mode when it has an
> > > > > > upper with a different MAC address than itself. But it could do better:
> > > > > > it could sync the MAC addresses of its uppers to the software FDB, as
> > > > > > local entries pointing to the bridge itself. This is compatible with
> > > > > > switchdev, since drivers are now instructed to trap these MAC addresses
> > > > > > to the CPU.
> > > > > >
> > > > > > Note that the dev_uc_add API does not propagate VLAN ID, so this only
> > > > > > works for VLAN-unaware bridges.
> > > > >
> > > > > IOW, it breaks VLAN-aware bridges...
> > > > >
> > > > > I understand that you do not want to track bridge uppers, but once you
> > > > > look beyond L2 you will need to do it anyway.
> > > > >
> > > > > Currently, you only care about getting packets with specific DMACs to
> > > > > the CPU. With L3 offload you will need to send these packets to your
> > > > > router block instead and track other attributes of these uppers such as
> > > > > their MTU so that the hardware will know to generate MTU exceptions. In
> > > > > addition, the hardware needs to know the MAC addresses of these uppers
> > > > > so that it will rewrite the SMAC of forwarded packets.
> > > > 
> > > > Ok, let's say I want to track bridge uppers. How can I track the changes to
> > > > those interfaces' secondary addresses, in a way that keeps the association
> > > > with their VLAN ID, if those uppers are VLAN interfaces?
> > > 
> > > Hi,
> > > 
> > > I'm not sure what you mean by "secondary addresses", but the canonical
> > > way that I'm familiar with of adding MAC addresses to a netdev is to use
> > > macvlan uppers. For example:
> > > 
> > > # ip link add name br0 up type bridge vlan_filtering 1
> > > # ip link add link br0 name br0.10 type vlan id 10
> > > # ip link add link br0.10 name br0.10-v address 00:11:22:33:44:55 type macvlan mode private
> > > 
> > > keepalived uses it in VRRP virtual MAC mode (for example):
> > > https://github.com/acassen/keepalived/blob/master/doc/NOTE_vrrp_vmac.txt
> > > 
> > > In the software data path, this will result in br0 transitioning to
> > > promisc mode and passing all the packets to upper devices that will
> > > filter them.
> > > 
> > > In the hardware data path, you can apply promisc mode by flooding to
> > > your CPU port (I believe this is what you are trying to avoid) or
> > > install an FDB entry <00:11:22:33:44:55,10> that points to your CPU
> > > port.
> > 
> > Maybe the terminology is not the best, but by secondary addresses I mean
> > struct net_device :: uc and mc. To my knowledge, the MAC address of
> > vlan/macvlan uppers is not the only way in which these lists can be
> > populated. There is also AF_PACKET UAPI for PACKET_MR_MULTICAST and
> > PACKET_MR_UNICAST, and this ends up calling dev_mc_add() and
> > dev_uc_add(). User space may use this API to add a secondary address to
> > a VLAN upper interface of a bridge.
> 
> OK, I see the problem... So you want the bridge to support
> 'IFF_UNICAST_FLT' by installing local FDB entries? I see two potential
> problems:
> 
> 1. For VLAN-unaware bridges this is trivial as VLAN information is of no
> use. For VLAN-aware bridges we either need to communicate VLAN
> information from upper layers or install a local FDB entry per each
> configured VLAN (wasteful...). Note that VLAN information will not
> always be available (in PACKET_MR_UNICAST, for example), in which case a
> local FDB entry will need to be configured per each existing VLAN in
> order to maintain existing behavior. Which lead to me think about the
> second problem...
>
> 2. The bigger problem that I see is that if the bridge starts supporting
> 'IFF_UNICAST_FLT' by installing local FDB entries, then packets that
> were previously locally received and flooded will only be locally
> received. Only locally receiving them makes sense, but I don't know what
> will break if we change the existing behavior... Maybe this needs to be
> guarded by a new bridge option?

I think it boils down to whether PACKET_MR_UNICAST on br0 is equivalent to
'bridge fdb add dev br0 self permanent' or not. Theoretically, the
former means "if a packet enters the local termination path of br0,
don't drop it", while the other means "direct this MAC DA only towards
the local termination path of br0". I.o.w. the difference between "copy
to CPU" and "trap to CPU".

If we agree they aren't equivalent, and we also agree that a macvlan on
top of a bridge wants "trap to CPU" instead of "copy to CPU", I think
the only logical conclusion is that the communication mechanism between
the bridge and the macvlan that we're looking for doesn't exist -
dev_uc_add() does something slightly different.

Which is why I want to better understand your idea of having the bridge
track upper interfaces.

Essentially, it isn't the bridge local FDB entries that I have a problem with.
"Locally terminated packets that are also flooded on other bridge ports"
is a problem that DSA users have tried to get rid of for years, I didn't
hear a single complaint after we started fixing that. To me, a bridge
VLAN is by definition an L2 broadcast domain and MAC addresses should be
unique. I can't imagine what would break if we'd make the bridge deliver
the packets only to their known destination.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2022-02-24 13:52             ` Vladimir Oltean
@ 2022-03-01 16:20               ` Ido Schimmel
  2022-03-02 11:17                 ` Vladimir Oltean
  0 siblings, 1 reply; 34+ messages in thread
From: Ido Schimmel @ 2022-03-01 16:20 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Thu, Feb 24, 2022 at 03:52:41PM +0200, Vladimir Oltean wrote:
> On Thu, Feb 24, 2022 at 03:22:31PM +0200, Ido Schimmel wrote:
> > On Tue, Feb 22, 2022 at 07:18:10PM +0200, Vladimir Oltean wrote:
> > > On Tue, Feb 22, 2022 at 06:54:13PM +0200, Ido Schimmel wrote:
> > > > On Tue, Feb 22, 2022 at 01:21:53PM +0200, Vladimir Oltean wrote:
> > > > > Hi Ido,
> > > > > 
> > > > > On Mon, 1 Mar 2021 at 17:22, Ido Schimmel <idosch@idosch.org> wrote:
> > > > > >
> > > > > > On Wed, Feb 24, 2021 at 01:43:38PM +0200, Vladimir Oltean wrote:
> > > > > > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > > > > >
> > > > > > > The bridge device currently goes into promiscuous mode when it has an
> > > > > > > upper with a different MAC address than itself. But it could do better:
> > > > > > > it could sync the MAC addresses of its uppers to the software FDB, as
> > > > > > > local entries pointing to the bridge itself. This is compatible with
> > > > > > > switchdev, since drivers are now instructed to trap these MAC addresses
> > > > > > > to the CPU.
> > > > > > >
> > > > > > > Note that the dev_uc_add API does not propagate VLAN ID, so this only
> > > > > > > works for VLAN-unaware bridges.
> > > > > >
> > > > > > IOW, it breaks VLAN-aware bridges...
> > > > > >
> > > > > > I understand that you do not want to track bridge uppers, but once you
> > > > > > look beyond L2 you will need to do it anyway.
> > > > > >
> > > > > > Currently, you only care about getting packets with specific DMACs to
> > > > > > the CPU. With L3 offload you will need to send these packets to your
> > > > > > router block instead and track other attributes of these uppers such as
> > > > > > their MTU so that the hardware will know to generate MTU exceptions. In
> > > > > > addition, the hardware needs to know the MAC addresses of these uppers
> > > > > > so that it will rewrite the SMAC of forwarded packets.
> > > > > 
> > > > > Ok, let's say I want to track bridge uppers. How can I track the changes to
> > > > > those interfaces' secondary addresses, in a way that keeps the association
> > > > > with their VLAN ID, if those uppers are VLAN interfaces?
> > > > 
> > > > Hi,
> > > > 
> > > > I'm not sure what you mean by "secondary addresses", but the canonical
> > > > way that I'm familiar with of adding MAC addresses to a netdev is to use
> > > > macvlan uppers. For example:
> > > > 
> > > > # ip link add name br0 up type bridge vlan_filtering 1
> > > > # ip link add link br0 name br0.10 type vlan id 10
> > > > # ip link add link br0.10 name br0.10-v address 00:11:22:33:44:55 type macvlan mode private
> > > > 
> > > > keepalived uses it in VRRP virtual MAC mode (for example):
> > > > https://github.com/acassen/keepalived/blob/master/doc/NOTE_vrrp_vmac.txt
> > > > 
> > > > In the software data path, this will result in br0 transitioning to
> > > > promisc mode and passing all the packets to upper devices that will
> > > > filter them.
> > > > 
> > > > In the hardware data path, you can apply promisc mode by flooding to
> > > > your CPU port (I believe this is what you are trying to avoid) or
> > > > install an FDB entry <00:11:22:33:44:55,10> that points to your CPU
> > > > port.
> > > 
> > > Maybe the terminology is not the best, but by secondary addresses I mean
> > > struct net_device :: uc and mc. To my knowledge, the MAC address of
> > > vlan/macvlan uppers is not the only way in which these lists can be
> > > populated. There is also AF_PACKET UAPI for PACKET_MR_MULTICAST and
> > > PACKET_MR_UNICAST, and this ends up calling dev_mc_add() and
> > > dev_uc_add(). User space may use this API to add a secondary address to
> > > a VLAN upper interface of a bridge.
> > 
> > OK, I see the problem... So you want the bridge to support
> > 'IFF_UNICAST_FLT' by installing local FDB entries? I see two potential
> > problems:
> > 
> > 1. For VLAN-unaware bridges this is trivial as VLAN information is of no
> > use. For VLAN-aware bridges we either need to communicate VLAN
> > information from upper layers or install a local FDB entry per each
> > configured VLAN (wasteful...). Note that VLAN information will not
> > always be available (in PACKET_MR_UNICAST, for example), in which case a
> > local FDB entry will need to be configured per each existing VLAN in
> > order to maintain existing behavior. Which lead to me think about the
> > second problem...
> >
> > 2. The bigger problem that I see is that if the bridge starts supporting
> > 'IFF_UNICAST_FLT' by installing local FDB entries, then packets that
> > were previously locally received and flooded will only be locally
> > received. Only locally receiving them makes sense, but I don't know what
> > will break if we change the existing behavior... Maybe this needs to be
> > guarded by a new bridge option?
> 
> I think it boils down to whether PACKET_MR_UNICAST on br0 is equivalent to
> 'bridge fdb add dev br0 self permanent' or not. Theoretically, the
> former means "if a packet enters the local termination path of br0,
> don't drop it", 

Trying to understand the first part of the sentence, are you saying that
if user space decides to use this interface, then it is up to it to
ensure that packets with the given unicast address are terminated on the
bridge? That is, it is up to user space to install the necessary
permanent FDB record? I think that is fair, it is just that right now
this operation does something else and causes all the packets forwarded
via the bridge to be locally terminated. Most of them will then be
dropped by upper layers. I don't think this was the author's intention,
it seems like an unfortunate side effect of current implementation. This
behavior is even more ridiculous when you take hardware offload into
account, as usually the CPU is unable to handle all these packets.

> while the other means "direct this MAC DA only towards
> the local termination path of br0".

This I agree with.

> I.o.w. the difference between "copy to CPU" and "trap to CPU".
> 
> If we agree they aren't equivalent, and we also agree that a macvlan on
> top of a bridge wants "trap to CPU" instead of "copy to CPU", I think
> the only logical conclusion is that the communication mechanism between
> the bridge and the macvlan that we're looking for doesn't exist -
> dev_uc_add() does something slightly different.
> 
> Which is why I want to better understand your idea of having the bridge
> track upper interfaces.

In my case these upper interfaces are actually router interfaces and I'm
interested in their MAC (in addition to other attributes) to know which
FDB entry to program towards the router port (your CPU port) on ingress
and which SA to use on egress (the hardware has limitations on SAs).

I'm pretty sure bridge maintainers will not agree to have this code in
the bridge driver in which case you can implement this in DSA. Should be
quite simple as I guess most configurations use VLANs/MACVLANs uppers.

> 
> Essentially, it isn't the bridge local FDB entries that I have a problem with.
> "Locally terminated packets that are also flooded on other bridge ports"
> is a problem that DSA users have tried to get rid of for years, I didn't
> hear a single complaint after we started fixing that. To me, a bridge
> VLAN is by definition an L2 broadcast domain and MAC addresses should be
> unique. I can't imagine what would break if we'd make the bridge deliver
> the packets only to their known destination.

I agree, can't come up with an example

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device
  2022-03-01 16:20               ` Ido Schimmel
@ 2022-03-02 11:17                 ` Vladimir Oltean
  0 siblings, 0 replies; 34+ messages in thread
From: Vladimir Oltean @ 2022-03-02 11:17 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Jiri Pirko, DENG Qingfang, Tobias Waldekranz, George McCollister,
	Vlad Yasevich, Roopa Prabhu, Nikolay Aleksandrov

On Tue, Mar 01, 2022 at 06:20:39PM +0200, Ido Schimmel wrote:
> > > OK, I see the problem... So you want the bridge to support
> > > 'IFF_UNICAST_FLT' by installing local FDB entries? I see two potential
> > > problems:
> > > 
> > > 1. For VLAN-unaware bridges this is trivial as VLAN information is of no
> > > use. For VLAN-aware bridges we either need to communicate VLAN
> > > information from upper layers or install a local FDB entry per each
> > > configured VLAN (wasteful...). Note that VLAN information will not
> > > always be available (in PACKET_MR_UNICAST, for example), in which case a
> > > local FDB entry will need to be configured per each existing VLAN in
> > > order to maintain existing behavior. Which lead to me think about the
> > > second problem...
> > >
> > > 2. The bigger problem that I see is that if the bridge starts supporting
> > > 'IFF_UNICAST_FLT' by installing local FDB entries, then packets that
> > > were previously locally received and flooded will only be locally
> > > received. Only locally receiving them makes sense, but I don't know what
> > > will break if we change the existing behavior... Maybe this needs to be
> > > guarded by a new bridge option?
> > 
> > I think it boils down to whether PACKET_MR_UNICAST on br0 is equivalent to
> > 'bridge fdb add dev br0 self permanent' or not. Theoretically, the
> > former means "if a packet enters the local termination path of br0,
> > don't drop it", 
> 
> Trying to understand the first part of the sentence, are you saying that
> if user space decides to use this interface, then it is up to it to
> ensure that packets with the given unicast address are terminated on the
> bridge? That is, it is up to user space to install the necessary
> permanent FDB record?

This first part of the sentence is just wondering whether it is even
sane to make the bridge driver essentially provide an implementation for
PACKET_MR_UNICAST, and translate that into a local FDB entry which means
something else. User space can already install a local FDB entry with
the MAC address of the upper interface, and this will behave closer to
what is expected.

If the bridge ever implements the support for PACKET_MR_UNICAST, a new
FDB entry flag is probably needed, for local reception. If the MAC
address added with PACKET_MR_UNICAST is new, the bridge would create an
entry with fdb->dst = NULL. If it already exists, it would keep the
existing fdb->dst and just mark the local reception flag as true.
This is to comply with the "copy to CPU" semantics instead of altering
the forwarding destination. I'm not sure whether there are real use
cases beyond just complying to expected semantics.

> I think that is fair, it is just that right now this operation does
> something else and causes all the packets forwarded via the bridge to
> be locally terminated. Most of them will then be dropped by upper
> layers. I don't think this was the author's intention, it seems like
> an unfortunate side effect of current implementation.

Do you mean here that the "something else" is to turn on promiscuous
mode for the bridge, and this makes local_rcv = true for every packet in
br_handle_frame_finish?

Yes, that is a problem. The dev_uc_add() calls will keep the bridge's
promiscuity at 1, with no way to turn it back to 0 from user space.
To get rid of this we'd need to declare IFF_UNICAST_FLT at the very
least.

> This behavior is even more ridiculous when you take hardware offload
> into account, as usually the CPU is unable to handle all these
> packets.

If we keep the analogy that a PACKET_MR_UNICAST means "copy MAC address
X to CPU", then IFF_PROMISC means "copy all packets to CPU", no?

So I wouldn't say the behavior is even more ridiculous, it is just as
ridiculous, just on a different level. And maybe not even "ridiculous",
just "highly sub-optimal". Ridiculous would be to not comply to the
expected behavior.

> > while the other means "direct this MAC DA only towards
> > the local termination path of br0".
> 
> This I agree with.
> 
> > I.o.w. the difference between "copy to CPU" and "trap to CPU".
> > 
> > If we agree they aren't equivalent, and we also agree that a macvlan on
> > top of a bridge wants "trap to CPU" instead of "copy to CPU", I think
> > the only logical conclusion is that the communication mechanism between
> > the bridge and the macvlan that we're looking for doesn't exist -
> > dev_uc_add() does something slightly different.
> > 
> > Which is why I want to better understand your idea of having the bridge
> > track upper interfaces.
> 
> In my case these upper interfaces are actually router interfaces and I'm
> interested in their MAC (in addition to other attributes) to know which
> FDB entry to program towards the router port (your CPU port) on ingress
> and which SA to use on egress (the hardware has limitations on SAs).
> 
> I'm pretty sure bridge maintainers will not agree to have this code in
> the bridge driver in which case you can implement this in DSA. Should be
> quite simple as I guess most configurations use VLANs/MACVLANs uppers.

Yes, but this will record only the dev_addr of those upper interfaces.
It would not be fully compliant with what user space can ask for.

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-03-02 11:19 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-24 11:43 [RFC PATCH v2 net-next 00/17] RX filtering in DSA Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 01/17] net: dsa: reference count the host mdb addresses Vladimir Oltean
2021-02-26  9:20   ` Tobias Waldekranz
2021-02-24 11:43 ` [RFC PATCH v2 net-next 02/17] net: dsa: reference count the host fdb addresses Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 03/17] net: dsa: install the host MDB and FDB entries in the master's RX filter Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 04/17] net: dsa: install the port MAC addresses as host fdb entries Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 05/17] net: bridge: implement unicast filtering for the bridge device Vladimir Oltean
2021-03-01 15:22   ` Ido Schimmel
2022-02-22 11:21     ` Vladimir Oltean
2022-02-22 16:54       ` Ido Schimmel
2022-02-22 17:18         ` Vladimir Oltean
2022-02-24 13:22           ` Ido Schimmel
2022-02-24 13:52             ` Vladimir Oltean
2022-03-01 16:20               ` Ido Schimmel
2022-03-02 11:17                 ` Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 06/17] net: dsa: add addresses obtained from RX filtering to host addresses Vladimir Oltean
2021-02-26 10:59   ` Tobias Waldekranz
2021-02-26 13:28     ` Vladimir Oltean
2021-02-26 22:44       ` Tobias Waldekranz
2021-02-24 11:43 ` [RFC PATCH v2 net-next 07/17] net: bridge: switchdev: refactor br_switchdev_fdb_notify Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 08/17] net: bridge: switchdev: include local flag in FDB notifications Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 09/17] net: bridge: switchdev: send FDB notifications for host addresses Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 10/17] net: dsa: include bridge addresses which are local in the host fdb list Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 11/17] net: dsa: include fdb entries pointing to bridge " Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 12/17] net: dsa: sync static FDB entries on foreign interfaces to hardware Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 13/17] net: dsa: mv88e6xxx: Request assisted learning on CPU port Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 14/17] net: dsa: replay port and host-joined mdb entries when joining the bridge Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 15/17] net: dsa: replay port and local fdb " Vladimir Oltean
2021-02-26 12:23   ` Tobias Waldekranz
2021-02-26 18:08     ` Vladimir Oltean
2021-02-24 11:43 ` [RFC PATCH v2 net-next 16/17] net: bridge: switchdev: let drivers inform which bridge ports are offloaded Vladimir Oltean
2021-02-24 14:25   ` kernel test robot
2021-02-24 11:43 ` [RFC PATCH v2 net-next 17/17] net: bridge: offloaded ports are always promiscuous Vladimir Oltean
2021-02-24 15:21   ` kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.