All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags
@ 2022-03-17  6:50 Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 1/5] switchdev: Add local_receive attribute Mattias Forsblad
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17  6:50 UTC (permalink / raw)
  To: netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

Greetings,

This series implements new bridge flood flags
{flood,mcast_flood,bcast_flood}
and HW offloading for Marvell mv88e6xxx.

When using a non-VLAN filtering bridge we want to be able to limit
traffic to the CPU port to lessen the CPU load. This is specially
important when we have disabled learning on user ports.

A sample configuration could be something like this:

       br0
      /   \
   swp0   swp1

ip link add dev br0 type bridge stp_state 0 vlan_filtering 0
ip link set swp0 master br0
ip link set swp1 master br0
ip link set swp0 type bridge_slave learning off
ip link set swp1 type bridge_slave learning off
ip link set swp0 up
ip link set swp1 up
ip link set br0 type bridge flood 0 mcast_flood 0 bcast_flood 0
ip link set br0 up

To further explain the reasoning for this please refer to post by
Tobias Waldekranz:
https://lore.kernel.org/netdev/87ilsxo052.fsf@waldekranz.com/

The first part(1,2) of the series implements the flags for the SW bridge
and the second part(3) the DSA infrastructure. Part (4) implements
offloading of this flag to HW for mv88e6xxx, which uses the
port vlan table to restrict the ingress from user ports
to the CPU port when all of the flag is cleared. Part (5) adds
selftests for these flags.

v2 -> v3:
  - Fixed compile warnings (Jakub Kicinski, lkp@intel.com)
  
v1 -> v2:
  - Split patch series in a more consistent way (Ido Shimmel)
  - Drop sysfs implementation (Ido, Nikolay Aleksandrov)
  - Change to use the boolopt API (Nikolay)
  - Drop ioctl implementation (Nikolay)
  - Split and rename local_receive to match bridge_slave
    {flood,mcast_flood,bcast_flood} (Ido)
  - Only handle the flags at apropiate places in the hot-path (Ido)
  - Add selftest (Ido)
  
Mattias Forsblad (5):
  switchdev: Add local_receive attribute
  net: bridge: Implement bridge flood flag
  dsa: Handle the flood flag in the DSA layer.
  mv88e6xxx: Offload the flood flag
  selftest: Add bridge flood flag tests

 drivers/net/dsa/mv88e6xxx/chip.c              |  45 ++++-
 include/linux/if_bridge.h                     |   6 +
 include/net/dsa.h                             |   7 +
 include/net/switchdev.h                       |   1 +
 include/uapi/linux/if_bridge.h                |   9 +-
 net/bridge/br.c                               |  46 +++++
 net/bridge/br_device.c                        |   3 +
 net/bridge/br_input.c                         |  23 ++-
 net/bridge/br_private.h                       |   4 +
 net/dsa/dsa_priv.h                            |   2 +
 net/dsa/slave.c                               |  18 ++
 .../testing/selftests/net/forwarding/Makefile |   1 +
 .../selftests/net/forwarding/bridge_flood.sh  | 169 ++++++++++++++++++
 tools/testing/selftests/net/forwarding/lib.sh |   8 +
 14 files changed, 335 insertions(+), 7 deletions(-)
 create mode 100755 tools/testing/selftests/net/forwarding/bridge_flood.sh

-- 
2.25.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/5] switchdev: Add local_receive attribute
  2022-03-17  6:50 [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags Mattias Forsblad
@ 2022-03-17  6:50 ` Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 2/5] net: bridge: Implement bridge flood flag Mattias Forsblad
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17  6:50 UTC (permalink / raw)
  To: netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

This commit adds the local_receive switchdev attribute in preparation
for bridge usage.

Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
---
 include/net/switchdev.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 3e424d40fae3..f4c1671c2561 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -28,6 +28,7 @@ enum switchdev_attr_id {
 	SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED,
 	SWITCHDEV_ATTR_ID_BRIDGE_MROUTER,
 	SWITCHDEV_ATTR_ID_MRP_PORT_ROLE,
+	SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
 };
 
 struct switchdev_brport_flags {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17  6:50 [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 1/5] switchdev: Add local_receive attribute Mattias Forsblad
@ 2022-03-17  6:50 ` Mattias Forsblad
  2022-03-17  9:07   ` Joachim Wiberg
  2022-03-17 10:11   ` Nikolay Aleksandrov
  2022-03-17  6:50 ` [PATCH 3/5] dsa: Handle the flood flag in the DSA layer Mattias Forsblad
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17  6:50 UTC (permalink / raw)
  To: netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

This patch implements the bridge flood flags. There are three different
flags matching unicast, multicast and broadcast. When the corresponding
flag is cleared packets received on bridge ports will not be flooded
towards the bridge.
This makes is possible to only forward selected traffic between the
port members of the bridge.

Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
---
 include/linux/if_bridge.h      |  6 +++++
 include/uapi/linux/if_bridge.h |  9 ++++++-
 net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
 net/bridge/br_device.c         |  3 +++
 net/bridge/br_input.c          | 23 ++++++++++++++---
 net/bridge/br_private.h        |  4 +++
 6 files changed, 85 insertions(+), 5 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 3aae023a9353..fa8e000a6fb9 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
 struct net_device *br_fdb_find_port(const struct net_device *br_dev,
 				    const unsigned char *addr,
 				    __u16 vid);
+bool br_flood_enabled(const struct net_device *dev);
 void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
 bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
 u8 br_port_get_stp_state(const struct net_device *dev);
@@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
 	return NULL;
 }
 
+static inline bool br_flood_enabled(const struct net_device *dev)
+{
+	return true;
+}
+
 static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
 {
 }
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 2711c3522010..765ed70c9b28 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -72,6 +72,7 @@ struct __bridge_info {
 	__u32 tcn_timer_value;
 	__u32 topology_change_timer_value;
 	__u32 gc_timer_value;
+	__u8 flood;
 };
 
 struct __port_info {
@@ -752,13 +753,19 @@ struct br_mcast_stats {
 /* bridge boolean options
  * BR_BOOLOPT_NO_LL_LEARN - disable learning from link-local packets
  * BR_BOOLOPT_MCAST_VLAN_SNOOPING - control vlan multicast snooping
+ * BR_BOOLOPT_FLOOD - control bridge flood flag
+ * BR_BOOLOPT_MCAST_FLOOD - control bridge multicast flood flag
+ * BR_BOOLOPT_BCAST_FLOOD - control bridge broadcast flood flag
  *
  * IMPORTANT: if adding a new option do not forget to handle
- *            it in br_boolopt_toggle/get and bridge sysfs
+ *            it in br_boolopt_toggle/get
  */
 enum br_boolopt_id {
 	BR_BOOLOPT_NO_LL_LEARN,
 	BR_BOOLOPT_MCAST_VLAN_SNOOPING,
+	BR_BOOLOPT_FLOOD,
+	BR_BOOLOPT_MCAST_FLOOD,
+	BR_BOOLOPT_BCAST_FLOOD,
 	BR_BOOLOPT_MAX
 };
 
diff --git a/net/bridge/br.c b/net/bridge/br.c
index b1dea3febeea..63a17bed6c63 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -265,6 +265,11 @@ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on,
 	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
 		err = br_multicast_toggle_vlan_snooping(br, on, extack);
 		break;
+	case BR_BOOLOPT_FLOOD:
+	case BR_BOOLOPT_MCAST_FLOOD:
+	case BR_BOOLOPT_BCAST_FLOOD:
+		err = br_flood_toggle(br, opt, on);
+		break;
 	default:
 		/* shouldn't be called with unsupported options */
 		WARN_ON(1);
@@ -281,6 +286,12 @@ int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt)
 		return br_opt_get(br, BROPT_NO_LL_LEARN);
 	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
 		return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED);
+	case BR_BOOLOPT_FLOOD:
+		return br_opt_get(br, BROPT_FLOOD);
+	case BR_BOOLOPT_MCAST_FLOOD:
+		return br_opt_get(br, BROPT_MCAST_FLOOD);
+	case BR_BOOLOPT_BCAST_FLOOD:
+		return br_opt_get(br, BROPT_BCAST_FLOOD);
 	default:
 		/* shouldn't be called with unsupported options */
 		WARN_ON(1);
@@ -325,6 +336,40 @@ void br_boolopt_multi_get(const struct net_bridge *br,
 	bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0);
 }
 
+int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt,
+		    bool on)
+{
+	struct switchdev_attr attr = {
+		.orig_dev = br->dev,
+		.id = SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
+		.flags = SWITCHDEV_F_DEFER,
+	};
+	enum net_bridge_opts bropt;
+	int ret;
+
+	switch (opt) {
+	case BR_BOOLOPT_FLOOD:
+		bropt = BROPT_FLOOD;
+		break;
+	case BR_BOOLOPT_MCAST_FLOOD:
+		bropt = BROPT_MCAST_FLOOD;
+		break;
+	case BR_BOOLOPT_BCAST_FLOOD:
+		bropt = BROPT_BCAST_FLOOD;
+		break;
+	default:
+		WARN_ON(1);
+		return -EINVAL;
+	}
+	br_opt_toggle(br, bropt, on);
+
+	attr.u.brport_flags.mask = BIT(bropt);
+	attr.u.brport_flags.val = on << bropt;
+	ret = switchdev_port_attr_set(br->dev, &attr, NULL);
+
+	return ret;
+}
+
 /* private bridge options, controlled by the kernel */
 void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
 {
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8d6bab244c4a..fafaef9d4b3a 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -524,6 +524,9 @@ void br_dev_setup(struct net_device *dev)
 	br->bridge_hello_time = br->hello_time = 2 * HZ;
 	br->bridge_forward_delay = br->forward_delay = 15 * HZ;
 	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
+	br_opt_toggle(br, BROPT_FLOOD, true);
+	br_opt_toggle(br, BROPT_MCAST_FLOOD, true);
+	br_opt_toggle(br, BROPT_BCAST_FLOOD, true);
 	dev->max_mtu = ETH_MAX_MTU;
 
 	br_netfilter_rtable_init(br);
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index e0c13fcc50ed..fcb0757bfdcc 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 		/* by definition the broadcast is also a multicast address */
 		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
 			pkt_type = BR_PKT_BROADCAST;
-			local_rcv = true;
+			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
 		} else {
 			pkt_type = BR_PKT_MULTICAST;
-			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
-				goto drop;
+			if (br_opt_get(br, BROPT_MCAST_FLOOD))
+				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
+					goto drop;
 		}
 	}
 
@@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 			local_rcv = true;
 			br->dev->stats.multicast++;
 		}
+		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
+			local_rcv = false;
 		break;
 	case BR_PKT_UNICAST:
 		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
+		if (!br_opt_get(br, BROPT_FLOOD))
+			local_rcv = false;
 		break;
 	default:
 		break;
@@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 	if (dst) {
 		unsigned long now = jiffies;
 
-		if (test_bit(BR_FDB_LOCAL, &dst->flags))
+		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
 			return br_pass_frame_up(skb);
 
 		if (now != dst->used)
@@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 }
 EXPORT_SYMBOL_GPL(br_handle_frame_finish);
 
+bool br_flood_enabled(const struct net_device *dev)
+{
+	struct net_bridge *br = netdev_priv(dev);
+
+	return !!(br_opt_get(br, BROPT_FLOOD) ||
+		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
+		   br_opt_get(br, BROPT_BCAST_FLOOD));
+}
+EXPORT_SYMBOL_GPL(br_flood_enabled);
+
 static void __br_handle_local_finish(struct sk_buff *skb)
 {
 	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 48bc61ebc211..cf88dce0b92b 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -445,6 +445,9 @@ enum net_bridge_opts {
 	BROPT_NO_LL_LEARN,
 	BROPT_VLAN_BRIDGE_BINDING,
 	BROPT_MCAST_VLAN_SNOOPING_ENABLED,
+	BROPT_FLOOD,
+	BROPT_MCAST_FLOOD,
+	BROPT_BCAST_FLOOD,
 };
 
 struct net_bridge {
@@ -720,6 +723,7 @@ int br_boolopt_multi_toggle(struct net_bridge *br,
 void br_boolopt_multi_get(const struct net_bridge *br,
 			  struct br_boolopt_multi *bm);
 void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on);
+int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on);
 
 /* br_device.c */
 void br_dev_setup(struct net_device *dev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/5] dsa: Handle the flood flag in the DSA layer.
  2022-03-17  6:50 [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 1/5] switchdev: Add local_receive attribute Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 2/5] net: bridge: Implement bridge flood flag Mattias Forsblad
@ 2022-03-17  6:50 ` Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 4/5] mv88e6xxx: Offload the flood flag Mattias Forsblad
  2022-03-17  6:50 ` [PATCH 5/5] selftest: Add bridge flood flag tests Mattias Forsblad
  4 siblings, 0 replies; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17  6:50 UTC (permalink / raw)
  To: netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

Add infrastructure to be able to handle the flood
flag in the DSA layer.

Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
---
 include/net/dsa.h  |  7 +++++++
 net/dsa/dsa_priv.h |  2 ++
 net/dsa/slave.c    | 18 ++++++++++++++++++
 3 files changed, 27 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 9bfe984fcdbf..fcb47dc832e1 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -939,6 +939,13 @@ struct dsa_switch_ops {
 	void	(*get_regs)(struct dsa_switch *ds, int port,
 			    struct ethtool_regs *regs, void *p);
 
+	/*
+	 * Local receive
+	 */
+	int	(*set_flood)(struct dsa_switch *ds, int port,
+				     struct net_device *bridge, unsigned long mask,
+				     unsigned long val);
+
 	/*
 	 * Upper device tracking.
 	 */
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index f20bdd8ea0a8..ca3ea320c8eb 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -234,6 +234,8 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 			    struct netlink_ext_ack *extack);
 bool dsa_port_skip_vlan_configuration(struct dsa_port *dp);
 int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock);
+int dsa_port_set_flood(struct dsa_port *dp, struct net_device *br, unsigned long mask,
+		       unsigned long val);
 int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu,
 			bool targeted_match);
 int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index d24b6bf845c1..f3d780e2b42b 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -458,6 +458,13 @@ static int dsa_slave_port_attr_set(struct net_device *dev, const void *ctx,
 		ret = dsa_port_vlan_filtering(dp, attr->u.vlan_filtering,
 					      extack);
 		break;
+	case SWITCHDEV_ATTR_ID_BRIDGE_FLOOD:
+		if (!dsa_port_offloads_bridge_dev(dp, attr->orig_dev))
+			return -EOPNOTSUPP;
+
+		ret = dsa_port_set_flood(dp, attr->orig_dev, attr->u.brport_flags.mask,
+					 attr->u.brport_flags.val);
+		break;
 	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
 		if (!dsa_port_offloads_bridge_dev(dp, attr->orig_dev))
 			return -EOPNOTSUPP;
@@ -834,6 +841,17 @@ dsa_slave_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *_p)
 		ds->ops->get_regs(ds, dp->index, regs, _p);
 }
 
+int dsa_port_set_flood(struct dsa_port *dp, struct net_device *br, unsigned long mask,
+		       unsigned long val)
+{
+	struct dsa_switch *ds = dp->ds;
+
+	if (ds->ops->set_flood)
+		return ds->ops->set_flood(ds, dp->index, br, mask, val);
+
+	return 0;
+}
+
 static int dsa_slave_nway_reset(struct net_device *dev)
 {
 	struct dsa_port *dp = dsa_slave_to_port(dev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/5] mv88e6xxx: Offload the flood flag
  2022-03-17  6:50 [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags Mattias Forsblad
                   ` (2 preceding siblings ...)
  2022-03-17  6:50 ` [PATCH 3/5] dsa: Handle the flood flag in the DSA layer Mattias Forsblad
@ 2022-03-17  6:50 ` Mattias Forsblad
  2022-03-17 13:05   ` Vladimir Oltean
  2022-03-17  6:50 ` [PATCH 5/5] selftest: Add bridge flood flag tests Mattias Forsblad
  4 siblings, 1 reply; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17  6:50 UTC (permalink / raw)
  To: netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

Use the port vlan table to restrict ingressing traffic to the
CPU port if the flood flags are cleared.

Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 45 ++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 84b90fc36c58..39347a05c3a5 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1384,6 +1384,7 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
 	struct dsa_switch *ds = chip->ds;
 	struct dsa_switch_tree *dst = ds->dst;
 	struct dsa_port *dp, *other_dp;
+	bool flood = true;
 	bool found = false;
 	u16 pvlan;
 
@@ -1425,6 +1426,9 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
 
 	pvlan = 0;
 
+	if (dp->bridge)
+		flood = br_flood_enabled(dp->bridge->dev);
+
 	/* Frames from standalone user ports can only egress on the
 	 * upstream port.
 	 */
@@ -1433,10 +1437,11 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
 
 	/* Frames from bridged user ports can egress any local DSA
 	 * links and CPU ports, as well as any local member of their
-	 * bridge group.
+	 * as well as any local member of their bridge group. However, CPU ports
+	 * are omitted if flood is cleared.
 	 */
 	dsa_switch_for_each_port(other_dp, ds)
-		if (other_dp->type == DSA_PORT_TYPE_CPU ||
+		if ((other_dp->type == DSA_PORT_TYPE_CPU && flood) ||
 		    other_dp->type == DSA_PORT_TYPE_DSA ||
 		    dsa_port_bridge_same(dp, other_dp))
 			pvlan |= BIT(other_dp->index);
@@ -2718,6 +2723,41 @@ static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds,
 	mv88e6xxx_reg_unlock(chip);
 }
 
+static int mv88e6xxx_set_flood(struct dsa_switch *ds, int port, struct net_device *br,
+			       unsigned long mask, unsigned long val)
+{
+	struct mv88e6xxx_chip *chip = ds->priv;
+	struct dsa_bridge *bridge;
+	struct dsa_port *dp;
+	bool found = false;
+	int err;
+
+	if (!netif_is_bridge_master(br))
+		return 0;
+
+	list_for_each_entry(dp, &ds->dst->ports, list) {
+		if (dp->ds == ds && dp->index == port) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found)
+		return 0;
+
+	bridge = dp->bridge;
+	if (!bridge)
+		return 0;
+
+	mv88e6xxx_reg_lock(chip);
+
+	err = mv88e6xxx_bridge_map(chip, *bridge);
+
+	mv88e6xxx_reg_unlock(chip);
+
+	return err;
+}
+
 static int mv88e6xxx_software_reset(struct mv88e6xxx_chip *chip)
 {
 	if (chip->info->ops->reset)
@@ -6478,6 +6518,7 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
 	.set_eeprom		= mv88e6xxx_set_eeprom,
 	.get_regs_len		= mv88e6xxx_get_regs_len,
 	.get_regs		= mv88e6xxx_get_regs,
+	.set_flood		= mv88e6xxx_set_flood,
 	.get_rxnfc		= mv88e6xxx_get_rxnfc,
 	.set_rxnfc		= mv88e6xxx_set_rxnfc,
 	.set_ageing_time	= mv88e6xxx_set_ageing_time,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/5] selftest: Add bridge flood flag tests
  2022-03-17  6:50 [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags Mattias Forsblad
                   ` (3 preceding siblings ...)
  2022-03-17  6:50 ` [PATCH 4/5] mv88e6xxx: Offload the flood flag Mattias Forsblad
@ 2022-03-17  6:50 ` Mattias Forsblad
  4 siblings, 0 replies; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17  6:50 UTC (permalink / raw)
  To: netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

Add test to check that the bridge flood flags works correctly.
When the bridge flag {flood,mcast_flood,bcast_flood} are cleared
no packets of the corresponding type should be flooded to the
bridge.

Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
---
 .../testing/selftests/net/forwarding/Makefile |   1 +
 .../selftests/net/forwarding/bridge_flood.sh  | 169 ++++++++++++++++++
 tools/testing/selftests/net/forwarding/lib.sh |   8 +
 3 files changed, 178 insertions(+)
 create mode 100755 tools/testing/selftests/net/forwarding/bridge_flood.sh

diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile
index 8fa97ae9af9e..24ca6a333edd 100644
--- a/tools/testing/selftests/net/forwarding/Makefile
+++ b/tools/testing/selftests/net/forwarding/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0+ OR MIT
 
 TEST_PROGS = bridge_igmp.sh \
+	bridge_flood.sh \
 	bridge_locked_port.sh \
 	bridge_port_isolation.sh \
 	bridge_sticky_fdb.sh \
diff --git a/tools/testing/selftests/net/forwarding/bridge_flood.sh b/tools/testing/selftests/net/forwarding/bridge_flood.sh
new file mode 100755
index 000000000000..ea3e7da139aa
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/bridge_flood.sh
@@ -0,0 +1,169 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+ALL_TESTS="ping_test bridge_flood"
+NUM_NETIFS=4
+CHECK_TC="no"
+source lib.sh
+bridge=br3
+
+h1_create()
+{
+	simple_if_init $h1 192.0.2.1/24 2001:db8:1::1/64
+}
+
+h1_destroy()
+{
+	simple_if_fini $h1 192.0.2.1/24 2001:db8:1::1/64
+}
+
+h2_create()
+{
+	simple_if_init $h2 192.0.2.2/24 2001:db8:1::2/64
+}
+
+h2_destroy()
+{
+	simple_if_fini $h2 192.0.2.2/24 2001:db8:1::2/64
+}
+
+switch_create()
+{
+	ip link add dev $bridge type bridge
+
+	ip link set dev $swp1 master $bridge
+	ip link set dev $swp2 master $bridge
+	ip link set dev $swp1 type bridge_slave learning off
+	ip link set dev $swp2 type bridge_slave learning off
+
+	ip link set dev $bridge type bridge flood 0 mcast_flood 0 bcast_flood 0
+	check_err $? "Can't set bridge flooding off on $bridge"
+
+	ip link set dev $bridge up
+	ip link set dev $bridge promisc on
+	ip link set dev $swp1 up
+	ip link set dev $swp2 up
+}
+
+switch_destroy()
+{
+	ip link set dev $swp2 down
+	ip link set dev $swp1 down
+
+	ip link del dev $bridge
+}
+
+setup_prepare()
+{
+	h1=${NETIFS[p1]}
+	swp1=${NETIFS[p2]}
+
+	swp2=${NETIFS[p3]}
+	h2=${NETIFS[p4]}
+
+	vrf_prepare
+
+	h1_create
+	h2_create
+
+	switch_create
+}
+
+ping_test()
+{
+	echo "Check connectivity /w ping"
+	ping_do $h1 192.0.2.2
+	check_err $? "ping fail"
+	log_test "ping test"
+}
+
+cleanup()
+{
+	pre_cleanup
+
+	switch_destroy
+
+	h2_destroy
+	h1_destroy
+
+	vrf_cleanup
+}
+
+bridge_flood_test_do()
+{
+	local should_flood=$1
+	local mac=$2
+	local ip=$3
+	local host1_if=$4
+	local err=0
+	local vrf_name
+
+
+	# Add an ACL on `host2_if` which will tell us whether the packet
+	# was flooded to it or not.
+	tc qdisc add dev $bridge ingress
+	tc filter add dev $bridge ingress protocol ip pref 1 handle 101 \
+		flower dst_mac $mac action drop
+
+	vrf_name=$(master_name_get $host1_if)
+	ip vrf exec $vrf_name \
+		$MZ $host1_if -c 1 -p 64 -b $mac -B $ip -t ip -q
+	sleep 1
+
+	tc -j -s filter show dev $bridge ingress \
+		| jq -e ".[] | select(.options.handle == 101) \
+		| select(.options.actions[0].stats.packets == 1)" &> /dev/null
+	if [[ $? -ne 0 && $should_flood == "true" || \
+	      $? -eq 0 && $should_flood == "false" ]]; then
+		err=1
+	fi
+
+	tc filter del dev $bridge ingress protocol ip pref 1 handle 101 flower
+	tc qdisc del dev $bridge ingress
+
+	return $err
+}
+
+bridge_flood_test()
+{
+	local mac=$1
+	local ip=$2
+	local flag=$3
+
+	RET=0
+
+	ip link set dev $bridge type bridge $flag 0
+
+	bridge_flood_test_do false $mac $ip $h1 $bridge
+	check_err $? "Packet flooded when should not"
+	log_test "Bridge test flag $flag disabled"
+
+	ip link set dev $bridge type bridge $flag 1
+
+	bridge_flood_test_do true $mac $ip $h1 $bridge
+	check_err $? "Packet was not flooded when should"
+
+	log_test "Bridge test flag $flag enabled"
+}
+
+bridge_flood()
+{
+	RET=0
+
+	check_bridge_flood_support $bridge || return 0
+
+	bridge_flood_test de:ad:be:ef:13:37 192.0.2.100 flood
+
+	bridge_flood_test 01:00:5e:00:00:01 239.0.0.1 mcast_flood
+
+	bridge_flood_test ff:ff:ff:ff:ff:ff 192.0.2.100 bcast_flood
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+
+tests_run
+
+exit $EXIT_STATUS
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 664b9ecaf228..12e69837374e 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -134,6 +134,14 @@ check_locked_port_support()
 	fi
 }
 
+check_bridge_flood_support()
+{
+	if ! ip -d link show dev $1 | grep -q " flood"; then
+		echo "SKIP: iproute2 too old; Bridge flood feature not supported."
+		return $ksft_skip
+	fi
+}
+
 if [[ "$(id -u)" -ne 0 ]]; then
 	echo "SKIP: need root privileges"
 	exit $ksft_skip
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17  6:50 ` [PATCH 2/5] net: bridge: Implement bridge flood flag Mattias Forsblad
@ 2022-03-17  9:07   ` Joachim Wiberg
  2022-03-17 10:30     ` Nikolay Aleksandrov
  2022-03-17 11:39     ` Mattias Forsblad
  2022-03-17 10:11   ` Nikolay Aleksandrov
  1 sibling, 2 replies; 23+ messages in thread
From: Joachim Wiberg @ 2022-03-17  9:07 UTC (permalink / raw)
  To: Mattias Forsblad, netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Mattias Forsblad

On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
> This patch implements the bridge flood flags. There are three different
> flags matching unicast, multicast and broadcast. When the corresponding
> flag is cleared packets received on bridge ports will not be flooded
> towards the bridge.

If I've not completely misunderstood things, I believe the flood and
mcast_flood flags operate on unknown unicast and multicast.  With that
in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
add my own comments below.

Happy incident I saw this patch set, I have a very similar one for these
flags to the bridge itself, with the intent to improve handling of all
classes of multicast to/from the bridge itself.

> [snip]
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index e0c13fcc50ed..fcb0757bfdcc 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  		/* by definition the broadcast is also a multicast address */
>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>  			pkt_type = BR_PKT_BROADCAST;
> -			local_rcv = true;
> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);

Minor comment, I believe the preferred style is more like this:

	if (br_opt_get(br, BROPT_BCAST_FLOOD))
        	local_rcv = true;

>  		} else {
>  			pkt_type = BR_PKT_MULTICAST;
> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> -				goto drop;
> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> +					goto drop;

Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
we cannot bypass the call to br_multicast_rcv(), which helps with the
classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
router ports, while the mdb lookup (below) is what an tell us if we
have uknown multicast and there we can check the BROPT_MCAST_FLOOD
flag for the bridge itself.

>  		}
>  	}
>  
> @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  			local_rcv = true;
>  			br->dev->stats.multicast++;
>  		}
> +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
> +			local_rcv = false;

We should never set local_rcv to false, only ever use constructs that
set it to true.  Here the PROMISC flag (above) condition would be
negated, which would be a regression.

Instead, for multicast I believe we should ensure that we reach the
else statement for unknown IP multicast, preventing mcast_hit from
being set, and instead flood unknown multicast using br_flood().

This is a bigger change that involves:

  1) dropping the mrouters_only skb flag for unknown multicast,
     keeping it only for IGMP/MLD reports
  2) extending br_flood() slightly to flood unknown multicast
     also to mcast_router ports

As I mentioned above, I have some patches, including selftests, for
forwarding known/unknown multicast using the mdb and mcast_flood +
mcast_router flags.  Maybe we should combine efforts here somehow?

>  		break;
>  	case BR_PKT_UNICAST:
>  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
> +		if (!br_opt_get(br, BROPT_FLOOD))
> +			local_rcv = false;

Again, never set it to false, invert the check instead, like this:

		if (!dst && br_opt_get(br, BROPT_FLOOD))
			local_rcv = true;

>  		break;
>  	default:
>  		break;
> @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  	if (dst) {
>  		unsigned long now = jiffies;
>  
> -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
> +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
>  			return br_pass_frame_up(skb);

I believe this would break both the flooding of unknown multicast and
the PROMISC case.  Down here we are broadcast or known/unknown multicast
land, so the local_rcv flag should be sufficient.

>  		if (now != dst->used)
> @@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  }
>  EXPORT_SYMBOL_GPL(br_handle_frame_finish);
>  
> +bool br_flood_enabled(const struct net_device *dev)
> +{
> +	struct net_bridge *br = netdev_priv(dev);
> +
> +	return !!(br_opt_get(br, BROPT_FLOOD) ||
> +		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
> +		   br_opt_get(br, BROPT_BCAST_FLOOD));

Minor nit, don't know what the rest of the list feels about this, but
maybe the BROPT_FLOOD option should be renamed to BR_UCAST_FLOOD or
BR_UNICAST_FLOOD?

Best regards
 /Joachim

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17  6:50 ` [PATCH 2/5] net: bridge: Implement bridge flood flag Mattias Forsblad
  2022-03-17  9:07   ` Joachim Wiberg
@ 2022-03-17 10:11   ` Nikolay Aleksandrov
  2022-03-17 10:32     ` Nikolay Aleksandrov
  2022-03-17 11:15     ` Vladimir Oltean
  1 sibling, 2 replies; 23+ messages in thread
From: Nikolay Aleksandrov @ 2022-03-17 10:11 UTC (permalink / raw)
  To: Mattias Forsblad, netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz, Joachim Wiberg

On 17/03/2022 08:50, Mattias Forsblad wrote:
> This patch implements the bridge flood flags. There are three different
> flags matching unicast, multicast and broadcast. When the corresponding
> flag is cleared packets received on bridge ports will not be flooded
> towards the bridge.
> This makes is possible to only forward selected traffic between the
> port members of the bridge.
> 
> Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
> ---
>  include/linux/if_bridge.h      |  6 +++++
>  include/uapi/linux/if_bridge.h |  9 ++++++-
>  net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
>  net/bridge/br_device.c         |  3 +++
>  net/bridge/br_input.c          | 23 ++++++++++++++---
>  net/bridge/br_private.h        |  4 +++
>  6 files changed, 85 insertions(+), 5 deletions(-)
> 
Please always CC bridge maintainers for bridge patches.

I almost missed this one. I'll add my reply to Joachim's notes
which are pretty spot on.

> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index 3aae023a9353..fa8e000a6fb9 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
>  struct net_device *br_fdb_find_port(const struct net_device *br_dev,
>  				    const unsigned char *addr,
>  				    __u16 vid);
> +bool br_flood_enabled(const struct net_device *dev);
>  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
>  u8 br_port_get_stp_state(const struct net_device *dev);
> @@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
>  	return NULL;
>  }
>  
> +static inline bool br_flood_enabled(const struct net_device *dev)
> +{
> +	return true;
> +}
> +
>  static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
>  {
>  }
> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
> index 2711c3522010..765ed70c9b28 100644
> --- a/include/uapi/linux/if_bridge.h
> +++ b/include/uapi/linux/if_bridge.h
> @@ -72,6 +72,7 @@ struct __bridge_info {
>  	__u32 tcn_timer_value;
>  	__u32 topology_change_timer_value;
>  	__u32 gc_timer_value;
> +	__u8 flood;
>  };
>  
>  struct __port_info {
> @@ -752,13 +753,19 @@ struct br_mcast_stats {
>  /* bridge boolean options
>   * BR_BOOLOPT_NO_LL_LEARN - disable learning from link-local packets
>   * BR_BOOLOPT_MCAST_VLAN_SNOOPING - control vlan multicast snooping
> + * BR_BOOLOPT_FLOOD - control bridge flood flag
> + * BR_BOOLOPT_MCAST_FLOOD - control bridge multicast flood flag
> + * BR_BOOLOPT_BCAST_FLOOD - control bridge broadcast flood flag
>   *
>   * IMPORTANT: if adding a new option do not forget to handle
> - *            it in br_boolopt_toggle/get and bridge sysfs
> + *            it in br_boolopt_toggle/get
>   */
>  enum br_boolopt_id {
>  	BR_BOOLOPT_NO_LL_LEARN,
>  	BR_BOOLOPT_MCAST_VLAN_SNOOPING,
> +	BR_BOOLOPT_FLOOD,
> +	BR_BOOLOPT_MCAST_FLOOD,
> +	BR_BOOLOPT_BCAST_FLOOD,
>  	BR_BOOLOPT_MAX
>  };
>  
> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index b1dea3febeea..63a17bed6c63 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -265,6 +265,11 @@ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on,
>  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
>  		err = br_multicast_toggle_vlan_snooping(br, on, extack);
>  		break;
> +	case BR_BOOLOPT_FLOOD:
> +	case BR_BOOLOPT_MCAST_FLOOD:
> +	case BR_BOOLOPT_BCAST_FLOOD:
> +		err = br_flood_toggle(br, opt, on);
> +		break;
>  	default:
>  		/* shouldn't be called with unsupported options */
>  		WARN_ON(1);
> @@ -281,6 +286,12 @@ int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt)
>  		return br_opt_get(br, BROPT_NO_LL_LEARN);
>  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
>  		return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED);
> +	case BR_BOOLOPT_FLOOD:
> +		return br_opt_get(br, BROPT_FLOOD);
> +	case BR_BOOLOPT_MCAST_FLOOD:
> +		return br_opt_get(br, BROPT_MCAST_FLOOD);
> +	case BR_BOOLOPT_BCAST_FLOOD:
> +		return br_opt_get(br, BROPT_BCAST_FLOOD);
>  	default:
>  		/* shouldn't be called with unsupported options */
>  		WARN_ON(1);
> @@ -325,6 +336,40 @@ void br_boolopt_multi_get(const struct net_bridge *br,
>  	bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0);
>  }
>  
> +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt,
> +		    bool on)
> +{
> +	struct switchdev_attr attr = {
> +		.orig_dev = br->dev,
> +		.id = SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
> +		.flags = SWITCHDEV_F_DEFER,
> +	};
> +	enum net_bridge_opts bropt;
> +	int ret;
> +
> +	switch (opt) {
> +	case BR_BOOLOPT_FLOOD:
> +		bropt = BROPT_FLOOD;
> +		break;
> +	case BR_BOOLOPT_MCAST_FLOOD:
> +		bropt = BROPT_MCAST_FLOOD;
> +		break;
> +	case BR_BOOLOPT_BCAST_FLOOD:
> +		bropt = BROPT_BCAST_FLOOD;
> +		break;
> +	default:
> +		WARN_ON(1);
> +		return -EINVAL;
> +	}
> +	br_opt_toggle(br, bropt, on);
> +
> +	attr.u.brport_flags.mask = BIT(bropt);
> +	attr.u.brport_flags.val = on << bropt;
> +	ret = switchdev_port_attr_set(br->dev, &attr, NULL);
> +
> +	return ret;
> +}
> +
>  /* private bridge options, controlled by the kernel */
>  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
>  {
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index 8d6bab244c4a..fafaef9d4b3a 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -524,6 +524,9 @@ void br_dev_setup(struct net_device *dev)
>  	br->bridge_hello_time = br->hello_time = 2 * HZ;
>  	br->bridge_forward_delay = br->forward_delay = 15 * HZ;
>  	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
> +	br_opt_toggle(br, BROPT_FLOOD, true);
> +	br_opt_toggle(br, BROPT_MCAST_FLOOD, true);
> +	br_opt_toggle(br, BROPT_BCAST_FLOOD, true);
>  	dev->max_mtu = ETH_MAX_MTU;
>  
>  	br_netfilter_rtable_init(br);
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index e0c13fcc50ed..fcb0757bfdcc 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  		/* by definition the broadcast is also a multicast address */
>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>  			pkt_type = BR_PKT_BROADCAST;
> -			local_rcv = true;
> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
>  		} else {
>  			pkt_type = BR_PKT_MULTICAST;
> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> -				goto drop;
> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> +					goto drop;
>  		}
>  	}
>  
> @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  			local_rcv = true;
>  			br->dev->stats.multicast++;
>  		}
> +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
> +			local_rcv = false;
>  		break;
>  	case BR_PKT_UNICAST:
>  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
> +		if (!br_opt_get(br, BROPT_FLOOD))
> +			local_rcv = false;
>  		break;
>  	default:
>  		break;
> @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  	if (dst) {
>  		unsigned long now = jiffies;
>  
> -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
> +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
>  			return br_pass_frame_up(skb);
>  
>  		if (now != dst->used)
> @@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>  }
>  EXPORT_SYMBOL_GPL(br_handle_frame_finish);
>  
> +bool br_flood_enabled(const struct net_device *dev)
> +{
> +	struct net_bridge *br = netdev_priv(dev);
> +
> +	return !!(br_opt_get(br, BROPT_FLOOD) ||
> +		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
> +		   br_opt_get(br, BROPT_BCAST_FLOOD));
> +}
> +EXPORT_SYMBOL_GPL(br_flood_enabled);
> +
>  static void __br_handle_local_finish(struct sk_buff *skb)
>  {
>  	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 48bc61ebc211..cf88dce0b92b 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -445,6 +445,9 @@ enum net_bridge_opts {
>  	BROPT_NO_LL_LEARN,
>  	BROPT_VLAN_BRIDGE_BINDING,
>  	BROPT_MCAST_VLAN_SNOOPING_ENABLED,
> +	BROPT_FLOOD,
> +	BROPT_MCAST_FLOOD,
> +	BROPT_BCAST_FLOOD,
>  };
>  
>  struct net_bridge {
> @@ -720,6 +723,7 @@ int br_boolopt_multi_toggle(struct net_bridge *br,
>  void br_boolopt_multi_get(const struct net_bridge *br,
>  			  struct br_boolopt_multi *bm);
>  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on);
> +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on);
>  
>  /* br_device.c */
>  void br_dev_setup(struct net_device *dev);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17  9:07   ` Joachim Wiberg
@ 2022-03-17 10:30     ` Nikolay Aleksandrov
  2022-03-17 11:39     ` Mattias Forsblad
  1 sibling, 0 replies; 23+ messages in thread
From: Nikolay Aleksandrov @ 2022-03-17 10:30 UTC (permalink / raw)
  To: Joachim Wiberg, Mattias Forsblad, netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz

On 17/03/2022 11:07, Joachim Wiberg wrote:
> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
>> This patch implements the bridge flood flags. There are three different
>> flags matching unicast, multicast and broadcast. When the corresponding
>> flag is cleared packets received on bridge ports will not be flooded
>> towards the bridge.
> 
> If I've not completely misunderstood things, I believe the flood and
> mcast_flood flags operate on unknown unicast and multicast.  With that
> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
> add my own comments below.
> 
> Happy incident I saw this patch set, I have a very similar one for these
> flags to the bridge itself, with the intent to improve handling of all
> classes of multicast to/from the bridge itself.

+1

I'll add my comments below, yours are pretty spot on. I have one more
that's for the snipped part, I'll send that separately. :)

First please split this into 3 separate patches - one for each flag.
That would make reviewing much easier.

> >> [snip]
>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>> index e0c13fcc50ed..fcb0757bfdcc 100644
>> --- a/net/bridge/br_input.c
>> +++ b/net/bridge/br_input.c
>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>  		/* by definition the broadcast is also a multicast address */
>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>>  			pkt_type = BR_PKT_BROADCAST;
>> -			local_rcv = true;
>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> 
> Minor comment, I believe the preferred style is more like this:
> 
> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
>         	local_rcv = true;
> 

ack

>>  		} else {
>>  			pkt_type = BR_PKT_MULTICAST;
>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> -				goto drop;
>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> +					goto drop;
> 
> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
> we cannot bypass the call to br_multicast_rcv(), which helps with the
> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
> router ports, while the mdb lookup (below) is what an tell us if we
> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
> flag for the bridge itself.
> 

+1

>>  		}
>>  	}
>>  
>> @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>  			local_rcv = true;
>>  			br->dev->stats.multicast++;
>>  		}
>> +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
>> +			local_rcv = false;
> 
> We should never set local_rcv to false, only ever use constructs that
> set it to true.  Here the PROMISC flag (above) condition would be
> negated, which would be a regression.
> 
> Instead, for multicast I believe we should ensure that we reach the
> else statement for unknown IP multicast, preventing mcast_hit from
> being set, and instead flood unknown multicast using br_flood().
> 
> This is a bigger change that involves:
> 
>   1) dropping the mrouters_only skb flag for unknown multicast,
>      keeping it only for IGMP/MLD reports
>   2) extending br_flood() slightly to flood unknown multicast
>      also to mcast_router ports
> 
> As I mentioned above, I have some patches, including selftests, for
> forwarding known/unknown multicast using the mdb and mcast_flood +
> mcast_router flags.  Maybe we should combine efforts here somehow?
> 

Ack, sounds good! Please coordinate that between yourselves, if the mcast
flood flag will be dropped from this set or if Joachim's will be integrated.

>>  		break;
>>  	case BR_PKT_UNICAST:
>>  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
>> +		if (!br_opt_get(br, BROPT_FLOOD))
>> +			local_rcv = false;
> 
> Again, never set it to false, invert the check instead, like this:
> 
> 		if (!dst && br_opt_get(br, BROPT_FLOOD))
> 			local_rcv = true;
> 
>>  		break;
>>  	default:
>>  		break;
>> @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>  	if (dst) {
>>  		unsigned long now = jiffies;
>>  
>> -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
>> +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
>>  			return br_pass_frame_up(skb);
> 
> I believe this would break both the flooding of unknown multicast and
> the PROMISC case.  Down here we are broadcast or known/unknown multicast
> land, so the local_rcv flag should be sufficient.
> 
>>  		if (now != dst->used)
>> @@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>  }
>>  EXPORT_SYMBOL_GPL(br_handle_frame_finish);
>>  
>> +bool br_flood_enabled(const struct net_device *dev)
>> +{
>> +	struct net_bridge *br = netdev_priv(dev);

const

>> +
>> +	return !!(br_opt_get(br, BROPT_FLOOD) ||
>> +		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
>> +		   br_opt_get(br, BROPT_BCAST_FLOOD));
> 
> Minor nit, don't know what the rest of the list feels about this, but
> maybe the BROPT_FLOOD option should be renamed to BR_UCAST_FLOOD or
> BR_UNICAST_FLOOD?
> 

Exactly, very good suggestion. Unfortunately we already have BR_FLOOD and can't
do anything about it, but we shouldn't follow that bad example.

> Best regards
>  /Joachim

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 10:11   ` Nikolay Aleksandrov
@ 2022-03-17 10:32     ` Nikolay Aleksandrov
  2022-03-17 11:15     ` Vladimir Oltean
  1 sibling, 0 replies; 23+ messages in thread
From: Nikolay Aleksandrov @ 2022-03-17 10:32 UTC (permalink / raw)
  To: Mattias Forsblad, netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz, Joachim Wiberg

On 17/03/2022 12:11, Nikolay Aleksandrov wrote:
> On 17/03/2022 08:50, Mattias Forsblad wrote:
>> This patch implements the bridge flood flags. There are three different
>> flags matching unicast, multicast and broadcast. When the corresponding
>> flag is cleared packets received on bridge ports will not be flooded
>> towards the bridge.
>> This makes is possible to only forward selected traffic between the
>> port members of the bridge.
>>
>> Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
>> ---
>>  include/linux/if_bridge.h      |  6 +++++
>>  include/uapi/linux/if_bridge.h |  9 ++++++-
>>  net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
>>  net/bridge/br_device.c         |  3 +++
>>  net/bridge/br_input.c          | 23 ++++++++++++++---
>>  net/bridge/br_private.h        |  4 +++
>>  6 files changed, 85 insertions(+), 5 deletions(-)
>>
> Please always CC bridge maintainers for bridge patches.
> 
> I almost missed this one. I'll add my reply to Joachim's notes
> which are pretty spot on.
> 
>> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
>> index 3aae023a9353..fa8e000a6fb9 100644
>> --- a/include/linux/if_bridge.h
>> +++ b/include/linux/if_bridge.h
>> @@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
>>  struct net_device *br_fdb_find_port(const struct net_device *br_dev,
>>  				    const unsigned char *addr,
>>  				    __u16 vid);
>> +bool br_flood_enabled(const struct net_device *dev);
>>  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
>>  u8 br_port_get_stp_state(const struct net_device *dev);
>> @@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
>>  	return NULL;
>>  }
>>  
>> +static inline bool br_flood_enabled(const struct net_device *dev)
>> +{
>> +	return true;
>> +}
>> +
>>  static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
>>  {
>>  }
>> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
>> index 2711c3522010..765ed70c9b28 100644
>> --- a/include/uapi/linux/if_bridge.h
>> +++ b/include/uapi/linux/if_bridge.h
>> @@ -72,6 +72,7 @@ struct __bridge_info {
>>  	__u32 tcn_timer_value;
>>  	__u32 topology_change_timer_value;
>>  	__u32 gc_timer_value;
>> +	__u8 flood;
>>  };

Replying to myself as this part was snipped from Joachim's comments.
You cannot change uapi structures.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 10:11   ` Nikolay Aleksandrov
  2022-03-17 10:32     ` Nikolay Aleksandrov
@ 2022-03-17 11:15     ` Vladimir Oltean
  2022-03-17 11:57       ` Vladimir Oltean
  1 sibling, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2022-03-17 11:15 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Mattias Forsblad, netdev, David S . Miller, Jakub Kicinski,
	Andrew Lunn, Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Tobias Waldekranz, Joachim Wiberg

On Thu, Mar 17, 2022 at 12:11:40PM +0200, Nikolay Aleksandrov wrote:
> On 17/03/2022 08:50, Mattias Forsblad wrote:
> > This patch implements the bridge flood flags. There are three different
> > flags matching unicast, multicast and broadcast. When the corresponding
> > flag is cleared packets received on bridge ports will not be flooded
> > towards the bridge.
> > This makes is possible to only forward selected traffic between the
> > port members of the bridge.
> > 
> > Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
> > ---
> >  include/linux/if_bridge.h      |  6 +++++
> >  include/uapi/linux/if_bridge.h |  9 ++++++-
> >  net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
> >  net/bridge/br_device.c         |  3 +++
> >  net/bridge/br_input.c          | 23 ++++++++++++++---
> >  net/bridge/br_private.h        |  4 +++
> >  6 files changed, 85 insertions(+), 5 deletions(-)
> > 
> Please always CC bridge maintainers for bridge patches.
> I almost missed this one. I'll add my reply to Joachim's notes
> which are pretty spot on.

And DSA maintainers for DSA patches ;) I was aimlessly scrolling through
patchwork when I happened to notice these, and the series is already at v3.

As a matter of fact, I downloaded these patches from the mailing list
with the intention of giving them a spin on mv88e6xxx to see what
they're about, and to my surprise, this particular patch (I haven't even
reached the offloading part) breaks DHCP on my bridge, so it can no
longer get an IP address. I haven't toggled any bridge flag through
netlink, just booted the board with systemd-networkd. The same thing
happens with my LS1028A board. Further investigation to come, but this
isn't off to a good start, I'm afraid...

> 
> > diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> > index 3aae023a9353..fa8e000a6fb9 100644
> > --- a/include/linux/if_bridge.h
> > +++ b/include/linux/if_bridge.h
> > @@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
> >  struct net_device *br_fdb_find_port(const struct net_device *br_dev,
> >  				    const unsigned char *addr,
> >  				    __u16 vid);
> > +bool br_flood_enabled(const struct net_device *dev);
> >  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
> >  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
> >  u8 br_port_get_stp_state(const struct net_device *dev);
> > @@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
> >  	return NULL;
> >  }
> >  
> > +static inline bool br_flood_enabled(const struct net_device *dev)
> > +{
> > +	return true;
> > +}
> > +
> >  static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
> >  {
> >  }
> > diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
> > index 2711c3522010..765ed70c9b28 100644
> > --- a/include/uapi/linux/if_bridge.h
> > +++ b/include/uapi/linux/if_bridge.h
> > @@ -72,6 +72,7 @@ struct __bridge_info {
> >  	__u32 tcn_timer_value;
> >  	__u32 topology_change_timer_value;
> >  	__u32 gc_timer_value;
> > +	__u8 flood;
> >  };
> >  
> >  struct __port_info {
> > @@ -752,13 +753,19 @@ struct br_mcast_stats {
> >  /* bridge boolean options
> >   * BR_BOOLOPT_NO_LL_LEARN - disable learning from link-local packets
> >   * BR_BOOLOPT_MCAST_VLAN_SNOOPING - control vlan multicast snooping
> > + * BR_BOOLOPT_FLOOD - control bridge flood flag
> > + * BR_BOOLOPT_MCAST_FLOOD - control bridge multicast flood flag
> > + * BR_BOOLOPT_BCAST_FLOOD - control bridge broadcast flood flag
> >   *
> >   * IMPORTANT: if adding a new option do not forget to handle
> > - *            it in br_boolopt_toggle/get and bridge sysfs
> > + *            it in br_boolopt_toggle/get
> >   */
> >  enum br_boolopt_id {
> >  	BR_BOOLOPT_NO_LL_LEARN,
> >  	BR_BOOLOPT_MCAST_VLAN_SNOOPING,
> > +	BR_BOOLOPT_FLOOD,
> > +	BR_BOOLOPT_MCAST_FLOOD,
> > +	BR_BOOLOPT_BCAST_FLOOD,
> >  	BR_BOOLOPT_MAX
> >  };
> >  
> > diff --git a/net/bridge/br.c b/net/bridge/br.c
> > index b1dea3febeea..63a17bed6c63 100644
> > --- a/net/bridge/br.c
> > +++ b/net/bridge/br.c
> > @@ -265,6 +265,11 @@ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on,
> >  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
> >  		err = br_multicast_toggle_vlan_snooping(br, on, extack);
> >  		break;
> > +	case BR_BOOLOPT_FLOOD:
> > +	case BR_BOOLOPT_MCAST_FLOOD:
> > +	case BR_BOOLOPT_BCAST_FLOOD:
> > +		err = br_flood_toggle(br, opt, on);
> > +		break;
> >  	default:
> >  		/* shouldn't be called with unsupported options */
> >  		WARN_ON(1);
> > @@ -281,6 +286,12 @@ int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt)
> >  		return br_opt_get(br, BROPT_NO_LL_LEARN);
> >  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
> >  		return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED);
> > +	case BR_BOOLOPT_FLOOD:
> > +		return br_opt_get(br, BROPT_FLOOD);
> > +	case BR_BOOLOPT_MCAST_FLOOD:
> > +		return br_opt_get(br, BROPT_MCAST_FLOOD);
> > +	case BR_BOOLOPT_BCAST_FLOOD:
> > +		return br_opt_get(br, BROPT_BCAST_FLOOD);
> >  	default:
> >  		/* shouldn't be called with unsupported options */
> >  		WARN_ON(1);
> > @@ -325,6 +336,40 @@ void br_boolopt_multi_get(const struct net_bridge *br,
> >  	bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0);
> >  }
> >  
> > +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt,
> > +		    bool on)
> > +{
> > +	struct switchdev_attr attr = {
> > +		.orig_dev = br->dev,
> > +		.id = SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
> > +		.flags = SWITCHDEV_F_DEFER,
> > +	};
> > +	enum net_bridge_opts bropt;
> > +	int ret;
> > +
> > +	switch (opt) {
> > +	case BR_BOOLOPT_FLOOD:
> > +		bropt = BROPT_FLOOD;
> > +		break;
> > +	case BR_BOOLOPT_MCAST_FLOOD:
> > +		bropt = BROPT_MCAST_FLOOD;
> > +		break;
> > +	case BR_BOOLOPT_BCAST_FLOOD:
> > +		bropt = BROPT_BCAST_FLOOD;
> > +		break;
> > +	default:
> > +		WARN_ON(1);
> > +		return -EINVAL;
> > +	}
> > +	br_opt_toggle(br, bropt, on);
> > +
> > +	attr.u.brport_flags.mask = BIT(bropt);
> > +	attr.u.brport_flags.val = on << bropt;
> > +	ret = switchdev_port_attr_set(br->dev, &attr, NULL);
> > +
> > +	return ret;
> > +}
> > +
> >  /* private bridge options, controlled by the kernel */
> >  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
> >  {
> > diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> > index 8d6bab244c4a..fafaef9d4b3a 100644
> > --- a/net/bridge/br_device.c
> > +++ b/net/bridge/br_device.c
> > @@ -524,6 +524,9 @@ void br_dev_setup(struct net_device *dev)
> >  	br->bridge_hello_time = br->hello_time = 2 * HZ;
> >  	br->bridge_forward_delay = br->forward_delay = 15 * HZ;
> >  	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
> > +	br_opt_toggle(br, BROPT_FLOOD, true);
> > +	br_opt_toggle(br, BROPT_MCAST_FLOOD, true);
> > +	br_opt_toggle(br, BROPT_BCAST_FLOOD, true);
> >  	dev->max_mtu = ETH_MAX_MTU;
> >  
> >  	br_netfilter_rtable_init(br);
> > diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> > index e0c13fcc50ed..fcb0757bfdcc 100644
> > --- a/net/bridge/br_input.c
> > +++ b/net/bridge/br_input.c
> > @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> >  		/* by definition the broadcast is also a multicast address */
> >  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
> >  			pkt_type = BR_PKT_BROADCAST;
> > -			local_rcv = true;
> > +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> >  		} else {
> >  			pkt_type = BR_PKT_MULTICAST;
> > -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> > -				goto drop;
> > +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> > +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> > +					goto drop;
> >  		}
> >  	}
> >  
> > @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> >  			local_rcv = true;
> >  			br->dev->stats.multicast++;
> >  		}
> > +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
> > +			local_rcv = false;
> >  		break;
> >  	case BR_PKT_UNICAST:
> >  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
> > +		if (!br_opt_get(br, BROPT_FLOOD))
> > +			local_rcv = false;
> >  		break;
> >  	default:
> >  		break;
> > @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> >  	if (dst) {
> >  		unsigned long now = jiffies;
> >  
> > -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
> > +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
> >  			return br_pass_frame_up(skb);
> >  
> >  		if (now != dst->used)
> > @@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> >  }
> >  EXPORT_SYMBOL_GPL(br_handle_frame_finish);
> >  
> > +bool br_flood_enabled(const struct net_device *dev)
> > +{
> > +	struct net_bridge *br = netdev_priv(dev);
> > +
> > +	return !!(br_opt_get(br, BROPT_FLOOD) ||
> > +		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
> > +		   br_opt_get(br, BROPT_BCAST_FLOOD));
> > +}
> > +EXPORT_SYMBOL_GPL(br_flood_enabled);
> > +
> >  static void __br_handle_local_finish(struct sk_buff *skb)
> >  {
> >  	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
> > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> > index 48bc61ebc211..cf88dce0b92b 100644
> > --- a/net/bridge/br_private.h
> > +++ b/net/bridge/br_private.h
> > @@ -445,6 +445,9 @@ enum net_bridge_opts {
> >  	BROPT_NO_LL_LEARN,
> >  	BROPT_VLAN_BRIDGE_BINDING,
> >  	BROPT_MCAST_VLAN_SNOOPING_ENABLED,
> > +	BROPT_FLOOD,
> > +	BROPT_MCAST_FLOOD,
> > +	BROPT_BCAST_FLOOD,
> >  };
> >  
> >  struct net_bridge {
> > @@ -720,6 +723,7 @@ int br_boolopt_multi_toggle(struct net_bridge *br,
> >  void br_boolopt_multi_get(const struct net_bridge *br,
> >  			  struct br_boolopt_multi *bm);
> >  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on);
> > +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on);
> >  
> >  /* br_device.c */
> >  void br_dev_setup(struct net_device *dev);
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17  9:07   ` Joachim Wiberg
  2022-03-17 10:30     ` Nikolay Aleksandrov
@ 2022-03-17 11:39     ` Mattias Forsblad
  2022-03-17 11:42       ` Nikolay Aleksandrov
  1 sibling, 1 reply; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17 11:39 UTC (permalink / raw)
  To: Joachim Wiberg, netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz,
	Nikolay Aleksandrov, Vladimir Oltean

On 2022-03-17 10:07, Joachim Wiberg wrote:
> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
>> This patch implements the bridge flood flags. There are three different
>> flags matching unicast, multicast and broadcast. When the corresponding
>> flag is cleared packets received on bridge ports will not be flooded
>> towards the bridge.
> 
> If I've not completely misunderstood things, I believe the flood and
> mcast_flood flags operate on unknown unicast and multicast.  With that
> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
> add my own comments below.
> 
> Happy incident I saw this patch set, I have a very similar one for these
> flags to the bridge itself, with the intent to improve handling of all
> classes of multicast to/from the bridge itself.
> 
>> [snip]
>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>> index e0c13fcc50ed..fcb0757bfdcc 100644
>> --- a/net/bridge/br_input.c
>> +++ b/net/bridge/br_input.c
>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>  		/* by definition the broadcast is also a multicast address */
>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>>  			pkt_type = BR_PKT_BROADCAST;
>> -			local_rcv = true;
>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> 
> Minor comment, I believe the preferred style is more like this:
> 
> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
>         	local_rcv = true;
> 
>>  		} else {
>>  			pkt_type = BR_PKT_MULTICAST;
>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> -				goto drop;
>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> +					goto drop;
> 
> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
> we cannot bypass the call to br_multicast_rcv(), which helps with the
> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
> router ports, while the mdb lookup (below) is what an tell us if we
> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
> flag for the bridge itself.

The original flag was name was local_receive to separate it from being
mistaken for the flood unknown flags. However the comment I've got was
to align it with the existing (port) flags. These flags have nothing to do with
the port flood unknown flags. Imagine the setup below:

           vlan1
             |
            br0             br1
           /   \           /   \
         swp1 swp2       swp3 swp4

We want to have swp1/2 as member of a normal vlan filtering bridge br0 /w learning on. 
On br1 we want to just forward packets between swp3/4 and disable learning. 
Additional we don't want this traffic to impact the CPU. 
If we disable learning on swp3/4 all traffic will be unknown and if we also 
have flood unknown on the CPU-port because of requirements for br0 it will
impact the traffic to br1. Thus we want to restrict traffic between swp3/4<->CPU port
with the help of the PVT.

/Mattias

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 11:39     ` Mattias Forsblad
@ 2022-03-17 11:42       ` Nikolay Aleksandrov
  2022-03-17 12:26         ` Ido Schimmel
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2022-03-17 11:42 UTC (permalink / raw)
  To: Mattias Forsblad, Joachim Wiberg, netdev
  Cc: David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Tobias Waldekranz, Vladimir Oltean

On 17/03/2022 13:39, Mattias Forsblad wrote:
> On 2022-03-17 10:07, Joachim Wiberg wrote:
>> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
>>> This patch implements the bridge flood flags. There are three different
>>> flags matching unicast, multicast and broadcast. When the corresponding
>>> flag is cleared packets received on bridge ports will not be flooded
>>> towards the bridge.
>>
>> If I've not completely misunderstood things, I believe the flood and
>> mcast_flood flags operate on unknown unicast and multicast.  With that
>> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
>> add my own comments below.
>>
>> Happy incident I saw this patch set, I have a very similar one for these
>> flags to the bridge itself, with the intent to improve handling of all
>> classes of multicast to/from the bridge itself.
>>
>>> [snip]
>>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>>> index e0c13fcc50ed..fcb0757bfdcc 100644
>>> --- a/net/bridge/br_input.c
>>> +++ b/net/bridge/br_input.c
>>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>>  		/* by definition the broadcast is also a multicast address */
>>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>>>  			pkt_type = BR_PKT_BROADCAST;
>>> -			local_rcv = true;
>>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
>>
>> Minor comment, I believe the preferred style is more like this:
>>
>> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
>>         	local_rcv = true;
>>
>>>  		} else {
>>>  			pkt_type = BR_PKT_MULTICAST;
>>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>>> -				goto drop;
>>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
>>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>>> +					goto drop;
>>
>> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
>> we cannot bypass the call to br_multicast_rcv(), which helps with the
>> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
>> router ports, while the mdb lookup (below) is what an tell us if we
>> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
>> flag for the bridge itself.
> 
> The original flag was name was local_receive to separate it from being
> mistaken for the flood unknown flags. However the comment I've got was
> to align it with the existing (port) flags. These flags have nothing to do with
> the port flood unknown flags. Imagine the setup below:
> 
>            vlan1
>              |
>             br0             br1
>            /   \           /   \
>          swp1 swp2       swp3 swp4
> 
> We want to have swp1/2 as member of a normal vlan filtering bridge br0 /w learning on. 
> On br1 we want to just forward packets between swp3/4 and disable learning. 
> Additional we don't want this traffic to impact the CPU. 
> If we disable learning on swp3/4 all traffic will be unknown and if we also 
> have flood unknown on the CPU-port because of requirements for br0 it will
> impact the traffic to br1. Thus we want to restrict traffic between swp3/4<->CPU port
> with the help of the PVT.
> 
> /Mattias

The feedback was correct and we all assumed unknown traffic control.
If you don't want any local receive then use filtering rules. Don't add unnecessary flags.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 11:15     ` Vladimir Oltean
@ 2022-03-17 11:57       ` Vladimir Oltean
  2022-03-17 11:59         ` Vladimir Oltean
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2022-03-17 11:57 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Mattias Forsblad, netdev, David S . Miller, Jakub Kicinski,
	Andrew Lunn, Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Tobias Waldekranz, Joachim Wiberg

On Thu, Mar 17, 2022 at 01:15:45PM +0200, Vladimir Oltean wrote:
> On Thu, Mar 17, 2022 at 12:11:40PM +0200, Nikolay Aleksandrov wrote:
> > On 17/03/2022 08:50, Mattias Forsblad wrote:
> > > This patch implements the bridge flood flags. There are three different
> > > flags matching unicast, multicast and broadcast. When the corresponding
> > > flag is cleared packets received on bridge ports will not be flooded
> > > towards the bridge.
> > > This makes is possible to only forward selected traffic between the
> > > port members of the bridge.
> > > 
> > > Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
> > > ---
> > >  include/linux/if_bridge.h      |  6 +++++
> > >  include/uapi/linux/if_bridge.h |  9 ++++++-
> > >  net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
> > >  net/bridge/br_device.c         |  3 +++
> > >  net/bridge/br_input.c          | 23 ++++++++++++++---
> > >  net/bridge/br_private.h        |  4 +++
> > >  6 files changed, 85 insertions(+), 5 deletions(-)
> > > 
> > Please always CC bridge maintainers for bridge patches.
> > I almost missed this one. I'll add my reply to Joachim's notes
> > which are pretty spot on.
> 
> And DSA maintainers for DSA patches ;) I was aimlessly scrolling through
> patchwork when I happened to notice these, and the series is already at v3.
> 
> As a matter of fact, I downloaded these patches from the mailing list
> with the intention of giving them a spin on mv88e6xxx to see what
> they're about, and to my surprise, this particular patch (I haven't even
> reached the offloading part) breaks DHCP on my bridge, so it can no
> longer get an IP address. I haven't toggled any bridge flag through
> netlink, just booted the board with systemd-networkd. The same thing
> happens with my LS1028A board. Further investigation to come, but this
> isn't off to a good start, I'm afraid...
> 
> > 
> > > diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> > > index 3aae023a9353..fa8e000a6fb9 100644
> > > --- a/include/linux/if_bridge.h
> > > +++ b/include/linux/if_bridge.h
> > > @@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
> > >  struct net_device *br_fdb_find_port(const struct net_device *br_dev,
> > >  				    const unsigned char *addr,
> > >  				    __u16 vid);
> > > +bool br_flood_enabled(const struct net_device *dev);
> > >  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
> > >  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
> > >  u8 br_port_get_stp_state(const struct net_device *dev);
> > > @@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
> > >  	return NULL;
> > >  }
> > >  
> > > +static inline bool br_flood_enabled(const struct net_device *dev)
> > > +{
> > > +	return true;
> > > +}
> > > +
> > >  static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
> > >  {
> > >  }
> > > diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
> > > index 2711c3522010..765ed70c9b28 100644
> > > --- a/include/uapi/linux/if_bridge.h
> > > +++ b/include/uapi/linux/if_bridge.h
> > > @@ -72,6 +72,7 @@ struct __bridge_info {
> > >  	__u32 tcn_timer_value;
> > >  	__u32 topology_change_timer_value;
> > >  	__u32 gc_timer_value;
> > > +	__u8 flood;
> > >  };
> > >  
> > >  struct __port_info {
> > > @@ -752,13 +753,19 @@ struct br_mcast_stats {
> > >  /* bridge boolean options
> > >   * BR_BOOLOPT_NO_LL_LEARN - disable learning from link-local packets
> > >   * BR_BOOLOPT_MCAST_VLAN_SNOOPING - control vlan multicast snooping
> > > + * BR_BOOLOPT_FLOOD - control bridge flood flag
> > > + * BR_BOOLOPT_MCAST_FLOOD - control bridge multicast flood flag
> > > + * BR_BOOLOPT_BCAST_FLOOD - control bridge broadcast flood flag
> > >   *
> > >   * IMPORTANT: if adding a new option do not forget to handle
> > > - *            it in br_boolopt_toggle/get and bridge sysfs
> > > + *            it in br_boolopt_toggle/get
> > >   */
> > >  enum br_boolopt_id {
> > >  	BR_BOOLOPT_NO_LL_LEARN,
> > >  	BR_BOOLOPT_MCAST_VLAN_SNOOPING,
> > > +	BR_BOOLOPT_FLOOD,
> > > +	BR_BOOLOPT_MCAST_FLOOD,
> > > +	BR_BOOLOPT_BCAST_FLOOD,
> > >  	BR_BOOLOPT_MAX
> > >  };
> > >  
> > > diff --git a/net/bridge/br.c b/net/bridge/br.c
> > > index b1dea3febeea..63a17bed6c63 100644
> > > --- a/net/bridge/br.c
> > > +++ b/net/bridge/br.c
> > > @@ -265,6 +265,11 @@ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on,
> > >  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
> > >  		err = br_multicast_toggle_vlan_snooping(br, on, extack);
> > >  		break;
> > > +	case BR_BOOLOPT_FLOOD:
> > > +	case BR_BOOLOPT_MCAST_FLOOD:
> > > +	case BR_BOOLOPT_BCAST_FLOOD:
> > > +		err = br_flood_toggle(br, opt, on);
> > > +		break;
> > >  	default:
> > >  		/* shouldn't be called with unsupported options */
> > >  		WARN_ON(1);
> > > @@ -281,6 +286,12 @@ int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt)
> > >  		return br_opt_get(br, BROPT_NO_LL_LEARN);
> > >  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
> > >  		return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED);
> > > +	case BR_BOOLOPT_FLOOD:
> > > +		return br_opt_get(br, BROPT_FLOOD);
> > > +	case BR_BOOLOPT_MCAST_FLOOD:
> > > +		return br_opt_get(br, BROPT_MCAST_FLOOD);
> > > +	case BR_BOOLOPT_BCAST_FLOOD:
> > > +		return br_opt_get(br, BROPT_BCAST_FLOOD);
> > >  	default:
> > >  		/* shouldn't be called with unsupported options */
> > >  		WARN_ON(1);
> > > @@ -325,6 +336,40 @@ void br_boolopt_multi_get(const struct net_bridge *br,
> > >  	bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0);
> > >  }
> > >  
> > > +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt,
> > > +		    bool on)
> > > +{
> > > +	struct switchdev_attr attr = {
> > > +		.orig_dev = br->dev,
> > > +		.id = SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
> > > +		.flags = SWITCHDEV_F_DEFER,
> > > +	};
> > > +	enum net_bridge_opts bropt;
> > > +	int ret;
> > > +
> > > +	switch (opt) {
> > > +	case BR_BOOLOPT_FLOOD:
> > > +		bropt = BROPT_FLOOD;
> > > +		break;
> > > +	case BR_BOOLOPT_MCAST_FLOOD:
> > > +		bropt = BROPT_MCAST_FLOOD;
> > > +		break;
> > > +	case BR_BOOLOPT_BCAST_FLOOD:
> > > +		bropt = BROPT_BCAST_FLOOD;
> > > +		break;
> > > +	default:
> > > +		WARN_ON(1);
> > > +		return -EINVAL;
> > > +	}
> > > +	br_opt_toggle(br, bropt, on);
> > > +
> > > +	attr.u.brport_flags.mask = BIT(bropt);
> > > +	attr.u.brport_flags.val = on << bropt;
> > > +	ret = switchdev_port_attr_set(br->dev, &attr, NULL);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* private bridge options, controlled by the kernel */
> > >  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
> > >  {
> > > diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> > > index 8d6bab244c4a..fafaef9d4b3a 100644
> > > --- a/net/bridge/br_device.c
> > > +++ b/net/bridge/br_device.c
> > > @@ -524,6 +524,9 @@ void br_dev_setup(struct net_device *dev)
> > >  	br->bridge_hello_time = br->hello_time = 2 * HZ;
> > >  	br->bridge_forward_delay = br->forward_delay = 15 * HZ;
> > >  	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
> > > +	br_opt_toggle(br, BROPT_FLOOD, true);
> > > +	br_opt_toggle(br, BROPT_MCAST_FLOOD, true);
> > > +	br_opt_toggle(br, BROPT_BCAST_FLOOD, true);
> > >  	dev->max_mtu = ETH_MAX_MTU;
> > >  
> > >  	br_netfilter_rtable_init(br);
> > > diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> > > index e0c13fcc50ed..fcb0757bfdcc 100644
> > > --- a/net/bridge/br_input.c
> > > +++ b/net/bridge/br_input.c
> > > @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > >  		/* by definition the broadcast is also a multicast address */
> > >  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
> > >  			pkt_type = BR_PKT_BROADCAST;
> > > -			local_rcv = true;
> > > +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> > >  		} else {
> > >  			pkt_type = BR_PKT_MULTICAST;
> > > -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> > > -				goto drop;
> > > +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> > > +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> > > +					goto drop;
> > >  		}
> > >  	}
> > >  
> > > @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > >  			local_rcv = true;
> > >  			br->dev->stats.multicast++;
> > >  		}
> > > +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
> > > +			local_rcv = false;
> > >  		break;
> > >  	case BR_PKT_UNICAST:
> > >  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
> > > +		if (!br_opt_get(br, BROPT_FLOOD))
> > > +			local_rcv = false;
> > >  		break;
> > >  	default:
> > >  		break;
> > > @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > >  	if (dst) {
> > >  		unsigned long now = jiffies;
> > >  
> > > -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
> > > +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)

So this is the line that breaks local termination. Could you explain
the reasoning for this change? For unicast packets matching a local FDB
entry, local_rcv used to be irrelevant (and wasn't even set to true,
unless the bridge device was promiscuous).

> > >  			return br_pass_frame_up(skb);
> > >  
> > >  		if (now != dst->used)
> > > @@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > >  }
> > >  EXPORT_SYMBOL_GPL(br_handle_frame_finish);
> > >  
> > > +bool br_flood_enabled(const struct net_device *dev)
> > > +{
> > > +	struct net_bridge *br = netdev_priv(dev);
> > > +
> > > +	return !!(br_opt_get(br, BROPT_FLOOD) ||
> > > +		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
> > > +		   br_opt_get(br, BROPT_BCAST_FLOOD));
> > > +}
> > > +EXPORT_SYMBOL_GPL(br_flood_enabled);
> > > +
> > >  static void __br_handle_local_finish(struct sk_buff *skb)
> > >  {
> > >  	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
> > > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> > > index 48bc61ebc211..cf88dce0b92b 100644
> > > --- a/net/bridge/br_private.h
> > > +++ b/net/bridge/br_private.h
> > > @@ -445,6 +445,9 @@ enum net_bridge_opts {
> > >  	BROPT_NO_LL_LEARN,
> > >  	BROPT_VLAN_BRIDGE_BINDING,
> > >  	BROPT_MCAST_VLAN_SNOOPING_ENABLED,
> > > +	BROPT_FLOOD,
> > > +	BROPT_MCAST_FLOOD,
> > > +	BROPT_BCAST_FLOOD,
> > >  };
> > >  
> > >  struct net_bridge {
> > > @@ -720,6 +723,7 @@ int br_boolopt_multi_toggle(struct net_bridge *br,
> > >  void br_boolopt_multi_get(const struct net_bridge *br,
> > >  			  struct br_boolopt_multi *bm);
> > >  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on);
> > > +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on);
> > >  
> > >  /* br_device.c */
> > >  void br_dev_setup(struct net_device *dev);
> > 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 11:57       ` Vladimir Oltean
@ 2022-03-17 11:59         ` Vladimir Oltean
  2022-03-17 12:08           ` Mattias Forsblad
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2022-03-17 11:59 UTC (permalink / raw)
  To: Mattias Forsblad
  Cc: Nikolay Aleksandrov, netdev, David S . Miller, Jakub Kicinski,
	Andrew Lunn, Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Tobias Waldekranz, Joachim Wiberg

On Thu, Mar 17, 2022 at 01:57:03PM +0200, Vladimir Oltean wrote:
> On Thu, Mar 17, 2022 at 01:15:45PM +0200, Vladimir Oltean wrote:
> > On Thu, Mar 17, 2022 at 12:11:40PM +0200, Nikolay Aleksandrov wrote:
> > > On 17/03/2022 08:50, Mattias Forsblad wrote:
> > > > This patch implements the bridge flood flags. There are three different
> > > > flags matching unicast, multicast and broadcast. When the corresponding
> > > > flag is cleared packets received on bridge ports will not be flooded
> > > > towards the bridge.
> > > > This makes is possible to only forward selected traffic between the
> > > > port members of the bridge.
> > > > 
> > > > Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
> > > > ---
> > > >  include/linux/if_bridge.h      |  6 +++++
> > > >  include/uapi/linux/if_bridge.h |  9 ++++++-
> > > >  net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
> > > >  net/bridge/br_device.c         |  3 +++
> > > >  net/bridge/br_input.c          | 23 ++++++++++++++---
> > > >  net/bridge/br_private.h        |  4 +++
> > > >  6 files changed, 85 insertions(+), 5 deletions(-)
> > > > 
> > > Please always CC bridge maintainers for bridge patches.
> > > I almost missed this one. I'll add my reply to Joachim's notes
> > > which are pretty spot on.
> > 
> > And DSA maintainers for DSA patches ;) I was aimlessly scrolling through
> > patchwork when I happened to notice these, and the series is already at v3.
> > 
> > As a matter of fact, I downloaded these patches from the mailing list
> > with the intention of giving them a spin on mv88e6xxx to see what
> > they're about, and to my surprise, this particular patch (I haven't even
> > reached the offloading part) breaks DHCP on my bridge, so it can no
> > longer get an IP address. I haven't toggled any bridge flag through
> > netlink, just booted the board with systemd-networkd. The same thing
> > happens with my LS1028A board. Further investigation to come, but this
> > isn't off to a good start, I'm afraid...
> > 
> > > 
> > > > diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> > > > index 3aae023a9353..fa8e000a6fb9 100644
> > > > --- a/include/linux/if_bridge.h
> > > > +++ b/include/linux/if_bridge.h
> > > > @@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
> > > >  struct net_device *br_fdb_find_port(const struct net_device *br_dev,
> > > >  				    const unsigned char *addr,
> > > >  				    __u16 vid);
> > > > +bool br_flood_enabled(const struct net_device *dev);
> > > >  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
> > > >  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
> > > >  u8 br_port_get_stp_state(const struct net_device *dev);
> > > > @@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
> > > >  	return NULL;
> > > >  }
> > > >  
> > > > +static inline bool br_flood_enabled(const struct net_device *dev)
> > > > +{
> > > > +	return true;
> > > > +}
> > > > +
> > > >  static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
> > > >  {
> > > >  }
> > > > diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
> > > > index 2711c3522010..765ed70c9b28 100644
> > > > --- a/include/uapi/linux/if_bridge.h
> > > > +++ b/include/uapi/linux/if_bridge.h
> > > > @@ -72,6 +72,7 @@ struct __bridge_info {
> > > >  	__u32 tcn_timer_value;
> > > >  	__u32 topology_change_timer_value;
> > > >  	__u32 gc_timer_value;
> > > > +	__u8 flood;
> > > >  };
> > > >  
> > > >  struct __port_info {
> > > > @@ -752,13 +753,19 @@ struct br_mcast_stats {
> > > >  /* bridge boolean options
> > > >   * BR_BOOLOPT_NO_LL_LEARN - disable learning from link-local packets
> > > >   * BR_BOOLOPT_MCAST_VLAN_SNOOPING - control vlan multicast snooping
> > > > + * BR_BOOLOPT_FLOOD - control bridge flood flag
> > > > + * BR_BOOLOPT_MCAST_FLOOD - control bridge multicast flood flag
> > > > + * BR_BOOLOPT_BCAST_FLOOD - control bridge broadcast flood flag
> > > >   *
> > > >   * IMPORTANT: if adding a new option do not forget to handle
> > > > - *            it in br_boolopt_toggle/get and bridge sysfs
> > > > + *            it in br_boolopt_toggle/get
> > > >   */
> > > >  enum br_boolopt_id {
> > > >  	BR_BOOLOPT_NO_LL_LEARN,
> > > >  	BR_BOOLOPT_MCAST_VLAN_SNOOPING,
> > > > +	BR_BOOLOPT_FLOOD,
> > > > +	BR_BOOLOPT_MCAST_FLOOD,
> > > > +	BR_BOOLOPT_BCAST_FLOOD,
> > > >  	BR_BOOLOPT_MAX
> > > >  };
> > > >  
> > > > diff --git a/net/bridge/br.c b/net/bridge/br.c
> > > > index b1dea3febeea..63a17bed6c63 100644
> > > > --- a/net/bridge/br.c
> > > > +++ b/net/bridge/br.c
> > > > @@ -265,6 +265,11 @@ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on,
> > > >  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
> > > >  		err = br_multicast_toggle_vlan_snooping(br, on, extack);
> > > >  		break;
> > > > +	case BR_BOOLOPT_FLOOD:
> > > > +	case BR_BOOLOPT_MCAST_FLOOD:
> > > > +	case BR_BOOLOPT_BCAST_FLOOD:
> > > > +		err = br_flood_toggle(br, opt, on);
> > > > +		break;
> > > >  	default:
> > > >  		/* shouldn't be called with unsupported options */
> > > >  		WARN_ON(1);
> > > > @@ -281,6 +286,12 @@ int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt)
> > > >  		return br_opt_get(br, BROPT_NO_LL_LEARN);
> > > >  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
> > > >  		return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED);
> > > > +	case BR_BOOLOPT_FLOOD:
> > > > +		return br_opt_get(br, BROPT_FLOOD);
> > > > +	case BR_BOOLOPT_MCAST_FLOOD:
> > > > +		return br_opt_get(br, BROPT_MCAST_FLOOD);
> > > > +	case BR_BOOLOPT_BCAST_FLOOD:
> > > > +		return br_opt_get(br, BROPT_BCAST_FLOOD);
> > > >  	default:
> > > >  		/* shouldn't be called with unsupported options */
> > > >  		WARN_ON(1);
> > > > @@ -325,6 +336,40 @@ void br_boolopt_multi_get(const struct net_bridge *br,
> > > >  	bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0);
> > > >  }
> > > >  
> > > > +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt,
> > > > +		    bool on)
> > > > +{
> > > > +	struct switchdev_attr attr = {
> > > > +		.orig_dev = br->dev,
> > > > +		.id = SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
> > > > +		.flags = SWITCHDEV_F_DEFER,
> > > > +	};
> > > > +	enum net_bridge_opts bropt;
> > > > +	int ret;
> > > > +
> > > > +	switch (opt) {
> > > > +	case BR_BOOLOPT_FLOOD:
> > > > +		bropt = BROPT_FLOOD;
> > > > +		break;
> > > > +	case BR_BOOLOPT_MCAST_FLOOD:
> > > > +		bropt = BROPT_MCAST_FLOOD;
> > > > +		break;
> > > > +	case BR_BOOLOPT_BCAST_FLOOD:
> > > > +		bropt = BROPT_BCAST_FLOOD;
> > > > +		break;
> > > > +	default:
> > > > +		WARN_ON(1);
> > > > +		return -EINVAL;
> > > > +	}
> > > > +	br_opt_toggle(br, bropt, on);
> > > > +
> > > > +	attr.u.brport_flags.mask = BIT(bropt);
> > > > +	attr.u.brport_flags.val = on << bropt;
> > > > +	ret = switchdev_port_attr_set(br->dev, &attr, NULL);
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  /* private bridge options, controlled by the kernel */
> > > >  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
> > > >  {
> > > > diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> > > > index 8d6bab244c4a..fafaef9d4b3a 100644
> > > > --- a/net/bridge/br_device.c
> > > > +++ b/net/bridge/br_device.c
> > > > @@ -524,6 +524,9 @@ void br_dev_setup(struct net_device *dev)
> > > >  	br->bridge_hello_time = br->hello_time = 2 * HZ;
> > > >  	br->bridge_forward_delay = br->forward_delay = 15 * HZ;
> > > >  	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
> > > > +	br_opt_toggle(br, BROPT_FLOOD, true);
> > > > +	br_opt_toggle(br, BROPT_MCAST_FLOOD, true);
> > > > +	br_opt_toggle(br, BROPT_BCAST_FLOOD, true);
> > > >  	dev->max_mtu = ETH_MAX_MTU;
> > > >  
> > > >  	br_netfilter_rtable_init(br);
> > > > diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> > > > index e0c13fcc50ed..fcb0757bfdcc 100644
> > > > --- a/net/bridge/br_input.c
> > > > +++ b/net/bridge/br_input.c
> > > > @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > > >  		/* by definition the broadcast is also a multicast address */
> > > >  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
> > > >  			pkt_type = BR_PKT_BROADCAST;
> > > > -			local_rcv = true;
> > > > +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> > > >  		} else {
> > > >  			pkt_type = BR_PKT_MULTICAST;
> > > > -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> > > > -				goto drop;
> > > > +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> > > > +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> > > > +					goto drop;
> > > >  		}
> > > >  	}
> > > >  
> > > > @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > > >  			local_rcv = true;
> > > >  			br->dev->stats.multicast++;
> > > >  		}
> > > > +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
> > > > +			local_rcv = false;
> > > >  		break;
> > > >  	case BR_PKT_UNICAST:
> > > >  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
> > > > +		if (!br_opt_get(br, BROPT_FLOOD))
> > > > +			local_rcv = false;
> > > >  		break;
> > > >  	default:
> > > >  		break;
> > > > @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > > >  	if (dst) {
> > > >  		unsigned long now = jiffies;
> > > >  
> > > > -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
> > > > +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
> 
> So this is the line that breaks local termination. Could you explain
> the reasoning for this change? For unicast packets matching a local FDB
> entry, local_rcv used to be irrelevant (and wasn't even set to true,
> unless the bridge device was promiscuous).

Sorry, it wasn't obvious from the To: field, but the question was
targeted to Mattias and not to Nikolay.

> > > >  			return br_pass_frame_up(skb);
> > > >  
> > > >  		if (now != dst->used)
> > > > @@ -190,6 +195,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(br_handle_frame_finish);
> > > >  
> > > > +bool br_flood_enabled(const struct net_device *dev)
> > > > +{
> > > > +	struct net_bridge *br = netdev_priv(dev);
> > > > +
> > > > +	return !!(br_opt_get(br, BROPT_FLOOD) ||
> > > > +		   br_opt_get(br, BROPT_MCAST_FLOOD) ||
> > > > +		   br_opt_get(br, BROPT_BCAST_FLOOD));
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(br_flood_enabled);
> > > > +
> > > >  static void __br_handle_local_finish(struct sk_buff *skb)
> > > >  {
> > > >  	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
> > > > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> > > > index 48bc61ebc211..cf88dce0b92b 100644
> > > > --- a/net/bridge/br_private.h
> > > > +++ b/net/bridge/br_private.h
> > > > @@ -445,6 +445,9 @@ enum net_bridge_opts {
> > > >  	BROPT_NO_LL_LEARN,
> > > >  	BROPT_VLAN_BRIDGE_BINDING,
> > > >  	BROPT_MCAST_VLAN_SNOOPING_ENABLED,
> > > > +	BROPT_FLOOD,
> > > > +	BROPT_MCAST_FLOOD,
> > > > +	BROPT_BCAST_FLOOD,
> > > >  };
> > > >  
> > > >  struct net_bridge {
> > > > @@ -720,6 +723,7 @@ int br_boolopt_multi_toggle(struct net_bridge *br,
> > > >  void br_boolopt_multi_get(const struct net_bridge *br,
> > > >  			  struct br_boolopt_multi *bm);
> > > >  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on);
> > > > +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on);
> > > >  
> > > >  /* br_device.c */
> > > >  void br_dev_setup(struct net_device *dev);
> > > 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 11:59         ` Vladimir Oltean
@ 2022-03-17 12:08           ` Mattias Forsblad
  0 siblings, 0 replies; 23+ messages in thread
From: Mattias Forsblad @ 2022-03-17 12:08 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Nikolay Aleksandrov, netdev, David S . Miller, Jakub Kicinski,
	Andrew Lunn, Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Tobias Waldekranz, Joachim Wiberg

On 2022-03-17 12:59, Vladimir Oltean wrote:
> On Thu, Mar 17, 2022 at 01:57:03PM +0200, Vladimir Oltean wrote:
>> On Thu, Mar 17, 2022 at 01:15:45PM +0200, Vladimir Oltean wrote:
>>> On Thu, Mar 17, 2022 at 12:11:40PM +0200, Nikolay Aleksandrov wrote:
>>>> On 17/03/2022 08:50, Mattias Forsblad wrote:
>>>>> This patch implements the bridge flood flags. There are three different
>>>>> flags matching unicast, multicast and broadcast. When the corresponding
>>>>> flag is cleared packets received on bridge ports will not be flooded
>>>>> towards the bridge.
>>>>> This makes is possible to only forward selected traffic between the
>>>>> port members of the bridge.
>>>>>
>>>>> Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
>>>>> ---
>>>>>  include/linux/if_bridge.h      |  6 +++++
>>>>>  include/uapi/linux/if_bridge.h |  9 ++++++-
>>>>>  net/bridge/br.c                | 45 ++++++++++++++++++++++++++++++++++
>>>>>  net/bridge/br_device.c         |  3 +++
>>>>>  net/bridge/br_input.c          | 23 ++++++++++++++---
>>>>>  net/bridge/br_private.h        |  4 +++
>>>>>  6 files changed, 85 insertions(+), 5 deletions(-)
>>>>>
>>>> Please always CC bridge maintainers for bridge patches.
>>>> I almost missed this one. I'll add my reply to Joachim's notes
>>>> which are pretty spot on.
>>>
>>> And DSA maintainers for DSA patches ;) I was aimlessly scrolling through
>>> patchwork when I happened to notice these, and the series is already at v3.
>>>
>>> As a matter of fact, I downloaded these patches from the mailing list
>>> with the intention of giving them a spin on mv88e6xxx to see what
>>> they're about, and to my surprise, this particular patch (I haven't even
>>> reached the offloading part) breaks DHCP on my bridge, so it can no
>>> longer get an IP address. I haven't toggled any bridge flag through
>>> netlink, just booted the board with systemd-networkd. The same thing
>>> happens with my LS1028A board. Further investigation to come, but this
>>> isn't off to a good start, I'm afraid...
>>>
>>>>
>>>>> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
>>>>> index 3aae023a9353..fa8e000a6fb9 100644
>>>>> --- a/include/linux/if_bridge.h
>>>>> +++ b/include/linux/if_bridge.h
>>>>> @@ -157,6 +157,7 @@ static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
>>>>>  struct net_device *br_fdb_find_port(const struct net_device *br_dev,
>>>>>  				    const unsigned char *addr,
>>>>>  				    __u16 vid);
>>>>> +bool br_flood_enabled(const struct net_device *dev);
>>>>>  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>>>>>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
>>>>>  u8 br_port_get_stp_state(const struct net_device *dev);
>>>>> @@ -170,6 +171,11 @@ br_fdb_find_port(const struct net_device *br_dev,
>>>>>  	return NULL;
>>>>>  }
>>>>>  
>>>>> +static inline bool br_flood_enabled(const struct net_device *dev)
>>>>> +{
>>>>> +	return true;
>>>>> +}
>>>>> +
>>>>>  static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid)
>>>>>  {
>>>>>  }
>>>>> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
>>>>> index 2711c3522010..765ed70c9b28 100644
>>>>> --- a/include/uapi/linux/if_bridge.h
>>>>> +++ b/include/uapi/linux/if_bridge.h
>>>>> @@ -72,6 +72,7 @@ struct __bridge_info {
>>>>>  	__u32 tcn_timer_value;
>>>>>  	__u32 topology_change_timer_value;
>>>>>  	__u32 gc_timer_value;
>>>>> +	__u8 flood;
>>>>>  };
>>>>>  
>>>>>  struct __port_info {
>>>>> @@ -752,13 +753,19 @@ struct br_mcast_stats {
>>>>>  /* bridge boolean options
>>>>>   * BR_BOOLOPT_NO_LL_LEARN - disable learning from link-local packets
>>>>>   * BR_BOOLOPT_MCAST_VLAN_SNOOPING - control vlan multicast snooping
>>>>> + * BR_BOOLOPT_FLOOD - control bridge flood flag
>>>>> + * BR_BOOLOPT_MCAST_FLOOD - control bridge multicast flood flag
>>>>> + * BR_BOOLOPT_BCAST_FLOOD - control bridge broadcast flood flag
>>>>>   *
>>>>>   * IMPORTANT: if adding a new option do not forget to handle
>>>>> - *            it in br_boolopt_toggle/get and bridge sysfs
>>>>> + *            it in br_boolopt_toggle/get
>>>>>   */
>>>>>  enum br_boolopt_id {
>>>>>  	BR_BOOLOPT_NO_LL_LEARN,
>>>>>  	BR_BOOLOPT_MCAST_VLAN_SNOOPING,
>>>>> +	BR_BOOLOPT_FLOOD,
>>>>> +	BR_BOOLOPT_MCAST_FLOOD,
>>>>> +	BR_BOOLOPT_BCAST_FLOOD,
>>>>>  	BR_BOOLOPT_MAX
>>>>>  };
>>>>>  
>>>>> diff --git a/net/bridge/br.c b/net/bridge/br.c
>>>>> index b1dea3febeea..63a17bed6c63 100644
>>>>> --- a/net/bridge/br.c
>>>>> +++ b/net/bridge/br.c
>>>>> @@ -265,6 +265,11 @@ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on,
>>>>>  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
>>>>>  		err = br_multicast_toggle_vlan_snooping(br, on, extack);
>>>>>  		break;
>>>>> +	case BR_BOOLOPT_FLOOD:
>>>>> +	case BR_BOOLOPT_MCAST_FLOOD:
>>>>> +	case BR_BOOLOPT_BCAST_FLOOD:
>>>>> +		err = br_flood_toggle(br, opt, on);
>>>>> +		break;
>>>>>  	default:
>>>>>  		/* shouldn't be called with unsupported options */
>>>>>  		WARN_ON(1);
>>>>> @@ -281,6 +286,12 @@ int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt)
>>>>>  		return br_opt_get(br, BROPT_NO_LL_LEARN);
>>>>>  	case BR_BOOLOPT_MCAST_VLAN_SNOOPING:
>>>>>  		return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED);
>>>>> +	case BR_BOOLOPT_FLOOD:
>>>>> +		return br_opt_get(br, BROPT_FLOOD);
>>>>> +	case BR_BOOLOPT_MCAST_FLOOD:
>>>>> +		return br_opt_get(br, BROPT_MCAST_FLOOD);
>>>>> +	case BR_BOOLOPT_BCAST_FLOOD:
>>>>> +		return br_opt_get(br, BROPT_BCAST_FLOOD);
>>>>>  	default:
>>>>>  		/* shouldn't be called with unsupported options */
>>>>>  		WARN_ON(1);
>>>>> @@ -325,6 +336,40 @@ void br_boolopt_multi_get(const struct net_bridge *br,
>>>>>  	bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0);
>>>>>  }
>>>>>  
>>>>> +int br_flood_toggle(struct net_bridge *br, enum br_boolopt_id opt,
>>>>> +		    bool on)
>>>>> +{
>>>>> +	struct switchdev_attr attr = {
>>>>> +		.orig_dev = br->dev,
>>>>> +		.id = SWITCHDEV_ATTR_ID_BRIDGE_FLOOD,
>>>>> +		.flags = SWITCHDEV_F_DEFER,
>>>>> +	};
>>>>> +	enum net_bridge_opts bropt;
>>>>> +	int ret;
>>>>> +
>>>>> +	switch (opt) {
>>>>> +	case BR_BOOLOPT_FLOOD:
>>>>> +		bropt = BROPT_FLOOD;
>>>>> +		break;
>>>>> +	case BR_BOOLOPT_MCAST_FLOOD:
>>>>> +		bropt = BROPT_MCAST_FLOOD;
>>>>> +		break;
>>>>> +	case BR_BOOLOPT_BCAST_FLOOD:
>>>>> +		bropt = BROPT_BCAST_FLOOD;
>>>>> +		break;
>>>>> +	default:
>>>>> +		WARN_ON(1);
>>>>> +		return -EINVAL;
>>>>> +	}
>>>>> +	br_opt_toggle(br, bropt, on);
>>>>> +
>>>>> +	attr.u.brport_flags.mask = BIT(bropt);
>>>>> +	attr.u.brport_flags.val = on << bropt;
>>>>> +	ret = switchdev_port_attr_set(br->dev, &attr, NULL);
>>>>> +
>>>>> +	return ret;
>>>>> +}
>>>>> +
>>>>>  /* private bridge options, controlled by the kernel */
>>>>>  void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
>>>>>  {
>>>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>>>> index 8d6bab244c4a..fafaef9d4b3a 100644
>>>>> --- a/net/bridge/br_device.c
>>>>> +++ b/net/bridge/br_device.c
>>>>> @@ -524,6 +524,9 @@ void br_dev_setup(struct net_device *dev)
>>>>>  	br->bridge_hello_time = br->hello_time = 2 * HZ;
>>>>>  	br->bridge_forward_delay = br->forward_delay = 15 * HZ;
>>>>>  	br->bridge_ageing_time = br->ageing_time = BR_DEFAULT_AGEING_TIME;
>>>>> +	br_opt_toggle(br, BROPT_FLOOD, true);
>>>>> +	br_opt_toggle(br, BROPT_MCAST_FLOOD, true);
>>>>> +	br_opt_toggle(br, BROPT_BCAST_FLOOD, true);
>>>>>  	dev->max_mtu = ETH_MAX_MTU;
>>>>>  
>>>>>  	br_netfilter_rtable_init(br);
>>>>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>>>>> index e0c13fcc50ed..fcb0757bfdcc 100644
>>>>> --- a/net/bridge/br_input.c
>>>>> +++ b/net/bridge/br_input.c
>>>>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>>>>  		/* by definition the broadcast is also a multicast address */
>>>>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>>>>>  			pkt_type = BR_PKT_BROADCAST;
>>>>> -			local_rcv = true;
>>>>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
>>>>>  		} else {
>>>>>  			pkt_type = BR_PKT_MULTICAST;
>>>>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>>>>> -				goto drop;
>>>>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
>>>>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>>>>> +					goto drop;
>>>>>  		}
>>>>>  	}
>>>>>  
>>>>> @@ -155,9 +156,13 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>>>>  			local_rcv = true;
>>>>>  			br->dev->stats.multicast++;
>>>>>  		}
>>>>> +		if (!br_opt_get(br, BROPT_MCAST_FLOOD))
>>>>> +			local_rcv = false;
>>>>>  		break;
>>>>>  	case BR_PKT_UNICAST:
>>>>>  		dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);
>>>>> +		if (!br_opt_get(br, BROPT_FLOOD))
>>>>> +			local_rcv = false;
>>>>>  		break;
>>>>>  	default:
>>>>>  		break;
>>>>> @@ -166,7 +171,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>>>>>  	if (dst) {
>>>>>  		unsigned long now = jiffies;
>>>>>  
>>>>> -		if (test_bit(BR_FDB_LOCAL, &dst->flags))
>>>>> +		if (test_bit(BR_FDB_LOCAL, &dst->flags) && local_rcv)
>>
>> So this is the line that breaks local termination. Could you explain
>> the reasoning for this change? For unicast packets matching a local FDB
>> entry, local_rcv used to be irrelevant (and wasn't even set to true,
>> unless the bridge device was promiscuous).
> 
> Sorry, it wasn't obvious from the To: field, but the question was
> targeted to Mattias and not to Nikolay.
> 

It might as simple be that I've misunderstood that flow.

/Mattias

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 11:42       ` Nikolay Aleksandrov
@ 2022-03-17 12:26         ` Ido Schimmel
  2022-03-17 13:34           ` Tobias Waldekranz
  0 siblings, 1 reply; 23+ messages in thread
From: Ido Schimmel @ 2022-03-17 12:26 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Mattias Forsblad, Joachim Wiberg, netdev, David S . Miller,
	Jakub Kicinski, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Roopa Prabhu, Tobias Waldekranz, Vladimir Oltean

On Thu, Mar 17, 2022 at 01:42:55PM +0200, Nikolay Aleksandrov wrote:
> On 17/03/2022 13:39, Mattias Forsblad wrote:
> > On 2022-03-17 10:07, Joachim Wiberg wrote:
> >> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
> >>> This patch implements the bridge flood flags. There are three different
> >>> flags matching unicast, multicast and broadcast. When the corresponding
> >>> flag is cleared packets received on bridge ports will not be flooded
> >>> towards the bridge.
> >>
> >> If I've not completely misunderstood things, I believe the flood and
> >> mcast_flood flags operate on unknown unicast and multicast.  With that
> >> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
> >> add my own comments below.
> >>
> >> Happy incident I saw this patch set, I have a very similar one for these
> >> flags to the bridge itself, with the intent to improve handling of all
> >> classes of multicast to/from the bridge itself.
> >>
> >>> [snip]
> >>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> >>> index e0c13fcc50ed..fcb0757bfdcc 100644
> >>> --- a/net/bridge/br_input.c
> >>> +++ b/net/bridge/br_input.c
> >>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> >>>  		/* by definition the broadcast is also a multicast address */
> >>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
> >>>  			pkt_type = BR_PKT_BROADCAST;
> >>> -			local_rcv = true;
> >>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> >>
> >> Minor comment, I believe the preferred style is more like this:
> >>
> >> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
> >>         	local_rcv = true;
> >>
> >>>  		} else {
> >>>  			pkt_type = BR_PKT_MULTICAST;
> >>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> >>> -				goto drop;
> >>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> >>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> >>> +					goto drop;
> >>
> >> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
> >> we cannot bypass the call to br_multicast_rcv(), which helps with the
> >> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
> >> router ports, while the mdb lookup (below) is what an tell us if we
> >> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
> >> flag for the bridge itself.
> > 
> > The original flag was name was local_receive to separate it from being
> > mistaken for the flood unknown flags. However the comment I've got was
> > to align it with the existing (port) flags. These flags have nothing to do with
> > the port flood unknown flags. Imagine the setup below:
> > 
> >            vlan1
> >              |
> >             br0             br1
> >            /   \           /   \
> >          swp1 swp2       swp3 swp4
> > 
> > We want to have swp1/2 as member of a normal vlan filtering bridge br0 /w learning on. 
> > On br1 we want to just forward packets between swp3/4 and disable learning. 
> > Additional we don't want this traffic to impact the CPU. 
> > If we disable learning on swp3/4 all traffic will be unknown and if we also 
> > have flood unknown on the CPU-port because of requirements for br0 it will
> > impact the traffic to br1. Thus we want to restrict traffic between swp3/4<->CPU port
> > with the help of the PVT.
> > 
> > /Mattias
> 
> The feedback was correct and we all assumed unknown traffic control.
> If you don't want any local receive then use filtering rules. Don't add unnecessary flags.

Yep. Very easy with tc:

# tc qdisc add dev br1 clsact
# tc filter add dev br1 ingress pref 1 proto all matchall action drop

This can be fully implemented inside the relevant device driver, no
changes needed in the bridge driver.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/5] mv88e6xxx: Offload the flood flag
  2022-03-17  6:50 ` [PATCH 4/5] mv88e6xxx: Offload the flood flag Mattias Forsblad
@ 2022-03-17 13:05   ` Vladimir Oltean
  2022-03-31  8:11     ` Tobias Waldekranz
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2022-03-17 13:05 UTC (permalink / raw)
  To: Mattias Forsblad
  Cc: netdev, David S . Miller, Jakub Kicinski, Andrew Lunn,
	Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Tobias Waldekranz, Nikolay Aleksandrov, Joachim Wiberg,
	Ido Schimmel, Allan W. Nielsen, UNGLinuxDriver

On Thu, Mar 17, 2022 at 07:50:30AM +0100, Mattias Forsblad wrote:
> Use the port vlan table to restrict ingressing traffic to the
> CPU port if the flood flags are cleared.
> 
> Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
> ---

There is a grave mismatch between what this patch says it does and what
it really does. (=> NACK)

Doing some interpolation from previous commit descriptions, the
intention is to disable flooding from a given port towards the CPU
(which, I mean, is fair enough as a goal).

But:
(a) mv88e6xxx_port_vlan() disables _forwarding_ from port A to port B.
    So this affects not only unknown traffic (the one which is flooded),
    but all traffic
(b) even if br_flood_enabled() is false (meaning that the bridge device
    doesn't want to locally process flooded packets), there is no
    equality sign between this and disabling flooding on the CPU port.
    If the DSA switch is bridged with a foreign (non-DSA) interface, be
    it a tap, a Wi-Fi AP, or a plain Ethernet port, then from the
    switch's perspective, this is no different from a local termination
    flow (packets need to be forwarded to the CPU). Yet from the
    bridge's perspective, it is a forwarding and not a termination flow.
    So you can't _just_ disable CPU flooding/forwarding when the bridge
    doesn't want to locally terminate traffic.

Regarding (b), I've CC'ed Allan Nielsen who held this presentation a few
years ago, and some ideas were able to be materialized in the meantime:
https://www.youtube.com/watch?v=B1HhxEcU7Jg

Regarding (a), have you seen the new dsa_port_manage_cpu_flood() from
the DSA unicast filtering patch series?
https://patchwork.kernel.org/project/netdevbpf/patch/20220302191417.1288145-6-vladimir.oltean@nxp.com/
It is incomplete work in the sense that

(1) it disables CPU flooding only if there isn't any port with IFF_PROMISC,
    but the bridge puts all ports in promiscuous mode. I think we can
    win that battle here, and convince bridge/switchdev maintainers to
    not put offloaded bridge ports (those that call switchdev_bridge_port_offload)
    in promiscuous mode, since it serves no purpose and that actively
    bothers us. At least the way DSA sees this is that unicast filtering
    and promiscuous mode deal with standalone mode. The forwarding plane
    is effectively a different address database and there is no direct
    equivalent to promiscuity there.

(2) Right now DSA calls ->port_bridge_flags() from dsa_port_manage_cpu_flood(),
    i.e. it treats CPU flooding as a purely per-port-egress setting.
    But once I manage to straighten some kinks in DSA's unicast
    filtering support for switches with ds->vlan_filtering_is_global (in
    other words, make sja1105 eligible for unicast filtering), I pretty
    much plan to change this by making DSA ask the driver to manage CPU
    flooding per user port - leaving this code path as just a fallback.

As baroque as I consider the sja1105 hardware to be, I'm surprised it
has a feature which mv88e6xxx doesn't seem to - which is having flood
controls per {ingress port, egress port} pair. So we'll have to
improvise here.

Could you tell me - ok, you remove the CPU port from the port VLAN map -
but if you install host FDB entries as ACL entries (so as to make the
switch generate a TO_CPU packet instead of a FORWARD packet), doesn't
the switch in fact send packets to the CPU even in lack of the CPU
port's membership in the port VLAN table for the bridge port?

If I'm right and it does, then I do see a path forward for this, with
zero user space additions, and working by default. We make the bridge
stop uselessly making offloaded DSA bridge ports promiscuous, then we
make DSA manage CPU flooding by itself - taking promiscuity into account
but also foreign interfaces joining/leaving. Then we make host addresses
be delivered by mv88e6xxx to the CPU as trapped and not forwarded, then
from new the DSA ->port_set_cpu_flood() callback we remove the CPU port
from the port VLAN table.

What do you think?

>  drivers/net/dsa/mv88e6xxx/chip.c | 45 ++++++++++++++++++++++++++++++--
>  1 file changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 84b90fc36c58..39347a05c3a5 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -1384,6 +1384,7 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>  	struct dsa_switch *ds = chip->ds;
>  	struct dsa_switch_tree *dst = ds->dst;
>  	struct dsa_port *dp, *other_dp;
> +	bool flood = true;
>  	bool found = false;
>  	u16 pvlan;
>  
> @@ -1425,6 +1426,9 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>  
>  	pvlan = 0;
>  
> +	if (dp->bridge)
> +		flood = br_flood_enabled(dp->bridge->dev);
> +
>  	/* Frames from standalone user ports can only egress on the
>  	 * upstream port.
>  	 */
> @@ -1433,10 +1437,11 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>  
>  	/* Frames from bridged user ports can egress any local DSA
>  	 * links and CPU ports, as well as any local member of their
> -	 * bridge group.
> +	 * as well as any local member of their bridge group. However, CPU ports
> +	 * are omitted if flood is cleared.
>  	 */
>  	dsa_switch_for_each_port(other_dp, ds)
> -		if (other_dp->type == DSA_PORT_TYPE_CPU ||
> +		if ((other_dp->type == DSA_PORT_TYPE_CPU && flood) ||
>  		    other_dp->type == DSA_PORT_TYPE_DSA ||
>  		    dsa_port_bridge_same(dp, other_dp))
>  			pvlan |= BIT(other_dp->index);
> @@ -2718,6 +2723,41 @@ static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds,
>  	mv88e6xxx_reg_unlock(chip);
>  }
>  
> +static int mv88e6xxx_set_flood(struct dsa_switch *ds, int port, struct net_device *br,
> +			       unsigned long mask, unsigned long val)
> +{
> +	struct mv88e6xxx_chip *chip = ds->priv;
> +	struct dsa_bridge *bridge;
> +	struct dsa_port *dp;
> +	bool found = false;
> +	int err;
> +
> +	if (!netif_is_bridge_master(br))
> +		return 0;
> +
> +	list_for_each_entry(dp, &ds->dst->ports, list) {
> +		if (dp->ds == ds && dp->index == port) {
> +			found = true;
> +			break;
> +		}
> +	}
> +
> +	if (!found)
> +		return 0;
> +
> +	bridge = dp->bridge;
> +	if (!bridge)
> +		return 0;
> +
> +	mv88e6xxx_reg_lock(chip);
> +
> +	err = mv88e6xxx_bridge_map(chip, *bridge);
> +
> +	mv88e6xxx_reg_unlock(chip);
> +
> +	return err;
> +}
> +
>  static int mv88e6xxx_software_reset(struct mv88e6xxx_chip *chip)
>  {
>  	if (chip->info->ops->reset)
> @@ -6478,6 +6518,7 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
>  	.set_eeprom		= mv88e6xxx_set_eeprom,
>  	.get_regs_len		= mv88e6xxx_get_regs_len,
>  	.get_regs		= mv88e6xxx_get_regs,
> +	.set_flood		= mv88e6xxx_set_flood,
>  	.get_rxnfc		= mv88e6xxx_get_rxnfc,
>  	.set_rxnfc		= mv88e6xxx_set_rxnfc,
>  	.set_ageing_time	= mv88e6xxx_set_ageing_time,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 12:26         ` Ido Schimmel
@ 2022-03-17 13:34           ` Tobias Waldekranz
  2022-03-17 14:10             ` Ido Schimmel
  0 siblings, 1 reply; 23+ messages in thread
From: Tobias Waldekranz @ 2022-03-17 13:34 UTC (permalink / raw)
  To: Ido Schimmel, Nikolay Aleksandrov
  Cc: Mattias Forsblad, Joachim Wiberg, netdev, David S . Miller,
	Jakub Kicinski, Andrew Lunn, Florian Fainelli, Vivien Didelot,
	Roopa Prabhu, Vladimir Oltean

On Thu, Mar 17, 2022 at 14:26, Ido Schimmel <idosch@idosch.org> wrote:
> On Thu, Mar 17, 2022 at 01:42:55PM +0200, Nikolay Aleksandrov wrote:
>> On 17/03/2022 13:39, Mattias Forsblad wrote:
>> > On 2022-03-17 10:07, Joachim Wiberg wrote:
>> >> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
>> >>> This patch implements the bridge flood flags. There are three different
>> >>> flags matching unicast, multicast and broadcast. When the corresponding
>> >>> flag is cleared packets received on bridge ports will not be flooded
>> >>> towards the bridge.
>> >>
>> >> If I've not completely misunderstood things, I believe the flood and
>> >> mcast_flood flags operate on unknown unicast and multicast.  With that
>> >> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
>> >> add my own comments below.
>> >>
>> >> Happy incident I saw this patch set, I have a very similar one for these
>> >> flags to the bridge itself, with the intent to improve handling of all
>> >> classes of multicast to/from the bridge itself.
>> >>
>> >>> [snip]
>> >>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>> >>> index e0c13fcc50ed..fcb0757bfdcc 100644
>> >>> --- a/net/bridge/br_input.c
>> >>> +++ b/net/bridge/br_input.c
>> >>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>> >>>  		/* by definition the broadcast is also a multicast address */
>> >>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>> >>>  			pkt_type = BR_PKT_BROADCAST;
>> >>> -			local_rcv = true;
>> >>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
>> >>
>> >> Minor comment, I believe the preferred style is more like this:
>> >>
>> >> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
>> >>         	local_rcv = true;
>> >>
>> >>>  		} else {
>> >>>  			pkt_type = BR_PKT_MULTICAST;
>> >>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> >>> -				goto drop;
>> >>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
>> >>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> >>> +					goto drop;
>> >>
>> >> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
>> >> we cannot bypass the call to br_multicast_rcv(), which helps with the
>> >> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
>> >> router ports, while the mdb lookup (below) is what an tell us if we
>> >> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
>> >> flag for the bridge itself.
>> > 
>> > The original flag was name was local_receive to separate it from being
>> > mistaken for the flood unknown flags. However the comment I've got was
>> > to align it with the existing (port) flags. These flags have nothing to do with
>> > the port flood unknown flags. Imagine the setup below:
>> > 
>> >            vlan1
>> >              |
>> >             br0             br1
>> >            /   \           /   \
>> >          swp1 swp2       swp3 swp4
>> > 
>> > We want to have swp1/2 as member of a normal vlan filtering bridge br0 /w learning on. 
>> > On br1 we want to just forward packets between swp3/4 and disable learning. 
>> > Additional we don't want this traffic to impact the CPU. 
>> > If we disable learning on swp3/4 all traffic will be unknown and if we also 
>> > have flood unknown on the CPU-port because of requirements for br0 it will
>> > impact the traffic to br1. Thus we want to restrict traffic between swp3/4<->CPU port
>> > with the help of the PVT.
>> > 
>> > /Mattias
>> 
>> The feedback was correct and we all assumed unknown traffic control.
>> If you don't want any local receive then use filtering rules. Don't add unnecessary flags.
>
> Yep. Very easy with tc:
>
> # tc qdisc add dev br1 clsact
> # tc filter add dev br1 ingress pref 1 proto all matchall action drop
>
> This can be fully implemented inside the relevant device driver, no
> changes needed in the bridge driver.

Interesting. Are you saying that a switchdev driver can offload rules
for a netdev that it does not directly control (e.g. bridge that it is
connected to)? It thought that you had to bind the relevant ports to the
same block to approximate that. If so, are any drivers implementing
this? I did a quick scan of mlxsw, but could not find anything obvious.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 13:34           ` Tobias Waldekranz
@ 2022-03-17 14:10             ` Ido Schimmel
  2022-03-17 16:02               ` Tobias Waldekranz
  0 siblings, 1 reply; 23+ messages in thread
From: Ido Schimmel @ 2022-03-17 14:10 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: Nikolay Aleksandrov, Mattias Forsblad, Joachim Wiberg, netdev,
	David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Vladimir Oltean

On Thu, Mar 17, 2022 at 02:34:38PM +0100, Tobias Waldekranz wrote:
> On Thu, Mar 17, 2022 at 14:26, Ido Schimmel <idosch@idosch.org> wrote:
> > On Thu, Mar 17, 2022 at 01:42:55PM +0200, Nikolay Aleksandrov wrote:
> >> On 17/03/2022 13:39, Mattias Forsblad wrote:
> >> > On 2022-03-17 10:07, Joachim Wiberg wrote:
> >> >> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
> >> >>> This patch implements the bridge flood flags. There are three different
> >> >>> flags matching unicast, multicast and broadcast. When the corresponding
> >> >>> flag is cleared packets received on bridge ports will not be flooded
> >> >>> towards the bridge.
> >> >>
> >> >> If I've not completely misunderstood things, I believe the flood and
> >> >> mcast_flood flags operate on unknown unicast and multicast.  With that
> >> >> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
> >> >> add my own comments below.
> >> >>
> >> >> Happy incident I saw this patch set, I have a very similar one for these
> >> >> flags to the bridge itself, with the intent to improve handling of all
> >> >> classes of multicast to/from the bridge itself.
> >> >>
> >> >>> [snip]
> >> >>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> >> >>> index e0c13fcc50ed..fcb0757bfdcc 100644
> >> >>> --- a/net/bridge/br_input.c
> >> >>> +++ b/net/bridge/br_input.c
> >> >>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
> >> >>>  		/* by definition the broadcast is also a multicast address */
> >> >>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
> >> >>>  			pkt_type = BR_PKT_BROADCAST;
> >> >>> -			local_rcv = true;
> >> >>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
> >> >>
> >> >> Minor comment, I believe the preferred style is more like this:
> >> >>
> >> >> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
> >> >>         	local_rcv = true;
> >> >>
> >> >>>  		} else {
> >> >>>  			pkt_type = BR_PKT_MULTICAST;
> >> >>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> >> >>> -				goto drop;
> >> >>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
> >> >>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
> >> >>> +					goto drop;
> >> >>
> >> >> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
> >> >> we cannot bypass the call to br_multicast_rcv(), which helps with the
> >> >> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
> >> >> router ports, while the mdb lookup (below) is what an tell us if we
> >> >> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
> >> >> flag for the bridge itself.
> >> > 
> >> > The original flag was name was local_receive to separate it from being
> >> > mistaken for the flood unknown flags. However the comment I've got was
> >> > to align it with the existing (port) flags. These flags have nothing to do with
> >> > the port flood unknown flags. Imagine the setup below:
> >> > 
> >> >            vlan1
> >> >              |
> >> >             br0             br1
> >> >            /   \           /   \
> >> >          swp1 swp2       swp3 swp4
> >> > 
> >> > We want to have swp1/2 as member of a normal vlan filtering bridge br0 /w learning on. 
> >> > On br1 we want to just forward packets between swp3/4 and disable learning. 
> >> > Additional we don't want this traffic to impact the CPU. 
> >> > If we disable learning on swp3/4 all traffic will be unknown and if we also 
> >> > have flood unknown on the CPU-port because of requirements for br0 it will
> >> > impact the traffic to br1. Thus we want to restrict traffic between swp3/4<->CPU port
> >> > with the help of the PVT.
> >> > 
> >> > /Mattias
> >> 
> >> The feedback was correct and we all assumed unknown traffic control.
> >> If you don't want any local receive then use filtering rules. Don't add unnecessary flags.
> >
> > Yep. Very easy with tc:
> >
> > # tc qdisc add dev br1 clsact
> > # tc filter add dev br1 ingress pref 1 proto all matchall action drop
> >
> > This can be fully implemented inside the relevant device driver, no
> > changes needed in the bridge driver.
> 
> Interesting. Are you saying that a switchdev driver can offload rules
> for a netdev that it does not directly control (e.g. bridge that it is
> connected to)? It thought that you had to bind the relevant ports to the
> same block to approximate that. If so, are any drivers implementing
> this? I did a quick scan of mlxsw, but could not find anything obvious.

Yes, currently mlxsw only supports filters configured on physical
netdevs, but the HW can support more advanced configurations such as
filters on a bridge device (or a VLAN upper). These would be translated
to ACLs configured on the ingress/egress router interface (RIF) backing
the netdev. NIC drivers support much more advanced tc offloads due to
the prevalent usage of OVS in this space, so might be better to check
them instead.

TBH, I never looked too deeply into this, but assumed it's supported via
the indirect tc block infra. See commit 7f76fa36754b ("net: sched:
register callbacks for indirect tc block binds") for more details.

Even if I got it wrong, it would be beneficial to extend the tc offload
infra rather than patching the bridge driver to achieve a functionality
that is already supported in the SW data path via tc.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] net: bridge: Implement bridge flood flag
  2022-03-17 14:10             ` Ido Schimmel
@ 2022-03-17 16:02               ` Tobias Waldekranz
  0 siblings, 0 replies; 23+ messages in thread
From: Tobias Waldekranz @ 2022-03-17 16:02 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Nikolay Aleksandrov, Mattias Forsblad, Joachim Wiberg, netdev,
	David S . Miller, Jakub Kicinski, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Roopa Prabhu, Vladimir Oltean

On Thu, Mar 17, 2022 at 16:10, Ido Schimmel <idosch@idosch.org> wrote:
> On Thu, Mar 17, 2022 at 02:34:38PM +0100, Tobias Waldekranz wrote:
>> On Thu, Mar 17, 2022 at 14:26, Ido Schimmel <idosch@idosch.org> wrote:
>> > On Thu, Mar 17, 2022 at 01:42:55PM +0200, Nikolay Aleksandrov wrote:
>> >> On 17/03/2022 13:39, Mattias Forsblad wrote:
>> >> > On 2022-03-17 10:07, Joachim Wiberg wrote:
>> >> >> On Thu, Mar 17, 2022 at 07:50, Mattias Forsblad <mattias.forsblad@gmail.com> wrote:
>> >> >>> This patch implements the bridge flood flags. There are three different
>> >> >>> flags matching unicast, multicast and broadcast. When the corresponding
>> >> >>> flag is cleared packets received on bridge ports will not be flooded
>> >> >>> towards the bridge.
>> >> >>
>> >> >> If I've not completely misunderstood things, I believe the flood and
>> >> >> mcast_flood flags operate on unknown unicast and multicast.  With that
>> >> >> in mind I think the hot path in br_input.c needs a bit more eyes.  I'll
>> >> >> add my own comments below.
>> >> >>
>> >> >> Happy incident I saw this patch set, I have a very similar one for these
>> >> >> flags to the bridge itself, with the intent to improve handling of all
>> >> >> classes of multicast to/from the bridge itself.
>> >> >>
>> >> >>> [snip]
>> >> >>> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
>> >> >>> index e0c13fcc50ed..fcb0757bfdcc 100644
>> >> >>> --- a/net/bridge/br_input.c
>> >> >>> +++ b/net/bridge/br_input.c
>> >> >>> @@ -109,11 +109,12 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
>> >> >>>  		/* by definition the broadcast is also a multicast address */
>> >> >>>  		if (is_broadcast_ether_addr(eth_hdr(skb)->h_dest)) {
>> >> >>>  			pkt_type = BR_PKT_BROADCAST;
>> >> >>> -			local_rcv = true;
>> >> >>> +			local_rcv = true && br_opt_get(br, BROPT_BCAST_FLOOD);
>> >> >>
>> >> >> Minor comment, I believe the preferred style is more like this:
>> >> >>
>> >> >> 	if (br_opt_get(br, BROPT_BCAST_FLOOD))
>> >> >>         	local_rcv = true;
>> >> >>
>> >> >>>  		} else {
>> >> >>>  			pkt_type = BR_PKT_MULTICAST;
>> >> >>> -			if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> >> >>> -				goto drop;
>> >> >>> +			if (br_opt_get(br, BROPT_MCAST_FLOOD))
>> >> >>> +				if (br_multicast_rcv(&brmctx, &pmctx, vlan, skb, vid))
>> >> >>> +					goto drop;
>> >> >>
>> >> >> Since the BROPT_MCAST_FLOOD flag should only control uknown multicast,
>> >> >> we cannot bypass the call to br_multicast_rcv(), which helps with the
>> >> >> classifcation.  E.g., we want IGMP/MLD reports to be forwarded to all
>> >> >> router ports, while the mdb lookup (below) is what an tell us if we
>> >> >> have uknown multicast and there we can check the BROPT_MCAST_FLOOD
>> >> >> flag for the bridge itself.
>> >> > 
>> >> > The original flag was name was local_receive to separate it from being
>> >> > mistaken for the flood unknown flags. However the comment I've got was
>> >> > to align it with the existing (port) flags. These flags have nothing to do with
>> >> > the port flood unknown flags. Imagine the setup below:
>> >> > 
>> >> >            vlan1
>> >> >              |
>> >> >             br0             br1
>> >> >            /   \           /   \
>> >> >          swp1 swp2       swp3 swp4
>> >> > 
>> >> > We want to have swp1/2 as member of a normal vlan filtering bridge br0 /w learning on. 
>> >> > On br1 we want to just forward packets between swp3/4 and disable learning. 
>> >> > Additional we don't want this traffic to impact the CPU. 
>> >> > If we disable learning on swp3/4 all traffic will be unknown and if we also 
>> >> > have flood unknown on the CPU-port because of requirements for br0 it will
>> >> > impact the traffic to br1. Thus we want to restrict traffic between swp3/4<->CPU port
>> >> > with the help of the PVT.
>> >> > 
>> >> > /Mattias
>> >> 
>> >> The feedback was correct and we all assumed unknown traffic control.
>> >> If you don't want any local receive then use filtering rules. Don't add unnecessary flags.
>> >
>> > Yep. Very easy with tc:
>> >
>> > # tc qdisc add dev br1 clsact
>> > # tc filter add dev br1 ingress pref 1 proto all matchall action drop
>> >
>> > This can be fully implemented inside the relevant device driver, no
>> > changes needed in the bridge driver.
>> 
>> Interesting. Are you saying that a switchdev driver can offload rules
>> for a netdev that it does not directly control (e.g. bridge that it is
>> connected to)? It thought that you had to bind the relevant ports to the
>> same block to approximate that. If so, are any drivers implementing
>> this? I did a quick scan of mlxsw, but could not find anything obvious.
>
> Yes, currently mlxsw only supports filters configured on physical
> netdevs, but the HW can support more advanced configurations such as
> filters on a bridge device (or a VLAN upper). These would be translated
> to ACLs configured on the ingress/egress router interface (RIF) backing
> the netdev. NIC drivers support much more advanced tc offloads due to
> the prevalent usage of OVS in this space, so might be better to check
> them instead.
>
> TBH, I never looked too deeply into this, but assumed it's supported via
> the indirect tc block infra. See commit 7f76fa36754b ("net: sched:
> register callbacks for indirect tc block binds") for more details.

Thanks for this pointer! That does indeed seem possible.

For others who may be interested, the API seems to have moved a bit, but
there are a few implementations. The current entry point seems to be:

flow_indr_dev_register

> Even if I got it wrong, it would be beneficial to extend the tc offload
> infra rather than patching the bridge driver to achieve a functionality
> that is already supported in the SW data path via tc.

Agreed! I just had no idea that it was already possible.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/5] mv88e6xxx: Offload the flood flag
  2022-03-17 13:05   ` Vladimir Oltean
@ 2022-03-31  8:11     ` Tobias Waldekranz
  2022-03-31 15:00       ` Vladimir Oltean
  0 siblings, 1 reply; 23+ messages in thread
From: Tobias Waldekranz @ 2022-03-31  8:11 UTC (permalink / raw)
  To: Vladimir Oltean, Mattias Forsblad
  Cc: netdev, David S . Miller, Jakub Kicinski, Andrew Lunn,
	Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Nikolay Aleksandrov, Joachim Wiberg, Ido Schimmel,
	Allan W. Nielsen, UNGLinuxDriver

On Thu, Mar 17, 2022 at 15:05, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Thu, Mar 17, 2022 at 07:50:30AM +0100, Mattias Forsblad wrote:
>> Use the port vlan table to restrict ingressing traffic to the
>> CPU port if the flood flags are cleared.
>> 
>> Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
>> ---
>
> There is a grave mismatch between what this patch says it does and what
> it really does. (=> NACK)
>
> Doing some interpolation from previous commit descriptions, the
> intention is to disable flooding from a given port towards the CPU
> (which, I mean, is fair enough as a goal).
>
> But:
> (a) mv88e6xxx_port_vlan() disables _forwarding_ from port A to port B.
>     So this affects not only unknown traffic (the one which is flooded),
>     but all traffic
> (b) even if br_flood_enabled() is false (meaning that the bridge device
>     doesn't want to locally process flooded packets), there is no
>     equality sign between this and disabling flooding on the CPU port.
>     If the DSA switch is bridged with a foreign (non-DSA) interface, be
>     it a tap, a Wi-Fi AP, or a plain Ethernet port, then from the
>     switch's perspective, this is no different from a local termination
>     flow (packets need to be forwarded to the CPU). Yet from the
>     bridge's perspective, it is a forwarding and not a termination flow.
>     So you can't _just_ disable CPU flooding/forwarding when the bridge
>     doesn't want to locally terminate traffic.
>
> Regarding (b), I've CC'ed Allan Nielsen who held this presentation a few
> years ago, and some ideas were able to be materialized in the meantime:
> https://www.youtube.com/watch?v=B1HhxEcU7Jg
>
> Regarding (a), have you seen the new dsa_port_manage_cpu_flood() from
> the DSA unicast filtering patch series?
> https://patchwork.kernel.org/project/netdevbpf/patch/20220302191417.1288145-6-vladimir.oltean@nxp.com/
> It is incomplete work in the sense that
>
> (1) it disables CPU flooding only if there isn't any port with IFF_PROMISC,
>     but the bridge puts all ports in promiscuous mode. I think we can
>     win that battle here, and convince bridge/switchdev maintainers to
>     not put offloaded bridge ports (those that call switchdev_bridge_port_offload)
>     in promiscuous mode, since it serves no purpose and that actively
>     bothers us. At least the way DSA sees this is that unicast filtering
>     and promiscuous mode deal with standalone mode. The forwarding plane
>     is effectively a different address database and there is no direct
>     equivalent to promiscuity there.
>
> (2) Right now DSA calls ->port_bridge_flags() from dsa_port_manage_cpu_flood(),
>     i.e. it treats CPU flooding as a purely per-port-egress setting.
>     But once I manage to straighten some kinks in DSA's unicast
>     filtering support for switches with ds->vlan_filtering_is_global (in
>     other words, make sja1105 eligible for unicast filtering), I pretty
>     much plan to change this by making DSA ask the driver to manage CPU
>     flooding per user port - leaving this code path as just a fallback.
>
> As baroque as I consider the sja1105 hardware to be, I'm surprised it
> has a feature which mv88e6xxx doesn't seem to - which is having flood
> controls per {ingress port, egress port} pair. So we'll have to
> improvise here.
>
> Could you tell me - ok, you remove the CPU port from the port VLAN map -
> but if you install host FDB entries as ACL entries (so as to make the
> switch generate a TO_CPU packet instead of a FORWARD packet), doesn't
> the switch in fact send packets to the CPU even in lack of the CPU
> port's membership in the port VLAN table for the bridge port?
>
> If I'm right and it does, then I do see a path forward for this, with
> zero user space additions, and working by default. We make the bridge
> stop uselessly making offloaded DSA bridge ports promiscuous, then we
> make DSA manage CPU flooding by itself - taking promiscuity into account
> but also foreign interfaces joining/leaving. Then we make host addresses
> be delivered by mv88e6xxx to the CPU as trapped and not forwarded, then
> from new the DSA ->port_set_cpu_flood() callback we remove the CPU port
> from the port VLAN table.
>
> What do you think?

It's an interesting idea. For unicast entries you could maybe get away
with it.  Though, it would mean that we would be limited to assisted CPU
learning, since there is no way for the switch to autonomously generate
ACL entries ("Policy entries" in ATU parlance). By extension, this also
means that the Learn2All functionality goes out the window for multichip
setups for addresses associated with the CPU.

For multicast though, I'm not sure that it would work in a multichip
system. As you say a policy entry will be sent with a TO_CPU tag, the
problem is that I think that applies to all DSA ports. So in this
system:

  CPU
   |
.--0--.   .-----.
| sw0 3---0 sw1 |
'-1-2-'   '-1-2-'

If we have a multicast group with subscribers behind sw0p{0,2} and
sw1p2, we need the following ATU entries:

sw0:
da:01:00:5e:01:02:03 vid:0 state:policy dpv:0,2,3

sw1:
da:01:00:5e:01:02:03 vid:0 state:policy dpv:0,2

When this group ingresses on sw0p1, I suspect it will egress
sw0p{0,2,3}, but on ingress at sw1p0 the frame will be dropped since it
will contain a TO_CPU tag (and sw1's CPU port is the ingress port).

Similarly, when this group ingresses on sw1p1, it will egress sw1p{0,2},
but since it is tagged with TO_CPU on ingress to sw0p3, it won't reach
sw0p2.

>>  drivers/net/dsa/mv88e6xxx/chip.c | 45 ++++++++++++++++++++++++++++++--
>>  1 file changed, 43 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
>> index 84b90fc36c58..39347a05c3a5 100644
>> --- a/drivers/net/dsa/mv88e6xxx/chip.c
>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
>> @@ -1384,6 +1384,7 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>>  	struct dsa_switch *ds = chip->ds;
>>  	struct dsa_switch_tree *dst = ds->dst;
>>  	struct dsa_port *dp, *other_dp;
>> +	bool flood = true;
>>  	bool found = false;
>>  	u16 pvlan;
>>  
>> @@ -1425,6 +1426,9 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>>  
>>  	pvlan = 0;
>>  
>> +	if (dp->bridge)
>> +		flood = br_flood_enabled(dp->bridge->dev);
>> +
>>  	/* Frames from standalone user ports can only egress on the
>>  	 * upstream port.
>>  	 */
>> @@ -1433,10 +1437,11 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>>  
>>  	/* Frames from bridged user ports can egress any local DSA
>>  	 * links and CPU ports, as well as any local member of their
>> -	 * bridge group.
>> +	 * as well as any local member of their bridge group. However, CPU ports
>> +	 * are omitted if flood is cleared.
>>  	 */
>>  	dsa_switch_for_each_port(other_dp, ds)
>> -		if (other_dp->type == DSA_PORT_TYPE_CPU ||
>> +		if ((other_dp->type == DSA_PORT_TYPE_CPU && flood) ||
>>  		    other_dp->type == DSA_PORT_TYPE_DSA ||
>>  		    dsa_port_bridge_same(dp, other_dp))
>>  			pvlan |= BIT(other_dp->index);
>> @@ -2718,6 +2723,41 @@ static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds,
>>  	mv88e6xxx_reg_unlock(chip);
>>  }
>>  
>> +static int mv88e6xxx_set_flood(struct dsa_switch *ds, int port, struct net_device *br,
>> +			       unsigned long mask, unsigned long val)
>> +{
>> +	struct mv88e6xxx_chip *chip = ds->priv;
>> +	struct dsa_bridge *bridge;
>> +	struct dsa_port *dp;
>> +	bool found = false;
>> +	int err;
>> +
>> +	if (!netif_is_bridge_master(br))
>> +		return 0;
>> +
>> +	list_for_each_entry(dp, &ds->dst->ports, list) {
>> +		if (dp->ds == ds && dp->index == port) {
>> +			found = true;
>> +			break;
>> +		}
>> +	}
>> +
>> +	if (!found)
>> +		return 0;
>> +
>> +	bridge = dp->bridge;
>> +	if (!bridge)
>> +		return 0;
>> +
>> +	mv88e6xxx_reg_lock(chip);
>> +
>> +	err = mv88e6xxx_bridge_map(chip, *bridge);
>> +
>> +	mv88e6xxx_reg_unlock(chip);
>> +
>> +	return err;
>> +}
>> +
>>  static int mv88e6xxx_software_reset(struct mv88e6xxx_chip *chip)
>>  {
>>  	if (chip->info->ops->reset)
>> @@ -6478,6 +6518,7 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
>>  	.set_eeprom		= mv88e6xxx_set_eeprom,
>>  	.get_regs_len		= mv88e6xxx_get_regs_len,
>>  	.get_regs		= mv88e6xxx_get_regs,
>> +	.set_flood		= mv88e6xxx_set_flood,
>>  	.get_rxnfc		= mv88e6xxx_get_rxnfc,
>>  	.set_rxnfc		= mv88e6xxx_set_rxnfc,
>>  	.set_ageing_time	= mv88e6xxx_set_ageing_time,
>> -- 
>> 2.25.1
>> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/5] mv88e6xxx: Offload the flood flag
  2022-03-31  8:11     ` Tobias Waldekranz
@ 2022-03-31 15:00       ` Vladimir Oltean
  0 siblings, 0 replies; 23+ messages in thread
From: Vladimir Oltean @ 2022-03-31 15:00 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: Mattias Forsblad, netdev, David S . Miller, Jakub Kicinski,
	Andrew Lunn, Florian Fainelli, Vivien Didelot, Roopa Prabhu,
	Nikolay Aleksandrov, Joachim Wiberg, Ido Schimmel,
	Allan W. Nielsen, UNGLinuxDriver

On Thu, Mar 31, 2022 at 10:11:58AM +0200, Tobias Waldekranz wrote:
> On Thu, Mar 17, 2022 at 15:05, Vladimir Oltean <olteanv@gmail.com> wrote:
> > On Thu, Mar 17, 2022 at 07:50:30AM +0100, Mattias Forsblad wrote:
> >> Use the port vlan table to restrict ingressing traffic to the
> >> CPU port if the flood flags are cleared.
> >> 
> >> Signed-off-by: Mattias Forsblad <mattias.forsblad@gmail.com>
> >> ---
> >
> > There is a grave mismatch between what this patch says it does and what
> > it really does. (=> NACK)
> >
> > Doing some interpolation from previous commit descriptions, the
> > intention is to disable flooding from a given port towards the CPU
> > (which, I mean, is fair enough as a goal).
> >
> > But:
> > (a) mv88e6xxx_port_vlan() disables _forwarding_ from port A to port B.
> >     So this affects not only unknown traffic (the one which is flooded),
> >     but all traffic
> > (b) even if br_flood_enabled() is false (meaning that the bridge device
> >     doesn't want to locally process flooded packets), there is no
> >     equality sign between this and disabling flooding on the CPU port.
> >     If the DSA switch is bridged with a foreign (non-DSA) interface, be
> >     it a tap, a Wi-Fi AP, or a plain Ethernet port, then from the
> >     switch's perspective, this is no different from a local termination
> >     flow (packets need to be forwarded to the CPU). Yet from the
> >     bridge's perspective, it is a forwarding and not a termination flow.
> >     So you can't _just_ disable CPU flooding/forwarding when the bridge
> >     doesn't want to locally terminate traffic.
> >
> > Regarding (b), I've CC'ed Allan Nielsen who held this presentation a few
> > years ago, and some ideas were able to be materialized in the meantime:
> > https://www.youtube.com/watch?v=B1HhxEcU7Jg
> >
> > Regarding (a), have you seen the new dsa_port_manage_cpu_flood() from
> > the DSA unicast filtering patch series?
> > https://patchwork.kernel.org/project/netdevbpf/patch/20220302191417.1288145-6-vladimir.oltean@nxp.com/
> > It is incomplete work in the sense that
> >
> > (1) it disables CPU flooding only if there isn't any port with IFF_PROMISC,
> >     but the bridge puts all ports in promiscuous mode. I think we can
> >     win that battle here, and convince bridge/switchdev maintainers to
> >     not put offloaded bridge ports (those that call switchdev_bridge_port_offload)
> >     in promiscuous mode, since it serves no purpose and that actively
> >     bothers us. At least the way DSA sees this is that unicast filtering
> >     and promiscuous mode deal with standalone mode. The forwarding plane
> >     is effectively a different address database and there is no direct
> >     equivalent to promiscuity there.
> >
> > (2) Right now DSA calls ->port_bridge_flags() from dsa_port_manage_cpu_flood(),
> >     i.e. it treats CPU flooding as a purely per-port-egress setting.
> >     But once I manage to straighten some kinks in DSA's unicast
> >     filtering support for switches with ds->vlan_filtering_is_global (in
> >     other words, make sja1105 eligible for unicast filtering), I pretty
> >     much plan to change this by making DSA ask the driver to manage CPU
> >     flooding per user port - leaving this code path as just a fallback.
> >
> > As baroque as I consider the sja1105 hardware to be, I'm surprised it
> > has a feature which mv88e6xxx doesn't seem to - which is having flood
> > controls per {ingress port, egress port} pair. So we'll have to
> > improvise here.
> >
> > Could you tell me - ok, you remove the CPU port from the port VLAN map -
> > but if you install host FDB entries as ACL entries (so as to make the
> > switch generate a TO_CPU packet instead of a FORWARD packet), doesn't
> > the switch in fact send packets to the CPU even in lack of the CPU
> > port's membership in the port VLAN table for the bridge port?
> >
> > If I'm right and it does, then I do see a path forward for this, with
> > zero user space additions, and working by default. We make the bridge
> > stop uselessly making offloaded DSA bridge ports promiscuous, then we
> > make DSA manage CPU flooding by itself - taking promiscuity into account
> > but also foreign interfaces joining/leaving. Then we make host addresses
> > be delivered by mv88e6xxx to the CPU as trapped and not forwarded, then
> > from new the DSA ->port_set_cpu_flood() callback we remove the CPU port
> > from the port VLAN table.
> >
> > What do you think?
> 
> It's an interesting idea. For unicast entries you could maybe get away
> with it.  Though, it would mean that we would be limited to assisted CPU
> learning, since there is no way for the switch to autonomously generate
> ACL entries ("Policy entries" in ATU parlance). By extension, this also
> means that the Learn2All functionality goes out the window for multichip
> setups for addresses associated with the CPU.

Yes, but on the other hand, dsa_port_bridge_host_fdb_add() does emit an
actual cross-chip notifier, which means that all switches in the tree do
get notified of the addresses added through assisted learning. More work
though, which I can probably agree with.

> For multicast though, I'm not sure that it would work in a multichip
> system. As you say a policy entry will be sent with a TO_CPU tag, the
> problem is that I think that applies to all DSA ports. So in this
> system:
> 
>   CPU
>    |
> .--0--.   .-----.
> | sw0 3---0 sw1 |
> '-1-2-'   '-1-2-'
> 
> If we have a multicast group with subscribers behind sw0p{0,2} and
> sw1p2, we need the following ATU entries:
> 
> sw0:
> da:01:00:5e:01:02:03 vid:0 state:policy dpv:0,2,3
> 
> sw1:
> da:01:00:5e:01:02:03 vid:0 state:policy dpv:0,2
> 
> When this group ingresses on sw0p1, I suspect it will egress
> sw0p{0,2,3}, but on ingress at sw1p0 the frame will be dropped since it
> will contain a TO_CPU tag (and sw1's CPU port is the ingress port).
> 
> Similarly, when this group ingresses on sw1p1, it will egress sw1p{0,2},
> but since it is tagged with TO_CPU on ingress to sw0p3, it won't reach
> sw0p2.

I think the idea for cross-chip multicast to work is to mark the
relevant addresses as Policy Mirrors, and make the upstream port be the
mirror port. I guess the problem will be that this will block the user
from installing mirroring rules using tc-mirred towards other ports
than the upstream port. Some other switches have a dedicated 'copy to CPU'
action, and I guess this is the reason, to not clobber your mirror port,
and I thought Marvell would too, but it looks like they don't.

In any case, implementing things this way certainly doesn't appear to be
the most straightforward way to achieve what you want, it was just worth
considering for its pros and cons.

The fact that Mattias sent a new RFC taking foreign interfaces into
account means to me that for your particular use case, you don't intend
to do software bridging, otherwise that would have been a deal breaker,
right? Or is it rather that you _may_ need software bridging, but not
for the bridge with local termination cut off, but rather for other
bridges spanning the switch? The reason I'm asking is for me to know
whether it's worth pursuing the idea of turning off CPU egress flooding,
in the most ideal of circumstances: all ports are either
(a) standalone but not promiscuous
(b) under a bridge device that isn't promiscuous and has no non-DSA bridge ports

If you really, really, really need to cut off the whole forwarding
towards the CPU and not just flooding, just because forwarding is
per-ingress-port and flooding is common to all ingress ports, then I
guess I might need to look more seriously at the option of monitoring
ingress drop tc rules on the bridge device, as implemented by Mattias,
and see what else it needs to work well.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-03-31 15:00 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17  6:50 [PATCH v3 net-next 0/5] bridge: dsa: switchdev: mv88e6xxx: Implement bridge flood flags Mattias Forsblad
2022-03-17  6:50 ` [PATCH 1/5] switchdev: Add local_receive attribute Mattias Forsblad
2022-03-17  6:50 ` [PATCH 2/5] net: bridge: Implement bridge flood flag Mattias Forsblad
2022-03-17  9:07   ` Joachim Wiberg
2022-03-17 10:30     ` Nikolay Aleksandrov
2022-03-17 11:39     ` Mattias Forsblad
2022-03-17 11:42       ` Nikolay Aleksandrov
2022-03-17 12:26         ` Ido Schimmel
2022-03-17 13:34           ` Tobias Waldekranz
2022-03-17 14:10             ` Ido Schimmel
2022-03-17 16:02               ` Tobias Waldekranz
2022-03-17 10:11   ` Nikolay Aleksandrov
2022-03-17 10:32     ` Nikolay Aleksandrov
2022-03-17 11:15     ` Vladimir Oltean
2022-03-17 11:57       ` Vladimir Oltean
2022-03-17 11:59         ` Vladimir Oltean
2022-03-17 12:08           ` Mattias Forsblad
2022-03-17  6:50 ` [PATCH 3/5] dsa: Handle the flood flag in the DSA layer Mattias Forsblad
2022-03-17  6:50 ` [PATCH 4/5] mv88e6xxx: Offload the flood flag Mattias Forsblad
2022-03-17 13:05   ` Vladimir Oltean
2022-03-31  8:11     ` Tobias Waldekranz
2022-03-31 15:00       ` Vladimir Oltean
2022-03-17  6:50 ` [PATCH 5/5] selftest: Add bridge flood flag tests Mattias Forsblad

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.