[PATCH V3 net-next 0/4] Extend locked port feature with FDB locked flag (MAC-Auth/MAB)

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V3 net-next 0/4] Extend locked port feature with FDB locked flag (MAC-Auth/MAB)
@ 2022-05-24 15:21 ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

This patch set extends the locked port feature for devices
that are behind a locked port, but do not have the ability to
authorize themselves as a supplicant using IEEE 802.1X.
Such devices can be printers, meters or anything related to
fixed installations. Instead of 802.1X authorization, devices
can get access based on their MAC addresses being whitelisted.

For an authorization daemon to detect that a device is trying
to get access through a locked port, the bridge will add the
MAC address of the device to the FDB with a locked flag to it.
Thus the authorization daemon can catch the FDB add event and
check if the MAC address is in the whitelist and if so replace
the FDB entry without the locked flag enabled, and thus open
the port for the device.

This feature is known as MAC-Auth or MAC Authentication Bypass
(MAB) in Cisco terminology, where the full MAB concept involves
additional Cisco infrastructure for authorization. There is no
real authentication process, as the MAC address of the device
is the only input the authorization daemon, in the general
case, has to base the decision if to unlock the port or not.

With this patch set, an implementation of the offloaded case is
supplied for the mv88e6xxx driver. When a packet ingresses on
a locked port, an ATU miss violation event will occur. When
handling such ATU miss violation interrupts, the MAC address of
the device is added to the FDB with a zero destination port
vector (DPV) and the MAC address is communicated through the
switchdev layer to the bridge, so that a FDB entry with the
locked flag enabled can be added.

Hans Schultz (4):
  net: bridge: add fdb flag to extent locked port feature
  net: switchdev: add support for offloading of fdb locked flag
  net: dsa: mv88e6xxx: mac-auth/MAB implementation
  selftests: forwarding: add test of MAC-Auth Bypass to locked port
    tests

 drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
 drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
 drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
 drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
 drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
 drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
 drivers/net/dsa/mv88e6xxx/port.h              |   2 +
 include/net/dsa.h                             |   6 +
 include/net/switchdev.h                       |   3 +-
 include/uapi/linux/neighbour.h                |   1 +
 net/bridge/br.c                               |   3 +-
 net/bridge/br_fdb.c                           |  18 +-
 net/bridge/br_if.c                            |   1 +
 net/bridge/br_input.c                         |  11 +-
 net/bridge/br_private.h                       |   9 +-
 .../net/forwarding/bridge_locked_port.sh      |  42 ++-
 18 files changed, 470 insertions(+), 29 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h

-- 
2.30.2

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [Bridge] [PATCH V3 net-next 0/4] Extend locked port feature with FDB locked flag (MAC-Auth/MAB)
@ 2022-05-24 15:21 ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, linux-kernel, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Paolo Abeni, Vladimir Oltean,
	Shuah Khan, Vivien Didelot

This patch set extends the locked port feature for devices
that are behind a locked port, but do not have the ability to
authorize themselves as a supplicant using IEEE 802.1X.
Such devices can be printers, meters or anything related to
fixed installations. Instead of 802.1X authorization, devices
can get access based on their MAC addresses being whitelisted.

For an authorization daemon to detect that a device is trying
to get access through a locked port, the bridge will add the
MAC address of the device to the FDB with a locked flag to it.
Thus the authorization daemon can catch the FDB add event and
check if the MAC address is in the whitelist and if so replace
the FDB entry without the locked flag enabled, and thus open
the port for the device.

This feature is known as MAC-Auth or MAC Authentication Bypass
(MAB) in Cisco terminology, where the full MAB concept involves
additional Cisco infrastructure for authorization. There is no
real authentication process, as the MAC address of the device
is the only input the authorization daemon, in the general
case, has to base the decision if to unlock the port or not.

With this patch set, an implementation of the offloaded case is
supplied for the mv88e6xxx driver. When a packet ingresses on
a locked port, an ATU miss violation event will occur. When
handling such ATU miss violation interrupts, the MAC address of
the device is added to the FDB with a zero destination port
vector (DPV) and the MAC address is communicated through the
switchdev layer to the bridge, so that a FDB entry with the
locked flag enabled can be added.

Hans Schultz (4):
  net: bridge: add fdb flag to extent locked port feature
  net: switchdev: add support for offloading of fdb locked flag
  net: dsa: mv88e6xxx: mac-auth/MAB implementation
  selftests: forwarding: add test of MAC-Auth Bypass to locked port
    tests

 drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
 drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
 drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
 drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
 drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
 drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
 drivers/net/dsa/mv88e6xxx/port.h              |   2 +
 include/net/dsa.h                             |   6 +
 include/net/switchdev.h                       |   3 +-
 include/uapi/linux/neighbour.h                |   1 +
 net/bridge/br.c                               |   3 +-
 net/bridge/br_fdb.c                           |  18 +-
 net/bridge/br_if.c                            |   1 +
 net/bridge/br_input.c                         |  11 +-
 net/bridge/br_private.h                       |   9 +-
 .../net/forwarding/bridge_locked_port.sh      |  42 ++-
 18 files changed, 470 insertions(+), 29 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h

-- 
2.30.2

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-24 15:21 ` [Bridge] " Hans Schultz
@ 2022-05-24 15:21   ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

Add an intermediate state for clients behind a locked port to allow for
possible opening of the port for said clients. This feature corresponds
to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
latter defined by Cisco.
Locked FDB entries will be limited in number, so as to prevent DOS
attacks by spamming the port with random entries. The limit will be
a per port limit as it is a port based feature and that the port flushes
all FDB entries on link down.

Only the kernel can set this FDB entry flag, while userspace can read
the flag and remove it by deleting the FDB entry.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 include/uapi/linux/neighbour.h |  1 +
 net/bridge/br_fdb.c            | 11 +++++++++++
 net/bridge/br_if.c             |  1 +
 net/bridge/br_input.c          | 11 ++++++++++-
 net/bridge/br_private.h        |  7 ++++++-
 5 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 39c565e460c7..76d65b481086 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -53,6 +53,7 @@ enum {
 #define NTF_ROUTER	(1 << 7)
 /* Extended flags under NDA_FLAGS_EXT: */
 #define NTF_EXT_MANAGED	(1 << 0)
+#define NTF_EXT_LOCKED	(1 << 1)
 
 /*
  *	Neighbor Cache Entry States.
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e7f4fccb6adb..6b83e2d6435d 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 	struct nda_cacheinfo ci;
 	struct nlmsghdr *nlh;
 	struct ndmsg *ndm;
+	u32 ext_flags = 0;
 
 	nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags);
 	if (nlh == NULL)
@@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 		ndm->ndm_flags |= NTF_EXT_LEARNED;
 	if (test_bit(BR_FDB_STICKY, &fdb->flags))
 		ndm->ndm_flags |= NTF_STICKY;
+	if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
+		ext_flags |= NTF_EXT_LOCKED;
 
 	if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr))
 		goto nla_put_failure;
 	if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex))
 		goto nla_put_failure;
+	if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
+		goto nla_put_failure;
+
 	ci.ndm_used	 = jiffies_to_clock_t(now - fdb->used);
 	ci.ndm_confirmed = 0;
 	ci.ndm_updated	 = jiffies_to_clock_t(now - fdb->updated);
@@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void)
 	return NLMSG_ALIGN(sizeof(struct ndmsg))
 		+ nla_total_size(ETH_ALEN) /* NDA_LLADDR */
 		+ nla_total_size(sizeof(u32)) /* NDA_MASTER */
+		+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
 		+ nla_total_size(sizeof(u16)) /* NDA_VLAN */
 		+ nla_total_size(sizeof(struct nda_cacheinfo))
 		+ nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
@@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
 	if (test_bit(BR_FDB_STATIC, &f->flags))
 		fdb_del_hw_addr(br, f->key.addr.addr);
 
+	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
+		atomic_dec(&f->dst->locked_entry_cnt);
+
 	hlist_del_init_rcu(&f->fdb_node);
 	rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode,
 			       br_fdb_rht_params);
@@ -1086,6 +1096,7 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 		modified = true;
 
 	set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+	clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
 
 	fdb->used = jiffies;
 	if (modified) {
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 47fcbade7389..0ca04cba5ebe 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -429,6 +429,7 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
 	p->priority = 0x8000 >> BR_PORT_BITS;
 	p->port_no = index;
 	p->flags = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+	p->locked_entry_cnt.counter = 0;
 	br_init_port(p);
 	br_set_state(p, BR_STATE_DISABLED);
 	br_stp_port_timer_init(p);
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 68b3e850bcb9..0280806cf980 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -110,8 +110,17 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 			br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
 
 		if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
-		    test_bit(BR_FDB_LOCAL, &fdb_src->flags))
+		    test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
+		    test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) {
+			if (!fdb_src && atomic_read(&p->locked_entry_cnt) < BR_LOCKED_ENTRIES_MAX) {
+				unsigned long flags = 0;
+
+				__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
+				br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags);
+				atomic_inc(&p->locked_entry_cnt);
+			}
 			goto drop;
+		}
 	}
 
 	nbp_switchdev_frame_mark(p, skb);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 06e5f6faa431..be17c99efe65 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -31,6 +31,8 @@
 #define BR_MULTICAST_QUERY_INTVL_MIN msecs_to_jiffies(1000)
 #define BR_MULTICAST_STARTUP_QUERY_INTVL_MIN BR_MULTICAST_QUERY_INTVL_MIN
 
+#define BR_LOCKED_ENTRIES_MAX	64
+
 #define BR_HWDOM_MAX BITS_PER_LONG
 
 #define BR_VERSION	"2.3"
@@ -251,7 +253,8 @@ enum {
 	BR_FDB_ADDED_BY_EXT_LEARN,
 	BR_FDB_OFFLOADED,
 	BR_FDB_NOTIFY,
-	BR_FDB_NOTIFY_INACTIVE
+	BR_FDB_NOTIFY_INACTIVE,
+	BR_FDB_ENTRY_LOCKED,
 };
 
 struct net_bridge_fdb_key {
@@ -414,6 +417,8 @@ struct net_bridge_port {
 	u16				backup_redirected_cnt;
 
 	struct bridge_stp_xstats	stp_xstats;
+
+	atomic_t			locked_entry_cnt;
 };
 
 #define kobj_to_brport(obj)	container_of(obj, struct net_bridge_port, kobj)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-24 15:21   ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, linux-kernel, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Paolo Abeni, Vladimir Oltean,
	Shuah Khan, Vivien Didelot

Add an intermediate state for clients behind a locked port to allow for
possible opening of the port for said clients. This feature corresponds
to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
latter defined by Cisco.
Locked FDB entries will be limited in number, so as to prevent DOS
attacks by spamming the port with random entries. The limit will be
a per port limit as it is a port based feature and that the port flushes
all FDB entries on link down.

Only the kernel can set this FDB entry flag, while userspace can read
the flag and remove it by deleting the FDB entry.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 include/uapi/linux/neighbour.h |  1 +
 net/bridge/br_fdb.c            | 11 +++++++++++
 net/bridge/br_if.c             |  1 +
 net/bridge/br_input.c          | 11 ++++++++++-
 net/bridge/br_private.h        |  7 ++++++-
 5 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 39c565e460c7..76d65b481086 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -53,6 +53,7 @@ enum {
 #define NTF_ROUTER	(1 << 7)
 /* Extended flags under NDA_FLAGS_EXT: */
 #define NTF_EXT_MANAGED	(1 << 0)
+#define NTF_EXT_LOCKED	(1 << 1)
 
 /*
  *	Neighbor Cache Entry States.
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e7f4fccb6adb..6b83e2d6435d 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 	struct nda_cacheinfo ci;
 	struct nlmsghdr *nlh;
 	struct ndmsg *ndm;
+	u32 ext_flags = 0;
 
 	nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags);
 	if (nlh == NULL)
@@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 		ndm->ndm_flags |= NTF_EXT_LEARNED;
 	if (test_bit(BR_FDB_STICKY, &fdb->flags))
 		ndm->ndm_flags |= NTF_STICKY;
+	if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
+		ext_flags |= NTF_EXT_LOCKED;
 
 	if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr))
 		goto nla_put_failure;
 	if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex))
 		goto nla_put_failure;
+	if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
+		goto nla_put_failure;
+
 	ci.ndm_used	 = jiffies_to_clock_t(now - fdb->used);
 	ci.ndm_confirmed = 0;
 	ci.ndm_updated	 = jiffies_to_clock_t(now - fdb->updated);
@@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void)
 	return NLMSG_ALIGN(sizeof(struct ndmsg))
 		+ nla_total_size(ETH_ALEN) /* NDA_LLADDR */
 		+ nla_total_size(sizeof(u32)) /* NDA_MASTER */
+		+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
 		+ nla_total_size(sizeof(u16)) /* NDA_VLAN */
 		+ nla_total_size(sizeof(struct nda_cacheinfo))
 		+ nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
@@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
 	if (test_bit(BR_FDB_STATIC, &f->flags))
 		fdb_del_hw_addr(br, f->key.addr.addr);
 
+	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
+		atomic_dec(&f->dst->locked_entry_cnt);
+
 	hlist_del_init_rcu(&f->fdb_node);
 	rhashtable_remove_fast(&br->fdb_hash_tbl, &f->rhnode,
 			       br_fdb_rht_params);
@@ -1086,6 +1096,7 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 		modified = true;
 
 	set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+	clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
 
 	fdb->used = jiffies;
 	if (modified) {
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 47fcbade7389..0ca04cba5ebe 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -429,6 +429,7 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
 	p->priority = 0x8000 >> BR_PORT_BITS;
 	p->port_no = index;
 	p->flags = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+	p->locked_entry_cnt.counter = 0;
 	br_init_port(p);
 	br_set_state(p, BR_STATE_DISABLED);
 	br_stp_port_timer_init(p);
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 68b3e850bcb9..0280806cf980 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -110,8 +110,17 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 			br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
 
 		if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
-		    test_bit(BR_FDB_LOCAL, &fdb_src->flags))
+		    test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
+		    test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) {
+			if (!fdb_src && atomic_read(&p->locked_entry_cnt) < BR_LOCKED_ENTRIES_MAX) {
+				unsigned long flags = 0;
+
+				__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
+				br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags);
+				atomic_inc(&p->locked_entry_cnt);
+			}
 			goto drop;
+		}
 	}
 
 	nbp_switchdev_frame_mark(p, skb);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 06e5f6faa431..be17c99efe65 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -31,6 +31,8 @@
 #define BR_MULTICAST_QUERY_INTVL_MIN msecs_to_jiffies(1000)
 #define BR_MULTICAST_STARTUP_QUERY_INTVL_MIN BR_MULTICAST_QUERY_INTVL_MIN
 
+#define BR_LOCKED_ENTRIES_MAX	64
+
 #define BR_HWDOM_MAX BITS_PER_LONG
 
 #define BR_VERSION	"2.3"
@@ -251,7 +253,8 @@ enum {
 	BR_FDB_ADDED_BY_EXT_LEARN,
 	BR_FDB_OFFLOADED,
 	BR_FDB_NOTIFY,
-	BR_FDB_NOTIFY_INACTIVE
+	BR_FDB_NOTIFY_INACTIVE,
+	BR_FDB_ENTRY_LOCKED,
 };
 
 struct net_bridge_fdb_key {
@@ -414,6 +417,8 @@ struct net_bridge_port {
 	u16				backup_redirected_cnt;
 
 	struct bridge_stp_xstats	stp_xstats;
+
+	atomic_t			locked_entry_cnt;
 };
 
 #define kobj_to_brport(obj)	container_of(obj, struct net_bridge_port, kobj)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 net-next 2/4] net: switchdev: add support for offloading of fdb locked flag
  2022-05-24 15:21 ` [Bridge] " Hans Schultz
@ 2022-05-24 15:21   ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

Used for Mac-auth/MAB feature in the offloaded case.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 include/net/dsa.h       | 6 ++++++
 include/net/switchdev.h | 3 ++-
 net/bridge/br.c         | 3 ++-
 net/bridge/br_fdb.c     | 7 +++++--
 net/bridge/br_private.h | 2 +-
 5 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 14f07275852b..a5a843b2d67d 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -330,6 +330,12 @@ struct dsa_port {
 	/* List of VLANs that CPU and DSA ports are members of. */
 	struct mutex		vlans_lock;
 	struct list_head	vlans;
+
+	/* List and maintenance of locked ATU entries */
+	struct mutex		locked_entries_list_lock;
+	struct list_head	atu_locked_entries_list;
+	atomic_t		atu_locked_entry_cnt;
+	struct delayed_work	atu_work;
 };
 
 /* TODO: ideally DSA ports would have a single dp->link_dp member,
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index aa0171d5786d..62f4f7c9c7c2 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -245,7 +245,8 @@ struct switchdev_notifier_fdb_info {
 	u16 vid;
 	u8 added_by_user:1,
 	   is_local:1,
-	   offloaded:1;
+	   offloaded:1,
+	   locked:1;
 };
 
 struct switchdev_notifier_port_obj_info {
diff --git a/net/bridge/br.c b/net/bridge/br.c
index 96e91d69a9a8..12933388a5a4 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = ptr;
 		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-						fdb_info->vid, false);
+						fdb_info->vid, false,
+						fdb_info->locked);
 		if (err) {
 			err = notifier_from_errno(err);
 			break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6b83e2d6435d..92469547283a 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1135,7 +1135,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 					   "FDB entry towards bridge must be permanent");
 			return -EINVAL;
 		}
-		err = br_fdb_external_learn_add(br, p, addr, vid, true);
+		err = br_fdb_external_learn_add(br, p, addr, vid, true, false);
 	} else {
 		spin_lock_bh(&br->hash_lock);
 		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1365,7 +1365,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-			      bool swdev_notify)
+			      bool swdev_notify, bool locked)
 {
 	struct net_bridge_fdb_entry *fdb;
 	bool modified = false;
@@ -1385,6 +1385,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (!p)
 			flags |= BIT(BR_FDB_LOCAL);
 
+		if (locked)
+			flags |= BIT(BR_FDB_ENTRY_LOCKED);
+
 		fdb = fdb_create(br, p, addr, vid, flags);
 		if (!fdb) {
 			err = -ENOMEM;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index be17c99efe65..88913e6a59e1 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -815,7 +815,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-			      bool swdev_notify);
+			      bool swdev_notify, bool locked);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
 			      bool swdev_notify);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [Bridge] [PATCH V3 net-next 2/4] net: switchdev: add support for offloading of fdb locked flag
@ 2022-05-24 15:21   ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, linux-kernel, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Paolo Abeni, Vladimir Oltean,
	Shuah Khan, Vivien Didelot

Used for Mac-auth/MAB feature in the offloaded case.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 include/net/dsa.h       | 6 ++++++
 include/net/switchdev.h | 3 ++-
 net/bridge/br.c         | 3 ++-
 net/bridge/br_fdb.c     | 7 +++++--
 net/bridge/br_private.h | 2 +-
 5 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 14f07275852b..a5a843b2d67d 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -330,6 +330,12 @@ struct dsa_port {
 	/* List of VLANs that CPU and DSA ports are members of. */
 	struct mutex		vlans_lock;
 	struct list_head	vlans;
+
+	/* List and maintenance of locked ATU entries */
+	struct mutex		locked_entries_list_lock;
+	struct list_head	atu_locked_entries_list;
+	atomic_t		atu_locked_entry_cnt;
+	struct delayed_work	atu_work;
 };
 
 /* TODO: ideally DSA ports would have a single dp->link_dp member,
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index aa0171d5786d..62f4f7c9c7c2 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -245,7 +245,8 @@ struct switchdev_notifier_fdb_info {
 	u16 vid;
 	u8 added_by_user:1,
 	   is_local:1,
-	   offloaded:1;
+	   offloaded:1,
+	   locked:1;
 };
 
 struct switchdev_notifier_port_obj_info {
diff --git a/net/bridge/br.c b/net/bridge/br.c
index 96e91d69a9a8..12933388a5a4 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = ptr;
 		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-						fdb_info->vid, false);
+						fdb_info->vid, false,
+						fdb_info->locked);
 		if (err) {
 			err = notifier_from_errno(err);
 			break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6b83e2d6435d..92469547283a 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1135,7 +1135,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 					   "FDB entry towards bridge must be permanent");
 			return -EINVAL;
 		}
-		err = br_fdb_external_learn_add(br, p, addr, vid, true);
+		err = br_fdb_external_learn_add(br, p, addr, vid, true, false);
 	} else {
 		spin_lock_bh(&br->hash_lock);
 		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1365,7 +1365,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-			      bool swdev_notify)
+			      bool swdev_notify, bool locked)
 {
 	struct net_bridge_fdb_entry *fdb;
 	bool modified = false;
@@ -1385,6 +1385,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (!p)
 			flags |= BIT(BR_FDB_LOCAL);
 
+		if (locked)
+			flags |= BIT(BR_FDB_ENTRY_LOCKED);
+
 		fdb = fdb_create(br, p, addr, vid, flags);
 		if (!fdb) {
 			err = -ENOMEM;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index be17c99efe65..88913e6a59e1 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -815,7 +815,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-			      bool swdev_notify);
+			      bool swdev_notify, bool locked);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
 			      bool swdev_notify);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-05-24 15:21 ` [Bridge] " Hans Schultz
@ 2022-05-24 15:21   ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

This implementation for the Marvell mv88e6xxx chip series, is
based on handling ATU miss violations occurring when packets
ingress on a port that is locked. The mac address triggering
the ATU miss violation is communicated through switchdev to
the bridge module, which adds a fdb entry with the fdb locked
flag set. The entry is kept according to the bridges ageing
time, thus simulating a dynamic entry.

Note: The locked port must have learning enabled for the ATU
miss violation to occur.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
 drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
 drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
 drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
 drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
 drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
 drivers/net/dsa/mv88e6xxx/port.h              |   2 +
 9 files changed, 389 insertions(+), 16 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h

diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile
index c8eca2b6f959..3ca57709730d 100644
--- a/drivers/net/dsa/mv88e6xxx/Makefile
+++ b/drivers/net/dsa/mv88e6xxx/Makefile
@@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o
 mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o
 mv88e6xxx-objs += serdes.o
 mv88e6xxx-objs += smi.o
+mv88e6xxx-objs += mv88e6xxx_switchdev.o
\ No newline at end of file
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 5d2c57a7c708..f7a16886bee9 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -42,6 +42,7 @@
 #include "ptp.h"
 #include "serdes.h"
 #include "smi.h"
+#include "mv88e6xxx_switchdev.h"
 
 static void assert_reg_lock(struct mv88e6xxx_chip *chip)
 {
@@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
 	if (err)
 		dev_err(chip->dev,
 			"p%d: failed to force MAC link down\n", port);
+	else
+		if (mv88e6xxx_port_is_locked(chip, port, true))
+			mv88e6xxx_atu_locked_entry_flush(ds, port);
 }
 
 static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port,
@@ -1685,6 +1689,9 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	if (mv88e6xxx_port_is_locked(chip, port, true))
+		mv88e6xxx_atu_locked_entry_flush(ds, port);
+
 	mv88e6xxx_reg_lock(chip);
 	err = mv88e6xxx_port_fast_age_fid(chip, port, 0);
 	mv88e6xxx_reg_unlock(chip);
@@ -1721,11 +1728,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid,
 	return err;
 }
 
-static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
-			      int (*cb)(struct mv88e6xxx_chip *chip,
-					const struct mv88e6xxx_vtu_entry *entry,
-					void *priv),
-			      void *priv)
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
+		       int (*cb)(struct mv88e6xxx_chip *chip,
+				 const struct mv88e6xxx_vtu_entry *entry,
+				 void *priv),
+		       void *priv)
 {
 	struct mv88e6xxx_vtu_entry entry = {
 		.vid = mv88e6xxx_max_vid(chip),
@@ -2722,9 +2729,12 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	if (mv88e6xxx_port_is_locked(chip, port, true))
+		mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
+
 	mv88e6xxx_reg_lock(chip);
 	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
-					   MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
+			MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
 	mv88e6xxx_reg_unlock(chip);
 
 	return err;
@@ -2735,12 +2745,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
 				  struct dsa_db db)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
+	bool locked_found = false;
 	int err;
 
-	mv88e6xxx_reg_lock(chip);
-	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
-	mv88e6xxx_reg_unlock(chip);
+	if (mv88e6xxx_port_is_locked(chip, port, true))
+		locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
 
+	if (!locked_found) {
+		mv88e6xxx_reg_lock(chip);
+		err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
+		mv88e6xxx_reg_unlock(chip);
+	}
 	return err;
 }
 
@@ -3809,11 +3824,16 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
 
 static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port)
 {
-	return mv88e6xxx_setup_devlink_regions_port(ds, port);
+	int err;
+
+	err = mv88e6xxx_setup_devlink_regions_port(ds, port);
+	mv88e6xxx_init_violation_handler(ds, port);
+	return err;
 }
 
 static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port)
 {
+	mv88e6xxx_teardown_violation_handler(ds, port);
 	mv88e6xxx_teardown_devlink_regions_port(ds, port);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 5e03cfe50156..c9a8404a6293 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -803,6 +803,11 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip)
 	mutex_unlock(&chip->reg_lock);
 }
 
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
+		       int (*cb)(struct mv88e6xxx_chip *chip,
+				 const struct mv88e6xxx_vtu_entry *entry,
+				 void *priv),
+		       void *priv);
 int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
 
 #endif /* _MV88E6XXX_CHIP_H */
diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
index 65958b2a0d3a..503fbf216670 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.h
+++ b/drivers/net/dsa/mv88e6xxx/global1.h
@@ -136,6 +136,7 @@
 #define MV88E6XXX_G1_ATU_DATA_TRUNK				0x8000
 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK			0x00f0
 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK			0x3ff0
+#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS		0x0000
 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK			0x000f
 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED			0x0000
 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST		0x0001
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c
index 40bd67a5c8e9..517376271f64 100644
--- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
+++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
@@ -12,6 +12,8 @@
 
 #include "chip.h"
 #include "global1.h"
+#include "port.h"
+#include "mv88e6xxx_switchdev.h"
 
 /* Offset 0x01: ATU FID Register */
 
@@ -114,6 +116,18 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip)
 	return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0);
 }
 
+static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip)
+{
+	int err;
+
+	err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP,
+				 MV88E6XXX_G1_ATU_OP_BUSY | MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
+	if (err)
+		return err;
+
+	return mv88e6xxx_g1_atu_op_wait(chip);
+}
+
 static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op)
 {
 	u16 val;
@@ -356,11 +370,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 	int spid;
 	int err;
 	u16 val;
+	u16 fid;
 
 	mv88e6xxx_reg_lock(chip);
 
-	err = mv88e6xxx_g1_atu_op(chip, 0,
-				  MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
+	err = mv88e6xxx_g1_read_atu_violation(chip);
 	if (err)
 		goto out;
 
@@ -368,6 +382,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 	if (err)
 		goto out;
 
+	err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid);
+	if (err)
+		goto out;
+
 	err = mv88e6xxx_g1_atu_data_read(chip, &entry);
 	if (err)
 		goto out;
@@ -382,6 +400,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 		dev_err_ratelimited(chip->dev,
 				    "ATU age out violation for %pM\n",
 				    entry.mac);
+		err = mv88e6xxx_handle_violation(chip,
+						 chip->ports[spid].port,
+						 &entry,
+						 fid,
+						 MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION);
 	}
 
 	if (val & MV88E6XXX_G1_ATU_OP_MEMBER_VIOLATION) {
@@ -396,6 +419,14 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 				    "ATU miss violation for %pM portvec %x spid %d\n",
 				    entry.mac, entry.portvec, spid);
 		chip->ports[spid].atu_miss_violation++;
+		if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port, false))
+			err = mv88e6xxx_handle_violation(chip,
+							 chip->ports[spid].port,
+							 &entry,
+							 fid,
+							 MV88E6XXX_G1_ATU_OP_MISS_VIOLATION);
+		if (err)
+			goto out;
 	}
 
 	if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) {
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
new file mode 100644
index 000000000000..8436655ceb9a
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * mv88e6xxx_switchdev.c
+ *
+ *	Authors:
+ *	Hans J. Schultz		<hans.schultz@westermo.com>
+ *
+ */
+
+#include <net/switchdev.h>
+#include <linux/list.h>
+#include "chip.h"
+#include "global1.h"
+#include "mv88e6xxx_switchdev.h"
+
+static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
+{
+	struct switchdev_notifier_fdb_info info = {
+		.addr = ale->mac,
+		.vid = ale->vid,
+		.added_by_user = false,
+		.is_local = false,
+		.offloaded = true,
+		.locked = true,
+	};
+	struct mv88e6xxx_atu_entry entry;
+	struct net_device *brport;
+	struct dsa_port *dp;
+
+	entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
+	entry.trunk = false;
+	memcpy(&entry.mac, &ale->mac, ETH_ALEN);
+
+	mv88e6xxx_reg_lock(ale->chip);
+	mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);
+	mv88e6xxx_reg_unlock(ale->chip);
+
+	dp = dsa_to_port(ale->chip->ds, ale->port);
+	brport = dsa_port_to_bridge_port(dp);
+
+	if (brport) {
+		if (!rtnl_is_locked()) {
+			rtnl_lock();
+			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
+						 brport, &info.info, NULL);
+			rtnl_unlock();
+		} else {
+			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
+						 brport, &info.info, NULL);
+		}
+	} else {
+		dev_err(ale->chip->dev, "ERR: No bridge port for dsa port belonging to port %d\n",
+			ale->port);
+	}
+}
+
+static inline void mv88e6xxx_atu_locked_entry_purge(struct atu_locked_entry *ale)
+{
+	mv88e6xxx_atu_locked_entry_timer_work(ale);
+	del_timer(&ale->timer);
+	list_del(&ale->list);
+	kfree(ale);
+}
+
+static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work)
+{
+	struct dsa_port *dp = container_of(work, struct dsa_port, atu_work.work);
+	struct atu_locked_entry *ale, *tmp;
+
+	mutex_lock(&dp->locked_entries_list_lock);
+	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
+		if (ale->timed_out) {
+			mv88e6xxx_atu_locked_entry_purge(ale);
+			atomic_dec(&dp->atu_locked_entry_cnt);
+		}
+	}
+	mutex_unlock(&dp->locked_entries_list_lock);
+
+	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
+}
+
+static void mv88e6xxx_atu_locked_entry_timer_handler(struct timer_list *t)
+{
+	struct atu_locked_entry *ale = from_timer(ale, t, timer);
+
+	if (ale)
+		ale->timed_out = true;
+}
+
+struct mv88e6xxx_fid_search_ctx {
+	u16 fid_search;
+	u16 vid_found;
+};
+
+static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip,
+					      const struct mv88e6xxx_vtu_entry *entry,
+					      void *priv)
+{
+	struct mv88e6xxx_fid_search_ctx *ctx = priv;
+
+	if (ctx->fid_search == entry->fid) {
+		ctx->vid_found = entry->vid;
+		return 1;
+	}
+
+	return 0;
+}
+
+int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
+			       int port,
+			       struct mv88e6xxx_atu_entry *entry,
+			       u16 fid,
+			       u16 type)
+{
+	struct switchdev_notifier_fdb_info info = {
+		.addr = entry->mac,
+		.vid = 0,
+		.added_by_user = false,
+		.is_local = false,
+		.offloaded = true,
+		.locked = true,
+	};
+	struct atu_locked_entry *locked_entry;
+	struct mv88e6xxx_fid_search_ctx ctx;
+	struct net_device *brport;
+	struct dsa_port *dp;
+	int err;
+
+	ctx.fid_search = fid;
+	err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx);
+	if (err < 0)
+		return err;
+	if (err == 1)
+		info.vid = ctx.vid_found;
+	else
+		return -ENODATA;
+
+	dp = dsa_to_port(chip->ds, port);
+	brport = dsa_port_to_bridge_port(dp);
+
+	if (!brport)
+		return -ENODEV;
+
+	switch (type) {
+	case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION:
+		if (atomic_read(&dp->atu_locked_entry_cnt) >= ATU_LOCKED_ENTRIES_MAX) {
+			mv88e6xxx_reg_unlock(chip);
+			return 0;
+		}
+		entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS;
+		entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC;
+		entry->trunk = false;
+		err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry);
+		if (err)
+			goto fail;
+
+		locked_entry = kmalloc(sizeof(*locked_entry), GFP_ATOMIC);
+		if (!locked_entry)
+			return -ENOMEM;
+		timer_setup(&locked_entry->timer, mv88e6xxx_atu_locked_entry_timer_handler, 0);
+		locked_entry->timer.expires = jiffies + dp->ageing_time / 10;
+		locked_entry->chip = chip;
+		locked_entry->port = port;
+		locked_entry->fid = fid;
+		locked_entry->vid = info.vid;
+		locked_entry->timed_out = false;
+		memcpy(&locked_entry->mac, entry->mac, ETH_ALEN);
+
+		mutex_lock(&dp->locked_entries_list_lock);
+		add_timer(&locked_entry->timer);
+		list_add(&locked_entry->list, &dp->atu_locked_entries_list);
+		mutex_unlock(&dp->locked_entries_list_lock);
+		atomic_inc(&dp->atu_locked_entry_cnt);
+
+		mv88e6xxx_reg_unlock(chip);
+
+		rtnl_lock();
+		err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE,
+					       brport, &info.info, NULL);
+		break;
+	}
+	rtnl_unlock();
+
+	return err;
+
+fail:
+	mv88e6xxx_reg_unlock(chip);
+	return err;
+}
+
+bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
+					   const unsigned char *addr, u16 vid)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+	struct atu_locked_entry *ale, *tmp;
+	bool found = false;
+
+	mutex_lock(&dp->locked_entries_list_lock);
+	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
+		if (!memcmp(&ale->mac, addr, ETH_ALEN)) {
+			if (ale->vid == vid) {
+				mv88e6xxx_atu_locked_entry_purge(ale);
+				atomic_dec(&dp->atu_locked_entry_cnt);
+				found = true;
+				break;
+			}
+		}
+	}
+	mutex_unlock(&dp->locked_entries_list_lock);
+	return found;
+}
+
+void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+	struct atu_locked_entry *ale, *tmp;
+
+	mutex_lock(&dp->locked_entries_list_lock);
+	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
+		mv88e6xxx_atu_locked_entry_purge(ale);
+		atomic_dec(&dp->atu_locked_entry_cnt);
+	}
+	mutex_unlock(&dp->locked_entries_list_lock);
+
+	if (atomic_read(&dp->atu_locked_entry_cnt) != 0)
+		dev_err(ds->dev,
+			"ERROR: Locked entries count is not zero after flush on port %d\n",
+			port);
+}
+
+void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+
+	INIT_LIST_HEAD(&dp->atu_locked_entries_list);
+	mutex_init(&dp->locked_entries_list_lock);
+	dp->atu_locked_entry_cnt.counter = 0;
+	INIT_DELAYED_WORK(&dp->atu_work, mv88e6xxx_atu_locked_entry_cleanup);
+	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
+}
+
+void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+
+	cancel_delayed_work(&dp->atu_work);
+	mv88e6xxx_atu_locked_entry_flush(ds, port);
+	mutex_destroy(&dp->locked_entries_list_lock);
+}
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
new file mode 100644
index 000000000000..f0e7abf7c361
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * mv88e6xxx_switchdev.h
+ *
+ *	Authors:
+ *	Hans J. Schultz		<hans.schultz@westermo.com>
+ *
+ */
+
+#ifndef DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
+#define DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
+
+#include <net/switchdev.h>
+#include "chip.h"
+
+#define ATU_LOCKED_ENTRIES_MAX	64
+
+struct atu_locked_entry {
+	struct list_head list;
+	struct mv88e6xxx_chip *chip;
+	int port;
+	u8	mac[ETH_ALEN];
+	u16 fid;
+	u16 vid;
+	struct timer_list timer;
+	bool timed_out;
+};
+
+int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
+			       int port,
+			       struct mv88e6xxx_atu_entry *entry,
+			       u16 fid,
+			       u16 type);
+bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
+					   const unsigned char *addr, u16 vid);
+void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port);
+void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port);
+void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port);
+
+#endif /* DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_ */
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 795b3128768f..c4e5e7174129 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -14,9 +14,11 @@
 #include <linux/phylink.h>
 
 #include "chip.h"
+#include "global1.h"
 #include "global2.h"
 #include "port.h"
 #include "serdes.h"
+#include "mv88e6xxx_switchdev.h"
 
 int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg,
 			u16 *val)
@@ -1239,6 +1241,25 @@ int mv88e6xxx_port_set_mirror(struct mv88e6xxx_chip *chip, int port,
 	return err;
 }
 
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock)
+{
+	bool locked = false;
+	u16 reg;
+
+	if (chiplock)
+		mv88e6xxx_reg_lock(chip);
+
+	if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
+		goto out;
+	locked = reg & MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK;
+
+out:
+	if (chiplock)
+		mv88e6xxx_reg_unlock(chip);
+
+	return locked;
+}
+
 int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
 			    bool locked)
 {
@@ -1261,10 +1282,13 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
 	if (err)
 		return err;
 
-	reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
-	if (locked)
-		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
-
+	reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
+	if (locked) {
+		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
+			MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
+			MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
+			MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
+	}
 	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index e0a705d82019..d377abd6ab17 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -231,6 +231,7 @@
 #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT		0x2000
 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG	0x1000
 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED	0x0800
+#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK		0x07ff
 
 /* Offset 0x0C: Port ATU Control */
 #define MV88E6XXX_PORT_ATU_CTL		0x0c
@@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid);
 int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid);
 int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid);
 
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock);
 int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
 			    bool locked);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-05-24 15:21   ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, linux-kernel, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Paolo Abeni, Vladimir Oltean,
	Shuah Khan, Vivien Didelot

This implementation for the Marvell mv88e6xxx chip series, is
based on handling ATU miss violations occurring when packets
ingress on a port that is locked. The mac address triggering
the ATU miss violation is communicated through switchdev to
the bridge module, which adds a fdb entry with the fdb locked
flag set. The entry is kept according to the bridges ageing
time, thus simulating a dynamic entry.

Note: The locked port must have learning enabled for the ATU
miss violation to occur.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
 drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
 drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
 drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
 drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
 .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
 drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
 drivers/net/dsa/mv88e6xxx/port.h              |   2 +
 9 files changed, 389 insertions(+), 16 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
 create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h

diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile
index c8eca2b6f959..3ca57709730d 100644
--- a/drivers/net/dsa/mv88e6xxx/Makefile
+++ b/drivers/net/dsa/mv88e6xxx/Makefile
@@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o
 mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o
 mv88e6xxx-objs += serdes.o
 mv88e6xxx-objs += smi.o
+mv88e6xxx-objs += mv88e6xxx_switchdev.o
\ No newline at end of file
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 5d2c57a7c708..f7a16886bee9 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -42,6 +42,7 @@
 #include "ptp.h"
 #include "serdes.h"
 #include "smi.h"
+#include "mv88e6xxx_switchdev.h"
 
 static void assert_reg_lock(struct mv88e6xxx_chip *chip)
 {
@@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
 	if (err)
 		dev_err(chip->dev,
 			"p%d: failed to force MAC link down\n", port);
+	else
+		if (mv88e6xxx_port_is_locked(chip, port, true))
+			mv88e6xxx_atu_locked_entry_flush(ds, port);
 }
 
 static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port,
@@ -1685,6 +1689,9 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	if (mv88e6xxx_port_is_locked(chip, port, true))
+		mv88e6xxx_atu_locked_entry_flush(ds, port);
+
 	mv88e6xxx_reg_lock(chip);
 	err = mv88e6xxx_port_fast_age_fid(chip, port, 0);
 	mv88e6xxx_reg_unlock(chip);
@@ -1721,11 +1728,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid,
 	return err;
 }
 
-static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
-			      int (*cb)(struct mv88e6xxx_chip *chip,
-					const struct mv88e6xxx_vtu_entry *entry,
-					void *priv),
-			      void *priv)
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
+		       int (*cb)(struct mv88e6xxx_chip *chip,
+				 const struct mv88e6xxx_vtu_entry *entry,
+				 void *priv),
+		       void *priv)
 {
 	struct mv88e6xxx_vtu_entry entry = {
 		.vid = mv88e6xxx_max_vid(chip),
@@ -2722,9 +2729,12 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	if (mv88e6xxx_port_is_locked(chip, port, true))
+		mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
+
 	mv88e6xxx_reg_lock(chip);
 	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
-					   MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
+			MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
 	mv88e6xxx_reg_unlock(chip);
 
 	return err;
@@ -2735,12 +2745,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
 				  struct dsa_db db)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
+	bool locked_found = false;
 	int err;
 
-	mv88e6xxx_reg_lock(chip);
-	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
-	mv88e6xxx_reg_unlock(chip);
+	if (mv88e6xxx_port_is_locked(chip, port, true))
+		locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
 
+	if (!locked_found) {
+		mv88e6xxx_reg_lock(chip);
+		err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
+		mv88e6xxx_reg_unlock(chip);
+	}
 	return err;
 }
 
@@ -3809,11 +3824,16 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
 
 static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port)
 {
-	return mv88e6xxx_setup_devlink_regions_port(ds, port);
+	int err;
+
+	err = mv88e6xxx_setup_devlink_regions_port(ds, port);
+	mv88e6xxx_init_violation_handler(ds, port);
+	return err;
 }
 
 static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port)
 {
+	mv88e6xxx_teardown_violation_handler(ds, port);
 	mv88e6xxx_teardown_devlink_regions_port(ds, port);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 5e03cfe50156..c9a8404a6293 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -803,6 +803,11 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip)
 	mutex_unlock(&chip->reg_lock);
 }
 
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
+		       int (*cb)(struct mv88e6xxx_chip *chip,
+				 const struct mv88e6xxx_vtu_entry *entry,
+				 void *priv),
+		       void *priv);
 int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
 
 #endif /* _MV88E6XXX_CHIP_H */
diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
index 65958b2a0d3a..503fbf216670 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.h
+++ b/drivers/net/dsa/mv88e6xxx/global1.h
@@ -136,6 +136,7 @@
 #define MV88E6XXX_G1_ATU_DATA_TRUNK				0x8000
 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK			0x00f0
 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK			0x3ff0
+#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS		0x0000
 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK			0x000f
 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED			0x0000
 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST		0x0001
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c
index 40bd67a5c8e9..517376271f64 100644
--- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
+++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
@@ -12,6 +12,8 @@
 
 #include "chip.h"
 #include "global1.h"
+#include "port.h"
+#include "mv88e6xxx_switchdev.h"
 
 /* Offset 0x01: ATU FID Register */
 
@@ -114,6 +116,18 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip)
 	return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0);
 }
 
+static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip)
+{
+	int err;
+
+	err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP,
+				 MV88E6XXX_G1_ATU_OP_BUSY | MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
+	if (err)
+		return err;
+
+	return mv88e6xxx_g1_atu_op_wait(chip);
+}
+
 static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op)
 {
 	u16 val;
@@ -356,11 +370,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 	int spid;
 	int err;
 	u16 val;
+	u16 fid;
 
 	mv88e6xxx_reg_lock(chip);
 
-	err = mv88e6xxx_g1_atu_op(chip, 0,
-				  MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
+	err = mv88e6xxx_g1_read_atu_violation(chip);
 	if (err)
 		goto out;
 
@@ -368,6 +382,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 	if (err)
 		goto out;
 
+	err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid);
+	if (err)
+		goto out;
+
 	err = mv88e6xxx_g1_atu_data_read(chip, &entry);
 	if (err)
 		goto out;
@@ -382,6 +400,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 		dev_err_ratelimited(chip->dev,
 				    "ATU age out violation for %pM\n",
 				    entry.mac);
+		err = mv88e6xxx_handle_violation(chip,
+						 chip->ports[spid].port,
+						 &entry,
+						 fid,
+						 MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION);
 	}
 
 	if (val & MV88E6XXX_G1_ATU_OP_MEMBER_VIOLATION) {
@@ -396,6 +419,14 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
 				    "ATU miss violation for %pM portvec %x spid %d\n",
 				    entry.mac, entry.portvec, spid);
 		chip->ports[spid].atu_miss_violation++;
+		if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port, false))
+			err = mv88e6xxx_handle_violation(chip,
+							 chip->ports[spid].port,
+							 &entry,
+							 fid,
+							 MV88E6XXX_G1_ATU_OP_MISS_VIOLATION);
+		if (err)
+			goto out;
 	}
 
 	if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) {
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
new file mode 100644
index 000000000000..8436655ceb9a
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * mv88e6xxx_switchdev.c
+ *
+ *	Authors:
+ *	Hans J. Schultz		<hans.schultz@westermo.com>
+ *
+ */
+
+#include <net/switchdev.h>
+#include <linux/list.h>
+#include "chip.h"
+#include "global1.h"
+#include "mv88e6xxx_switchdev.h"
+
+static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
+{
+	struct switchdev_notifier_fdb_info info = {
+		.addr = ale->mac,
+		.vid = ale->vid,
+		.added_by_user = false,
+		.is_local = false,
+		.offloaded = true,
+		.locked = true,
+	};
+	struct mv88e6xxx_atu_entry entry;
+	struct net_device *brport;
+	struct dsa_port *dp;
+
+	entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
+	entry.trunk = false;
+	memcpy(&entry.mac, &ale->mac, ETH_ALEN);
+
+	mv88e6xxx_reg_lock(ale->chip);
+	mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);
+	mv88e6xxx_reg_unlock(ale->chip);
+
+	dp = dsa_to_port(ale->chip->ds, ale->port);
+	brport = dsa_port_to_bridge_port(dp);
+
+	if (brport) {
+		if (!rtnl_is_locked()) {
+			rtnl_lock();
+			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
+						 brport, &info.info, NULL);
+			rtnl_unlock();
+		} else {
+			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
+						 brport, &info.info, NULL);
+		}
+	} else {
+		dev_err(ale->chip->dev, "ERR: No bridge port for dsa port belonging to port %d\n",
+			ale->port);
+	}
+}
+
+static inline void mv88e6xxx_atu_locked_entry_purge(struct atu_locked_entry *ale)
+{
+	mv88e6xxx_atu_locked_entry_timer_work(ale);
+	del_timer(&ale->timer);
+	list_del(&ale->list);
+	kfree(ale);
+}
+
+static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work)
+{
+	struct dsa_port *dp = container_of(work, struct dsa_port, atu_work.work);
+	struct atu_locked_entry *ale, *tmp;
+
+	mutex_lock(&dp->locked_entries_list_lock);
+	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
+		if (ale->timed_out) {
+			mv88e6xxx_atu_locked_entry_purge(ale);
+			atomic_dec(&dp->atu_locked_entry_cnt);
+		}
+	}
+	mutex_unlock(&dp->locked_entries_list_lock);
+
+	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
+}
+
+static void mv88e6xxx_atu_locked_entry_timer_handler(struct timer_list *t)
+{
+	struct atu_locked_entry *ale = from_timer(ale, t, timer);
+
+	if (ale)
+		ale->timed_out = true;
+}
+
+struct mv88e6xxx_fid_search_ctx {
+	u16 fid_search;
+	u16 vid_found;
+};
+
+static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip,
+					      const struct mv88e6xxx_vtu_entry *entry,
+					      void *priv)
+{
+	struct mv88e6xxx_fid_search_ctx *ctx = priv;
+
+	if (ctx->fid_search == entry->fid) {
+		ctx->vid_found = entry->vid;
+		return 1;
+	}
+
+	return 0;
+}
+
+int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
+			       int port,
+			       struct mv88e6xxx_atu_entry *entry,
+			       u16 fid,
+			       u16 type)
+{
+	struct switchdev_notifier_fdb_info info = {
+		.addr = entry->mac,
+		.vid = 0,
+		.added_by_user = false,
+		.is_local = false,
+		.offloaded = true,
+		.locked = true,
+	};
+	struct atu_locked_entry *locked_entry;
+	struct mv88e6xxx_fid_search_ctx ctx;
+	struct net_device *brport;
+	struct dsa_port *dp;
+	int err;
+
+	ctx.fid_search = fid;
+	err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx);
+	if (err < 0)
+		return err;
+	if (err == 1)
+		info.vid = ctx.vid_found;
+	else
+		return -ENODATA;
+
+	dp = dsa_to_port(chip->ds, port);
+	brport = dsa_port_to_bridge_port(dp);
+
+	if (!brport)
+		return -ENODEV;
+
+	switch (type) {
+	case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION:
+		if (atomic_read(&dp->atu_locked_entry_cnt) >= ATU_LOCKED_ENTRIES_MAX) {
+			mv88e6xxx_reg_unlock(chip);
+			return 0;
+		}
+		entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS;
+		entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC;
+		entry->trunk = false;
+		err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry);
+		if (err)
+			goto fail;
+
+		locked_entry = kmalloc(sizeof(*locked_entry), GFP_ATOMIC);
+		if (!locked_entry)
+			return -ENOMEM;
+		timer_setup(&locked_entry->timer, mv88e6xxx_atu_locked_entry_timer_handler, 0);
+		locked_entry->timer.expires = jiffies + dp->ageing_time / 10;
+		locked_entry->chip = chip;
+		locked_entry->port = port;
+		locked_entry->fid = fid;
+		locked_entry->vid = info.vid;
+		locked_entry->timed_out = false;
+		memcpy(&locked_entry->mac, entry->mac, ETH_ALEN);
+
+		mutex_lock(&dp->locked_entries_list_lock);
+		add_timer(&locked_entry->timer);
+		list_add(&locked_entry->list, &dp->atu_locked_entries_list);
+		mutex_unlock(&dp->locked_entries_list_lock);
+		atomic_inc(&dp->atu_locked_entry_cnt);
+
+		mv88e6xxx_reg_unlock(chip);
+
+		rtnl_lock();
+		err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE,
+					       brport, &info.info, NULL);
+		break;
+	}
+	rtnl_unlock();
+
+	return err;
+
+fail:
+	mv88e6xxx_reg_unlock(chip);
+	return err;
+}
+
+bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
+					   const unsigned char *addr, u16 vid)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+	struct atu_locked_entry *ale, *tmp;
+	bool found = false;
+
+	mutex_lock(&dp->locked_entries_list_lock);
+	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
+		if (!memcmp(&ale->mac, addr, ETH_ALEN)) {
+			if (ale->vid == vid) {
+				mv88e6xxx_atu_locked_entry_purge(ale);
+				atomic_dec(&dp->atu_locked_entry_cnt);
+				found = true;
+				break;
+			}
+		}
+	}
+	mutex_unlock(&dp->locked_entries_list_lock);
+	return found;
+}
+
+void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+	struct atu_locked_entry *ale, *tmp;
+
+	mutex_lock(&dp->locked_entries_list_lock);
+	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
+		mv88e6xxx_atu_locked_entry_purge(ale);
+		atomic_dec(&dp->atu_locked_entry_cnt);
+	}
+	mutex_unlock(&dp->locked_entries_list_lock);
+
+	if (atomic_read(&dp->atu_locked_entry_cnt) != 0)
+		dev_err(ds->dev,
+			"ERROR: Locked entries count is not zero after flush on port %d\n",
+			port);
+}
+
+void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+
+	INIT_LIST_HEAD(&dp->atu_locked_entries_list);
+	mutex_init(&dp->locked_entries_list_lock);
+	dp->atu_locked_entry_cnt.counter = 0;
+	INIT_DELAYED_WORK(&dp->atu_work, mv88e6xxx_atu_locked_entry_cleanup);
+	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
+}
+
+void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port)
+{
+	struct dsa_port *dp = dsa_to_port(ds, port);
+
+	cancel_delayed_work(&dp->atu_work);
+	mv88e6xxx_atu_locked_entry_flush(ds, port);
+	mutex_destroy(&dp->locked_entries_list_lock);
+}
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
new file mode 100644
index 000000000000..f0e7abf7c361
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * mv88e6xxx_switchdev.h
+ *
+ *	Authors:
+ *	Hans J. Schultz		<hans.schultz@westermo.com>
+ *
+ */
+
+#ifndef DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
+#define DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
+
+#include <net/switchdev.h>
+#include "chip.h"
+
+#define ATU_LOCKED_ENTRIES_MAX	64
+
+struct atu_locked_entry {
+	struct list_head list;
+	struct mv88e6xxx_chip *chip;
+	int port;
+	u8	mac[ETH_ALEN];
+	u16 fid;
+	u16 vid;
+	struct timer_list timer;
+	bool timed_out;
+};
+
+int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
+			       int port,
+			       struct mv88e6xxx_atu_entry *entry,
+			       u16 fid,
+			       u16 type);
+bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
+					   const unsigned char *addr, u16 vid);
+void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port);
+void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port);
+void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port);
+
+#endif /* DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_ */
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 795b3128768f..c4e5e7174129 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -14,9 +14,11 @@
 #include <linux/phylink.h>
 
 #include "chip.h"
+#include "global1.h"
 #include "global2.h"
 #include "port.h"
 #include "serdes.h"
+#include "mv88e6xxx_switchdev.h"
 
 int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg,
 			u16 *val)
@@ -1239,6 +1241,25 @@ int mv88e6xxx_port_set_mirror(struct mv88e6xxx_chip *chip, int port,
 	return err;
 }
 
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock)
+{
+	bool locked = false;
+	u16 reg;
+
+	if (chiplock)
+		mv88e6xxx_reg_lock(chip);
+
+	if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
+		goto out;
+	locked = reg & MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK;
+
+out:
+	if (chiplock)
+		mv88e6xxx_reg_unlock(chip);
+
+	return locked;
+}
+
 int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
 			    bool locked)
 {
@@ -1261,10 +1282,13 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
 	if (err)
 		return err;
 
-	reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
-	if (locked)
-		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
-
+	reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
+	if (locked) {
+		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
+			MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
+			MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
+			MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
+	}
 	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index e0a705d82019..d377abd6ab17 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -231,6 +231,7 @@
 #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT		0x2000
 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG	0x1000
 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED	0x0800
+#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK		0x07ff
 
 /* Offset 0x0C: Port ATU Control */
 #define MV88E6XXX_PORT_ATU_CTL		0x0c
@@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid);
 int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid);
 int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid);
 
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock);
 int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
 			    bool locked);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
  2022-05-24 15:21 ` [Bridge] " Hans Schultz
@ 2022-05-24 15:21   ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

Verify that the MAC-Auth mechanism works by adding a FDB entry with the
locked flag set. denying access until the FDB entry is replaced with a
FDB entry without the locked flag set.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 .../net/forwarding/bridge_locked_port.sh      | 42 ++++++++++++++++---
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
index 5b02b6b60ce7..50b9048d044a 100755
--- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
+++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
@@ -1,7 +1,7 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan"
+ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab"
 NUM_NETIFS=4
 CHECK_TC="no"
 source lib.sh
@@ -94,13 +94,13 @@ locked_port_ipv4()
 	ping_do $h1 192.0.2.2
 	check_fail $? "Ping worked after locking port, but before adding FDB entry"
 
-	bridge fdb add `mac_get $h1` dev $swp1 master static
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
 
 	ping_do $h1 192.0.2.2
 	check_err $? "Ping did not work after locking port and adding FDB entry"
 
 	bridge link set dev $swp1 locked off
-	bridge fdb del `mac_get $h1` dev $swp1 master static
+	bridge fdb del `mac_get $h1` dev $swp1 master
 
 	ping_do $h1 192.0.2.2
 	check_err $? "Ping did not work after unlocking port and removing FDB entry."
@@ -124,13 +124,13 @@ locked_port_vlan()
 	ping_do $h1.100 198.51.100.2
 	check_fail $? "Ping through vlan worked after locking port, but before adding FDB entry"
 
-	bridge fdb add `mac_get $h1` dev $swp1 vlan 100 master static
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
 
 	ping_do $h1.100 198.51.100.2
 	check_err $? "Ping through vlan did not work after locking port and adding FDB entry"
 
 	bridge link set dev $swp1 locked off
-	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master static
+	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master
 
 	ping_do $h1.100 198.51.100.2
 	check_err $? "Ping through vlan did not work after unlocking port and removing FDB entry"
@@ -153,7 +153,8 @@ locked_port_ipv6()
 	ping6_do $h1 2001:db8:1::2
 	check_fail $? "Ping6 worked after locking port, but before adding FDB entry"
 
-	bridge fdb add `mac_get $h1` dev $swp1 master static
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
+
 	ping6_do $h1 2001:db8:1::2
 	check_err $? "Ping6 did not work after locking port and adding FDB entry"
 
@@ -166,6 +167,35 @@ locked_port_ipv6()
 	log_test "Locked port ipv6"
 }
 
+locked_port_mab()
+{
+	RET=0
+	check_locked_port_support || return 0
+
+	ping_do $h1 192.0.2.2
+	check_err $? "MAB: Ping did not work before locking port"
+
+	bridge link set dev $swp1 locked on
+	bridge link set dev $swp1 learning on
+
+	bridge fdb del `mac_get $h1` dev $swp1 master
+
+	ping_do $h1 192.0.2.2
+	check_fail $? "MAB: Ping worked on locked port without FDB entry"
+
+	bridge fdb show | grep `mac_get $h1` | grep -q "locked"
+	check_err $? "MAB: No locked fdb entry after ping on locked port"
+
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
+
+	ping_do $h1 192.0.2.2
+	check_err $? "MAB: Ping did not work with fdb entry without locked flag"
+
+	bridge fdb del `mac_get $h1` dev $swp1 master
+	bridge link set dev $swp1 locked off
+
+	log_test "Locked port MAB"
+}
 trap cleanup EXIT
 
 setup_prepare
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [Bridge] [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
@ 2022-05-24 15:21   ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 15:21 UTC (permalink / raw)
  To: davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, linux-kernel, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Paolo Abeni, Vladimir Oltean,
	Shuah Khan, Vivien Didelot

Verify that the MAC-Auth mechanism works by adding a FDB entry with the
locked flag set. denying access until the FDB entry is replaced with a
FDB entry without the locked flag set.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
---
 .../net/forwarding/bridge_locked_port.sh      | 42 ++++++++++++++++---
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
index 5b02b6b60ce7..50b9048d044a 100755
--- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
+++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
@@ -1,7 +1,7 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan"
+ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab"
 NUM_NETIFS=4
 CHECK_TC="no"
 source lib.sh
@@ -94,13 +94,13 @@ locked_port_ipv4()
 	ping_do $h1 192.0.2.2
 	check_fail $? "Ping worked after locking port, but before adding FDB entry"
 
-	bridge fdb add `mac_get $h1` dev $swp1 master static
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
 
 	ping_do $h1 192.0.2.2
 	check_err $? "Ping did not work after locking port and adding FDB entry"
 
 	bridge link set dev $swp1 locked off
-	bridge fdb del `mac_get $h1` dev $swp1 master static
+	bridge fdb del `mac_get $h1` dev $swp1 master
 
 	ping_do $h1 192.0.2.2
 	check_err $? "Ping did not work after unlocking port and removing FDB entry."
@@ -124,13 +124,13 @@ locked_port_vlan()
 	ping_do $h1.100 198.51.100.2
 	check_fail $? "Ping through vlan worked after locking port, but before adding FDB entry"
 
-	bridge fdb add `mac_get $h1` dev $swp1 vlan 100 master static
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
 
 	ping_do $h1.100 198.51.100.2
 	check_err $? "Ping through vlan did not work after locking port and adding FDB entry"
 
 	bridge link set dev $swp1 locked off
-	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master static
+	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master
 
 	ping_do $h1.100 198.51.100.2
 	check_err $? "Ping through vlan did not work after unlocking port and removing FDB entry"
@@ -153,7 +153,8 @@ locked_port_ipv6()
 	ping6_do $h1 2001:db8:1::2
 	check_fail $? "Ping6 worked after locking port, but before adding FDB entry"
 
-	bridge fdb add `mac_get $h1` dev $swp1 master static
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
+
 	ping6_do $h1 2001:db8:1::2
 	check_err $? "Ping6 did not work after locking port and adding FDB entry"
 
@@ -166,6 +167,35 @@ locked_port_ipv6()
 	log_test "Locked port ipv6"
 }
 
+locked_port_mab()
+{
+	RET=0
+	check_locked_port_support || return 0
+
+	ping_do $h1 192.0.2.2
+	check_err $? "MAB: Ping did not work before locking port"
+
+	bridge link set dev $swp1 locked on
+	bridge link set dev $swp1 learning on
+
+	bridge fdb del `mac_get $h1` dev $swp1 master
+
+	ping_do $h1 192.0.2.2
+	check_fail $? "MAB: Ping worked on locked port without FDB entry"
+
+	bridge fdb show | grep `mac_get $h1` | grep -q "locked"
+	check_err $? "MAB: No locked fdb entry after ping on locked port"
+
+	bridge fdb replace `mac_get $h1` dev $swp1 master static
+
+	ping_do $h1 192.0.2.2
+	check_err $? "MAB: Ping did not work with fdb entry without locked flag"
+
+	bridge fdb del `mac_get $h1` dev $swp1 master
+	bridge link set dev $swp1 locked off
+
+	log_test "Locked port MAB"
+}
 trap cleanup EXIT
 
 setup_prepare
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
@ 2022-05-24 15:39     ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-24 15:39 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On 24/05/2022 18:21, Hans Schultz wrote:
> Add an intermediate state for clients behind a locked port to allow for
> possible opening of the port for said clients. This feature corresponds
> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
> latter defined by Cisco.
> Locked FDB entries will be limited in number, so as to prevent DOS
> attacks by spamming the port with random entries. The limit will be
> a per port limit as it is a port based feature and that the port flushes
> all FDB entries on link down.
> 
> Only the kernel can set this FDB entry flag, while userspace can read
> the flag and remove it by deleting the FDB entry.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  include/uapi/linux/neighbour.h |  1 +
>  net/bridge/br_fdb.c            | 11 +++++++++++
>  net/bridge/br_if.c             |  1 +
>  net/bridge/br_input.c          | 11 ++++++++++-
>  net/bridge/br_private.h        |  7 ++++++-
>  5 files changed, 29 insertions(+), 2 deletions(-)
> 

Hi Hans,
So this approach has a fundamental problem, f->dst is changed without any synchronization
you cannot rely on it and thus you cannot account for these entries properly. We must be very
careful if we try to add any new synchronization not to affect performance as well.
More below...

> diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
> index 39c565e460c7..76d65b481086 100644
> --- a/include/uapi/linux/neighbour.h
> +++ b/include/uapi/linux/neighbour.h
> @@ -53,6 +53,7 @@ enum {
>  #define NTF_ROUTER	(1 << 7)
>  /* Extended flags under NDA_FLAGS_EXT: */
>  #define NTF_EXT_MANAGED	(1 << 0)
> +#define NTF_EXT_LOCKED	(1 << 1)
>  
>  /*
>   *	Neighbor Cache Entry States.
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index e7f4fccb6adb..6b83e2d6435d 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
>  	struct nda_cacheinfo ci;
>  	struct nlmsghdr *nlh;
>  	struct ndmsg *ndm;
> +	u32 ext_flags = 0;
>  
>  	nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags);
>  	if (nlh == NULL)
> @@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
>  		ndm->ndm_flags |= NTF_EXT_LEARNED;
>  	if (test_bit(BR_FDB_STICKY, &fdb->flags))
>  		ndm->ndm_flags |= NTF_STICKY;
> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
> +		ext_flags |= NTF_EXT_LOCKED;
>  
>  	if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr))
>  		goto nla_put_failure;
>  	if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex))
>  		goto nla_put_failure;
> +	if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
> +		goto nla_put_failure;
> +
>  	ci.ndm_used	 = jiffies_to_clock_t(now - fdb->used);
>  	ci.ndm_confirmed = 0;
>  	ci.ndm_updated	 = jiffies_to_clock_t(now - fdb->updated);
> @@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void)
>  	return NLMSG_ALIGN(sizeof(struct ndmsg))
>  		+ nla_total_size(ETH_ALEN) /* NDA_LLADDR */
>  		+ nla_total_size(sizeof(u32)) /* NDA_MASTER */
> +		+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
>  		+ nla_total_size(sizeof(u16)) /* NDA_VLAN */
>  		+ nla_total_size(sizeof(struct nda_cacheinfo))
>  		+ nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>  		fdb_del_hw_addr(br, f->key.addr.addr);
>  
> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
> +		atomic_dec(&f->dst->locked_entry_cnt);

Sorry but you cannot do this for multiple reasons:
 - f->dst can be NULL
 - f->dst changes without any synchronization
 - there is no synchronization between fdb's flags and its ->dst

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-24 15:39     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-24 15:39 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, Eric Dumazet,
	linux-kernel, Hans Schultz, linux-kselftest, Roopa Prabhu,
	Paolo Abeni, Vladimir Oltean, Shuah Khan, Vivien Didelot

On 24/05/2022 18:21, Hans Schultz wrote:
> Add an intermediate state for clients behind a locked port to allow for
> possible opening of the port for said clients. This feature corresponds
> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
> latter defined by Cisco.
> Locked FDB entries will be limited in number, so as to prevent DOS
> attacks by spamming the port with random entries. The limit will be
> a per port limit as it is a port based feature and that the port flushes
> all FDB entries on link down.
> 
> Only the kernel can set this FDB entry flag, while userspace can read
> the flag and remove it by deleting the FDB entry.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  include/uapi/linux/neighbour.h |  1 +
>  net/bridge/br_fdb.c            | 11 +++++++++++
>  net/bridge/br_if.c             |  1 +
>  net/bridge/br_input.c          | 11 ++++++++++-
>  net/bridge/br_private.h        |  7 ++++++-
>  5 files changed, 29 insertions(+), 2 deletions(-)
> 

Hi Hans,
So this approach has a fundamental problem, f->dst is changed without any synchronization
you cannot rely on it and thus you cannot account for these entries properly. We must be very
careful if we try to add any new synchronization not to affect performance as well.
More below...

> diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
> index 39c565e460c7..76d65b481086 100644
> --- a/include/uapi/linux/neighbour.h
> +++ b/include/uapi/linux/neighbour.h
> @@ -53,6 +53,7 @@ enum {
>  #define NTF_ROUTER	(1 << 7)
>  /* Extended flags under NDA_FLAGS_EXT: */
>  #define NTF_EXT_MANAGED	(1 << 0)
> +#define NTF_EXT_LOCKED	(1 << 1)
>  
>  /*
>   *	Neighbor Cache Entry States.
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index e7f4fccb6adb..6b83e2d6435d 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
>  	struct nda_cacheinfo ci;
>  	struct nlmsghdr *nlh;
>  	struct ndmsg *ndm;
> +	u32 ext_flags = 0;
>  
>  	nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags);
>  	if (nlh == NULL)
> @@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
>  		ndm->ndm_flags |= NTF_EXT_LEARNED;
>  	if (test_bit(BR_FDB_STICKY, &fdb->flags))
>  		ndm->ndm_flags |= NTF_STICKY;
> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
> +		ext_flags |= NTF_EXT_LOCKED;
>  
>  	if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr))
>  		goto nla_put_failure;
>  	if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex))
>  		goto nla_put_failure;
> +	if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
> +		goto nla_put_failure;
> +
>  	ci.ndm_used	 = jiffies_to_clock_t(now - fdb->used);
>  	ci.ndm_confirmed = 0;
>  	ci.ndm_updated	 = jiffies_to_clock_t(now - fdb->updated);
> @@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void)
>  	return NLMSG_ALIGN(sizeof(struct ndmsg))
>  		+ nla_total_size(ETH_ALEN) /* NDA_LLADDR */
>  		+ nla_total_size(sizeof(u32)) /* NDA_MASTER */
> +		+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
>  		+ nla_total_size(sizeof(u16)) /* NDA_VLAN */
>  		+ nla_total_size(sizeof(struct nda_cacheinfo))
>  		+ nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>  		fdb_del_hw_addr(br, f->key.addr.addr);
>  
> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
> +		atomic_dec(&f->dst->locked_entry_cnt);

Sorry but you cannot do this for multiple reasons:
 - f->dst can be NULL
 - f->dst changes without any synchronization
 - there is no synchronization between fdb's flags and its ->dst

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-24 15:39     ` [Bridge] " Nikolay Aleksandrov
@ 2022-05-24 16:08       ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 16:08 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

>
> Hi Hans,
> So this approach has a fundamental problem, f->dst is changed without any synchronization
> you cannot rely on it and thus you cannot account for these entries properly. We must be very
> careful if we try to add any new synchronization not to affect performance as well.
> More below...
>
>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>  
>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>> +		atomic_dec(&f->dst->locked_entry_cnt);
>
> Sorry but you cannot do this for multiple reasons:
>  - f->dst can be NULL
>  - f->dst changes without any synchronization
>  - there is no synchronization between fdb's flags and its ->dst
>
> Cheers,
>  Nik

Hi Nik,

I could check if f->dst is NULL, but in general this should be able to
work on a per port basis, so do you have an idea of how to keep a per
port counter of added locked fdb entries?

Best,
Hans

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-24 16:08       ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 16:08 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

>
> Hi Hans,
> So this approach has a fundamental problem, f->dst is changed without any synchronization
> you cannot rely on it and thus you cannot account for these entries properly. We must be very
> careful if we try to add any new synchronization not to affect performance as well.
> More below...
>
>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>  
>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>> +		atomic_dec(&f->dst->locked_entry_cnt);
>
> Sorry but you cannot do this for multiple reasons:
>  - f->dst can be NULL
>  - f->dst changes without any synchronization
>  - there is no synchronization between fdb's flags and its ->dst
>
> Cheers,
>  Nik

Hi Nik,

I could check if f->dst is NULL, but in general this should be able to
work on a per port basis, so do you have an idea of how to keep a per
port counter of added locked fdb entries?

Best,
Hans

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-24 15:39     ` [Bridge] " Nikolay Aleksandrov
@ 2022-05-24 16:21       ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 16:21 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

>
> Hi Hans,
> So this approach has a fundamental problem, f->dst is changed without any synchronization
> you cannot rely on it and thus you cannot account for these entries properly. We must be very
> careful if we try to add any new synchronization not to affect performance as well.
> More below...
>
>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>  
>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>> +		atomic_dec(&f->dst->locked_entry_cnt);
>
> Sorry but you cannot do this for multiple reasons:
>  - f->dst can be NULL
>  - f->dst changes without any synchronization
>  - there is no synchronization between fdb's flags and its ->dst
>
> Cheers,
>  Nik

Hi Nik,

if a port is decoupled from the bridge, the locked entries would of
course be invalid, so maybe if adding and removing a port is accounted
for wrt locked entries and the count of locked entries, would that not
work?

Best,
Hans

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-24 16:21       ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-24 16:21 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

>
> Hi Hans,
> So this approach has a fundamental problem, f->dst is changed without any synchronization
> you cannot rely on it and thus you cannot account for these entries properly. We must be very
> careful if we try to add any new synchronization not to affect performance as well.
> More below...
>
>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>  
>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>> +		atomic_dec(&f->dst->locked_entry_cnt);
>
> Sorry but you cannot do this for multiple reasons:
>  - f->dst can be NULL
>  - f->dst changes without any synchronization
>  - there is no synchronization between fdb's flags and its ->dst
>
> Cheers,
>  Nik

Hi Nik,

if a port is decoupled from the bridge, the locked entries would of
course be invalid, so maybe if adding and removing a port is accounted
for wrt locked entries and the count of locked entries, would that not
work?

Best,
Hans

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
  (?)
@ 2022-05-24 21:36   ` kernel test robot
  -1 siblings, 0 replies; 108+ messages in thread
From: kernel test robot @ 2022-05-24 21:36 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: llvm, kbuild-all, netdev, Hans Schultz, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, Vladimir Oltean, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

Hi Hans,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Hans-Schultz/Extend-locked-port-feature-with-FDB-locked-flag-MAC-Auth-MAB/20220524-232455
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 677fb7525331375ba2f90f4bc94a80b9b6e697a3
config: x86_64-randconfig-a005 (https://download.01.org/0day-ci/archive/20220525/202205250511.beuDDi9L-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 10c9ecce9f6096e18222a331c5e7d085bd813f75)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/5c2d731ec7670b3eb06906c64d66c6098c588a6a
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Hans-Schultz/Extend-locked-port-feature-with-FDB-locked-flag-MAC-Auth-MAB/20220524-232455
        git checkout 5c2d731ec7670b3eb06906c64d66c6098c588a6a
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/net/dsa/mv88e6xxx/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/net/dsa/mv88e6xxx/chip.c:2754:6: warning: variable 'err' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
           if (!locked_found) {
               ^~~~~~~~~~~~~
   drivers/net/dsa/mv88e6xxx/chip.c:2759:9: note: uninitialized use occurs here
           return err;
                  ^~~
   drivers/net/dsa/mv88e6xxx/chip.c:2754:2: note: remove the 'if' if its condition is always true
           if (!locked_found) {
           ^~~~~~~~~~~~~~~~~~~
   drivers/net/dsa/mv88e6xxx/chip.c:2749:9: note: initialize the variable 'err' to silence this warning
           int err;
                  ^
                   = 0
   1 warning generated.


vim +2754 drivers/net/dsa/mv88e6xxx/chip.c

  2742	
  2743	static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
  2744					  const unsigned char *addr, u16 vid,
  2745					  struct dsa_db db)
  2746	{
  2747		struct mv88e6xxx_chip *chip = ds->priv;
  2748		bool locked_found = false;
  2749		int err;
  2750	
  2751		if (mv88e6xxx_port_is_locked(chip, port, true))
  2752			locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
  2753	
> 2754		if (!locked_found) {
  2755			mv88e6xxx_reg_lock(chip);
  2756			err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
  2757			mv88e6xxx_reg_unlock(chip);
  2758		}
  2759		return err;
  2760	}
  2761	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-24 16:21       ` [Bridge] " Hans Schultz
@ 2022-05-25  8:06         ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-25  8:06 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 24/05/2022 19:21, Hans Schultz wrote:
>>
>> Hi Hans,
>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>> careful if we try to add any new synchronization not to affect performance as well.
>> More below...
>>
>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>  
>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>
>> Sorry but you cannot do this for multiple reasons:
>>  - f->dst can be NULL
>>  - f->dst changes without any synchronization
>>  - there is no synchronization between fdb's flags and its ->dst
>>
>> Cheers,
>>  Nik
> 
> Hi Nik,
> 
> if a port is decoupled from the bridge, the locked entries would of
> course be invalid, so maybe if adding and removing a port is accounted
> for wrt locked entries and the count of locked entries, would that not
> work?
> 
> Best,
> Hans

Hi Hans,
Unfortunately you need the correct amount of locked entries per-port if you want
to limit their number per-port, instead of globally. So you need a consistent
fdb view with all its attributes when changing its dst in this case, which would
require new locking because you have multiple dependent struct fields and it will
kill roaming/learning scalability. I don't think this use case is worth the complexity it
will bring, so I'd suggest an alternative - you can monitor the number of locked entries
per-port from a user-space agent and disable port learning or some similar solution that
doesn't require any complex kernel changes. Is the limit a requirement to add the feature?

I have an idea how to do it and to minimize the performance hit if it really is needed
but it'll add a lot of complexity which I'd like to avoid if possible.

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-25  8:06         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-25  8:06 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

On 24/05/2022 19:21, Hans Schultz wrote:
>>
>> Hi Hans,
>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>> careful if we try to add any new synchronization not to affect performance as well.
>> More below...
>>
>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>  
>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>
>> Sorry but you cannot do this for multiple reasons:
>>  - f->dst can be NULL
>>  - f->dst changes without any synchronization
>>  - there is no synchronization between fdb's flags and its ->dst
>>
>> Cheers,
>>  Nik
> 
> Hi Nik,
> 
> if a port is decoupled from the bridge, the locked entries would of
> course be invalid, so maybe if adding and removing a port is accounted
> for wrt locked entries and the count of locked entries, would that not
> work?
> 
> Best,
> Hans

Hi Hans,
Unfortunately you need the correct amount of locked entries per-port if you want
to limit their number per-port, instead of globally. So you need a consistent
fdb view with all its attributes when changing its dst in this case, which would
require new locking because you have multiple dependent struct fields and it will
kill roaming/learning scalability. I don't think this use case is worth the complexity it
will bring, so I'd suggest an alternative - you can monitor the number of locked entries
per-port from a user-space agent and disable port learning or some similar solution that
doesn't require any complex kernel changes. Is the limit a requirement to add the feature?

I have an idea how to do it and to minimize the performance hit if it really is needed
but it'll add a lot of complexity which I'd like to avoid if possible.

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-25  8:06         ` [Bridge] " Nikolay Aleksandrov
@ 2022-05-25  8:34           ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-25  8:34 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 24/05/2022 19:21, Hans Schultz wrote:
>>>
>>> Hi Hans,
>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>> careful if we try to add any new synchronization not to affect performance as well.
>>> More below...
>>>
>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>  
>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>
>>> Sorry but you cannot do this for multiple reasons:
>>>  - f->dst can be NULL
>>>  - f->dst changes without any synchronization
>>>  - there is no synchronization between fdb's flags and its ->dst
>>>
>>> Cheers,
>>>  Nik
>> 
>> Hi Nik,
>> 
>> if a port is decoupled from the bridge, the locked entries would of
>> course be invalid, so maybe if adding and removing a port is accounted
>> for wrt locked entries and the count of locked entries, would that not
>> work?
>> 
>> Best,
>> Hans
>
> Hi Hans,
> Unfortunately you need the correct amount of locked entries per-port if you want
> to limit their number per-port, instead of globally. So you need a
> consistent

Hi Nik,
the used dst is a port structure, so it is per-port and not globally.

Best,
Hans

> fdb view with all its attributes when changing its dst in this case, which would
> require new locking because you have multiple dependent struct fields and it will
> kill roaming/learning scalability. I don't think this use case is worth the complexity it
> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
> per-port from a user-space agent and disable port learning or some similar solution that
> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>
> I have an idea how to do it and to minimize the performance hit if it really is needed
> but it'll add a lot of complexity which I'd like to avoid if possible.
>
> Cheers,
>  Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-25  8:34           ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-25  8:34 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 24/05/2022 19:21, Hans Schultz wrote:
>>>
>>> Hi Hans,
>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>> careful if we try to add any new synchronization not to affect performance as well.
>>> More below...
>>>
>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>  
>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>
>>> Sorry but you cannot do this for multiple reasons:
>>>  - f->dst can be NULL
>>>  - f->dst changes without any synchronization
>>>  - there is no synchronization between fdb's flags and its ->dst
>>>
>>> Cheers,
>>>  Nik
>> 
>> Hi Nik,
>> 
>> if a port is decoupled from the bridge, the locked entries would of
>> course be invalid, so maybe if adding and removing a port is accounted
>> for wrt locked entries and the count of locked entries, would that not
>> work?
>> 
>> Best,
>> Hans
>
> Hi Hans,
> Unfortunately you need the correct amount of locked entries per-port if you want
> to limit their number per-port, instead of globally. So you need a
> consistent

Hi Nik,
the used dst is a port structure, so it is per-port and not globally.

Best,
Hans

> fdb view with all its attributes when changing its dst in this case, which would
> require new locking because you have multiple dependent struct fields and it will
> kill roaming/learning scalability. I don't think this use case is worth the complexity it
> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
> per-port from a user-space agent and disable port learning or some similar solution that
> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>
> I have an idea how to do it and to minimize the performance hit if it really is needed
> but it'll add a lot of complexity which I'd like to avoid if possible.
>
> Cheers,
>  Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-25  8:34           ` [Bridge] " Hans Schultz
@ 2022-05-25  8:38             ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-25  8:38 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 25/05/2022 11:34, Hans Schultz wrote:
> On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> On 24/05/2022 19:21, Hans Schultz wrote:
>>>>
>>>> Hi Hans,
>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>> More below...
>>>>
>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>  
>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>
>>>> Sorry but you cannot do this for multiple reasons:
>>>>  - f->dst can be NULL
>>>>  - f->dst changes without any synchronization
>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>
>>>> Cheers,
>>>>  Nik
>>>
>>> Hi Nik,
>>>
>>> if a port is decoupled from the bridge, the locked entries would of
>>> course be invalid, so maybe if adding and removing a port is accounted
>>> for wrt locked entries and the count of locked entries, would that not
>>> work?
>>>
>>> Best,
>>> Hans
>>
>> Hi Hans,
>> Unfortunately you need the correct amount of locked entries per-port if you want
>> to limit their number per-port, instead of globally. So you need a
>> consistent
> 
> Hi Nik,
> the used dst is a port structure, so it is per-port and not globally.
> 
> Best,
> Hans
> 

Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.

By the way just fyi net-next is closed right now due to merge window. And one more
thing please include a short log of changes between versions when you send a new one.
I had to go look for v2 to find out what changed.

>> fdb view with all its attributes when changing its dst in this case, which would
>> require new locking because you have multiple dependent struct fields and it will
>> kill roaming/learning scalability. I don't think this use case is worth the complexity it
>> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
>> per-port from a user-space agent and disable port learning or some similar solution that
>> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>>
>> I have an idea how to do it and to minimize the performance hit if it really is needed
>> but it'll add a lot of complexity which I'd like to avoid if possible.
>>
>> Cheers,
>>  Nik


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-25  8:38             ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-25  8:38 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

On 25/05/2022 11:34, Hans Schultz wrote:
> On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> On 24/05/2022 19:21, Hans Schultz wrote:
>>>>
>>>> Hi Hans,
>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>> More below...
>>>>
>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>  
>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>
>>>> Sorry but you cannot do this for multiple reasons:
>>>>  - f->dst can be NULL
>>>>  - f->dst changes without any synchronization
>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>
>>>> Cheers,
>>>>  Nik
>>>
>>> Hi Nik,
>>>
>>> if a port is decoupled from the bridge, the locked entries would of
>>> course be invalid, so maybe if adding and removing a port is accounted
>>> for wrt locked entries and the count of locked entries, would that not
>>> work?
>>>
>>> Best,
>>> Hans
>>
>> Hi Hans,
>> Unfortunately you need the correct amount of locked entries per-port if you want
>> to limit their number per-port, instead of globally. So you need a
>> consistent
> 
> Hi Nik,
> the used dst is a port structure, so it is per-port and not globally.
> 
> Best,
> Hans
> 

Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.

By the way just fyi net-next is closed right now due to merge window. And one more
thing please include a short log of changes between versions when you send a new one.
I had to go look for v2 to find out what changed.

>> fdb view with all its attributes when changing its dst in this case, which would
>> require new locking because you have multiple dependent struct fields and it will
>> kill roaming/learning scalability. I don't think this use case is worth the complexity it
>> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
>> per-port from a user-space agent and disable port learning or some similar solution that
>> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>>
>> I have an idea how to do it and to minimize the performance hit if it really is needed
>> but it'll add a lot of complexity which I'd like to avoid if possible.
>>
>> Cheers,
>>  Nik


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-25  8:38             ` [Bridge] " Nikolay Aleksandrov
@ 2022-05-25  9:11               ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-25  9:11 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On ons, maj 25, 2022 at 11:38, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 25/05/2022 11:34, Hans Schultz wrote:
>> On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>>> On 24/05/2022 19:21, Hans Schultz wrote:
>>>>>
>>>>> Hi Hans,
>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>>> More below...
>>>>>
>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>>  
>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>>
>>>>> Sorry but you cannot do this for multiple reasons:
>>>>>  - f->dst can be NULL
>>>>>  - f->dst changes without any synchronization
>>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>>
>>>>> Cheers,
>>>>>  Nik
>>>>
>>>> Hi Nik,
>>>>
>>>> if a port is decoupled from the bridge, the locked entries would of
>>>> course be invalid, so maybe if adding and removing a port is accounted
>>>> for wrt locked entries and the count of locked entries, would that not
>>>> work?
>>>>
>>>> Best,
>>>> Hans
>>>
>>> Hi Hans,
>>> Unfortunately you need the correct amount of locked entries per-port if you want
>>> to limit their number per-port, instead of globally. So you need a
>>> consistent
>> 
>> Hi Nik,
>> the used dst is a port structure, so it is per-port and not globally.
>> 
>> Best,
>> Hans
>> 
>
> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
>
> By the way just fyi net-next is closed right now due to merge window. And one more
> thing please include a short log of changes between versions when you send a new one.
> I had to go look for v2 to find out what changed.
>

Okay, I will drop the limit in the bridge module, which is an easy thing
to do. :) (It is mostly there to ensure against DOS attacks if someone
bombards a locked port with random mac addresses.)
I have a similar limitation in the driver, which should then probably be
dropped too?

The mayor difference between v2 and v3 is in the mv88e6xxx driver, where
I now keep an inventory of locked ATU entries and remove them based on a
timer (mv88e6xxx_switchcore.c).

I guess the mentioned log should be in the cover letter part?


>>> fdb view with all its attributes when changing its dst in this case, which would
>>> require new locking because you have multiple dependent struct fields and it will
>>> kill roaming/learning scalability. I don't think this use case is worth the complexity it
>>> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
>>> per-port from a user-space agent and disable port learning or some similar solution that
>>> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>>>
>>> I have an idea how to do it and to minimize the performance hit if it really is needed
>>> but it'll add a lot of complexity which I'd like to avoid if possible.
>>>
>>> Cheers,
>>>  Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-25  9:11               ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-25  9:11 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

On ons, maj 25, 2022 at 11:38, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 25/05/2022 11:34, Hans Schultz wrote:
>> On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>>> On 24/05/2022 19:21, Hans Schultz wrote:
>>>>>
>>>>> Hi Hans,
>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>>> More below...
>>>>>
>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>>  
>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>>
>>>>> Sorry but you cannot do this for multiple reasons:
>>>>>  - f->dst can be NULL
>>>>>  - f->dst changes without any synchronization
>>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>>
>>>>> Cheers,
>>>>>  Nik
>>>>
>>>> Hi Nik,
>>>>
>>>> if a port is decoupled from the bridge, the locked entries would of
>>>> course be invalid, so maybe if adding and removing a port is accounted
>>>> for wrt locked entries and the count of locked entries, would that not
>>>> work?
>>>>
>>>> Best,
>>>> Hans
>>>
>>> Hi Hans,
>>> Unfortunately you need the correct amount of locked entries per-port if you want
>>> to limit their number per-port, instead of globally. So you need a
>>> consistent
>> 
>> Hi Nik,
>> the used dst is a port structure, so it is per-port and not globally.
>> 
>> Best,
>> Hans
>> 
>
> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
>
> By the way just fyi net-next is closed right now due to merge window. And one more
> thing please include a short log of changes between versions when you send a new one.
> I had to go look for v2 to find out what changed.
>

Okay, I will drop the limit in the bridge module, which is an easy thing
to do. :) (It is mostly there to ensure against DOS attacks if someone
bombards a locked port with random mac addresses.)
I have a similar limitation in the driver, which should then probably be
dropped too?

The mayor difference between v2 and v3 is in the mv88e6xxx driver, where
I now keep an inventory of locked ATU entries and remove them based on a
timer (mv88e6xxx_switchcore.c).

I guess the mentioned log should be in the cover letter part?


>>> fdb view with all its attributes when changing its dst in this case, which would
>>> require new locking because you have multiple dependent struct fields and it will
>>> kill roaming/learning scalability. I don't think this use case is worth the complexity it
>>> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
>>> per-port from a user-space agent and disable port learning or some similar solution that
>>> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>>>
>>> I have an idea how to do it and to minimize the performance hit if it really is needed
>>> but it'll add a lot of complexity which I'd like to avoid if possible.
>>>
>>> Cheers,
>>>  Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-25  9:11               ` [Bridge] " Hans Schultz
@ 2022-05-25 10:18                 ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-25 10:18 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: netdev, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 25/05/2022 12:11, Hans Schultz wrote:
> On ons, maj 25, 2022 at 11:38, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> On 25/05/2022 11:34, Hans Schultz wrote:
>>> On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>>>> On 24/05/2022 19:21, Hans Schultz wrote:
>>>>>>
>>>>>> Hi Hans,
>>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>>>> More below...
>>>>>>
>>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>>>  
>>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>>>
>>>>>> Sorry but you cannot do this for multiple reasons:
>>>>>>  - f->dst can be NULL
>>>>>>  - f->dst changes without any synchronization
>>>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>>>
>>>>>> Cheers,
>>>>>>  Nik
>>>>>
>>>>> Hi Nik,
>>>>>
>>>>> if a port is decoupled from the bridge, the locked entries would of
>>>>> course be invalid, so maybe if adding and removing a port is accounted
>>>>> for wrt locked entries and the count of locked entries, would that not
>>>>> work?
>>>>>
>>>>> Best,
>>>>> Hans
>>>>
>>>> Hi Hans,
>>>> Unfortunately you need the correct amount of locked entries per-port if you want
>>>> to limit their number per-port, instead of globally. So you need a
>>>> consistent
>>>
>>> Hi Nik,
>>> the used dst is a port structure, so it is per-port and not globally.
>>>
>>> Best,
>>> Hans
>>>
>>
>> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
>> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
>>
>> By the way just fyi net-next is closed right now due to merge window. And one more
>> thing please include a short log of changes between versions when you send a new one.
>> I had to go look for v2 to find out what changed.
>>
> 
> Okay, I will drop the limit in the bridge module, which is an easy thing
> to do. :) (It is mostly there to ensure against DOS attacks if someone
> bombards a locked port with random mac addresses.)
> I have a similar limitation in the driver, which should then probably be
> dropped too?
> 

That is up to you/driver, I'd try looking for similar problems in other switch drivers
and check how those were handled. There are people in the CC above that can
directly answer that. :)

> The mayor difference between v2 and v3 is in the mv88e6xxx driver, where
> I now keep an inventory of locked ATU entries and remove them based on a
> timer (mv88e6xxx_switchcore.c).
> 

ack

> I guess the mentioned log should be in the cover letter part?
> 

Yep, usually a short mention of what changed to make it easier for reviewers.
Some people also add the patch-specific changes to each patch under the ---
so they're not included in the log, but I'm fine either way as long as I don't
have to go digging up the old versions.

> 
>>>> fdb view with all its attributes when changing its dst in this case, which would
>>>> require new locking because you have multiple dependent struct fields and it will
>>>> kill roaming/learning scalability. I don't think this use case is worth the complexity it
>>>> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
>>>> per-port from a user-space agent and disable port learning or some similar solution that
>>>> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>>>>
>>>> I have an idea how to do it and to minimize the performance hit if it really is needed
>>>> but it'll add a lot of complexity which I'd like to avoid if possible.
>>>>
>>>> Cheers,
>>>>  Nik


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-25 10:18                 ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-05-25 10:18 UTC (permalink / raw)
  To: Hans Schultz, davem, kuba
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Eric Dumazet, linux-kselftest, Roopa Prabhu, Paolo Abeni,
	Vladimir Oltean, Shuah Khan, Vivien Didelot

On 25/05/2022 12:11, Hans Schultz wrote:
> On ons, maj 25, 2022 at 11:38, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> On 25/05/2022 11:34, Hans Schultz wrote:
>>> On ons, maj 25, 2022 at 11:06, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>>>> On 24/05/2022 19:21, Hans Schultz wrote:
>>>>>>
>>>>>> Hi Hans,
>>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>>>> More below...
>>>>>>
>>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>>>  
>>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>>>
>>>>>> Sorry but you cannot do this for multiple reasons:
>>>>>>  - f->dst can be NULL
>>>>>>  - f->dst changes without any synchronization
>>>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>>>
>>>>>> Cheers,
>>>>>>  Nik
>>>>>
>>>>> Hi Nik,
>>>>>
>>>>> if a port is decoupled from the bridge, the locked entries would of
>>>>> course be invalid, so maybe if adding and removing a port is accounted
>>>>> for wrt locked entries and the count of locked entries, would that not
>>>>> work?
>>>>>
>>>>> Best,
>>>>> Hans
>>>>
>>>> Hi Hans,
>>>> Unfortunately you need the correct amount of locked entries per-port if you want
>>>> to limit their number per-port, instead of globally. So you need a
>>>> consistent
>>>
>>> Hi Nik,
>>> the used dst is a port structure, so it is per-port and not globally.
>>>
>>> Best,
>>> Hans
>>>
>>
>> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
>> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
>>
>> By the way just fyi net-next is closed right now due to merge window. And one more
>> thing please include a short log of changes between versions when you send a new one.
>> I had to go look for v2 to find out what changed.
>>
> 
> Okay, I will drop the limit in the bridge module, which is an easy thing
> to do. :) (It is mostly there to ensure against DOS attacks if someone
> bombards a locked port with random mac addresses.)
> I have a similar limitation in the driver, which should then probably be
> dropped too?
> 

That is up to you/driver, I'd try looking for similar problems in other switch drivers
and check how those were handled. There are people in the CC above that can
directly answer that. :)

> The mayor difference between v2 and v3 is in the mv88e6xxx driver, where
> I now keep an inventory of locked ATU entries and remove them based on a
> timer (mv88e6xxx_switchcore.c).
> 

ack

> I guess the mentioned log should be in the cover letter part?
> 

Yep, usually a short mention of what changed to make it easier for reviewers.
Some people also add the patch-specific changes to each patch under the ---
so they're not included in the log, but I'm fine either way as long as I don't
have to go digging up the old versions.

> 
>>>> fdb view with all its attributes when changing its dst in this case, which would
>>>> require new locking because you have multiple dependent struct fields and it will
>>>> kill roaming/learning scalability. I don't think this use case is worth the complexity it
>>>> will bring, so I'd suggest an alternative - you can monitor the number of locked entries
>>>> per-port from a user-space agent and disable port learning or some similar solution that
>>>> doesn't require any complex kernel changes. Is the limit a requirement to add the feature?
>>>>
>>>> I have an idea how to do it and to minimize the performance hit if it really is needed
>>>> but it'll add a lot of complexity which I'd like to avoid if possible.
>>>>
>>>> Cheers,
>>>>  Nik


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
@ 2022-05-26 14:13     ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-26 14:13 UTC (permalink / raw)
  To: Hans Schultz
  Cc: davem, kuba, netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On Tue, May 24, 2022 at 05:21:41PM +0200, Hans Schultz wrote:
> Add an intermediate state for clients behind a locked port to allow for
> possible opening of the port for said clients. This feature corresponds
> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
> latter defined by Cisco.
> Locked FDB entries will be limited in number, so as to prevent DOS
> attacks by spamming the port with random entries. The limit will be
> a per port limit as it is a port based feature and that the port flushes
> all FDB entries on link down.

Why locked FDB entries need a special treatment compared to regular
entries? A port that has learning enabled can be spammed with random
source MACs just as well.

The authorization daemon that is monitoring FDB notifications can have a
policy to shut down a port if the rate / number of locked entries is
above a given threshold.

I don't think this kind of policy belongs in the kernel. If it resides
in user space, then the threshold can be adjusted. Currently it's hard
coded to 64 and I don't see how user space can change or monitor it.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-26 14:13     ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-26 14:13 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem, linux-kernel

On Tue, May 24, 2022 at 05:21:41PM +0200, Hans Schultz wrote:
> Add an intermediate state for clients behind a locked port to allow for
> possible opening of the port for said clients. This feature corresponds
> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
> latter defined by Cisco.
> Locked FDB entries will be limited in number, so as to prevent DOS
> attacks by spamming the port with random entries. The limit will be
> a per port limit as it is a port based feature and that the port flushes
> all FDB entries on link down.

Why locked FDB entries need a special treatment compared to regular
entries? A port that has learning enabled can be spammed with random
source MACs just as well.

The authorization daemon that is monitoring FDB notifications can have a
policy to shut down a port if the rate / number of locked entries is
above a given threshold.

I don't think this kind of policy belongs in the kernel. If it resides
in user space, then the threshold can be adjusted. Currently it's hard
coded to 64 and I don't see how user space can change or monitor it.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
@ 2022-05-26 14:27     ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-26 14:27 UTC (permalink / raw)
  To: Hans Schultz
  Cc: davem, kuba, netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On Tue, May 24, 2022 at 05:21:44PM +0200, Hans Schultz wrote:
> Verify that the MAC-Auth mechanism works by adding a FDB entry with the
> locked flag set. denying access until the FDB entry is replaced with a
> FDB entry without the locked flag set.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  .../net/forwarding/bridge_locked_port.sh      | 42 ++++++++++++++++---
>  1 file changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
> index 5b02b6b60ce7..50b9048d044a 100755
> --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
> +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
> @@ -1,7 +1,7 @@
>  #!/bin/bash
>  # SPDX-License-Identifier: GPL-2.0
>  
> -ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan"
> +ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab"
>  NUM_NETIFS=4
>  CHECK_TC="no"
>  source lib.sh
> @@ -94,13 +94,13 @@ locked_port_ipv4()
>  	ping_do $h1 192.0.2.2
>  	check_fail $? "Ping worked after locking port, but before adding FDB entry"
>  
> -	bridge fdb add `mac_get $h1` dev $swp1 master static
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>  
>  	ping_do $h1 192.0.2.2
>  	check_err $? "Ping did not work after locking port and adding FDB entry"
>  
>  	bridge link set dev $swp1 locked off
> -	bridge fdb del `mac_get $h1` dev $swp1 master static
> +	bridge fdb del `mac_get $h1` dev $swp1 master
>  
>  	ping_do $h1 192.0.2.2
>  	check_err $? "Ping did not work after unlocking port and removing FDB entry."
> @@ -124,13 +124,13 @@ locked_port_vlan()
>  	ping_do $h1.100 198.51.100.2
>  	check_fail $? "Ping through vlan worked after locking port, but before adding FDB entry"
>  
> -	bridge fdb add `mac_get $h1` dev $swp1 vlan 100 master static
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>  
>  	ping_do $h1.100 198.51.100.2
>  	check_err $? "Ping through vlan did not work after locking port and adding FDB entry"
>  
>  	bridge link set dev $swp1 locked off
> -	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master static
> +	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master
>  
>  	ping_do $h1.100 198.51.100.2
>  	check_err $? "Ping through vlan did not work after unlocking port and removing FDB entry"
> @@ -153,7 +153,8 @@ locked_port_ipv6()
>  	ping6_do $h1 2001:db8:1::2
>  	check_fail $? "Ping6 worked after locking port, but before adding FDB entry"
>  
> -	bridge fdb add `mac_get $h1` dev $swp1 master static
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
> +
>  	ping6_do $h1 2001:db8:1::2
>  	check_err $? "Ping6 did not work after locking port and adding FDB entry"
>  
> @@ -166,6 +167,35 @@ locked_port_ipv6()
>  	log_test "Locked port ipv6"
>  }

Why did you change s/add/replace/? Also, from the subject and commit
message I understand the patch is about adding a new test, not changing
existing ones.

>  
> +locked_port_mab()
> +{
> +	RET=0
> +	check_locked_port_support || return 0
> +
> +	ping_do $h1 192.0.2.2
> +	check_err $? "MAB: Ping did not work before locking port"
> +
> +	bridge link set dev $swp1 locked on
> +	bridge link set dev $swp1 learning on
> +
> +	bridge fdb del `mac_get $h1` dev $swp1 master

Why the delete is needed? Aren't you getting errors on trying to delete
a non-existing entry? In previous test cases learning is disabled and it
seems the FDB entry is cleaned up.

> +
> +	ping_do $h1 192.0.2.2
> +	check_fail $? "MAB: Ping worked on locked port without FDB entry"
> +
> +	bridge fdb show | grep `mac_get $h1` | grep -q "locked"
> +	check_err $? "MAB: No locked fdb entry after ping on locked port"
> +
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
> +
> +	ping_do $h1 192.0.2.2
> +	check_err $? "MAB: Ping did not work with fdb entry without locked flag"
> +
> +	bridge fdb del `mac_get $h1` dev $swp1 master

bridge link set dev $swp1 learning off

> +	bridge link set dev $swp1 locked off
> +
> +	log_test "Locked port MAB"
> +}
>  trap cleanup EXIT
>  
>  setup_prepare
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
@ 2022-05-26 14:27     ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-26 14:27 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem, linux-kernel

On Tue, May 24, 2022 at 05:21:44PM +0200, Hans Schultz wrote:
> Verify that the MAC-Auth mechanism works by adding a FDB entry with the
> locked flag set. denying access until the FDB entry is replaced with a
> FDB entry without the locked flag set.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  .../net/forwarding/bridge_locked_port.sh      | 42 ++++++++++++++++---
>  1 file changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
> index 5b02b6b60ce7..50b9048d044a 100755
> --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
> +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
> @@ -1,7 +1,7 @@
>  #!/bin/bash
>  # SPDX-License-Identifier: GPL-2.0
>  
> -ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan"
> +ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab"
>  NUM_NETIFS=4
>  CHECK_TC="no"
>  source lib.sh
> @@ -94,13 +94,13 @@ locked_port_ipv4()
>  	ping_do $h1 192.0.2.2
>  	check_fail $? "Ping worked after locking port, but before adding FDB entry"
>  
> -	bridge fdb add `mac_get $h1` dev $swp1 master static
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>  
>  	ping_do $h1 192.0.2.2
>  	check_err $? "Ping did not work after locking port and adding FDB entry"
>  
>  	bridge link set dev $swp1 locked off
> -	bridge fdb del `mac_get $h1` dev $swp1 master static
> +	bridge fdb del `mac_get $h1` dev $swp1 master
>  
>  	ping_do $h1 192.0.2.2
>  	check_err $? "Ping did not work after unlocking port and removing FDB entry."
> @@ -124,13 +124,13 @@ locked_port_vlan()
>  	ping_do $h1.100 198.51.100.2
>  	check_fail $? "Ping through vlan worked after locking port, but before adding FDB entry"
>  
> -	bridge fdb add `mac_get $h1` dev $swp1 vlan 100 master static
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>  
>  	ping_do $h1.100 198.51.100.2
>  	check_err $? "Ping through vlan did not work after locking port and adding FDB entry"
>  
>  	bridge link set dev $swp1 locked off
> -	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master static
> +	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master
>  
>  	ping_do $h1.100 198.51.100.2
>  	check_err $? "Ping through vlan did not work after unlocking port and removing FDB entry"
> @@ -153,7 +153,8 @@ locked_port_ipv6()
>  	ping6_do $h1 2001:db8:1::2
>  	check_fail $? "Ping6 worked after locking port, but before adding FDB entry"
>  
> -	bridge fdb add `mac_get $h1` dev $swp1 master static
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
> +
>  	ping6_do $h1 2001:db8:1::2
>  	check_err $? "Ping6 did not work after locking port and adding FDB entry"
>  
> @@ -166,6 +167,35 @@ locked_port_ipv6()
>  	log_test "Locked port ipv6"
>  }

Why did you change s/add/replace/? Also, from the subject and commit
message I understand the patch is about adding a new test, not changing
existing ones.

>  
> +locked_port_mab()
> +{
> +	RET=0
> +	check_locked_port_support || return 0
> +
> +	ping_do $h1 192.0.2.2
> +	check_err $? "MAB: Ping did not work before locking port"
> +
> +	bridge link set dev $swp1 locked on
> +	bridge link set dev $swp1 learning on
> +
> +	bridge fdb del `mac_get $h1` dev $swp1 master

Why the delete is needed? Aren't you getting errors on trying to delete
a non-existing entry? In previous test cases learning is disabled and it
seems the FDB entry is cleaned up.

> +
> +	ping_do $h1 192.0.2.2
> +	check_fail $? "MAB: Ping worked on locked port without FDB entry"
> +
> +	bridge fdb show | grep `mac_get $h1` | grep -q "locked"
> +	check_err $? "MAB: No locked fdb entry after ping on locked port"
> +
> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
> +
> +	ping_do $h1 192.0.2.2
> +	check_err $? "MAB: Ping did not work with fdb entry without locked flag"
> +
> +	bridge fdb del `mac_get $h1` dev $swp1 master

bridge link set dev $swp1 learning off

> +	bridge link set dev $swp1 locked off
> +
> +	log_test "Locked port MAB"
> +}
>  trap cleanup EXIT
>  
>  setup_prepare
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-26 14:13     ` [Bridge] " Ido Schimmel
@ 2022-05-27  8:52       ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-27  8:52 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On tor, maj 26, 2022 at 17:13, Ido Schimmel <idosch@idosch.org> wrote:
> On Tue, May 24, 2022 at 05:21:41PM +0200, Hans Schultz wrote:
>> Add an intermediate state for clients behind a locked port to allow for
>> possible opening of the port for said clients. This feature corresponds
>> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
>> latter defined by Cisco.
>> Locked FDB entries will be limited in number, so as to prevent DOS
>> attacks by spamming the port with random entries. The limit will be
>> a per port limit as it is a port based feature and that the port flushes
>> all FDB entries on link down.
>
> Why locked FDB entries need a special treatment compared to regular
> entries? A port that has learning enabled can be spammed with random
> source MACs just as well.
>
> The authorization daemon that is monitoring FDB notifications can have a
> policy to shut down a port if the rate / number of locked entries is
> above a given threshold.
>
> I don't think this kind of policy belongs in the kernel. If it resides
> in user space, then the threshold can be adjusted. Currently it's hard
> coded to 64 and I don't see how user space can change or monitor it.

In the Mac-Auth/MAB context, the locked port feature is really a form of
CPU based learning, and on mv88e6xxx switchcores, this is facilitated by
violation interrupts. Based on miss violation interrupts, the locked
entries are then added to a list with a timer to remove the entries
according to the bridge timeout.
As this is very CPU intensive compared to normal operation, the
assessment is that all this will jam up most devices if bombarded with
random entries at link speed, and my estimate is that any userspace 
daemon that listens to the ensuing fdb events will never get a chance
to stop this flood and eventually the device will lock down/reset. To
prevent this, the limit is introduced.

Ideally this limit could be adjustable from userspace, but in real
use-cases a cap like 64 should be more than enough, as that corresponds
to 64 possible devices behind a port that cannot authenticate by other
means (printers etc.) than having their mac addresses white-listed.

The software bridge behavior was then just set to correspond to the
offloaded behavior, but after correspondence with Nik, the software
bridge locked entries limit will be removed.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-27  8:52       ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-27  8:52 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Ido Schimmel, Vivien Didelot, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On tor, maj 26, 2022 at 17:13, Ido Schimmel <idosch@idosch.org> wrote:
> On Tue, May 24, 2022 at 05:21:41PM +0200, Hans Schultz wrote:
>> Add an intermediate state for clients behind a locked port to allow for
>> possible opening of the port for said clients. This feature corresponds
>> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
>> latter defined by Cisco.
>> Locked FDB entries will be limited in number, so as to prevent DOS
>> attacks by spamming the port with random entries. The limit will be
>> a per port limit as it is a port based feature and that the port flushes
>> all FDB entries on link down.
>
> Why locked FDB entries need a special treatment compared to regular
> entries? A port that has learning enabled can be spammed with random
> source MACs just as well.
>
> The authorization daemon that is monitoring FDB notifications can have a
> policy to shut down a port if the rate / number of locked entries is
> above a given threshold.
>
> I don't think this kind of policy belongs in the kernel. If it resides
> in user space, then the threshold can be adjusted. Currently it's hard
> coded to 64 and I don't see how user space can change or monitor it.

In the Mac-Auth/MAB context, the locked port feature is really a form of
CPU based learning, and on mv88e6xxx switchcores, this is facilitated by
violation interrupts. Based on miss violation interrupts, the locked
entries are then added to a list with a timer to remove the entries
according to the bridge timeout.
As this is very CPU intensive compared to normal operation, the
assessment is that all this will jam up most devices if bombarded with
random entries at link speed, and my estimate is that any userspace 
daemon that listens to the ensuing fdb events will never get a chance
to stop this flood and eventually the device will lock down/reset. To
prevent this, the limit is introduced.

Ideally this limit could be adjustable from userspace, but in real
use-cases a cap like 64 should be more than enough, as that corresponds
to 64 possible devices behind a port that cannot authenticate by other
means (printers etc.) than having their mac addresses white-listed.

The software bridge behavior was then just set to correspond to the
offloaded behavior, but after correspondence with Nik, the software
bridge locked entries limit will be removed.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
  2022-05-26 14:27     ` [Bridge] " Ido Schimmel
@ 2022-05-27  9:07       ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-27  9:07 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On tor, maj 26, 2022 at 17:27, Ido Schimmel <idosch@idosch.org> wrote:
> On Tue, May 24, 2022 at 05:21:44PM +0200, Hans Schultz wrote:
>> Verify that the MAC-Auth mechanism works by adding a FDB entry with the
>> locked flag set. denying access until the FDB entry is replaced with a
>> FDB entry without the locked flag set.
>> 
>> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
>> ---
>>  .../net/forwarding/bridge_locked_port.sh      | 42 ++++++++++++++++---
>>  1 file changed, 36 insertions(+), 6 deletions(-)
>> 
>> diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
>> index 5b02b6b60ce7..50b9048d044a 100755
>> --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
>> +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
>> @@ -1,7 +1,7 @@
>>  #!/bin/bash
>>  # SPDX-License-Identifier: GPL-2.0
>>  
>> -ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan"
>> +ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab"
>>  NUM_NETIFS=4
>>  CHECK_TC="no"
>>  source lib.sh
>> @@ -94,13 +94,13 @@ locked_port_ipv4()
>>  	ping_do $h1 192.0.2.2
>>  	check_fail $? "Ping worked after locking port, but before adding FDB entry"
>>  
>> -	bridge fdb add `mac_get $h1` dev $swp1 master static
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>>  
>>  	ping_do $h1 192.0.2.2
>>  	check_err $? "Ping did not work after locking port and adding FDB entry"
>>  
>>  	bridge link set dev $swp1 locked off
>> -	bridge fdb del `mac_get $h1` dev $swp1 master static
>> +	bridge fdb del `mac_get $h1` dev $swp1 master
>>  
>>  	ping_do $h1 192.0.2.2
>>  	check_err $? "Ping did not work after unlocking port and removing FDB entry."
>> @@ -124,13 +124,13 @@ locked_port_vlan()
>>  	ping_do $h1.100 198.51.100.2
>>  	check_fail $? "Ping through vlan worked after locking port, but before adding FDB entry"
>>  
>> -	bridge fdb add `mac_get $h1` dev $swp1 vlan 100 master static
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>>  
>>  	ping_do $h1.100 198.51.100.2
>>  	check_err $? "Ping through vlan did not work after locking port and adding FDB entry"
>>  
>>  	bridge link set dev $swp1 locked off
>> -	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master static
>> +	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master
>>  
>>  	ping_do $h1.100 198.51.100.2
>>  	check_err $? "Ping through vlan did not work after unlocking port and removing FDB entry"
>> @@ -153,7 +153,8 @@ locked_port_ipv6()
>>  	ping6_do $h1 2001:db8:1::2
>>  	check_fail $? "Ping6 worked after locking port, but before adding FDB entry"
>>  
>> -	bridge fdb add `mac_get $h1` dev $swp1 master static
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>> +
>>  	ping6_do $h1 2001:db8:1::2
>>  	check_err $? "Ping6 did not work after locking port and adding FDB entry"
>>  
>> @@ -166,6 +167,35 @@ locked_port_ipv6()
>>  	log_test "Locked port ipv6"
>>  }
>
> Why did you change s/add/replace/? Also, from the subject and commit
> message I understand the patch is about adding a new test, not changing
> existing ones.
>

Sorry, I might have lost a bit track of the kernel selftests, as for
internal reasons there has been a pause in the work. I will remove the
changes to the previous tests, and I hope it will be fine.

>>  
>> +locked_port_mab()
>> +{
>> +	RET=0
>> +	check_locked_port_support || return 0
>> +
>> +	ping_do $h1 192.0.2.2
>> +	check_err $? "MAB: Ping did not work before locking port"
>> +
>> +	bridge link set dev $swp1 locked on
>> +	bridge link set dev $swp1 learning on
>> +
>> +	bridge fdb del `mac_get $h1` dev $swp1 master
>
> Why the delete is needed? Aren't you getting errors on trying to delete
> a non-existing entry? In previous test cases learning is disabled and it
> seems the FDB entry is cleaned up.
>

I guess you are right.

>> +
>> +	ping_do $h1 192.0.2.2
>> +	check_fail $? "MAB: Ping worked on locked port without FDB entry"
>> +
>> +	bridge fdb show | grep `mac_get $h1` | grep -q "locked"
>> +	check_err $? "MAB: No locked fdb entry after ping on locked port"
>> +
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>> +
>> +	ping_do $h1 192.0.2.2
>> +	check_err $? "MAB: Ping did not work with fdb entry without locked flag"
>> +
>> +	bridge fdb del `mac_get $h1` dev $swp1 master
>
> bridge link set dev $swp1 learning off
>

noted.

>> +	bridge link set dev $swp1 locked off
>> +
>> +	log_test "Locked port MAB"
>> +}
>>  trap cleanup EXIT
>>  
>>  setup_prepare
>> -- 
>> 2.30.2
>> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
@ 2022-05-27  9:07       ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-27  9:07 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Ido Schimmel, Vivien Didelot, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On tor, maj 26, 2022 at 17:27, Ido Schimmel <idosch@idosch.org> wrote:
> On Tue, May 24, 2022 at 05:21:44PM +0200, Hans Schultz wrote:
>> Verify that the MAC-Auth mechanism works by adding a FDB entry with the
>> locked flag set. denying access until the FDB entry is replaced with a
>> FDB entry without the locked flag set.
>> 
>> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
>> ---
>>  .../net/forwarding/bridge_locked_port.sh      | 42 ++++++++++++++++---
>>  1 file changed, 36 insertions(+), 6 deletions(-)
>> 
>> diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
>> index 5b02b6b60ce7..50b9048d044a 100755
>> --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
>> +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh
>> @@ -1,7 +1,7 @@
>>  #!/bin/bash
>>  # SPDX-License-Identifier: GPL-2.0
>>  
>> -ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan"
>> +ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab"
>>  NUM_NETIFS=4
>>  CHECK_TC="no"
>>  source lib.sh
>> @@ -94,13 +94,13 @@ locked_port_ipv4()
>>  	ping_do $h1 192.0.2.2
>>  	check_fail $? "Ping worked after locking port, but before adding FDB entry"
>>  
>> -	bridge fdb add `mac_get $h1` dev $swp1 master static
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>>  
>>  	ping_do $h1 192.0.2.2
>>  	check_err $? "Ping did not work after locking port and adding FDB entry"
>>  
>>  	bridge link set dev $swp1 locked off
>> -	bridge fdb del `mac_get $h1` dev $swp1 master static
>> +	bridge fdb del `mac_get $h1` dev $swp1 master
>>  
>>  	ping_do $h1 192.0.2.2
>>  	check_err $? "Ping did not work after unlocking port and removing FDB entry."
>> @@ -124,13 +124,13 @@ locked_port_vlan()
>>  	ping_do $h1.100 198.51.100.2
>>  	check_fail $? "Ping through vlan worked after locking port, but before adding FDB entry"
>>  
>> -	bridge fdb add `mac_get $h1` dev $swp1 vlan 100 master static
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>>  
>>  	ping_do $h1.100 198.51.100.2
>>  	check_err $? "Ping through vlan did not work after locking port and adding FDB entry"
>>  
>>  	bridge link set dev $swp1 locked off
>> -	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master static
>> +	bridge fdb del `mac_get $h1` dev $swp1 vlan 100 master
>>  
>>  	ping_do $h1.100 198.51.100.2
>>  	check_err $? "Ping through vlan did not work after unlocking port and removing FDB entry"
>> @@ -153,7 +153,8 @@ locked_port_ipv6()
>>  	ping6_do $h1 2001:db8:1::2
>>  	check_fail $? "Ping6 worked after locking port, but before adding FDB entry"
>>  
>> -	bridge fdb add `mac_get $h1` dev $swp1 master static
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>> +
>>  	ping6_do $h1 2001:db8:1::2
>>  	check_err $? "Ping6 did not work after locking port and adding FDB entry"
>>  
>> @@ -166,6 +167,35 @@ locked_port_ipv6()
>>  	log_test "Locked port ipv6"
>>  }
>
> Why did you change s/add/replace/? Also, from the subject and commit
> message I understand the patch is about adding a new test, not changing
> existing ones.
>

Sorry, I might have lost a bit track of the kernel selftests, as for
internal reasons there has been a pause in the work. I will remove the
changes to the previous tests, and I hope it will be fine.

>>  
>> +locked_port_mab()
>> +{
>> +	RET=0
>> +	check_locked_port_support || return 0
>> +
>> +	ping_do $h1 192.0.2.2
>> +	check_err $? "MAB: Ping did not work before locking port"
>> +
>> +	bridge link set dev $swp1 locked on
>> +	bridge link set dev $swp1 learning on
>> +
>> +	bridge fdb del `mac_get $h1` dev $swp1 master
>
> Why the delete is needed? Aren't you getting errors on trying to delete
> a non-existing entry? In previous test cases learning is disabled and it
> seems the FDB entry is cleaned up.
>

I guess you are right.

>> +
>> +	ping_do $h1 192.0.2.2
>> +	check_fail $? "MAB: Ping worked on locked port without FDB entry"
>> +
>> +	bridge fdb show | grep `mac_get $h1` | grep -q "locked"
>> +	check_err $? "MAB: No locked fdb entry after ping on locked port"
>> +
>> +	bridge fdb replace `mac_get $h1` dev $swp1 master static
>> +
>> +	ping_do $h1 192.0.2.2
>> +	check_err $? "MAB: Ping did not work with fdb entry without locked flag"
>> +
>> +	bridge fdb del `mac_get $h1` dev $swp1 master
>
> bridge link set dev $swp1 learning off
>

noted.

>> +	bridge link set dev $swp1 locked off
>> +
>> +	log_test "Locked port MAB"
>> +}
>>  trap cleanup EXIT
>>  
>>  setup_prepare
>> -- 
>> 2.30.2
>> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-27  8:52       ` [Bridge] " Hans Schultz
@ 2022-05-27  9:58         ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-27  9:58 UTC (permalink / raw)
  To: Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On Fri, May 27, 2022 at 10:52:27AM +0200, Hans Schultz wrote:
> On tor, maj 26, 2022 at 17:13, Ido Schimmel <idosch@idosch.org> wrote:
> > On Tue, May 24, 2022 at 05:21:41PM +0200, Hans Schultz wrote:
> >> Add an intermediate state for clients behind a locked port to allow for
> >> possible opening of the port for said clients. This feature corresponds
> >> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
> >> latter defined by Cisco.
> >> Locked FDB entries will be limited in number, so as to prevent DOS
> >> attacks by spamming the port with random entries. The limit will be
> >> a per port limit as it is a port based feature and that the port flushes
> >> all FDB entries on link down.
> >
> > Why locked FDB entries need a special treatment compared to regular
> > entries? A port that has learning enabled can be spammed with random
> > source MACs just as well.
> >
> > The authorization daemon that is monitoring FDB notifications can have a
> > policy to shut down a port if the rate / number of locked entries is
> > above a given threshold.
> >
> > I don't think this kind of policy belongs in the kernel. If it resides
> > in user space, then the threshold can be adjusted. Currently it's hard
> > coded to 64 and I don't see how user space can change or monitor it.
> 
> In the Mac-Auth/MAB context, the locked port feature is really a form of
> CPU based learning, and on mv88e6xxx switchcores, this is facilitated by
> violation interrupts. Based on miss violation interrupts, the locked
> entries are then added to a list with a timer to remove the entries
> according to the bridge timeout.
> As this is very CPU intensive compared to normal operation, the
> assessment is that all this will jam up most devices if bombarded with
> random entries at link speed, and my estimate is that any userspace 
> daemon that listens to the ensuing fdb events will never get a chance
> to stop this flood and eventually the device will lock down/reset. To
> prevent this, the limit is introduced.
> 
> Ideally this limit could be adjustable from userspace, but in real
> use-cases a cap like 64 should be more than enough, as that corresponds
> to 64 possible devices behind a port that cannot authenticate by other
> means (printers etc.) than having their mac addresses white-listed.
> 
> The software bridge behavior was then just set to correspond to the
> offloaded behavior, but after correspondence with Nik, the software
> bridge locked entries limit will be removed.

As far as the bridge is concerned, locked entries are not really
different from regular learned entries in terms of processing and since
we don't have limits for regular entries I don't think we should have
limits for locked entries.

I do understand the problem you have in mv88e6xxx and I think it would
be wise to hard code a reasonable limit there. It can be adjusted over
time based on feedback and possibly exposed to user space.

Just to give you another data point about how this works in other
devices, I can say that at least in Spectrum this works a bit
differently. Packets that ingress via a locked port and incur an FDB
miss are trapped to the CPU where they should be injected into the Rx
path so that the bridge will create the 'locked' FDB entry and notify it
to user space. The packets are obviously rated limited as the CPU cannot
handle billions of packets per second, unlike the ASIC. The limit is not
per bridge port (or even per bridge), but instead global to the entire
device.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-27  9:58         ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-27  9:58 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Ido Schimmel, Vivien Didelot, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On Fri, May 27, 2022 at 10:52:27AM +0200, Hans Schultz wrote:
> On tor, maj 26, 2022 at 17:13, Ido Schimmel <idosch@idosch.org> wrote:
> > On Tue, May 24, 2022 at 05:21:41PM +0200, Hans Schultz wrote:
> >> Add an intermediate state for clients behind a locked port to allow for
> >> possible opening of the port for said clients. This feature corresponds
> >> to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The
> >> latter defined by Cisco.
> >> Locked FDB entries will be limited in number, so as to prevent DOS
> >> attacks by spamming the port with random entries. The limit will be
> >> a per port limit as it is a port based feature and that the port flushes
> >> all FDB entries on link down.
> >
> > Why locked FDB entries need a special treatment compared to regular
> > entries? A port that has learning enabled can be spammed with random
> > source MACs just as well.
> >
> > The authorization daemon that is monitoring FDB notifications can have a
> > policy to shut down a port if the rate / number of locked entries is
> > above a given threshold.
> >
> > I don't think this kind of policy belongs in the kernel. If it resides
> > in user space, then the threshold can be adjusted. Currently it's hard
> > coded to 64 and I don't see how user space can change or monitor it.
> 
> In the Mac-Auth/MAB context, the locked port feature is really a form of
> CPU based learning, and on mv88e6xxx switchcores, this is facilitated by
> violation interrupts. Based on miss violation interrupts, the locked
> entries are then added to a list with a timer to remove the entries
> according to the bridge timeout.
> As this is very CPU intensive compared to normal operation, the
> assessment is that all this will jam up most devices if bombarded with
> random entries at link speed, and my estimate is that any userspace 
> daemon that listens to the ensuing fdb events will never get a chance
> to stop this flood and eventually the device will lock down/reset. To
> prevent this, the limit is introduced.
> 
> Ideally this limit could be adjustable from userspace, but in real
> use-cases a cap like 64 should be more than enough, as that corresponds
> to 64 possible devices behind a port that cannot authenticate by other
> means (printers etc.) than having their mac addresses white-listed.
> 
> The software bridge behavior was then just set to correspond to the
> offloaded behavior, but after correspondence with Nik, the software
> bridge locked entries limit will be removed.

As far as the bridge is concerned, locked entries are not really
different from regular learned entries in terms of processing and since
we don't have limits for regular entries I don't think we should have
limits for locked entries.

I do understand the problem you have in mv88e6xxx and I think it would
be wise to hard code a reasonable limit there. It can be adjusted over
time based on feedback and possibly exposed to user space.

Just to give you another data point about how this works in other
devices, I can say that at least in Spectrum this works a bit
differently. Packets that ingress via a locked port and incur an FDB
miss are trapped to the CPU where they should be injected into the Rx
path so that the bridge will create the 'locked' FDB entry and notify it
to user space. The packets are obviously rated limited as the CPU cannot
handle billions of packets per second, unlike the ASIC. The limit is not
per bridge port (or even per bridge), but instead global to the entire
device.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-27  9:58         ` [Bridge] " Ido Schimmel
@ 2022-05-27 16:00           ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-27 16:00 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

>
> As far as the bridge is concerned, locked entries are not really
> different from regular learned entries in terms of processing and since
> we don't have limits for regular entries I don't think we should have
> limits for locked entries.
>
> I do understand the problem you have in mv88e6xxx and I think it would
> be wise to hard code a reasonable limit there. It can be adjusted over
> time based on feedback and possibly exposed to user space.
>
> Just to give you another data point about how this works in other
> devices, I can say that at least in Spectrum this works a bit
> differently. Packets that ingress via a locked port and incur an FDB
> miss are trapped to the CPU where they should be injected into the Rx
> path so that the bridge will create the 'locked' FDB entry and notify it
> to user space. The packets are obviously rated limited as the CPU cannot
> handle billions of packets per second, unlike the ASIC. The limit is not
> per bridge port (or even per bridge), but instead global to the entire
> device.

Ahh, I see. I think that the best is for those FDB entries to be created
as dynamic entries, so that user-space does not have to keep track of
unauthorized entries.
Also the mv88e6xxx chipsets have a 'hold at one' feature, for authorized
entries, so that if a device goes quiet after being authorized the
driver can signal to the software bridge and further to userspace that
an authorized device has gone quiet, and the entry can then be
removed. This though requires user added dynamic entries in the ATU, or
you can call it synthetically 'learned' entries.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-27 16:00           ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-27 16:00 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Ido Schimmel, Vivien Didelot, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

>
> As far as the bridge is concerned, locked entries are not really
> different from regular learned entries in terms of processing and since
> we don't have limits for regular entries I don't think we should have
> limits for locked entries.
>
> I do understand the problem you have in mv88e6xxx and I think it would
> be wise to hard code a reasonable limit there. It can be adjusted over
> time based on feedback and possibly exposed to user space.
>
> Just to give you another data point about how this works in other
> devices, I can say that at least in Spectrum this works a bit
> differently. Packets that ingress via a locked port and incur an FDB
> miss are trapped to the CPU where they should be injected into the Rx
> path so that the bridge will create the 'locked' FDB entry and notify it
> to user space. The packets are obviously rated limited as the CPU cannot
> handle billions of packets per second, unlike the ASIC. The limit is not
> per bridge port (or even per bridge), but instead global to the entire
> device.

Ahh, I see. I think that the best is for those FDB entries to be created
as dynamic entries, so that user-space does not have to keep track of
unauthorized entries.
Also the mv88e6xxx chipsets have a 'hold at one' feature, for authorized
entries, so that if a device goes quiet after being authorized the
driver can signal to the software bridge and further to userspace that
an authorized device has gone quiet, and the entry can then be
removed. This though requires user added dynamic entries in the ATU, or
you can call it synthetically 'learned' entries.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-27  9:58         ` [Bridge] " Ido Schimmel
@ 2022-05-31  9:34           ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-31  9:34 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

> Just to give you another data point about how this works in other
> devices, I can say that at least in Spectrum this works a bit
> differently. Packets that ingress via a locked port and incur an FDB
> miss are trapped to the CPU where they should be injected into the Rx
> path so that the bridge will create the 'locked' FDB entry and notify it
> to user space. The packets are obviously rated limited as the CPU cannot
> handle billions of packets per second, unlike the ASIC. The limit is not
> per bridge port (or even per bridge), but instead global to the entire
> device.

Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
towards the switchcore in the scheme you mention and thus add an entry
that opens up for the specified mac address?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-31  9:34           ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-31  9:34 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Ido Schimmel, Vivien Didelot, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

> Just to give you another data point about how this works in other
> devices, I can say that at least in Spectrum this works a bit
> differently. Packets that ingress via a locked port and incur an FDB
> miss are trapped to the CPU where they should be injected into the Rx
> path so that the bridge will create the 'locked' FDB entry and notify it
> to user space. The packets are obviously rated limited as the CPU cannot
> handle billions of packets per second, unlike the ASIC. The limit is not
> per bridge port (or even per bridge), but instead global to the entire
> device.

Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
towards the switchcore in the scheme you mention and thus add an entry
that opens up for the specified mac address?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-31  9:34           ` [Bridge] " Hans Schultz
@ 2022-05-31 14:23             ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-31 14:23 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ido Schimmel, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, linux-kernel, bridge,
	linux-kselftest

On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> > Just to give you another data point about how this works in other
> > devices, I can say that at least in Spectrum this works a bit
> > differently. Packets that ingress via a locked port and incur an FDB
> > miss are trapped to the CPU where they should be injected into the Rx
> > path so that the bridge will create the 'locked' FDB entry and notify it
> > to user space. The packets are obviously rated limited as the CPU cannot
> > handle billions of packets per second, unlike the ASIC. The limit is not
> > per bridge port (or even per bridge), but instead global to the entire
> > device.
> 
> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
> towards the switchcore in the scheme you mention and thus add an entry
> that opens up for the specified mac address?

It will, but the driver needs to ignore FDB entries that are notified
with locked flag. I see that you extended 'struct
switchdev_notifier_fdb_info' with the locked flag, but it's not
initialized in br_switchdev_fdb_populate(). Can you add it in the next
version?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-31 14:23             ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-05-31 14:23 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> > Just to give you another data point about how this works in other
> > devices, I can say that at least in Spectrum this works a bit
> > differently. Packets that ingress via a locked port and incur an FDB
> > miss are trapped to the CPU where they should be injected into the Rx
> > path so that the bridge will create the 'locked' FDB entry and notify it
> > to user space. The packets are obviously rated limited as the CPU cannot
> > handle billions of packets per second, unlike the ASIC. The limit is not
> > per bridge port (or even per bridge), but instead global to the entire
> > device.
> 
> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
> towards the switchcore in the scheme you mention and thus add an entry
> that opens up for the specified mac address?

It will, but the driver needs to ignore FDB entries that are notified
with locked flag. I see that you extended 'struct
switchdev_notifier_fdb_info' with the locked flag, but it's not
initialized in br_switchdev_fdb_populate(). Can you add it in the next
version?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-31 14:23             ` [Bridge] " Ido Schimmel
@ 2022-05-31 15:49               ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-31 15:49 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ido Schimmel, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, linux-kernel, bridge,
	linux-kselftest

On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>> > Just to give you another data point about how this works in other
>> > devices, I can say that at least in Spectrum this works a bit
>> > differently. Packets that ingress via a locked port and incur an FDB
>> > miss are trapped to the CPU where they should be injected into the Rx
>> > path so that the bridge will create the 'locked' FDB entry and notify it
>> > to user space. The packets are obviously rated limited as the CPU cannot
>> > handle billions of packets per second, unlike the ASIC. The limit is not
>> > per bridge port (or even per bridge), but instead global to the entire
>> > device.
>> 
>> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
>> towards the switchcore in the scheme you mention and thus add an entry
>> that opens up for the specified mac address?
>
> It will, but the driver needs to ignore FDB entries that are notified
> with locked flag. I see that you extended 'struct
> switchdev_notifier_fdb_info' with the locked flag, but it's not
> initialized in br_switchdev_fdb_populate(). Can you add it in the next
> version?

Yes, definitely. I have only had focus on it in the messages coming up
from the driver, and neglected it the other way.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-05-31 15:49               ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-05-31 15:49 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>> > Just to give you another data point about how this works in other
>> > devices, I can say that at least in Spectrum this works a bit
>> > differently. Packets that ingress via a locked port and incur an FDB
>> > miss are trapped to the CPU where they should be injected into the Rx
>> > path so that the bridge will create the 'locked' FDB entry and notify it
>> > to user space. The packets are obviously rated limited as the CPU cannot
>> > handle billions of packets per second, unlike the ASIC. The limit is not
>> > per bridge port (or even per bridge), but instead global to the entire
>> > device.
>> 
>> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
>> towards the switchcore in the scheme you mention and thus add an entry
>> that opens up for the specified mac address?
>
> It will, but the driver needs to ignore FDB entries that are notified
> with locked flag. I see that you extended 'struct
> switchdev_notifier_fdb_info' with the locked flag, but it's not
> initialized in br_switchdev_fdb_populate(). Can you add it in the next
> version?

Yes, definitely. I have only had focus on it in the messages coming up
from the driver, and neglected it the other way.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-31 14:23             ` [Bridge] " Ido Schimmel
@ 2022-06-02  9:17               ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02  9:17 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ido Schimmel, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, linux-kernel, bridge,
	linux-kselftest

On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>> > Just to give you another data point about how this works in other
>> > devices, I can say that at least in Spectrum this works a bit
>> > differently. Packets that ingress via a locked port and incur an FDB
>> > miss are trapped to the CPU where they should be injected into the Rx
>> > path so that the bridge will create the 'locked' FDB entry and notify it
>> > to user space. The packets are obviously rated limited as the CPU cannot
>> > handle billions of packets per second, unlike the ASIC. The limit is not
>> > per bridge port (or even per bridge), but instead global to the entire
>> > device.
>> 
>> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
>> towards the switchcore in the scheme you mention and thus add an entry
>> that opens up for the specified mac address?
>
> It will, but the driver needs to ignore FDB entries that are notified
> with locked flag. I see that you extended 'struct
> switchdev_notifier_fdb_info' with the locked flag, but it's not
> initialized in br_switchdev_fdb_populate(). Can you add it in the next
> version?

An issue with sending the flag to the driver is that port_fdb_add() is
suddenly getting more and more arguments and getting messy in my
opinion, but maybe that's just how it is...

Another issue is that
bridge fdb add MAC dev DEV master static
seems to add the entry with the SELF flag set, which I don't think is
what we would want it to do or?
Also the replace command is not really supported properly as it is. I
have made a fix for that which looks something like this:

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6cbb27e3b976..f43aa204f375 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
                if (flags & NLM_F_EXCL)
                        return -EEXIST;
 
+               if (flags & NLM_F_REPLACE)
+                       modified = true;
+
                if (READ_ONCE(fdb->dst) != source) {
                        WRITE_ONCE(fdb->dst, source);
                        modified = true;

The argument for always sending notifications to the driver in the case
of replace is that a replace command will refresh the entries timeout if
the entry is the same. Any thoughts on this?

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02  9:17               ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02  9:17 UTC (permalink / raw)
  To: Ido Schimmel, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>> > Just to give you another data point about how this works in other
>> > devices, I can say that at least in Spectrum this works a bit
>> > differently. Packets that ingress via a locked port and incur an FDB
>> > miss are trapped to the CPU where they should be injected into the Rx
>> > path so that the bridge will create the 'locked' FDB entry and notify it
>> > to user space. The packets are obviously rated limited as the CPU cannot
>> > handle billions of packets per second, unlike the ASIC. The limit is not
>> > per bridge port (or even per bridge), but instead global to the entire
>> > device.
>> 
>> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
>> towards the switchcore in the scheme you mention and thus add an entry
>> that opens up for the specified mac address?
>
> It will, but the driver needs to ignore FDB entries that are notified
> with locked flag. I see that you extended 'struct
> switchdev_notifier_fdb_info' with the locked flag, but it's not
> initialized in br_switchdev_fdb_populate(). Can you add it in the next
> version?

An issue with sending the flag to the driver is that port_fdb_add() is
suddenly getting more and more arguments and getting messy in my
opinion, but maybe that's just how it is...

Another issue is that
bridge fdb add MAC dev DEV master static
seems to add the entry with the SELF flag set, which I don't think is
what we would want it to do or?
Also the replace command is not really supported properly as it is. I
have made a fix for that which looks something like this:

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6cbb27e3b976..f43aa204f375 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
                if (flags & NLM_F_EXCL)
                        return -EEXIST;
 
+               if (flags & NLM_F_REPLACE)
+                       modified = true;
+
                if (READ_ONCE(fdb->dst) != source) {
                        WRITE_ONCE(fdb->dst, source);
                        modified = true;

The argument for always sending notifications to the driver in the case
of replace is that a replace command will refresh the entries timeout if
the entry is the same. Any thoughts on this?

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02  9:17               ` [Bridge] " Hans Schultz
@ 2022-06-02  9:33                 ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-06-02  9:33 UTC (permalink / raw)
  To: Hans Schultz, Ido Schimmel
  Cc: Ido Schimmel, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

On 02/06/2022 12:17, Hans Schultz wrote:
> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>>>> Just to give you another data point about how this works in other
>>>> devices, I can say that at least in Spectrum this works a bit
>>>> differently. Packets that ingress via a locked port and incur an FDB
>>>> miss are trapped to the CPU where they should be injected into the Rx
>>>> path so that the bridge will create the 'locked' FDB entry and notify it
>>>> to user space. The packets are obviously rated limited as the CPU cannot
>>>> handle billions of packets per second, unlike the ASIC. The limit is not
>>>> per bridge port (or even per bridge), but instead global to the entire
>>>> device.
>>>
>>> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
>>> towards the switchcore in the scheme you mention and thus add an entry
>>> that opens up for the specified mac address?
>>
>> It will, but the driver needs to ignore FDB entries that are notified
>> with locked flag. I see that you extended 'struct
>> switchdev_notifier_fdb_info' with the locked flag, but it's not
>> initialized in br_switchdev_fdb_populate(). Can you add it in the next
>> version?
> 
> An issue with sending the flag to the driver is that port_fdb_add() is
> suddenly getting more and more arguments and getting messy in my
> opinion, but maybe that's just how it is...
> 
> Another issue is that
> bridge fdb add MAC dev DEV master static
> seems to add the entry with the SELF flag set, which I don't think is
> what we would want it to do or?

I don't see such thing (hacked iproute2 to print the flags before cmd):
$ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
flags 0x4

0x4 = NTF_MASTER only

> Also the replace command is not really supported properly as it is. I
> have made a fix for that which looks something like this:
> 
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 6cbb27e3b976..f43aa204f375 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>                 if (flags & NLM_F_EXCL)
>                         return -EEXIST;
>  
> +               if (flags & NLM_F_REPLACE)
> +                       modified = true;
> +
>                 if (READ_ONCE(fdb->dst) != source) {
>                         WRITE_ONCE(fdb->dst, source);
>                         modified = true;
> 
> The argument for always sending notifications to the driver in the case
> of replace is that a replace command will refresh the entries timeout if
> the entry is the same. Any thoughts on this?

I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
for expire. A replace that doesn't actually change anything on the entry shouldn't generate
a notification.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02  9:33                 ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-06-02  9:33 UTC (permalink / raw)
  To: Hans Schultz, Ido Schimmel
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, bridge, linux-kernel, Vivien Didelot,
	Ido Schimmel, Eric Dumazet, Paolo Abeni, linux-kselftest,
	Roopa Prabhu, kuba, Vladimir Oltean, Shuah Khan, davem

On 02/06/2022 12:17, Hans Schultz wrote:
> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>>>> Just to give you another data point about how this works in other
>>>> devices, I can say that at least in Spectrum this works a bit
>>>> differently. Packets that ingress via a locked port and incur an FDB
>>>> miss are trapped to the CPU where they should be injected into the Rx
>>>> path so that the bridge will create the 'locked' FDB entry and notify it
>>>> to user space. The packets are obviously rated limited as the CPU cannot
>>>> handle billions of packets per second, unlike the ASIC. The limit is not
>>>> per bridge port (or even per bridge), but instead global to the entire
>>>> device.
>>>
>>> Btw, will the bridge not create a SWITCHDEV_FDB_ADD_TO_DEVICE event
>>> towards the switchcore in the scheme you mention and thus add an entry
>>> that opens up for the specified mac address?
>>
>> It will, but the driver needs to ignore FDB entries that are notified
>> with locked flag. I see that you extended 'struct
>> switchdev_notifier_fdb_info' with the locked flag, but it's not
>> initialized in br_switchdev_fdb_populate(). Can you add it in the next
>> version?
> 
> An issue with sending the flag to the driver is that port_fdb_add() is
> suddenly getting more and more arguments and getting messy in my
> opinion, but maybe that's just how it is...
> 
> Another issue is that
> bridge fdb add MAC dev DEV master static
> seems to add the entry with the SELF flag set, which I don't think is
> what we would want it to do or?

I don't see such thing (hacked iproute2 to print the flags before cmd):
$ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
flags 0x4

0x4 = NTF_MASTER only

> Also the replace command is not really supported properly as it is. I
> have made a fix for that which looks something like this:
> 
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 6cbb27e3b976..f43aa204f375 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>                 if (flags & NLM_F_EXCL)
>                         return -EEXIST;
>  
> +               if (flags & NLM_F_REPLACE)
> +                       modified = true;
> +
>                 if (READ_ONCE(fdb->dst) != source) {
>                         WRITE_ONCE(fdb->dst, source);
>                         modified = true;
> 
> The argument for always sending notifications to the driver in the case
> of replace is that a replace command will refresh the entries timeout if
> the entry is the same. Any thoughts on this?

I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
for expire. A replace that doesn't actually change anything on the entry shouldn't generate
a notification.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02  9:33                 ` [Bridge] " Nikolay Aleksandrov
@ 2022-06-02 10:17                   ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02 10:17 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, Ido Schimmel
  Cc: Ido Schimmel, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 02/06/2022 12:17, Hans Schultz wrote:
>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:

>> Another issue is that
>> bridge fdb add MAC dev DEV master static
>> seems to add the entry with the SELF flag set, which I don't think is
>> what we would want it to do or?
>
> I don't see such thing (hacked iproute2 to print the flags before cmd):
> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
> flags 0x4
>
> 0x4 = NTF_MASTER only
>

I also get 0x4 from iproute2, but I still get SELF entries when I look
with:
bridge fdb show dev DEV

>> Also the replace command is not really supported properly as it is. I
>> have made a fix for that which looks something like this:
>> 
>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
>> index 6cbb27e3b976..f43aa204f375 100644
>> --- a/net/bridge/br_fdb.c
>> +++ b/net/bridge/br_fdb.c
>> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>>                 if (flags & NLM_F_EXCL)
>>                         return -EEXIST;
>>  
>> +               if (flags & NLM_F_REPLACE)
>> +                       modified = true;
>> +
>>                 if (READ_ONCE(fdb->dst) != source) {
>>                         WRITE_ONCE(fdb->dst, source);
>>                         modified = true;
>> 
>> The argument for always sending notifications to the driver in the case
>> of replace is that a replace command will refresh the entries timeout if
>> the entry is the same. Any thoughts on this?
>
> I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
> for expire. A replace that doesn't actually change anything on the entry shouldn't generate
> a notification.

Okay, so then there is missing checks on flags as the issue arose from
replacing locked entries with dynamic entries. I will do another fix
based on flags as modified needs to be true for the driver to get notified.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 10:17                   ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02 10:17 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz, Ido Schimmel
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, bridge, linux-kernel, Vivien Didelot,
	Ido Schimmel, Eric Dumazet, Paolo Abeni, linux-kselftest,
	Roopa Prabhu, kuba, Vladimir Oltean, Shuah Khan, davem

On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 02/06/2022 12:17, Hans Schultz wrote:
>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:

>> Another issue is that
>> bridge fdb add MAC dev DEV master static
>> seems to add the entry with the SELF flag set, which I don't think is
>> what we would want it to do or?
>
> I don't see such thing (hacked iproute2 to print the flags before cmd):
> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
> flags 0x4
>
> 0x4 = NTF_MASTER only
>

I also get 0x4 from iproute2, but I still get SELF entries when I look
with:
bridge fdb show dev DEV

>> Also the replace command is not really supported properly as it is. I
>> have made a fix for that which looks something like this:
>> 
>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
>> index 6cbb27e3b976..f43aa204f375 100644
>> --- a/net/bridge/br_fdb.c
>> +++ b/net/bridge/br_fdb.c
>> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>>                 if (flags & NLM_F_EXCL)
>>                         return -EEXIST;
>>  
>> +               if (flags & NLM_F_REPLACE)
>> +                       modified = true;
>> +
>>                 if (READ_ONCE(fdb->dst) != source) {
>>                         WRITE_ONCE(fdb->dst, source);
>>                         modified = true;
>> 
>> The argument for always sending notifications to the driver in the case
>> of replace is that a replace command will refresh the entries timeout if
>> the entry is the same. Any thoughts on this?
>
> I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
> for expire. A replace that doesn't actually change anything on the entry shouldn't generate
> a notification.

Okay, so then there is missing checks on flags as the issue arose from
replacing locked entries with dynamic entries. I will do another fix
based on flags as modified needs to be true for the driver to get notified.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 10:17                   ` [Bridge] " Hans Schultz
@ 2022-06-02 10:30                     ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-06-02 10:30 UTC (permalink / raw)
  To: Hans Schultz, Ido Schimmel
  Cc: Ido Schimmel, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

On 02/06/2022 13:17, Hans Schultz wrote:
> On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> On 02/06/2022 12:17, Hans Schultz wrote:
>>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> 
>>> Another issue is that
>>> bridge fdb add MAC dev DEV master static
>>> seems to add the entry with the SELF flag set, which I don't think is
>>> what we would want it to do or?
>>
>> I don't see such thing (hacked iproute2 to print the flags before cmd):
>> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
>> flags 0x4
>>
>> 0x4 = NTF_MASTER only
>>
> 
> I also get 0x4 from iproute2, but I still get SELF entries when I look
> with:
> bridge fdb show dev DEV
> 

after the above add:
$ bridge fdb show dev vnet110 | grep 00:11
00:11:22:33:44:55 master virbr0 static

>>> Also the replace command is not really supported properly as it is. I
>>> have made a fix for that which looks something like this:
>>>
>>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
>>> index 6cbb27e3b976..f43aa204f375 100644
>>> --- a/net/bridge/br_fdb.c
>>> +++ b/net/bridge/br_fdb.c
>>> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>>>                 if (flags & NLM_F_EXCL)
>>>                         return -EEXIST;
>>>  
>>> +               if (flags & NLM_F_REPLACE)
>>> +                       modified = true;
>>> +
>>>                 if (READ_ONCE(fdb->dst) != source) {
>>>                         WRITE_ONCE(fdb->dst, source);
>>>                         modified = true;
>>>
>>> The argument for always sending notifications to the driver in the case
>>> of replace is that a replace command will refresh the entries timeout if
>>> the entry is the same. Any thoughts on this?
>>
>> I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
>> for expire. A replace that doesn't actually change anything on the entry shouldn't generate
>> a notification.
> 
> Okay, so then there is missing checks on flags as the issue arose from
> replacing locked entries with dynamic entries. I will do another fix
> based on flags as modified needs to be true for the driver to get notified.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 10:30                     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-06-02 10:30 UTC (permalink / raw)
  To: Hans Schultz, Ido Schimmel
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, bridge, linux-kernel, Vivien Didelot,
	Ido Schimmel, Eric Dumazet, Paolo Abeni, linux-kselftest,
	Roopa Prabhu, kuba, Vladimir Oltean, Shuah Khan, davem

On 02/06/2022 13:17, Hans Schultz wrote:
> On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> On 02/06/2022 12:17, Hans Schultz wrote:
>>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> 
>>> Another issue is that
>>> bridge fdb add MAC dev DEV master static
>>> seems to add the entry with the SELF flag set, which I don't think is
>>> what we would want it to do or?
>>
>> I don't see such thing (hacked iproute2 to print the flags before cmd):
>> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
>> flags 0x4
>>
>> 0x4 = NTF_MASTER only
>>
> 
> I also get 0x4 from iproute2, but I still get SELF entries when I look
> with:
> bridge fdb show dev DEV
> 

after the above add:
$ bridge fdb show dev vnet110 | grep 00:11
00:11:22:33:44:55 master virbr0 static

>>> Also the replace command is not really supported properly as it is. I
>>> have made a fix for that which looks something like this:
>>>
>>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
>>> index 6cbb27e3b976..f43aa204f375 100644
>>> --- a/net/bridge/br_fdb.c
>>> +++ b/net/bridge/br_fdb.c
>>> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
>>>                 if (flags & NLM_F_EXCL)
>>>                         return -EEXIST;
>>>  
>>> +               if (flags & NLM_F_REPLACE)
>>> +                       modified = true;
>>> +
>>>                 if (READ_ONCE(fdb->dst) != source) {
>>>                         WRITE_ONCE(fdb->dst, source);
>>>                         modified = true;
>>>
>>> The argument for always sending notifications to the driver in the case
>>> of replace is that a replace command will refresh the entries timeout if
>>> the entry is the same. Any thoughts on this?
>>
>> I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
>> for expire. A replace that doesn't actually change anything on the entry shouldn't generate
>> a notification.
> 
> Okay, so then there is missing checks on flags as the issue arose from
> replacing locked entries with dynamic entries. I will do another fix
> based on flags as modified needs to be true for the driver to get notified.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 10:30                     ` [Bridge] " Nikolay Aleksandrov
@ 2022-06-02 10:39                       ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-06-02 10:39 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Hans Schultz, Ido Schimmel, davem, kuba, netdev, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, Vladimir Oltean, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

On Thu, Jun 02, 2022 at 01:30:06PM +0300, Nikolay Aleksandrov wrote:
> On 02/06/2022 13:17, Hans Schultz wrote:
> > On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> >> On 02/06/2022 12:17, Hans Schultz wrote:
> >>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> >>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> > 
> >>> Another issue is that
> >>> bridge fdb add MAC dev DEV master static
> >>> seems to add the entry with the SELF flag set, which I don't think is
> >>> what we would want it to do or?
> >>
> >> I don't see such thing (hacked iproute2 to print the flags before cmd):
> >> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
> >> flags 0x4
> >>
> >> 0x4 = NTF_MASTER only
> >>
> > 
> > I also get 0x4 from iproute2, but I still get SELF entries when I look
> > with:
> > bridge fdb show dev DEV
> > 
> 
> after the above add:
> $ bridge fdb show dev vnet110 | grep 00:11
> 00:11:22:33:44:55 master virbr0 static

I think Hans is testing with mv88e6xxx which dumps entries directly from
HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
NTF_SELF.

Hans, are you seeing the entry twice? Once with 'master' and once with
'self'?

> 
> >>> Also the replace command is not really supported properly as it is. I
> >>> have made a fix for that which looks something like this:
> >>>
> >>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> >>> index 6cbb27e3b976..f43aa204f375 100644
> >>> --- a/net/bridge/br_fdb.c
> >>> +++ b/net/bridge/br_fdb.c
> >>> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
> >>>                 if (flags & NLM_F_EXCL)
> >>>                         return -EEXIST;
> >>>  
> >>> +               if (flags & NLM_F_REPLACE)
> >>> +                       modified = true;
> >>> +
> >>>                 if (READ_ONCE(fdb->dst) != source) {
> >>>                         WRITE_ONCE(fdb->dst, source);
> >>>                         modified = true;
> >>>
> >>> The argument for always sending notifications to the driver in the case
> >>> of replace is that a replace command will refresh the entries timeout if
> >>> the entry is the same. Any thoughts on this?
> >>
> >> I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
> >> for expire. A replace that doesn't actually change anything on the entry shouldn't generate
> >> a notification.
> > 
> > Okay, so then there is missing checks on flags as the issue arose from
> > replacing locked entries with dynamic entries. I will do another fix
> > based on flags as modified needs to be true for the driver to get notified.
> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 10:39                       ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-06-02 10:39 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Vivien Didelot, Ido Schimmel, Eric Dumazet, Paolo Abeni,
	linux-kselftest, Roopa Prabhu, kuba, Vladimir Oltean, Shuah Khan,
	davem

On Thu, Jun 02, 2022 at 01:30:06PM +0300, Nikolay Aleksandrov wrote:
> On 02/06/2022 13:17, Hans Schultz wrote:
> > On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> >> On 02/06/2022 12:17, Hans Schultz wrote:
> >>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> >>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> > 
> >>> Another issue is that
> >>> bridge fdb add MAC dev DEV master static
> >>> seems to add the entry with the SELF flag set, which I don't think is
> >>> what we would want it to do or?
> >>
> >> I don't see such thing (hacked iproute2 to print the flags before cmd):
> >> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
> >> flags 0x4
> >>
> >> 0x4 = NTF_MASTER only
> >>
> > 
> > I also get 0x4 from iproute2, but I still get SELF entries when I look
> > with:
> > bridge fdb show dev DEV
> > 
> 
> after the above add:
> $ bridge fdb show dev vnet110 | grep 00:11
> 00:11:22:33:44:55 master virbr0 static

I think Hans is testing with mv88e6xxx which dumps entries directly from
HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
NTF_SELF.

Hans, are you seeing the entry twice? Once with 'master' and once with
'self'?

> 
> >>> Also the replace command is not really supported properly as it is. I
> >>> have made a fix for that which looks something like this:
> >>>
> >>> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> >>> index 6cbb27e3b976..f43aa204f375 100644
> >>> --- a/net/bridge/br_fdb.c
> >>> +++ b/net/bridge/br_fdb.c
> >>> @@ -917,6 +917,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
> >>>                 if (flags & NLM_F_EXCL)
> >>>                         return -EEXIST;
> >>>  
> >>> +               if (flags & NLM_F_REPLACE)
> >>> +                       modified = true;
> >>> +
> >>>                 if (READ_ONCE(fdb->dst) != source) {
> >>>                         WRITE_ONCE(fdb->dst, source);
> >>>                         modified = true;
> >>>
> >>> The argument for always sending notifications to the driver in the case
> >>> of replace is that a replace command will refresh the entries timeout if
> >>> the entry is the same. Any thoughts on this?
> >>
> >> I don't think so. It always updates its "used" timer, not its "updated" timer which is the one
> >> for expire. A replace that doesn't actually change anything on the entry shouldn't generate
> >> a notification.
> > 
> > Okay, so then there is missing checks on flags as the issue arose from
> > replacing locked entries with dynamic entries. I will do another fix
> > based on flags as modified needs to be true for the driver to get notified.
> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 10:39                       ` [Bridge] " Ido Schimmel
@ 2022-06-02 11:36                         ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02 11:36 UTC (permalink / raw)
  To: Ido Schimmel, Nikolay Aleksandrov
  Cc: Hans Schultz, Ido Schimmel, davem, kuba, netdev, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, Vladimir Oltean, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

On tor, jun 02, 2022 at 13:39, Ido Schimmel <idosch@nvidia.com> wrote:
> On Thu, Jun 02, 2022 at 01:30:06PM +0300, Nikolay Aleksandrov wrote:
>> On 02/06/2022 13:17, Hans Schultz wrote:
>> > On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> >> On 02/06/2022 12:17, Hans Schultz wrote:
>> >>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>> >>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>> > 
>> >>> Another issue is that
>> >>> bridge fdb add MAC dev DEV master static
>> >>> seems to add the entry with the SELF flag set, which I don't think is
>> >>> what we would want it to do or?
>> >>
>> >> I don't see such thing (hacked iproute2 to print the flags before cmd):
>> >> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
>> >> flags 0x4
>> >>
>> >> 0x4 = NTF_MASTER only
>> >>
>> > 
>> > I also get 0x4 from iproute2, but I still get SELF entries when I look
>> > with:
>> > bridge fdb show dev DEV
>> > 
>> 
>> after the above add:
>> $ bridge fdb show dev vnet110 | grep 00:11
>> 00:11:22:33:44:55 master virbr0 static

>
> I think Hans is testing with mv88e6xxx which dumps entries directly from
> HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> NTF_SELF.
>
> Hans, are you seeing the entry twice? Once with 'master' and once with
> 'self'?
>

Well yes, but I get some additional entries with 'self' for different
vlans. So from clean adding a random fdb entry I get 4 entries on the
port, 2 with 'master' and two with 'self'.
It looks like this:

# bridge fdb add  00:22:33:44:55:66 dev eth6 master static
# bridge fdb show dev eth6 | grep 55
00:22:33:44:55:66 vlan 1 master br0 offload static
00:22:33:44:55:66 master br0 offload static
00:22:33:44:55:66 vlan 1 self static
00:22:33:44:55:66 vlan 4095 self static

If I do a replace of a locked entry I only get one with the 'self' flag.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 11:36                         ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02 11:36 UTC (permalink / raw)
  To: Ido Schimmel, Nikolay Aleksandrov
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Vivien Didelot, Ido Schimmel, Eric Dumazet, Paolo Abeni,
	linux-kselftest, Roopa Prabhu, kuba, Vladimir Oltean, Shuah Khan,
	davem

On tor, jun 02, 2022 at 13:39, Ido Schimmel <idosch@nvidia.com> wrote:
> On Thu, Jun 02, 2022 at 01:30:06PM +0300, Nikolay Aleksandrov wrote:
>> On 02/06/2022 13:17, Hans Schultz wrote:
>> > On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> >> On 02/06/2022 12:17, Hans Schultz wrote:
>> >>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
>> >>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
>> > 
>> >>> Another issue is that
>> >>> bridge fdb add MAC dev DEV master static
>> >>> seems to add the entry with the SELF flag set, which I don't think is
>> >>> what we would want it to do or?
>> >>
>> >> I don't see such thing (hacked iproute2 to print the flags before cmd):
>> >> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
>> >> flags 0x4
>> >>
>> >> 0x4 = NTF_MASTER only
>> >>
>> > 
>> > I also get 0x4 from iproute2, but I still get SELF entries when I look
>> > with:
>> > bridge fdb show dev DEV
>> > 
>> 
>> after the above add:
>> $ bridge fdb show dev vnet110 | grep 00:11
>> 00:11:22:33:44:55 master virbr0 static

>
> I think Hans is testing with mv88e6xxx which dumps entries directly from
> HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> NTF_SELF.
>
> Hans, are you seeing the entry twice? Once with 'master' and once with
> 'self'?
>

Well yes, but I get some additional entries with 'self' for different
vlans. So from clean adding a random fdb entry I get 4 entries on the
port, 2 with 'master' and two with 'self'.
It looks like this:

# bridge fdb add  00:22:33:44:55:66 dev eth6 master static
# bridge fdb show dev eth6 | grep 55
00:22:33:44:55:66 vlan 1 master br0 offload static
00:22:33:44:55:66 master br0 offload static
00:22:33:44:55:66 vlan 1 self static
00:22:33:44:55:66 vlan 4095 self static

If I do a replace of a locked entry I only get one with the 'self' flag.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 11:36                         ` [Bridge] " Hans Schultz
@ 2022-06-02 11:55                           ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-06-02 11:55 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Nikolay Aleksandrov, Ido Schimmel, davem, kuba, netdev,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Eric Dumazet, Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Shuah Khan, Daniel Borkmann, linux-kernel, bridge,
	linux-kselftest

On Thu, Jun 02, 2022 at 01:36:56PM +0200, Hans Schultz wrote:
> On tor, jun 02, 2022 at 13:39, Ido Schimmel <idosch@nvidia.com> wrote:
> > On Thu, Jun 02, 2022 at 01:30:06PM +0300, Nikolay Aleksandrov wrote:
> >> On 02/06/2022 13:17, Hans Schultz wrote:
> >> > On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> >> >> On 02/06/2022 12:17, Hans Schultz wrote:
> >> >>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> >> >>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> >> > 
> >> >>> Another issue is that
> >> >>> bridge fdb add MAC dev DEV master static
> >> >>> seems to add the entry with the SELF flag set, which I don't think is
> >> >>> what we would want it to do or?
> >> >>
> >> >> I don't see such thing (hacked iproute2 to print the flags before cmd):
> >> >> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
> >> >> flags 0x4
> >> >>
> >> >> 0x4 = NTF_MASTER only
> >> >>
> >> > 
> >> > I also get 0x4 from iproute2, but I still get SELF entries when I look
> >> > with:
> >> > bridge fdb show dev DEV
> >> > 
> >> 
> >> after the above add:
> >> $ bridge fdb show dev vnet110 | grep 00:11
> >> 00:11:22:33:44:55 master virbr0 static
> 
> >
> > I think Hans is testing with mv88e6xxx which dumps entries directly from
> > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > NTF_SELF.
> >
> > Hans, are you seeing the entry twice? Once with 'master' and once with
> > 'self'?
> >
> 
> Well yes, but I get some additional entries with 'self' for different
> vlans. So from clean adding a random fdb entry I get 4 entries on the
> port, 2 with 'master' and two with 'self'.
> It looks like this:
> 
> # bridge fdb add  00:22:33:44:55:66 dev eth6 master static
> # bridge fdb show dev eth6 | grep 55
> 00:22:33:44:55:66 vlan 1 master br0 offload static
> 00:22:33:44:55:66 master br0 offload static

These two entries are added by the bridge driver ('master' is set). You
get two entries because you didn't specify a VLAN, so one entry is
installed with VLAN 0 (no VLAN) and the second is installed because VLAN
1 is configured on eth6.

> 00:22:33:44:55:66 vlan 1 self static

This entry is from the HW. It corresponds to the first entry above.

> 00:22:33:44:55:66 vlan 4095 self static

I assume you are using VLAN 4095 for untagged traffic, so this entry
probably corresponds to the second entry above.

> 
> If I do a replace of a locked entry I only get one with the 'self' flag.

IIUC, your driver is adding the entry to the bridge and with a specific
VLAN. So you have one entry reported by the bridge driver and a
corresponding entry in HW.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 11:55                           ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-06-02 11:55 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On Thu, Jun 02, 2022 at 01:36:56PM +0200, Hans Schultz wrote:
> On tor, jun 02, 2022 at 13:39, Ido Schimmel <idosch@nvidia.com> wrote:
> > On Thu, Jun 02, 2022 at 01:30:06PM +0300, Nikolay Aleksandrov wrote:
> >> On 02/06/2022 13:17, Hans Schultz wrote:
> >> > On tor, jun 02, 2022 at 12:33, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> >> >> On 02/06/2022 12:17, Hans Schultz wrote:
> >> >>> On tis, maj 31, 2022 at 17:23, Ido Schimmel <idosch@nvidia.com> wrote:
> >> >>>> On Tue, May 31, 2022 at 11:34:21AM +0200, Hans Schultz wrote:
> >> > 
> >> >>> Another issue is that
> >> >>> bridge fdb add MAC dev DEV master static
> >> >>> seems to add the entry with the SELF flag set, which I don't think is
> >> >>> what we would want it to do or?
> >> >>
> >> >> I don't see such thing (hacked iproute2 to print the flags before cmd):
> >> >> $ bridge fdb add 00:11:22:33:44:55 dev vnet110 master static
> >> >> flags 0x4
> >> >>
> >> >> 0x4 = NTF_MASTER only
> >> >>
> >> > 
> >> > I also get 0x4 from iproute2, but I still get SELF entries when I look
> >> > with:
> >> > bridge fdb show dev DEV
> >> > 
> >> 
> >> after the above add:
> >> $ bridge fdb show dev vnet110 | grep 00:11
> >> 00:11:22:33:44:55 master virbr0 static
> 
> >
> > I think Hans is testing with mv88e6xxx which dumps entries directly from
> > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > NTF_SELF.
> >
> > Hans, are you seeing the entry twice? Once with 'master' and once with
> > 'self'?
> >
> 
> Well yes, but I get some additional entries with 'self' for different
> vlans. So from clean adding a random fdb entry I get 4 entries on the
> port, 2 with 'master' and two with 'self'.
> It looks like this:
> 
> # bridge fdb add  00:22:33:44:55:66 dev eth6 master static
> # bridge fdb show dev eth6 | grep 55
> 00:22:33:44:55:66 vlan 1 master br0 offload static
> 00:22:33:44:55:66 master br0 offload static

These two entries are added by the bridge driver ('master' is set). You
get two entries because you didn't specify a VLAN, so one entry is
installed with VLAN 0 (no VLAN) and the second is installed because VLAN
1 is configured on eth6.

> 00:22:33:44:55:66 vlan 1 self static

This entry is from the HW. It corresponds to the first entry above.

> 00:22:33:44:55:66 vlan 4095 self static

I assume you are using VLAN 4095 for untagged traffic, so this entry
probably corresponds to the second entry above.

> 
> If I do a replace of a locked entry I only get one with the 'self' flag.

IIUC, your driver is adding the entry to the bridge and with a specific
VLAN. So you have one entry reported by the bridge driver and a
corresponding entry in HW.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 10:39                       ` [Bridge] " Ido Schimmel
@ 2022-06-02 12:08                         ` Hans Schultz
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02 12:08 UTC (permalink / raw)
  To: Ido Schimmel, Nikolay Aleksandrov
  Cc: Hans Schultz, Ido Schimmel, davem, kuba, netdev, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, Vladimir Oltean, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

>
> I think Hans is testing with mv88e6xxx which dumps entries directly from
> HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> NTF_SELF.
>
> Hans, are you seeing the entry twice? Once with 'master' and once with
> 'self'?
>

When replacing a locked entry it looks like this:

# bridge fdb show dev eth6 | grep 4c
00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked

# bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ; bridge fdb show dev eth6 | grep 4c
00:4c:4c:4c:4c:4c vlan 1 self static

The problem is then that the function
br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
, where the h_source and vid is the entry above, does not find the entry.
My hypothesis was then that this is because of the 'self' flag that I
see.

I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
debug, and thus does not really set any flags in the bridge modules FDB,
but then I don't understand why the above find function does not find
the entry?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 12:08                         ` Hans Schultz
  0 siblings, 0 replies; 108+ messages in thread
From: Hans Schultz @ 2022-06-02 12:08 UTC (permalink / raw)
  To: Ido Schimmel, Nikolay Aleksandrov
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Vivien Didelot, Ido Schimmel, Eric Dumazet, Paolo Abeni,
	linux-kselftest, Roopa Prabhu, kuba, Vladimir Oltean, Shuah Khan,
	davem

>
> I think Hans is testing with mv88e6xxx which dumps entries directly from
> HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> NTF_SELF.
>
> Hans, are you seeing the entry twice? Once with 'master' and once with
> 'self'?
>

When replacing a locked entry it looks like this:

# bridge fdb show dev eth6 | grep 4c
00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked

# bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ; bridge fdb show dev eth6 | grep 4c
00:4c:4c:4c:4c:4c vlan 1 self static

The problem is then that the function
br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
, where the h_source and vid is the entry above, does not find the entry.
My hypothesis was then that this is because of the 'self' flag that I
see.

I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
debug, and thus does not really set any flags in the bridge modules FDB,
but then I don't understand why the above find function does not find
the entry?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 12:08                         ` [Bridge] " Hans Schultz
@ 2022-06-02 12:18                           ` Ido Schimmel
  -1 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-06-02 12:18 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Nikolay Aleksandrov, Ido Schimmel, davem, kuba, netdev,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Eric Dumazet, Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Shuah Khan, Daniel Borkmann, linux-kernel, bridge,
	linux-kselftest

On Thu, Jun 02, 2022 at 02:08:41PM +0200, Hans Schultz wrote:
> >
> > I think Hans is testing with mv88e6xxx which dumps entries directly from
> > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > NTF_SELF.
> >
> > Hans, are you seeing the entry twice? Once with 'master' and once with
> > 'self'?
> >
> 
> When replacing a locked entry it looks like this:
> 
> # bridge fdb show dev eth6 | grep 4c
> 00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked
> 
> # bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ; bridge fdb show dev eth6 | grep 4c
> 00:4c:4c:4c:4c:4c vlan 1 self static

This output means that the FDB entry was deleted from the bridge driver
FDB.

> 
> The problem is then that the function
> br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
> , where the h_source and vid is the entry above, does not find the entry.
> My hypothesis was then that this is because of the 'self' flag that I
> see.

br_fdb_find_rcu() does a lookup in the bridge driver FDB, but per the
output above, the entry isn't there for some reason. It's only in HW.

Can it be that you driver is deleting these entries from the bridge
driver FDB via SWITCHDEV_FDB_DEL_TO_BRIDGE for some reason?

> 
> I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
> debug, and thus does not really set any flags in the bridge modules FDB,
> but then I don't understand why the above find function does not find
> the entry?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 12:18                           ` Ido Schimmel
  0 siblings, 0 replies; 108+ messages in thread
From: Ido Schimmel @ 2022-06-02 12:18 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, kuba,
	Vladimir Oltean, Shuah Khan, davem

On Thu, Jun 02, 2022 at 02:08:41PM +0200, Hans Schultz wrote:
> >
> > I think Hans is testing with mv88e6xxx which dumps entries directly from
> > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > NTF_SELF.
> >
> > Hans, are you seeing the entry twice? Once with 'master' and once with
> > 'self'?
> >
> 
> When replacing a locked entry it looks like this:
> 
> # bridge fdb show dev eth6 | grep 4c
> 00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked
> 
> # bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ; bridge fdb show dev eth6 | grep 4c
> 00:4c:4c:4c:4c:4c vlan 1 self static

This output means that the FDB entry was deleted from the bridge driver
FDB.

> 
> The problem is then that the function
> br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
> , where the h_source and vid is the entry above, does not find the entry.
> My hypothesis was then that this is because of the 'self' flag that I
> see.

br_fdb_find_rcu() does a lookup in the bridge driver FDB, but per the
output above, the entry isn't there for some reason. It's only in HW.

Can it be that you driver is deleting these entries from the bridge
driver FDB via SWITCHDEV_FDB_DEL_TO_BRIDGE for some reason?

> 
> I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
> debug, and thus does not really set any flags in the bridge modules FDB,
> but then I don't understand why the above find function does not find
> the entry?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 12:18                           ` [Bridge] " Ido Schimmel
  (?)
@ 2022-06-02 12:53                           ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-02 12:53 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, Jakub Kicinski,
	Vladimir Oltean, Shuah Khan, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 2574 bytes --]

On tor, jun 02, 2022 at 15:18, Ido Schimmel <idosch@nvidia.com> wrote:
> On Thu, Jun 02, 2022 at 02:08:41PM +0200, Hans Schultz wrote:
>
> br_fdb_find_rcu() does a lookup in the bridge driver FDB, but per the
> output above, the entry isn't there for some reason. It's only in HW.
>
> Can it be that you driver is deleting these entries from the bridge
> driver FDB via SWITCHDEV_FDB_DEL_TO_BRIDGE for some reason?
>

Yes, that sounds much like the case. So the replace of course just
modifies the SW fdb entry, and then it obviously removes the HW entry,
which triggers SWITCHDEV_FDB_DEL_TO_BRIDGE and then adds the replaced,
but because it comes from the SW bridge it doesn't need to do more.
But as there are only ops for port_fdb_add() and port_fdb_del(), there
is no way to know how to handle a replace. So it ends up just deleting
the SW fdb entry after all...

It is difficult for me to see how this scenario will be able to work
differently, or?

On Thu, Jun 2, 2022 at 2:18 PM Ido Schimmel <idosch@nvidia.com> wrote:

> On Thu, Jun 02, 2022 at 02:08:41PM +0200, Hans Schultz wrote:
> > >
> > > I think Hans is testing with mv88e6xxx which dumps entries directly
> from
> > > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > > NTF_SELF.
> > >
> > > Hans, are you seeing the entry twice? Once with 'master' and once with
> > > 'self'?
> > >
> >
> > When replacing a locked entry it looks like this:
> >
> > # bridge fdb show dev eth6 | grep 4c
> > 00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked
> >
> > # bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ;
> bridge fdb show dev eth6 | grep 4c
> > 00:4c:4c:4c:4c:4c vlan 1 self static
>
> This output means that the FDB entry was deleted from the bridge driver
> FDB.
>
> >
> > The problem is then that the function
> > br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
> > , where the h_source and vid is the entry above, does not find the entry.
> > My hypothesis was then that this is because of the 'self' flag that I
> > see.
>
> br_fdb_find_rcu() does a lookup in the bridge driver FDB, but per the
> output above, the entry isn't there for some reason. It's only in HW.
>
> Can it be that you driver is deleting these entries from the bridge
> driver FDB via SWITCHDEV_FDB_DEL_TO_BRIDGE for some reason?
>
> >
> > I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
> > debug, and thus does not really set any flags in the bridge modules FDB,
> > but then I don't understand why the above find function does not find
> > the entry?
>

[-- Attachment #2: Type: text/html, Size: 3258 bytes --]

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-06-02 12:18                           ` [Bridge] " Ido Schimmel
@ 2022-06-02 13:27                             ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-02 13:27 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Nikolay Aleksandrov, Ido Schimmel, David S. Miller,
	Jakub Kicinski, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, linux-kernel, bridge, linux-kselftest

Yes, that sounds much like the case. So the replace of course just
modifies the SW fdb entry, and then it just uses port_fdb_add() to
replace HW entry I assume, which then in my case triggers
SWITCHDEV_FDB_DEL_TO_BRIDGE as the locked entry is removed.
So I should not send the SWITCHDEV_FDB_DEL_TO_BRIDGE message when
removing the locked entry from port_fdb_add() function...

(note: having problems with smtp.gmail.com...)


On Thu, Jun 2, 2022 at 2:18 PM Ido Schimmel <idosch@nvidia.com> wrote:
>
> On Thu, Jun 02, 2022 at 02:08:41PM +0200, Hans Schultz wrote:
> > >
> > > I think Hans is testing with mv88e6xxx which dumps entries directly from
> > > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > > NTF_SELF.
> > >
> > > Hans, are you seeing the entry twice? Once with 'master' and once with
> > > 'self'?
> > >
> >
> > When replacing a locked entry it looks like this:
> >
> > # bridge fdb show dev eth6 | grep 4c
> > 00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked
> >
> > # bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ; bridge fdb show dev eth6 | grep 4c
> > 00:4c:4c:4c:4c:4c vlan 1 self static
>
> This output means that the FDB entry was deleted from the bridge driver
> FDB.
>
> >
> > The problem is then that the function
> > br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
> > , where the h_source and vid is the entry above, does not find the entry.
> > My hypothesis was then that this is because of the 'self' flag that I
> > see.
>
> br_fdb_find_rcu() does a lookup in the bridge driver FDB, but per the
> output above, the entry isn't there for some reason. It's only in HW.
>
> Can it be that you driver is deleting these entries from the bridge
> driver FDB via SWITCHDEV_FDB_DEL_TO_BRIDGE for some reason?
>
> >
> > I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
> > debug, and thus does not really set any flags in the bridge modules FDB,
> > but then I don't understand why the above find function does not find
> > the entry?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-06-02 13:27                             ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-02 13:27 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	linux-kernel, Vivien Didelot, Ido Schimmel, Eric Dumazet,
	Paolo Abeni, linux-kselftest, Roopa Prabhu, Jakub Kicinski,
	Vladimir Oltean, Shuah Khan, David S. Miller

Yes, that sounds much like the case. So the replace of course just
modifies the SW fdb entry, and then it just uses port_fdb_add() to
replace HW entry I assume, which then in my case triggers
SWITCHDEV_FDB_DEL_TO_BRIDGE as the locked entry is removed.
So I should not send the SWITCHDEV_FDB_DEL_TO_BRIDGE message when
removing the locked entry from port_fdb_add() function...

(note: having problems with smtp.gmail.com...)


On Thu, Jun 2, 2022 at 2:18 PM Ido Schimmel <idosch@nvidia.com> wrote:
>
> On Thu, Jun 02, 2022 at 02:08:41PM +0200, Hans Schultz wrote:
> > >
> > > I think Hans is testing with mv88e6xxx which dumps entries directly from
> > > HW via ndo_fdb_dump(). See dsa_slave_port_fdb_do_dump() which sets
> > > NTF_SELF.
> > >
> > > Hans, are you seeing the entry twice? Once with 'master' and once with
> > > 'self'?
> > >
> >
> > When replacing a locked entry it looks like this:
> >
> > # bridge fdb show dev eth6 | grep 4c
> > 00:4c:4c:4c:4c:4c vlan 1 master br0 extern_learn offload locked
> >
> > # bridge fdb replace 00:4c:4c:4c:4c:4c dev eth6 vlan 1 master static ; bridge fdb show dev eth6 | grep 4c
> > 00:4c:4c:4c:4c:4c vlan 1 self static
>
> This output means that the FDB entry was deleted from the bridge driver
> FDB.
>
> >
> > The problem is then that the function
> > br_fdb_find_rcu(br,eth_hdr(skb)->h_source, vid);
> > , where the h_source and vid is the entry above, does not find the entry.
> > My hypothesis was then that this is because of the 'self' flag that I
> > see.
>
> br_fdb_find_rcu() does a lookup in the bridge driver FDB, but per the
> output above, the entry isn't there for some reason. It's only in HW.
>
> Can it be that you driver is deleting these entries from the bridge
> driver FDB via SWITCHDEV_FDB_DEL_TO_BRIDGE for some reason?
>
> >
> > I am thinking that the function dsa_slave_port_fdb_do_dump() is only for
> > debug, and thus does not really set any flags in the bridge modules FDB,
> > but then I don't understand why the above find function does not find
> > the entry?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
@ 2022-06-27 12:58     ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-27 12:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, Eric Dumazet, Paolo Abeni,
	Jiri Pirko, Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov,
	Shuah Khan, Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

Hi Vladimir,
maybe you have missed my upstreaming of this patch set...

According to our earlier discussions I have now implemented the
feature, so that the ATU locked entries are owned by the driver, so to
make the entries dynamic I add the entries to a list and use kernel
timers to age out the entries as they should be 'dynamic'. See
mv88e6xxx_switchdev.c.

Hans

On Tue, May 24, 2022 at 5:22 PM Hans Schultz <schultz.hans@gmail.com> wrote:
>
> This implementation for the Marvell mv88e6xxx chip series, is
> based on handling ATU miss violations occurring when packets
> ingress on a port that is locked. The mac address triggering
> the ATU miss violation is communicated through switchdev to
> the bridge module, which adds a fdb entry with the fdb locked
> flag set. The entry is kept according to the bridges ageing
> time, thus simulating a dynamic entry.
>
> Note: The locked port must have learning enabled for the ATU
> miss violation to occur.
>
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
>  drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
>  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
>  drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
>  drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
>  drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
>  drivers/net/dsa/mv88e6xxx/port.h              |   2 +
>  9 files changed, 389 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
>

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-06-27 12:58     ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-27 12:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, linux-kernel, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Paolo Abeni, Vladimir Oltean,
	Shuah Khan, Vivien Didelot

Hi Vladimir,
maybe you have missed my upstreaming of this patch set...

According to our earlier discussions I have now implemented the
feature, so that the ATU locked entries are owned by the driver, so to
make the entries dynamic I add the entries to a list and use kernel
timers to age out the entries as they should be 'dynamic'. See
mv88e6xxx_switchdev.c.

Hans

On Tue, May 24, 2022 at 5:22 PM Hans Schultz <schultz.hans@gmail.com> wrote:
>
> This implementation for the Marvell mv88e6xxx chip series, is
> based on handling ATU miss violations occurring when packets
> ingress on a port that is locked. The mac address triggering
> the ATU miss violation is communicated through switchdev to
> the bridge module, which adds a fdb entry with the fdb locked
> flag set. The entry is kept according to the bridges ageing
> time, thus simulating a dynamic entry.
>
> Note: The locked port must have learning enabled for the ATU
> miss violation to occur.
>
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
>  drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
>  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
>  drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
>  drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
>  drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
>  drivers/net/dsa/mv88e6xxx/port.h              |   2 +
>  9 files changed, 389 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
>

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 2/4] net: switchdev: add support for offloading of fdb locked flag
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
@ 2022-06-27 16:06     ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-06-27 16:06 UTC (permalink / raw)
  To: Hans Schultz
  Cc: davem, kuba, netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov, Shuah Khan,
	Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On Tue, May 24, 2022 at 05:21:42PM +0200, Hans Schultz wrote:
> Used for Mac-auth/MAB feature in the offloaded case.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  include/net/dsa.h       | 6 ++++++
>  include/net/switchdev.h | 3 ++-
>  net/bridge/br.c         | 3 ++-
>  net/bridge/br_fdb.c     | 7 +++++--
>  net/bridge/br_private.h | 2 +-
>  5 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 14f07275852b..a5a843b2d67d 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -330,6 +330,12 @@ struct dsa_port {
>  	/* List of VLANs that CPU and DSA ports are members of. */
>  	struct mutex		vlans_lock;
>  	struct list_head	vlans;
> +
> +	/* List and maintenance of locked ATU entries */
> +	struct mutex		locked_entries_list_lock;
> +	struct list_head	atu_locked_entries_list;
> +	atomic_t		atu_locked_entry_cnt;
> +	struct delayed_work	atu_work;

DSA is not Marvell only, so please remove these from struct dsa_port and
place them somewhere like struct mv88e6xxx_port. Also, the change has
nothing to do in a patch with the "net: switchdev: " prefix.

>  };
>  
>  /* TODO: ideally DSA ports would have a single dp->link_dp member,
> diff --git a/include/net/switchdev.h b/include/net/switchdev.h
> index aa0171d5786d..62f4f7c9c7c2 100644
> --- a/include/net/switchdev.h
> +++ b/include/net/switchdev.h
> @@ -245,7 +245,8 @@ struct switchdev_notifier_fdb_info {
>  	u16 vid;
>  	u8 added_by_user:1,
>  	   is_local:1,
> -	   offloaded:1;
> +	   offloaded:1,
> +	   locked:1;

As mentioned by Ido, please update br_switchdev_fdb_populate() as part
of this change, in the bridge->switchdev direction. We should add a
comment near struct switchdev_notifier_fdb_info stating just that,
so that people don't forget.

>  };
>  
>  struct switchdev_notifier_port_obj_info {
> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index 96e91d69a9a8..12933388a5a4 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
>  		fdb_info = ptr;
>  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
> -						fdb_info->vid, false);
> +						fdb_info->vid, false,
> +						fdb_info->locked);
>  		if (err) {
>  			err = notifier_from_errno(err);
>  			break;
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 6b83e2d6435d..92469547283a 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -1135,7 +1135,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
>  					   "FDB entry towards bridge must be permanent");
>  			return -EINVAL;
>  		}
> -		err = br_fdb_external_learn_add(br, p, addr, vid, true);
> +		err = br_fdb_external_learn_add(br, p, addr, vid, true, false);
>  	} else {
>  		spin_lock_bh(&br->hash_lock);
>  		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
> @@ -1365,7 +1365,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
>  
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
> -			      bool swdev_notify)
> +			      bool swdev_notify, bool locked)
>  {
>  	struct net_bridge_fdb_entry *fdb;
>  	bool modified = false;
> @@ -1385,6 +1385,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  		if (!p)
>  			flags |= BIT(BR_FDB_LOCAL);
>  
> +		if (locked)
> +			flags |= BIT(BR_FDB_ENTRY_LOCKED);
> +
>  		fdb = fdb_create(br, p, addr, vid, flags);
>  		if (!fdb) {
>  			err = -ENOMEM;
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index be17c99efe65..88913e6a59e1 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -815,7 +815,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
>  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
> -			      bool swdev_notify);
> +			      bool swdev_notify, bool locked);
>  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
>  			      bool swdev_notify);
> -- 
> 2.30.2
> 


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 2/4] net: switchdev: add support for offloading of fdb locked flag
@ 2022-06-27 16:06     ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-06-27 16:06 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan,
	davem, linux-kernel

On Tue, May 24, 2022 at 05:21:42PM +0200, Hans Schultz wrote:
> Used for Mac-auth/MAB feature in the offloaded case.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---
>  include/net/dsa.h       | 6 ++++++
>  include/net/switchdev.h | 3 ++-
>  net/bridge/br.c         | 3 ++-
>  net/bridge/br_fdb.c     | 7 +++++--
>  net/bridge/br_private.h | 2 +-
>  5 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 14f07275852b..a5a843b2d67d 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -330,6 +330,12 @@ struct dsa_port {
>  	/* List of VLANs that CPU and DSA ports are members of. */
>  	struct mutex		vlans_lock;
>  	struct list_head	vlans;
> +
> +	/* List and maintenance of locked ATU entries */
> +	struct mutex		locked_entries_list_lock;
> +	struct list_head	atu_locked_entries_list;
> +	atomic_t		atu_locked_entry_cnt;
> +	struct delayed_work	atu_work;

DSA is not Marvell only, so please remove these from struct dsa_port and
place them somewhere like struct mv88e6xxx_port. Also, the change has
nothing to do in a patch with the "net: switchdev: " prefix.

>  };
>  
>  /* TODO: ideally DSA ports would have a single dp->link_dp member,
> diff --git a/include/net/switchdev.h b/include/net/switchdev.h
> index aa0171d5786d..62f4f7c9c7c2 100644
> --- a/include/net/switchdev.h
> +++ b/include/net/switchdev.h
> @@ -245,7 +245,8 @@ struct switchdev_notifier_fdb_info {
>  	u16 vid;
>  	u8 added_by_user:1,
>  	   is_local:1,
> -	   offloaded:1;
> +	   offloaded:1,
> +	   locked:1;

As mentioned by Ido, please update br_switchdev_fdb_populate() as part
of this change, in the bridge->switchdev direction. We should add a
comment near struct switchdev_notifier_fdb_info stating just that,
so that people don't forget.

>  };
>  
>  struct switchdev_notifier_port_obj_info {
> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index 96e91d69a9a8..12933388a5a4 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
>  		fdb_info = ptr;
>  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
> -						fdb_info->vid, false);
> +						fdb_info->vid, false,
> +						fdb_info->locked);
>  		if (err) {
>  			err = notifier_from_errno(err);
>  			break;
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 6b83e2d6435d..92469547283a 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -1135,7 +1135,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
>  					   "FDB entry towards bridge must be permanent");
>  			return -EINVAL;
>  		}
> -		err = br_fdb_external_learn_add(br, p, addr, vid, true);
> +		err = br_fdb_external_learn_add(br, p, addr, vid, true, false);
>  	} else {
>  		spin_lock_bh(&br->hash_lock);
>  		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
> @@ -1365,7 +1365,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
>  
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
> -			      bool swdev_notify)
> +			      bool swdev_notify, bool locked)
>  {
>  	struct net_bridge_fdb_entry *fdb;
>  	bool modified = false;
> @@ -1385,6 +1385,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  		if (!p)
>  			flags |= BIT(BR_FDB_LOCAL);
>  
> +		if (locked)
> +			flags |= BIT(BR_FDB_ENTRY_LOCKED);
> +
>  		fdb = fdb_create(br, p, addr, vid, flags);
>  		if (!fdb) {
>  			err = -ENOMEM;
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index be17c99efe65..88913e6a59e1 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -815,7 +815,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
>  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
> -			      bool swdev_notify);
> +			      bool swdev_notify, bool locked);
>  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
>  			      bool swdev_notify);
> -- 
> 2.30.2
> 


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-05-24 15:21   ` [Bridge] " Hans Schultz
@ 2022-06-27 18:05     ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-06-27 18:05 UTC (permalink / raw)
  To: Hans Schultz
  Cc: davem, kuba, netdev, Hans Schultz, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Nikolay Aleksandrov, Shuah Khan,
	Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

Hi Hans,

On Tue, May 24, 2022 at 05:21:43PM +0200, Hans Schultz wrote:
> This implementation for the Marvell mv88e6xxx chip series, is
> based on handling ATU miss violations occurring when packets
> ingress on a port that is locked. The mac address triggering
> the ATU miss violation is communicated through switchdev to
> the bridge module, which adds a fdb entry with the fdb locked
> flag set. The entry is kept according to the bridges ageing
> time, thus simulating a dynamic entry.
> 
> Note: The locked port must have learning enabled for the ATU
> miss violation to occur.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---

I'm sorry that I couldn't focus on the big picture of this patch,
but locking is an absolute disaster and I just stopped after a while,
it's really distracting :)

Would you mind addressing the feedback below first, and I'll take
another look when you send v4?

>  drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
>  drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
>  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
>  drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
>  drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
>  drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
>  drivers/net/dsa/mv88e6xxx/port.h              |   2 +
>  9 files changed, 389 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile
> index c8eca2b6f959..3ca57709730d 100644
> --- a/drivers/net/dsa/mv88e6xxx/Makefile
> +++ b/drivers/net/dsa/mv88e6xxx/Makefile
> @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o
>  mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o
>  mv88e6xxx-objs += serdes.o
>  mv88e6xxx-objs += smi.o
> +mv88e6xxx-objs += mv88e6xxx_switchdev.o
> \ No newline at end of file
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 5d2c57a7c708..f7a16886bee9 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -42,6 +42,7 @@
>  #include "ptp.h"
>  #include "serdes.h"
>  #include "smi.h"
> +#include "mv88e6xxx_switchdev.h"
>  
>  static void assert_reg_lock(struct mv88e6xxx_chip *chip)
>  {
> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
>  	if (err)
>  		dev_err(chip->dev,
>  			"p%d: failed to force MAC link down\n", port);
> +	else
> +		if (mv88e6xxx_port_is_locked(chip, port, true))
> +			mv88e6xxx_atu_locked_entry_flush(ds, port);

This is superfluous, is it not? The bridge will transition a port whose
link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
fast-age the dynamic FDB entries on the port, which you've already
handled below.

>  }
>  
>  static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port,
> @@ -1685,6 +1689,9 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
>  
> +	if (mv88e6xxx_port_is_locked(chip, port, true))
> +		mv88e6xxx_atu_locked_entry_flush(ds, port);
> +

Dumb question: if you only flush the locked entries at fast age if the
port is locked, then what happens with the existing locked entries if
the port becomes unlocked before an FDB flush takes place?
Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
too?

>  	mv88e6xxx_reg_lock(chip);
>  	err = mv88e6xxx_port_fast_age_fid(chip, port, 0);
>  	mv88e6xxx_reg_unlock(chip);
> @@ -1721,11 +1728,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid,
>  	return err;
>  }
>  
> -static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
> -			      int (*cb)(struct mv88e6xxx_chip *chip,
> -					const struct mv88e6xxx_vtu_entry *entry,
> -					void *priv),
> -			      void *priv)
> +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
> +		       int (*cb)(struct mv88e6xxx_chip *chip,
> +				 const struct mv88e6xxx_vtu_entry *entry,
> +				 void *priv),
> +		       void *priv)
>  {
>  	struct mv88e6xxx_vtu_entry entry = {
>  		.vid = mv88e6xxx_max_vid(chip),
> @@ -2722,9 +2729,12 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
>  
> +	if (mv88e6xxx_port_is_locked(chip, port, true))
> +		mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
> +
>  	mv88e6xxx_reg_lock(chip);
>  	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
> -					   MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
> +			MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);

Unrelated and unjustified change.

>  	mv88e6xxx_reg_unlock(chip);
>  
>  	return err;
> @@ -2735,12 +2745,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
>  				  struct dsa_db db)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
> +	bool locked_found = false;
>  	int err;
>  
> -	mv88e6xxx_reg_lock(chip);
> -	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
> -	mv88e6xxx_reg_unlock(chip);
> +	if (mv88e6xxx_port_is_locked(chip, port, true))
> +		locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
>  
> +	if (!locked_found) {
> +		mv88e6xxx_reg_lock(chip);
> +		err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
> +		mv88e6xxx_reg_unlock(chip);
> +	}
>  	return err;
>  }
>  
> @@ -3809,11 +3824,16 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
>  
>  static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port)
>  {
> -	return mv88e6xxx_setup_devlink_regions_port(ds, port);
> +	int err;
> +
> +	err = mv88e6xxx_setup_devlink_regions_port(ds, port);
> +	mv88e6xxx_init_violation_handler(ds, port);

What's with this quirky placement? You need to do error checking and
call mv88e6xxx_teardown_violation_handler() if setting up the devlink
port regions fails, otherwise the port will fail to probe but no one
will quiesce its delayed ATU work.

By the way, do all mv88e6xxx switches support 802.1X and MAC Auth Bypass,
or do we need to initialize these structures depending on some capability?

> +	return err;
>  }
>  
>  static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port)
>  {
> +	mv88e6xxx_teardown_violation_handler(ds, port);
>  	mv88e6xxx_teardown_devlink_regions_port(ds, port);
>  }
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
> index 5e03cfe50156..c9a8404a6293 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -803,6 +803,11 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip)
>  	mutex_unlock(&chip->reg_lock);
>  }
>  
> +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
> +		       int (*cb)(struct mv88e6xxx_chip *chip,
> +				 const struct mv88e6xxx_vtu_entry *entry,
> +				 void *priv),
> +		       void *priv);
>  int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
>  
>  #endif /* _MV88E6XXX_CHIP_H */
> diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
> index 65958b2a0d3a..503fbf216670 100644
> --- a/drivers/net/dsa/mv88e6xxx/global1.h
> +++ b/drivers/net/dsa/mv88e6xxx/global1.h
> @@ -136,6 +136,7 @@
>  #define MV88E6XXX_G1_ATU_DATA_TRUNK				0x8000
>  #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK			0x00f0
>  #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK			0x3ff0
> +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS		0x0000
>  #define MV88E6XXX_G1_ATU_DATA_STATE_MASK			0x000f
>  #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED			0x0000
>  #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST		0x0001
> diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c
> index 40bd67a5c8e9..517376271f64 100644
> --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
> +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
> @@ -12,6 +12,8 @@
>  
>  #include "chip.h"
>  #include "global1.h"
> +#include "port.h"
> +#include "mv88e6xxx_switchdev.h"
>  
>  /* Offset 0x01: ATU FID Register */
>  
> @@ -114,6 +116,18 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip)
>  	return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0);
>  }
>  
> +static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip)
> +{
> +	int err;
> +
> +	err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP,
> +				 MV88E6XXX_G1_ATU_OP_BUSY | MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);

Split on 3 lines please.

> +	if (err)
> +		return err;
> +
> +	return mv88e6xxx_g1_atu_op_wait(chip);
> +}
> +
>  static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op)
>  {
>  	u16 val;
> @@ -356,11 +370,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  	int spid;
>  	int err;
>  	u16 val;
> +	u16 fid;
>  
>  	mv88e6xxx_reg_lock(chip);
>  
> -	err = mv88e6xxx_g1_atu_op(chip, 0,
> -				  MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
> +	err = mv88e6xxx_g1_read_atu_violation(chip);

I cannot comment on the validity of this change: previously, we were
writing FID 0 as part of mv88e6xxx_g1_atu_op(), now we are reading back
the FID. Definitely too much going on in a single change, this needs a
separate patch with an explanation.

>  	if (err)
>  		goto out;
>  
> @@ -368,6 +382,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  	if (err)
>  		goto out;
>  
> +	err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid);
> +	if (err)
> +		goto out;

Is it ok to read the MV88E6352_G1_ATU_FID register from an IRQ handler
common for all switches, I wonder?

> +
>  	err = mv88e6xxx_g1_atu_data_read(chip, &entry);
>  	if (err)
>  		goto out;
> @@ -382,6 +400,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  		dev_err_ratelimited(chip->dev,
>  				    "ATU age out violation for %pM\n",
>  				    entry.mac);
> +		err = mv88e6xxx_handle_violation(chip,
> +						 chip->ports[spid].port,

Dumb question: isn't chip->ports[spid].port == spid?

> +						 &entry,
> +						 fid,
> +						 MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION);

This fits on 3 lines instead of 5 (and same below).

>  	}
>  
>  	if (val & MV88E6XXX_G1_ATU_OP_MEMBER_VIOLATION) {
> @@ -396,6 +419,14 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  				    "ATU miss violation for %pM portvec %x spid %d\n",
>  				    entry.mac, entry.portvec, spid);
>  		chip->ports[spid].atu_miss_violation++;
> +		if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port, false))
> +			err = mv88e6xxx_handle_violation(chip,
> +							 chip->ports[spid].port,
> +							 &entry,
> +							 fid,
> +							 MV88E6XXX_G1_ATU_OP_MISS_VIOLATION);
> +		if (err)
> +			goto out;
>  	}
>  
>  	if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) {
> diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
> new file mode 100644
> index 000000000000..8436655ceb9a
> --- /dev/null
> +++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
> @@ -0,0 +1,249 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * mv88e6xxx_switchdev.c
> + *
> + *	Authors:
> + *	Hans J. Schultz		<hans.schultz@westermo.com>
> + *
> + */
> +
> +#include <net/switchdev.h>
> +#include <linux/list.h>
> +#include "chip.h"
> +#include "global1.h"
> +#include "mv88e6xxx_switchdev.h"
> +
> +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)

Please find a more adequate name for this function.

> +{
> +	struct switchdev_notifier_fdb_info info = {
> +		.addr = ale->mac,
> +		.vid = ale->vid,
> +		.added_by_user = false,
> +		.is_local = false,

No need to have an initializer for the false members.

> +		.offloaded = true,
> +		.locked = true,
> +	};
> +	struct mv88e6xxx_atu_entry entry;
> +	struct net_device *brport;
> +	struct dsa_port *dp;
> +
> +	entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
> +	entry.trunk = false;
> +	memcpy(&entry.mac, &ale->mac, ETH_ALEN);

ether_addr_copy

> +
> +	mv88e6xxx_reg_lock(ale->chip);
> +	mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);

The portvec will be junk memory that's on stack, is that what you want?

> +	mv88e6xxx_reg_unlock(ale->chip);
> +
> +	dp = dsa_to_port(ale->chip->ds, ale->port);
> +	brport = dsa_port_to_bridge_port(dp);
> +
> +	if (brport) {
> +		if (!rtnl_is_locked()) {
> +			rtnl_lock();

No, no, no, no, no, no, no.

As I've explained already:
https://patchwork.kernel.org/project/netdevbpf/patch/20220317093902.1305816-4-schultz.hans+netdev@gmail.com/#24782974
dsa_port_to_bridge_port() needs to be called with the rtnl_mutex held.

Please take a moment to figure out which function expects which lock and
for what operation, then draw a call graph, figure out a consistent lock
hierarchy where things are always acquired in the same order, and if a
function needs a locking context but not all callers offer it, put an
ASSERT_RTNL() (for example) and transfer the locking responsibility to
the caller.

Doing this will also help you name your functions better than
"locked entry timer work" (which are called from... drum roll...
mv88e6xxx_port_fdb_del and mv88e6xxx_port_fast_age).

Which by the way, reminds me that.....
You can't take rtnl_lock() from port_fdb_add() and port_fdb_del(),
see commits d7d0d423dbaa ("net: dsa: flush switchdev workqueue when
leaving the bridge") and 0faf890fc519 ("net: dsa: drop rtnl_lock from
dsa_slave_switchdev_event_work"), as you'll deadlock with
dsa_port_pre_bridge_leave(). In fact you never could, but for a slightly
different reason.

From the discussion with Ido and Nikolay I get the impression that
you're not doing the right thing here either, notifying a
SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).

> +			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
> +						 brport, &info.info, NULL);
> +			rtnl_unlock();
> +		} else {
> +			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
> +						 brport, &info.info, NULL);
> +		}
> +	} else {
> +		dev_err(ale->chip->dev, "ERR: No bridge port for dsa port belonging to port %d\n",
> +			ale->port);
> +	}
> +}
> +
> +static inline void mv88e6xxx_atu_locked_entry_purge(struct atu_locked_entry *ale)

No inline functions in .c files.

> +{
> +	mv88e6xxx_atu_locked_entry_timer_work(ale);
> +	del_timer(&ale->timer);
> +	list_del(&ale->list);
> +	kfree(ale);
> +}
> +
> +static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work)
> +{
> +	struct dsa_port *dp = container_of(work, struct dsa_port, atu_work.work);
> +	struct atu_locked_entry *ale, *tmp;
> +
> +	mutex_lock(&dp->locked_entries_list_lock);
> +	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> +		if (ale->timed_out) {
> +			mv88e6xxx_atu_locked_entry_purge(ale);

Nasty lock ordering inversion. In mv88e6xxx_handle_violation() we take
&dp->locked_entries_list_lock with mv88e6xxx_reg_lock() held.
Here (in mv88e6xxx_atu_locked_entry_timer_work called from here) we take
mv88e6xxx_reg_lock() with &dp->locked_entries_list_lock held.

> +			atomic_dec(&dp->atu_locked_entry_cnt);
> +		}
> +	}
> +	mutex_unlock(&dp->locked_entries_list_lock);
> +
> +	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
> +}
> +
> +static void mv88e6xxx_atu_locked_entry_timer_handler(struct timer_list *t)
> +{
> +	struct atu_locked_entry *ale = from_timer(ale, t, timer);
> +
> +	if (ale)
> +		ale->timed_out = true;
> +}
> +
> +struct mv88e6xxx_fid_search_ctx {
> +	u16 fid_search;
> +	u16 vid_found;
> +};
> +
> +static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip,
> +					      const struct mv88e6xxx_vtu_entry *entry,
> +					      void *priv)
> +{
> +	struct mv88e6xxx_fid_search_ctx *ctx = priv;
> +
> +	if (ctx->fid_search == entry->fid) {
> +		ctx->vid_found = entry->vid;
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
> +			       int port,
> +			       struct mv88e6xxx_atu_entry *entry,
> +			       u16 fid,
> +			       u16 type)
> +{
> +	struct switchdev_notifier_fdb_info info = {
> +		.addr = entry->mac,
> +		.vid = 0,
> +		.added_by_user = false,
> +		.is_local = false,
> +		.offloaded = true,
> +		.locked = true,
> +	};
> +	struct atu_locked_entry *locked_entry;
> +	struct mv88e6xxx_fid_search_ctx ctx;
> +	struct net_device *brport;
> +	struct dsa_port *dp;
> +	int err;
> +
> +	ctx.fid_search = fid;
> +	err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx);
> +	if (err < 0)
> +		return err;
> +	if (err == 1)
> +		info.vid = ctx.vid_found;
> +	else
> +		return -ENODATA;
> +
> +	dp = dsa_to_port(chip->ds, port);
> +	brport = dsa_port_to_bridge_port(dp);
> +
> +	if (!brport)
> +		return -ENODEV;
> +
> +	switch (type) {
> +	case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION:
> +		if (atomic_read(&dp->atu_locked_entry_cnt) >= ATU_LOCKED_ENTRIES_MAX) {
> +			mv88e6xxx_reg_unlock(chip);

You call mv88e6xxx_reg_lock() from mv88e6xxx_g1_atu_prob_irq_thread_fn()
and mv88e6xxx_reg_unlock() from mv88e6xxx_handle_violation()? Nice!

And I understand why that is: to avoid a lock ordering inversion with
rtnl_lock(). Just unlock the mv88e6xxx registers after the last hardware
access in mv88e6xxx_g1_atu_prob_irq_thread_fn() - after mv88e6xxx_g1_atu_mac_read(),
and call mv88e6xxx_handle_violation() with the registers unlocked, and
lock them when you need them.

> +			return 0;
> +		}
> +		entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS;
> +		entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC;
> +		entry->trunk = false;
> +		err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry);
> +		if (err)
> +			goto fail;
> +
> +		locked_entry = kmalloc(sizeof(*locked_entry), GFP_ATOMIC);

Please be consistent in your naming of struct atu_locked_entry
variables, be they "locked_entry" or "ale" or otherwise.
And please create a helper function that creates such a structure and
initializes it.

> +		if (!locked_entry)
> +			return -ENOMEM;
> +		timer_setup(&locked_entry->timer, mv88e6xxx_atu_locked_entry_timer_handler, 0);

Does this have to be a dedicated timer per entry, or can you just record
the "jiffies" at creation time per locked entry, and compare it with the
current jiffies from the periodic, sleepable mv88e6xxx_atu_locked_entry_cleanup?

> +		locked_entry->timer.expires = jiffies + dp->ageing_time / 10;
> +		locked_entry->chip = chip;
> +		locked_entry->port = port;
> +		locked_entry->fid = fid;
> +		locked_entry->vid = info.vid;
> +		locked_entry->timed_out = false;
> +		memcpy(&locked_entry->mac, entry->mac, ETH_ALEN);
> +
> +		mutex_lock(&dp->locked_entries_list_lock);
> +		add_timer(&locked_entry->timer);
> +		list_add(&locked_entry->list, &dp->atu_locked_entries_list);
> +		mutex_unlock(&dp->locked_entries_list_lock);
> +		atomic_inc(&dp->atu_locked_entry_cnt);
> +
> +		mv88e6xxx_reg_unlock(chip);
> +
> +		rtnl_lock();
> +		err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE,
> +					       brport, &info.info, NULL);
> +		break;
> +	}
> +	rtnl_unlock();

Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().

> +
> +	return err;
> +
> +fail:
> +	mv88e6xxx_reg_unlock(chip);
> +	return err;
> +}
> +
> +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
> +					   const unsigned char *addr, u16 vid)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +	struct atu_locked_entry *ale, *tmp;
> +	bool found = false;
> +
> +	mutex_lock(&dp->locked_entries_list_lock);
> +	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> +		if (!memcmp(&ale->mac, addr, ETH_ALEN)) {
> +			if (ale->vid == vid) {
> +				mv88e6xxx_atu_locked_entry_purge(ale);
> +				atomic_dec(&dp->atu_locked_entry_cnt);
> +				found = true;
> +				break;
> +			}
> +		}
> +	}
> +	mutex_unlock(&dp->locked_entries_list_lock);
> +	return found;
> +}
> +
> +void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +	struct atu_locked_entry *ale, *tmp;
> +
> +	mutex_lock(&dp->locked_entries_list_lock);
> +	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> +		mv88e6xxx_atu_locked_entry_purge(ale);
> +		atomic_dec(&dp->atu_locked_entry_cnt);
> +	}
> +	mutex_unlock(&dp->locked_entries_list_lock);
> +
> +	if (atomic_read(&dp->atu_locked_entry_cnt) != 0)
> +		dev_err(ds->dev,
> +			"ERROR: Locked entries count is not zero after flush on port %d\n",
> +			port);

And generally speaking, why would you expect it to be 0, since there's
nothing that stops this check from racing with mv88e6xxx_handle_violation?
I suspect that if locking is properly thought through, the atu_locked_entry_cnt
can just be a plain int instead of an improperly used atomic.

Also, random fact: no need to say ERROR when printing with the KERN_ERR
log level. It's kind of implied from the log level.

> +}
> +
> +void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +
> +	INIT_LIST_HEAD(&dp->atu_locked_entries_list);
> +	mutex_init(&dp->locked_entries_list_lock);
> +	dp->atu_locked_entry_cnt.counter = 0;

atomic_set()

> +	INIT_DELAYED_WORK(&dp->atu_work, mv88e6xxx_atu_locked_entry_cleanup);
> +	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
> +}
> +
> +void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +
> +	cancel_delayed_work(&dp->atu_work);
> +	mv88e6xxx_atu_locked_entry_flush(ds, port);
> +	mutex_destroy(&dp->locked_entries_list_lock);
> +}
> diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h

This and mv88e6xxx_switchdev.c are the only source files belonging to
this driver which have the mv88e6xxx_ prefix (others are "chip.c" etc).
Can you please follow the convention?

> new file mode 100644
> index 000000000000..f0e7abf7c361
> --- /dev/null
> +++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
> @@ -0,0 +1,40 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * mv88e6xxx_switchdev.h
> + *
> + *	Authors:
> + *	Hans J. Schultz		<hans.schultz@westermo.com>
> + *
> + */
> +
> +#ifndef DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
> +#define DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
> +
> +#include <net/switchdev.h>
> +#include "chip.h"
> +
> +#define ATU_LOCKED_ENTRIES_MAX	64
> +
> +struct atu_locked_entry {

mv88e6xxx driver specific structure names should be prefixed with mv88e6xxx_.

> +	struct list_head list;
> +	struct mv88e6xxx_chip *chip;
> +	int port;
> +	u8	mac[ETH_ALEN];

Either align everything with tabs, or nothing.

> +	u16 fid;
> +	u16 vid;
> +	struct timer_list timer;
> +	bool timed_out;
> +};
> +
> +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
> +			       int port,
> +			       struct mv88e6xxx_atu_entry *entry,
> +			       u16 fid,
> +			       u16 type);

Both this and the function definition can easily fit on 3 lines.

> +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
> +					   const unsigned char *addr, u16 vid);
> +void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port);
> +void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port);
> +void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port);
> +
> +#endif /* DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_ */
> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
> index 795b3128768f..c4e5e7174129 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -14,9 +14,11 @@
>  #include <linux/phylink.h>
>  
>  #include "chip.h"
> +#include "global1.h"
>  #include "global2.h"
>  #include "port.h"
>  #include "serdes.h"
> +#include "mv88e6xxx_switchdev.h"
>  
>  int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg,
>  			u16 *val)
> @@ -1239,6 +1241,25 @@ int mv88e6xxx_port_set_mirror(struct mv88e6xxx_chip *chip, int port,
>  	return err;
>  }
>  
> +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock)
> +{
> +	bool locked = false;
> +	u16 reg;
> +
> +	if (chiplock)
> +		mv88e6xxx_reg_lock(chip);

Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
which serves both for documentation and for validation purposes, ensure
the lock is always taken at the caller (which in this case is super easy)
and move on.

> +
> +	if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> +		goto out;

It would be good to actually propagate the error to the caller and
"locked" via a pass-by-reference bool pointer argument, not just say
that I/O errors mean that the port is unlocked.

> +	locked = reg & MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK;
> +
> +out:
> +	if (chiplock)
> +		mv88e6xxx_reg_unlock(chip);
> +
> +	return locked;
> +}
> +
>  int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
>  			    bool locked)
>  {
> @@ -1261,10 +1282,13 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
>  	if (err)
>  		return err;
>  
> -	reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
> -	if (locked)
> -		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
> -
> +	reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> +	if (locked) {
> +		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> +			MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> +			MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> +			MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

I'd suggest aligning these macros vertically.

> +	}
>  	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg);
>  }
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
> index e0a705d82019..d377abd6ab17 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -231,6 +231,7 @@
>  #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT		0x2000
>  #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG	0x1000
>  #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED	0x0800
> +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK		0x07ff
>  
>  /* Offset 0x0C: Port ATU Control */
>  #define MV88E6XXX_PORT_ATU_CTL		0x0c
> @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid);
>  int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid);
>  int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid);
>  
> +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock);
>  int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
>  			    bool locked);
>  
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-06-27 18:05     ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-06-27 18:05 UTC (permalink / raw)
  To: Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan,
	davem, linux-kernel

Hi Hans,

On Tue, May 24, 2022 at 05:21:43PM +0200, Hans Schultz wrote:
> This implementation for the Marvell mv88e6xxx chip series, is
> based on handling ATU miss violations occurring when packets
> ingress on a port that is locked. The mac address triggering
> the ATU miss violation is communicated through switchdev to
> the bridge module, which adds a fdb entry with the fdb locked
> flag set. The entry is kept according to the bridges ageing
> time, thus simulating a dynamic entry.
> 
> Note: The locked port must have learning enabled for the ATU
> miss violation to occur.
> 
> Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> ---

I'm sorry that I couldn't focus on the big picture of this patch,
but locking is an absolute disaster and I just stopped after a while,
it's really distracting :)

Would you mind addressing the feedback below first, and I'll take
another look when you send v4?

>  drivers/net/dsa/mv88e6xxx/Makefile            |   1 +
>  drivers/net/dsa/mv88e6xxx/chip.c              |  40 ++-
>  drivers/net/dsa/mv88e6xxx/chip.h              |   5 +
>  drivers/net/dsa/mv88e6xxx/global1.h           |   1 +
>  drivers/net/dsa/mv88e6xxx/global1_atu.c       |  35 ++-
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c   | 249 ++++++++++++++++++
>  .../net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h   |  40 +++
>  drivers/net/dsa/mv88e6xxx/port.c              |  32 ++-
>  drivers/net/dsa/mv88e6xxx/port.h              |   2 +
>  9 files changed, 389 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
>  create mode 100644 drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile
> index c8eca2b6f959..3ca57709730d 100644
> --- a/drivers/net/dsa/mv88e6xxx/Makefile
> +++ b/drivers/net/dsa/mv88e6xxx/Makefile
> @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o
>  mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o
>  mv88e6xxx-objs += serdes.o
>  mv88e6xxx-objs += smi.o
> +mv88e6xxx-objs += mv88e6xxx_switchdev.o
> \ No newline at end of file
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 5d2c57a7c708..f7a16886bee9 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -42,6 +42,7 @@
>  #include "ptp.h"
>  #include "serdes.h"
>  #include "smi.h"
> +#include "mv88e6xxx_switchdev.h"
>  
>  static void assert_reg_lock(struct mv88e6xxx_chip *chip)
>  {
> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
>  	if (err)
>  		dev_err(chip->dev,
>  			"p%d: failed to force MAC link down\n", port);
> +	else
> +		if (mv88e6xxx_port_is_locked(chip, port, true))
> +			mv88e6xxx_atu_locked_entry_flush(ds, port);

This is superfluous, is it not? The bridge will transition a port whose
link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
fast-age the dynamic FDB entries on the port, which you've already
handled below.

>  }
>  
>  static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port,
> @@ -1685,6 +1689,9 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
>  
> +	if (mv88e6xxx_port_is_locked(chip, port, true))
> +		mv88e6xxx_atu_locked_entry_flush(ds, port);
> +

Dumb question: if you only flush the locked entries at fast age if the
port is locked, then what happens with the existing locked entries if
the port becomes unlocked before an FDB flush takes place?
Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
too?

>  	mv88e6xxx_reg_lock(chip);
>  	err = mv88e6xxx_port_fast_age_fid(chip, port, 0);
>  	mv88e6xxx_reg_unlock(chip);
> @@ -1721,11 +1728,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid,
>  	return err;
>  }
>  
> -static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
> -			      int (*cb)(struct mv88e6xxx_chip *chip,
> -					const struct mv88e6xxx_vtu_entry *entry,
> -					void *priv),
> -			      void *priv)
> +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
> +		       int (*cb)(struct mv88e6xxx_chip *chip,
> +				 const struct mv88e6xxx_vtu_entry *entry,
> +				 void *priv),
> +		       void *priv)
>  {
>  	struct mv88e6xxx_vtu_entry entry = {
>  		.vid = mv88e6xxx_max_vid(chip),
> @@ -2722,9 +2729,12 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
>  
> +	if (mv88e6xxx_port_is_locked(chip, port, true))
> +		mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
> +
>  	mv88e6xxx_reg_lock(chip);
>  	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
> -					   MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
> +			MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);

Unrelated and unjustified change.

>  	mv88e6xxx_reg_unlock(chip);
>  
>  	return err;
> @@ -2735,12 +2745,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
>  				  struct dsa_db db)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
> +	bool locked_found = false;
>  	int err;
>  
> -	mv88e6xxx_reg_lock(chip);
> -	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
> -	mv88e6xxx_reg_unlock(chip);
> +	if (mv88e6xxx_port_is_locked(chip, port, true))
> +		locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
>  
> +	if (!locked_found) {
> +		mv88e6xxx_reg_lock(chip);
> +		err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
> +		mv88e6xxx_reg_unlock(chip);
> +	}
>  	return err;
>  }
>  
> @@ -3809,11 +3824,16 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
>  
>  static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port)
>  {
> -	return mv88e6xxx_setup_devlink_regions_port(ds, port);
> +	int err;
> +
> +	err = mv88e6xxx_setup_devlink_regions_port(ds, port);
> +	mv88e6xxx_init_violation_handler(ds, port);

What's with this quirky placement? You need to do error checking and
call mv88e6xxx_teardown_violation_handler() if setting up the devlink
port regions fails, otherwise the port will fail to probe but no one
will quiesce its delayed ATU work.

By the way, do all mv88e6xxx switches support 802.1X and MAC Auth Bypass,
or do we need to initialize these structures depending on some capability?

> +	return err;
>  }
>  
>  static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port)
>  {
> +	mv88e6xxx_teardown_violation_handler(ds, port);
>  	mv88e6xxx_teardown_devlink_regions_port(ds, port);
>  }
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
> index 5e03cfe50156..c9a8404a6293 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -803,6 +803,11 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip)
>  	mutex_unlock(&chip->reg_lock);
>  }
>  
> +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip,
> +		       int (*cb)(struct mv88e6xxx_chip *chip,
> +				 const struct mv88e6xxx_vtu_entry *entry,
> +				 void *priv),
> +		       void *priv);
>  int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
>  
>  #endif /* _MV88E6XXX_CHIP_H */
> diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
> index 65958b2a0d3a..503fbf216670 100644
> --- a/drivers/net/dsa/mv88e6xxx/global1.h
> +++ b/drivers/net/dsa/mv88e6xxx/global1.h
> @@ -136,6 +136,7 @@
>  #define MV88E6XXX_G1_ATU_DATA_TRUNK				0x8000
>  #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK			0x00f0
>  #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK			0x3ff0
> +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS		0x0000
>  #define MV88E6XXX_G1_ATU_DATA_STATE_MASK			0x000f
>  #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED			0x0000
>  #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST		0x0001
> diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c
> index 40bd67a5c8e9..517376271f64 100644
> --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
> +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
> @@ -12,6 +12,8 @@
>  
>  #include "chip.h"
>  #include "global1.h"
> +#include "port.h"
> +#include "mv88e6xxx_switchdev.h"
>  
>  /* Offset 0x01: ATU FID Register */
>  
> @@ -114,6 +116,18 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip)
>  	return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0);
>  }
>  
> +static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip)
> +{
> +	int err;
> +
> +	err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP,
> +				 MV88E6XXX_G1_ATU_OP_BUSY | MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);

Split on 3 lines please.

> +	if (err)
> +		return err;
> +
> +	return mv88e6xxx_g1_atu_op_wait(chip);
> +}
> +
>  static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op)
>  {
>  	u16 val;
> @@ -356,11 +370,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  	int spid;
>  	int err;
>  	u16 val;
> +	u16 fid;
>  
>  	mv88e6xxx_reg_lock(chip);
>  
> -	err = mv88e6xxx_g1_atu_op(chip, 0,
> -				  MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
> +	err = mv88e6xxx_g1_read_atu_violation(chip);

I cannot comment on the validity of this change: previously, we were
writing FID 0 as part of mv88e6xxx_g1_atu_op(), now we are reading back
the FID. Definitely too much going on in a single change, this needs a
separate patch with an explanation.

>  	if (err)
>  		goto out;
>  
> @@ -368,6 +382,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  	if (err)
>  		goto out;
>  
> +	err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid);
> +	if (err)
> +		goto out;

Is it ok to read the MV88E6352_G1_ATU_FID register from an IRQ handler
common for all switches, I wonder?

> +
>  	err = mv88e6xxx_g1_atu_data_read(chip, &entry);
>  	if (err)
>  		goto out;
> @@ -382,6 +400,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  		dev_err_ratelimited(chip->dev,
>  				    "ATU age out violation for %pM\n",
>  				    entry.mac);
> +		err = mv88e6xxx_handle_violation(chip,
> +						 chip->ports[spid].port,

Dumb question: isn't chip->ports[spid].port == spid?

> +						 &entry,
> +						 fid,
> +						 MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION);

This fits on 3 lines instead of 5 (and same below).

>  	}
>  
>  	if (val & MV88E6XXX_G1_ATU_OP_MEMBER_VIOLATION) {
> @@ -396,6 +419,14 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
>  				    "ATU miss violation for %pM portvec %x spid %d\n",
>  				    entry.mac, entry.portvec, spid);
>  		chip->ports[spid].atu_miss_violation++;
> +		if (mv88e6xxx_port_is_locked(chip, chip->ports[spid].port, false))
> +			err = mv88e6xxx_handle_violation(chip,
> +							 chip->ports[spid].port,
> +							 &entry,
> +							 fid,
> +							 MV88E6XXX_G1_ATU_OP_MISS_VIOLATION);
> +		if (err)
> +			goto out;
>  	}
>  
>  	if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) {
> diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
> new file mode 100644
> index 000000000000..8436655ceb9a
> --- /dev/null
> +++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.c
> @@ -0,0 +1,249 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * mv88e6xxx_switchdev.c
> + *
> + *	Authors:
> + *	Hans J. Schultz		<hans.schultz@westermo.com>
> + *
> + */
> +
> +#include <net/switchdev.h>
> +#include <linux/list.h>
> +#include "chip.h"
> +#include "global1.h"
> +#include "mv88e6xxx_switchdev.h"
> +
> +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)

Please find a more adequate name for this function.

> +{
> +	struct switchdev_notifier_fdb_info info = {
> +		.addr = ale->mac,
> +		.vid = ale->vid,
> +		.added_by_user = false,
> +		.is_local = false,

No need to have an initializer for the false members.

> +		.offloaded = true,
> +		.locked = true,
> +	};
> +	struct mv88e6xxx_atu_entry entry;
> +	struct net_device *brport;
> +	struct dsa_port *dp;
> +
> +	entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
> +	entry.trunk = false;
> +	memcpy(&entry.mac, &ale->mac, ETH_ALEN);

ether_addr_copy

> +
> +	mv88e6xxx_reg_lock(ale->chip);
> +	mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);

The portvec will be junk memory that's on stack, is that what you want?

> +	mv88e6xxx_reg_unlock(ale->chip);
> +
> +	dp = dsa_to_port(ale->chip->ds, ale->port);
> +	brport = dsa_port_to_bridge_port(dp);
> +
> +	if (brport) {
> +		if (!rtnl_is_locked()) {
> +			rtnl_lock();

No, no, no, no, no, no, no.

As I've explained already:
https://patchwork.kernel.org/project/netdevbpf/patch/20220317093902.1305816-4-schultz.hans+netdev@gmail.com/#24782974
dsa_port_to_bridge_port() needs to be called with the rtnl_mutex held.

Please take a moment to figure out which function expects which lock and
for what operation, then draw a call graph, figure out a consistent lock
hierarchy where things are always acquired in the same order, and if a
function needs a locking context but not all callers offer it, put an
ASSERT_RTNL() (for example) and transfer the locking responsibility to
the caller.

Doing this will also help you name your functions better than
"locked entry timer work" (which are called from... drum roll...
mv88e6xxx_port_fdb_del and mv88e6xxx_port_fast_age).

Which by the way, reminds me that.....
You can't take rtnl_lock() from port_fdb_add() and port_fdb_del(),
see commits d7d0d423dbaa ("net: dsa: flush switchdev workqueue when
leaving the bridge") and 0faf890fc519 ("net: dsa: drop rtnl_lock from
dsa_slave_switchdev_event_work"), as you'll deadlock with
dsa_port_pre_bridge_leave(). In fact you never could, but for a slightly
different reason.

From the discussion with Ido and Nikolay I get the impression that
you're not doing the right thing here either, notifying a
SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).

> +			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
> +						 brport, &info.info, NULL);
> +			rtnl_unlock();
> +		} else {
> +			call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
> +						 brport, &info.info, NULL);
> +		}
> +	} else {
> +		dev_err(ale->chip->dev, "ERR: No bridge port for dsa port belonging to port %d\n",
> +			ale->port);
> +	}
> +}
> +
> +static inline void mv88e6xxx_atu_locked_entry_purge(struct atu_locked_entry *ale)

No inline functions in .c files.

> +{
> +	mv88e6xxx_atu_locked_entry_timer_work(ale);
> +	del_timer(&ale->timer);
> +	list_del(&ale->list);
> +	kfree(ale);
> +}
> +
> +static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work)
> +{
> +	struct dsa_port *dp = container_of(work, struct dsa_port, atu_work.work);
> +	struct atu_locked_entry *ale, *tmp;
> +
> +	mutex_lock(&dp->locked_entries_list_lock);
> +	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> +		if (ale->timed_out) {
> +			mv88e6xxx_atu_locked_entry_purge(ale);

Nasty lock ordering inversion. In mv88e6xxx_handle_violation() we take
&dp->locked_entries_list_lock with mv88e6xxx_reg_lock() held.
Here (in mv88e6xxx_atu_locked_entry_timer_work called from here) we take
mv88e6xxx_reg_lock() with &dp->locked_entries_list_lock held.

> +			atomic_dec(&dp->atu_locked_entry_cnt);
> +		}
> +	}
> +	mutex_unlock(&dp->locked_entries_list_lock);
> +
> +	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
> +}
> +
> +static void mv88e6xxx_atu_locked_entry_timer_handler(struct timer_list *t)
> +{
> +	struct atu_locked_entry *ale = from_timer(ale, t, timer);
> +
> +	if (ale)
> +		ale->timed_out = true;
> +}
> +
> +struct mv88e6xxx_fid_search_ctx {
> +	u16 fid_search;
> +	u16 vid_found;
> +};
> +
> +static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip,
> +					      const struct mv88e6xxx_vtu_entry *entry,
> +					      void *priv)
> +{
> +	struct mv88e6xxx_fid_search_ctx *ctx = priv;
> +
> +	if (ctx->fid_search == entry->fid) {
> +		ctx->vid_found = entry->vid;
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
> +			       int port,
> +			       struct mv88e6xxx_atu_entry *entry,
> +			       u16 fid,
> +			       u16 type)
> +{
> +	struct switchdev_notifier_fdb_info info = {
> +		.addr = entry->mac,
> +		.vid = 0,
> +		.added_by_user = false,
> +		.is_local = false,
> +		.offloaded = true,
> +		.locked = true,
> +	};
> +	struct atu_locked_entry *locked_entry;
> +	struct mv88e6xxx_fid_search_ctx ctx;
> +	struct net_device *brport;
> +	struct dsa_port *dp;
> +	int err;
> +
> +	ctx.fid_search = fid;
> +	err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx);
> +	if (err < 0)
> +		return err;
> +	if (err == 1)
> +		info.vid = ctx.vid_found;
> +	else
> +		return -ENODATA;
> +
> +	dp = dsa_to_port(chip->ds, port);
> +	brport = dsa_port_to_bridge_port(dp);
> +
> +	if (!brport)
> +		return -ENODEV;
> +
> +	switch (type) {
> +	case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION:
> +		if (atomic_read(&dp->atu_locked_entry_cnt) >= ATU_LOCKED_ENTRIES_MAX) {
> +			mv88e6xxx_reg_unlock(chip);

You call mv88e6xxx_reg_lock() from mv88e6xxx_g1_atu_prob_irq_thread_fn()
and mv88e6xxx_reg_unlock() from mv88e6xxx_handle_violation()? Nice!

And I understand why that is: to avoid a lock ordering inversion with
rtnl_lock(). Just unlock the mv88e6xxx registers after the last hardware
access in mv88e6xxx_g1_atu_prob_irq_thread_fn() - after mv88e6xxx_g1_atu_mac_read(),
and call mv88e6xxx_handle_violation() with the registers unlocked, and
lock them when you need them.

> +			return 0;
> +		}
> +		entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS;
> +		entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC;
> +		entry->trunk = false;
> +		err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry);
> +		if (err)
> +			goto fail;
> +
> +		locked_entry = kmalloc(sizeof(*locked_entry), GFP_ATOMIC);

Please be consistent in your naming of struct atu_locked_entry
variables, be they "locked_entry" or "ale" or otherwise.
And please create a helper function that creates such a structure and
initializes it.

> +		if (!locked_entry)
> +			return -ENOMEM;
> +		timer_setup(&locked_entry->timer, mv88e6xxx_atu_locked_entry_timer_handler, 0);

Does this have to be a dedicated timer per entry, or can you just record
the "jiffies" at creation time per locked entry, and compare it with the
current jiffies from the periodic, sleepable mv88e6xxx_atu_locked_entry_cleanup?

> +		locked_entry->timer.expires = jiffies + dp->ageing_time / 10;
> +		locked_entry->chip = chip;
> +		locked_entry->port = port;
> +		locked_entry->fid = fid;
> +		locked_entry->vid = info.vid;
> +		locked_entry->timed_out = false;
> +		memcpy(&locked_entry->mac, entry->mac, ETH_ALEN);
> +
> +		mutex_lock(&dp->locked_entries_list_lock);
> +		add_timer(&locked_entry->timer);
> +		list_add(&locked_entry->list, &dp->atu_locked_entries_list);
> +		mutex_unlock(&dp->locked_entries_list_lock);
> +		atomic_inc(&dp->atu_locked_entry_cnt);
> +
> +		mv88e6xxx_reg_unlock(chip);
> +
> +		rtnl_lock();
> +		err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE,
> +					       brport, &info.info, NULL);
> +		break;
> +	}
> +	rtnl_unlock();

Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().

> +
> +	return err;
> +
> +fail:
> +	mv88e6xxx_reg_unlock(chip);
> +	return err;
> +}
> +
> +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
> +					   const unsigned char *addr, u16 vid)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +	struct atu_locked_entry *ale, *tmp;
> +	bool found = false;
> +
> +	mutex_lock(&dp->locked_entries_list_lock);
> +	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> +		if (!memcmp(&ale->mac, addr, ETH_ALEN)) {
> +			if (ale->vid == vid) {
> +				mv88e6xxx_atu_locked_entry_purge(ale);
> +				atomic_dec(&dp->atu_locked_entry_cnt);
> +				found = true;
> +				break;
> +			}
> +		}
> +	}
> +	mutex_unlock(&dp->locked_entries_list_lock);
> +	return found;
> +}
> +
> +void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +	struct atu_locked_entry *ale, *tmp;
> +
> +	mutex_lock(&dp->locked_entries_list_lock);
> +	list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> +		mv88e6xxx_atu_locked_entry_purge(ale);
> +		atomic_dec(&dp->atu_locked_entry_cnt);
> +	}
> +	mutex_unlock(&dp->locked_entries_list_lock);
> +
> +	if (atomic_read(&dp->atu_locked_entry_cnt) != 0)
> +		dev_err(ds->dev,
> +			"ERROR: Locked entries count is not zero after flush on port %d\n",
> +			port);

And generally speaking, why would you expect it to be 0, since there's
nothing that stops this check from racing with mv88e6xxx_handle_violation?
I suspect that if locking is properly thought through, the atu_locked_entry_cnt
can just be a plain int instead of an improperly used atomic.

Also, random fact: no need to say ERROR when printing with the KERN_ERR
log level. It's kind of implied from the log level.

> +}
> +
> +void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +
> +	INIT_LIST_HEAD(&dp->atu_locked_entries_list);
> +	mutex_init(&dp->locked_entries_list_lock);
> +	dp->atu_locked_entry_cnt.counter = 0;

atomic_set()

> +	INIT_DELAYED_WORK(&dp->atu_work, mv88e6xxx_atu_locked_entry_cleanup);
> +	mod_delayed_work(system_long_wq, &dp->atu_work, msecs_to_jiffies(100));
> +}
> +
> +void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port)
> +{
> +	struct dsa_port *dp = dsa_to_port(ds, port);
> +
> +	cancel_delayed_work(&dp->atu_work);
> +	mv88e6xxx_atu_locked_entry_flush(ds, port);
> +	mutex_destroy(&dp->locked_entries_list_lock);
> +}
> diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h

This and mv88e6xxx_switchdev.c are the only source files belonging to
this driver which have the mv88e6xxx_ prefix (others are "chip.c" etc).
Can you please follow the convention?

> new file mode 100644
> index 000000000000..f0e7abf7c361
> --- /dev/null
> +++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx_switchdev.h
> @@ -0,0 +1,40 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * mv88e6xxx_switchdev.h
> + *
> + *	Authors:
> + *	Hans J. Schultz		<hans.schultz@westermo.com>
> + *
> + */
> +
> +#ifndef DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
> +#define DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_
> +
> +#include <net/switchdev.h>
> +#include "chip.h"
> +
> +#define ATU_LOCKED_ENTRIES_MAX	64
> +
> +struct atu_locked_entry {

mv88e6xxx driver specific structure names should be prefixed with mv88e6xxx_.

> +	struct list_head list;
> +	struct mv88e6xxx_chip *chip;
> +	int port;
> +	u8	mac[ETH_ALEN];

Either align everything with tabs, or nothing.

> +	u16 fid;
> +	u16 vid;
> +	struct timer_list timer;
> +	bool timed_out;
> +};
> +
> +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
> +			       int port,
> +			       struct mv88e6xxx_atu_entry *entry,
> +			       u16 fid,
> +			       u16 type);

Both this and the function definition can easily fit on 3 lines.

> +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port,
> +					   const unsigned char *addr, u16 vid);
> +void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port);
> +void mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port);
> +void mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port);
> +
> +#endif /* DRIVERS_NET_DSA_MV88E6XXX_MV88E6XXX_SWITCHDEV_H_ */
> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
> index 795b3128768f..c4e5e7174129 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -14,9 +14,11 @@
>  #include <linux/phylink.h>
>  
>  #include "chip.h"
> +#include "global1.h"
>  #include "global2.h"
>  #include "port.h"
>  #include "serdes.h"
> +#include "mv88e6xxx_switchdev.h"
>  
>  int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg,
>  			u16 *val)
> @@ -1239,6 +1241,25 @@ int mv88e6xxx_port_set_mirror(struct mv88e6xxx_chip *chip, int port,
>  	return err;
>  }
>  
> +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock)
> +{
> +	bool locked = false;
> +	u16 reg;
> +
> +	if (chiplock)
> +		mv88e6xxx_reg_lock(chip);

Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
which serves both for documentation and for validation purposes, ensure
the lock is always taken at the caller (which in this case is super easy)
and move on.

> +
> +	if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> +		goto out;

It would be good to actually propagate the error to the caller and
"locked" via a pass-by-reference bool pointer argument, not just say
that I/O errors mean that the port is unlocked.

> +	locked = reg & MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK;
> +
> +out:
> +	if (chiplock)
> +		mv88e6xxx_reg_unlock(chip);
> +
> +	return locked;
> +}
> +
>  int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
>  			    bool locked)
>  {
> @@ -1261,10 +1282,13 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
>  	if (err)
>  		return err;
>  
> -	reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
> -	if (locked)
> -		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
> -
> +	reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> +	if (locked) {
> +		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> +			MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> +			MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> +			MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

I'd suggest aligning these macros vertically.

> +	}
>  	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg);
>  }
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
> index e0a705d82019..d377abd6ab17 100644
> --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -231,6 +231,7 @@
>  #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT		0x2000
>  #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG	0x1000
>  #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED	0x0800
> +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK		0x07ff
>  
>  /* Offset 0x0C: Port ATU Control */
>  #define MV88E6XXX_PORT_ATU_CTL		0x0c
> @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid);
>  int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid);
>  int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid);
>  
> +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port, bool chiplock);
>  int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port,
>  			    bool locked);
>  
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-06-27 18:05     ` [Bridge] " Vladimir Oltean
@ 2022-06-28 12:26       ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-28 12:26 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Mon, Jun 27, 2022 at 8:06 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> Hi Hans,
>
> On Tue, May 24, 2022 at 05:21:43PM +0200, Hans Schultz wrote:
> > This implementation for the Marvell mv88e6xxx chip series, is
> > based on handling ATU miss violations occurring when packets
> > ingress on a port that is locked. The mac address triggering
> > the ATU miss violation is communicated through switchdev to
> > the bridge module, which adds a fdb entry with the fdb locked
> > flag set. The entry is kept according to the bridges ageing
> > time, thus simulating a dynamic entry.
> >
> > Note: The locked port must have learning enabled for the ATU
> > miss violation to occur.
> >
> > Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> > ---
>
> I'm sorry that I couldn't focus on the big picture of this patch,
> but locking is an absolute disaster and I just stopped after a while,
> it's really distracting :)

The code works, but I think that we should "undisaster" it. :)

>
> Would you mind addressing the feedback below first, and I'll take
> another look when you send v4?

fine :)

> >       if (err)
> >               dev_err(chip->dev,
> >                       "p%d: failed to force MAC link down\n", port);
> > +     else
> > +             if (mv88e6xxx_port_is_locked(chip, port, true))
> > +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
>
> This is superfluous, is it not? The bridge will transition a port whose
> link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> fast-age the dynamic FDB entries on the port, which you've already
> handled below.

I guess you are right.


> >  }
> >
> >  static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port,
> > @@ -1685,6 +1689,9 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
> >       struct mv88e6xxx_chip *chip = ds->priv;
> >       int err;
> >
> > +     if (mv88e6xxx_port_is_locked(chip, port, true))
> > +             mv88e6xxx_atu_locked_entry_flush(ds, port);
> > +
>
> Dumb question: if you only flush the locked entries at fast age if the
> port is locked, then what happens with the existing locked entries if
> the port becomes unlocked before an FDB flush takes place?
> Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> too?

That was my first thought too, but the way the flags are handled with
the mask etc, does so that
mv88e6xxx_port_set_lock() is called when other flags change. It could
be done by the transition
from locked->unlocked by checking if the port is locked already.
On the other hand, the timers will timeout and the entries will be
removed anyhow.

> > +     if (mv88e6xxx_port_is_locked(chip, port, true))
> > +             mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
> > +
> >       mv88e6xxx_reg_lock(chip);
> >       err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
> > -                                        MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
> > +                     MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
>
> Unrelated and unjustified change.
>
Ups, missed that one.

> >  static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port)
> >  {
> > -     return mv88e6xxx_setup_devlink_regions_port(ds, port);
> > +     int err;
> > +
> > +     err = mv88e6xxx_setup_devlink_regions_port(ds, port);
> > +     mv88e6xxx_init_violation_handler(ds, port);
>
> What's with this quirky placement? You need to do error checking and
> call mv88e6xxx_teardown_violation_handler() if setting up the devlink
> port regions fails, otherwise the port will fail to probe but no one
> will quiesce its delayed ATU work.

Yes, of course.

> By the way, do all mv88e6xxx switches support 802.1X and MAC Auth Bypass,
> or do we need to initialize these structures depending on some capability?

I will have to look into that, but I think they all do support these features.

> > +     err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP,
> > +                              MV88E6XXX_G1_ATU_OP_BUSY | MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
>
> Split on 3 lines please.

OK.

> > +     if (err)
> > +             return err;
> > +
> > +     return mv88e6xxx_g1_atu_op_wait(chip);
> > +}
> > +
> >  static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op)
> >  {
> >       u16 val;
> > @@ -356,11 +370,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
> >       int spid;
> >       int err;
> >       u16 val;
> > +     u16 fid;
> >
> >       mv88e6xxx_reg_lock(chip);
> >
> > -     err = mv88e6xxx_g1_atu_op(chip, 0,
> > -                               MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
> > +     err = mv88e6xxx_g1_read_atu_violation(chip);
>
> I cannot comment on the validity of this change: previously, we were
> writing FID 0 as part of mv88e6xxx_g1_atu_op(), now we are reading back
> the FID. Definitely too much going on in a single change, this needs a
> separate patch with an explanation.

It is of course needed to read the fid and I couldn't really
understand the reasoning behind how it was before,
but I will do as you say as best I can.

>
> >       if (err)
> >               goto out;
> >
> > @@ -368,6 +382,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
> >       if (err)
> >               goto out;
> >
> > +     err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid);
> > +     if (err)
> > +             goto out;
>
> Is it ok to read the MV88E6352_G1_ATU_FID register from an IRQ handler
> common for all switches, I wonder?

I don't know about the naming of this define (I probably overlooked
the 6352 part), but it is the same as I have in the
spec for 6097, and I don't see any alternative...

> > +
> >       err = mv88e6xxx_g1_atu_data_read(chip, &entry);
> >       if (err)
> >               goto out;
> > @@ -382,6 +400,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
> >               dev_err_ratelimited(chip->dev,
> >                                   "ATU age out violation for %pM\n",
> >                                   entry.mac);
> > +             err = mv88e6xxx_handle_violation(chip,
> > +                                              chip->ports[spid].port,
>
> Dumb question: isn't chip->ports[spid].port == spid?

Probably you are right.

>
> > +                                              &entry,
> > +                                              fid,
> > +                                              MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION);
>
> This fits on 3 lines instead of 5 (and same below).

OK

>
> > +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
>
> Please find a more adequate name for this function.

Any suggestions?

>
> > +{
> > +     struct switchdev_notifier_fdb_info info = {
> > +             .addr = ale->mac,
> > +             .vid = ale->vid,
> > +             .added_by_user = false,
> > +             .is_local = false,
>
> No need to have an initializer for the false members.

OK

> > +             .offloaded = true,
> > +             .locked = true,
> > +     };
> > +     struct mv88e6xxx_atu_entry entry;
> > +     struct net_device *brport;
> > +     struct dsa_port *dp;
> > +
> > +     entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
> > +     entry.trunk = false;
> > +     memcpy(&entry.mac, &ale->mac, ETH_ALEN);
>
> ether_addr_copy
>
> > +
> > +     mv88e6xxx_reg_lock(ale->chip);
> > +     mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);
>
> The portvec will be junk memory that's on stack, is that what you want?
>
Probably not what I want.

> > +     if (brport) {
> > +             if (!rtnl_is_locked()) {
> > +                     rtnl_lock();
>
> No, no, no, no, no, no, no.
>
> As I've explained already:
> https://patchwork.kernel.org/project/netdevbpf/patch/20220317093902.1305816-4-schultz.hans+netdev@gmail.com/#24782974
> dsa_port_to_bridge_port() needs to be called with the rtnl_mutex held.
>
> Please take a moment to figure out which function expects which lock and
> for what operation, then draw a call graph, figure out a consistent lock
> hierarchy where things are always acquired in the same order, and if a
> function needs a locking context but not all callers offer it, put an
> ASSERT_RTNL() (for example) and transfer the locking responsibility to
> the caller.

As I remember it was because mv88e6xxx_atu_locked_entry_flush() was called both
with and without the lock, but there was something I didn't know about
how link down
handling works.

>
> Doing this will also help you name your functions better than
> "locked entry timer work" (which are called from... drum roll...
> mv88e6xxx_port_fdb_del and mv88e6xxx_port_fast_age).
>
> Which by the way, reminds me that.....
> You can't take rtnl_lock() from port_fdb_add() and port_fdb_del(),
> see commits d7d0d423dbaa ("net: dsa: flush switchdev workqueue when
> leaving the bridge") and 0faf890fc519 ("net: dsa: drop rtnl_lock from
> dsa_slave_switchdev_event_work"), as you'll deadlock with
> dsa_port_pre_bridge_leave(). In fact you never could, but for a slightly
> different reason.
>
> From the discussion with Ido and Nikolay I get the impression that
> you're not doing the right thing here either, notifying a
> SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).

Hmm, my experience tells me that much is opposite the normal
conventions when dealing with
locked ports, as there was never switchdev notifications from the
driver to the bridge before, but
that is needed to keep ATU and FDB entries in sync.

>
> No inline functions in .c files.

OK

> Nasty lock ordering inversion. In mv88e6xxx_handle_violation() we take
> &dp->locked_entries_list_lock with mv88e6xxx_reg_lock() held.
> Here (in mv88e6xxx_atu_locked_entry_timer_work called from here) we take
> mv88e6xxx_reg_lock() with &dp->locked_entries_list_lock held.
>
I will look into that.

> > +     switch (type) {
> > +     case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION:
> > +             if (atomic_read(&dp->atu_locked_entry_cnt) >= ATU_LOCKED_ENTRIES_MAX) {
> > +                     mv88e6xxx_reg_unlock(chip);
>
> You call mv88e6xxx_reg_lock() from mv88e6xxx_g1_atu_prob_irq_thread_fn()
> and mv88e6xxx_reg_unlock() from mv88e6xxx_handle_violation()? Nice!
>
> And I understand why that is: to avoid a lock ordering inversion with
> rtnl_lock(). Just unlock the mv88e6xxx registers after the last hardware
> access in mv88e6xxx_g1_atu_prob_irq_thread_fn() - after mv88e6xxx_g1_atu_mac_read(),
> and call mv88e6xxx_handle_violation() with the registers unlocked, and
> lock them when you need them.

OK.

> > +             locked_entry = kmalloc(sizeof(*locked_entry), GFP_ATOMIC);
>
> Please be consistent in your naming of struct atu_locked_entry
> variables, be they "locked_entry" or "ale" or otherwise.
> And please create a helper function that creates such a structure and
> initializes it.

OK

> > +             if (!locked_entry)
> > +                     return -ENOMEM;
> > +             timer_setup(&locked_entry->timer, mv88e6xxx_atu_locked_entry_timer_handler, 0);
>
> Does this have to be a dedicated timer per entry, or can you just record
> the "jiffies" at creation time per locked entry, and compare it with the
> current jiffies from the periodic, sleepable mv88e6xxx_atu_locked_entry_cleanup?

I think that approach should be sufficient too.

>
> Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().

Just a small optimization as I also have another case of the switch
(only one switch case if
you didn't notice) belonging to the next patch set regarding dynamic
ATU entries.

> > +void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port)
> > +{
> > +     struct dsa_port *dp = dsa_to_port(ds, port);
> > +     struct atu_locked_entry *ale, *tmp;
> > +
> > +     mutex_lock(&dp->locked_entries_list_lock);
> > +     list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> > +             mv88e6xxx_atu_locked_entry_purge(ale);
> > +             atomic_dec(&dp->atu_locked_entry_cnt);
> > +     }
> > +     mutex_unlock(&dp->locked_entries_list_lock);
> > +
> > +     if (atomic_read(&dp->atu_locked_entry_cnt) != 0)
> > +             dev_err(ds->dev,
> > +                     "ERROR: Locked entries count is not zero after flush on port %d\n",
> > +                     port);
>
> And generally speaking, why would you expect it to be 0, since there's
> nothing that stops this check from racing with mv88e6xxx_handle_violation?

I guess you are right that when setting the port STP state to BLOCKED, there is
the potential race you mention.

> Also, random fact: no need to say ERROR when printing with the KERN_ERR
> log level. It's kind of implied from the log level.

Of course.

> > +     dp->atu_locked_entry_cnt.counter = 0;
>
> atomic_set()

Right!

> This and mv88e6xxx_switchdev.c are the only source files belonging to
> this driver which have the mv88e6xxx_ prefix (others are "chip.c" etc).
> Can you please follow the convention?

Yes. I think I got that idea from some other driver, thus avoiding
switchdev.h file,
but I will change it.

> > +struct atu_locked_entry {
>
> mv88e6xxx driver specific structure names should be prefixed with mv88e6xxx_.

OK

> > +     u8      mac[ETH_ALEN];
>
> Either align everything with tabs, or nothing.

Ups.

> > +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
> > +                            int port,
> > +                            struct mv88e6xxx_atu_entry *entry,
> > +                            u16 fid,
> > +                            u16 type);
>
> Both this and the function definition can easily fit on 3 lines.

OK

> Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
> which serves both for documentation and for validation purposes, ensure
> the lock is always taken at the caller (which in this case is super easy)
> and move on.

As I am calling the function in if statement checks, it would make
that code more messy, while with
this approach the function can be called from anywhere. I also looked
at having two functions, with
one being a wrapper function taking the lock and calling the other...

>
> > +
> > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > +             goto out;
>
> It would be good to actually propagate the error to the caller and
> "locked" via a pass-by-reference bool pointer argument, not just say
> that I/O errors mean that the port is unlocked.

Again the wish to be able to call it from if statement checks,.

> > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > +     if (locked) {
> > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
>
> I'd suggest aligning these macros vertically.

They are according to the Linux kernel coding standard wrt indentation afaik.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-06-28 12:26       ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-06-28 12:26 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Mon, Jun 27, 2022 at 8:06 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> Hi Hans,
>
> On Tue, May 24, 2022 at 05:21:43PM +0200, Hans Schultz wrote:
> > This implementation for the Marvell mv88e6xxx chip series, is
> > based on handling ATU miss violations occurring when packets
> > ingress on a port that is locked. The mac address triggering
> > the ATU miss violation is communicated through switchdev to
> > the bridge module, which adds a fdb entry with the fdb locked
> > flag set. The entry is kept according to the bridges ageing
> > time, thus simulating a dynamic entry.
> >
> > Note: The locked port must have learning enabled for the ATU
> > miss violation to occur.
> >
> > Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
> > ---
>
> I'm sorry that I couldn't focus on the big picture of this patch,
> but locking is an absolute disaster and I just stopped after a while,
> it's really distracting :)

The code works, but I think that we should "undisaster" it. :)

>
> Would you mind addressing the feedback below first, and I'll take
> another look when you send v4?

fine :)

> >       if (err)
> >               dev_err(chip->dev,
> >                       "p%d: failed to force MAC link down\n", port);
> > +     else
> > +             if (mv88e6xxx_port_is_locked(chip, port, true))
> > +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
>
> This is superfluous, is it not? The bridge will transition a port whose
> link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> fast-age the dynamic FDB entries on the port, which you've already
> handled below.

I guess you are right.


> >  }
> >
> >  static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port,
> > @@ -1685,6 +1689,9 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
> >       struct mv88e6xxx_chip *chip = ds->priv;
> >       int err;
> >
> > +     if (mv88e6xxx_port_is_locked(chip, port, true))
> > +             mv88e6xxx_atu_locked_entry_flush(ds, port);
> > +
>
> Dumb question: if you only flush the locked entries at fast age if the
> port is locked, then what happens with the existing locked entries if
> the port becomes unlocked before an FDB flush takes place?
> Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> too?

That was my first thought too, but the way the flags are handled with
the mask etc, does so that
mv88e6xxx_port_set_lock() is called when other flags change. It could
be done by the transition
from locked->unlocked by checking if the port is locked already.
On the other hand, the timers will timeout and the entries will be
removed anyhow.

> > +     if (mv88e6xxx_port_is_locked(chip, port, true))
> > +             mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
> > +
> >       mv88e6xxx_reg_lock(chip);
> >       err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
> > -                                        MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
> > +                     MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
>
> Unrelated and unjustified change.
>
Ups, missed that one.

> >  static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port)
> >  {
> > -     return mv88e6xxx_setup_devlink_regions_port(ds, port);
> > +     int err;
> > +
> > +     err = mv88e6xxx_setup_devlink_regions_port(ds, port);
> > +     mv88e6xxx_init_violation_handler(ds, port);
>
> What's with this quirky placement? You need to do error checking and
> call mv88e6xxx_teardown_violation_handler() if setting up the devlink
> port regions fails, otherwise the port will fail to probe but no one
> will quiesce its delayed ATU work.

Yes, of course.

> By the way, do all mv88e6xxx switches support 802.1X and MAC Auth Bypass,
> or do we need to initialize these structures depending on some capability?

I will have to look into that, but I think they all do support these features.

> > +     err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP,
> > +                              MV88E6XXX_G1_ATU_OP_BUSY | MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
>
> Split on 3 lines please.

OK.

> > +     if (err)
> > +             return err;
> > +
> > +     return mv88e6xxx_g1_atu_op_wait(chip);
> > +}
> > +
> >  static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op)
> >  {
> >       u16 val;
> > @@ -356,11 +370,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
> >       int spid;
> >       int err;
> >       u16 val;
> > +     u16 fid;
> >
> >       mv88e6xxx_reg_lock(chip);
> >
> > -     err = mv88e6xxx_g1_atu_op(chip, 0,
> > -                               MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION);
> > +     err = mv88e6xxx_g1_read_atu_violation(chip);
>
> I cannot comment on the validity of this change: previously, we were
> writing FID 0 as part of mv88e6xxx_g1_atu_op(), now we are reading back
> the FID. Definitely too much going on in a single change, this needs a
> separate patch with an explanation.

It is of course needed to read the fid and I couldn't really
understand the reasoning behind how it was before,
but I will do as you say as best I can.

>
> >       if (err)
> >               goto out;
> >
> > @@ -368,6 +382,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
> >       if (err)
> >               goto out;
> >
> > +     err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid);
> > +     if (err)
> > +             goto out;
>
> Is it ok to read the MV88E6352_G1_ATU_FID register from an IRQ handler
> common for all switches, I wonder?

I don't know about the naming of this define (I probably overlooked
the 6352 part), but it is the same as I have in the
spec for 6097, and I don't see any alternative...

> > +
> >       err = mv88e6xxx_g1_atu_data_read(chip, &entry);
> >       if (err)
> >               goto out;
> > @@ -382,6 +400,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
> >               dev_err_ratelimited(chip->dev,
> >                                   "ATU age out violation for %pM\n",
> >                                   entry.mac);
> > +             err = mv88e6xxx_handle_violation(chip,
> > +                                              chip->ports[spid].port,
>
> Dumb question: isn't chip->ports[spid].port == spid?

Probably you are right.

>
> > +                                              &entry,
> > +                                              fid,
> > +                                              MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION);
>
> This fits on 3 lines instead of 5 (and same below).

OK

>
> > +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
>
> Please find a more adequate name for this function.

Any suggestions?

>
> > +{
> > +     struct switchdev_notifier_fdb_info info = {
> > +             .addr = ale->mac,
> > +             .vid = ale->vid,
> > +             .added_by_user = false,
> > +             .is_local = false,
>
> No need to have an initializer for the false members.

OK

> > +             .offloaded = true,
> > +             .locked = true,
> > +     };
> > +     struct mv88e6xxx_atu_entry entry;
> > +     struct net_device *brport;
> > +     struct dsa_port *dp;
> > +
> > +     entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED;
> > +     entry.trunk = false;
> > +     memcpy(&entry.mac, &ale->mac, ETH_ALEN);
>
> ether_addr_copy
>
> > +
> > +     mv88e6xxx_reg_lock(ale->chip);
> > +     mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry);
>
> The portvec will be junk memory that's on stack, is that what you want?
>
Probably not what I want.

> > +     if (brport) {
> > +             if (!rtnl_is_locked()) {
> > +                     rtnl_lock();
>
> No, no, no, no, no, no, no.
>
> As I've explained already:
> https://patchwork.kernel.org/project/netdevbpf/patch/20220317093902.1305816-4-schultz.hans+netdev@gmail.com/#24782974
> dsa_port_to_bridge_port() needs to be called with the rtnl_mutex held.
>
> Please take a moment to figure out which function expects which lock and
> for what operation, then draw a call graph, figure out a consistent lock
> hierarchy where things are always acquired in the same order, and if a
> function needs a locking context but not all callers offer it, put an
> ASSERT_RTNL() (for example) and transfer the locking responsibility to
> the caller.

As I remember it was because mv88e6xxx_atu_locked_entry_flush() was called both
with and without the lock, but there was something I didn't know about
how link down
handling works.

>
> Doing this will also help you name your functions better than
> "locked entry timer work" (which are called from... drum roll...
> mv88e6xxx_port_fdb_del and mv88e6xxx_port_fast_age).
>
> Which by the way, reminds me that.....
> You can't take rtnl_lock() from port_fdb_add() and port_fdb_del(),
> see commits d7d0d423dbaa ("net: dsa: flush switchdev workqueue when
> leaving the bridge") and 0faf890fc519 ("net: dsa: drop rtnl_lock from
> dsa_slave_switchdev_event_work"), as you'll deadlock with
> dsa_port_pre_bridge_leave(). In fact you never could, but for a slightly
> different reason.
>
> From the discussion with Ido and Nikolay I get the impression that
> you're not doing the right thing here either, notifying a
> SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).

Hmm, my experience tells me that much is opposite the normal
conventions when dealing with
locked ports, as there was never switchdev notifications from the
driver to the bridge before, but
that is needed to keep ATU and FDB entries in sync.

>
> No inline functions in .c files.

OK

> Nasty lock ordering inversion. In mv88e6xxx_handle_violation() we take
> &dp->locked_entries_list_lock with mv88e6xxx_reg_lock() held.
> Here (in mv88e6xxx_atu_locked_entry_timer_work called from here) we take
> mv88e6xxx_reg_lock() with &dp->locked_entries_list_lock held.
>
I will look into that.

> > +     switch (type) {
> > +     case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION:
> > +             if (atomic_read(&dp->atu_locked_entry_cnt) >= ATU_LOCKED_ENTRIES_MAX) {
> > +                     mv88e6xxx_reg_unlock(chip);
>
> You call mv88e6xxx_reg_lock() from mv88e6xxx_g1_atu_prob_irq_thread_fn()
> and mv88e6xxx_reg_unlock() from mv88e6xxx_handle_violation()? Nice!
>
> And I understand why that is: to avoid a lock ordering inversion with
> rtnl_lock(). Just unlock the mv88e6xxx registers after the last hardware
> access in mv88e6xxx_g1_atu_prob_irq_thread_fn() - after mv88e6xxx_g1_atu_mac_read(),
> and call mv88e6xxx_handle_violation() with the registers unlocked, and
> lock them when you need them.

OK.

> > +             locked_entry = kmalloc(sizeof(*locked_entry), GFP_ATOMIC);
>
> Please be consistent in your naming of struct atu_locked_entry
> variables, be they "locked_entry" or "ale" or otherwise.
> And please create a helper function that creates such a structure and
> initializes it.

OK

> > +             if (!locked_entry)
> > +                     return -ENOMEM;
> > +             timer_setup(&locked_entry->timer, mv88e6xxx_atu_locked_entry_timer_handler, 0);
>
> Does this have to be a dedicated timer per entry, or can you just record
> the "jiffies" at creation time per locked entry, and compare it with the
> current jiffies from the periodic, sleepable mv88e6xxx_atu_locked_entry_cleanup?

I think that approach should be sufficient too.

>
> Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().

Just a small optimization as I also have another case of the switch
(only one switch case if
you didn't notice) belonging to the next patch set regarding dynamic
ATU entries.

> > +void mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port)
> > +{
> > +     struct dsa_port *dp = dsa_to_port(ds, port);
> > +     struct atu_locked_entry *ale, *tmp;
> > +
> > +     mutex_lock(&dp->locked_entries_list_lock);
> > +     list_for_each_entry_safe(ale, tmp, &dp->atu_locked_entries_list, list) {
> > +             mv88e6xxx_atu_locked_entry_purge(ale);
> > +             atomic_dec(&dp->atu_locked_entry_cnt);
> > +     }
> > +     mutex_unlock(&dp->locked_entries_list_lock);
> > +
> > +     if (atomic_read(&dp->atu_locked_entry_cnt) != 0)
> > +             dev_err(ds->dev,
> > +                     "ERROR: Locked entries count is not zero after flush on port %d\n",
> > +                     port);
>
> And generally speaking, why would you expect it to be 0, since there's
> nothing that stops this check from racing with mv88e6xxx_handle_violation?

I guess you are right that when setting the port STP state to BLOCKED, there is
the potential race you mention.

> Also, random fact: no need to say ERROR when printing with the KERN_ERR
> log level. It's kind of implied from the log level.

Of course.

> > +     dp->atu_locked_entry_cnt.counter = 0;
>
> atomic_set()

Right!

> This and mv88e6xxx_switchdev.c are the only source files belonging to
> this driver which have the mv88e6xxx_ prefix (others are "chip.c" etc).
> Can you please follow the convention?

Yes. I think I got that idea from some other driver, thus avoiding
switchdev.h file,
but I will change it.

> > +struct atu_locked_entry {
>
> mv88e6xxx driver specific structure names should be prefixed with mv88e6xxx_.

OK

> > +     u8      mac[ETH_ALEN];
>
> Either align everything with tabs, or nothing.

Ups.

> > +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip,
> > +                            int port,
> > +                            struct mv88e6xxx_atu_entry *entry,
> > +                            u16 fid,
> > +                            u16 type);
>
> Both this and the function definition can easily fit on 3 lines.

OK

> Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
> which serves both for documentation and for validation purposes, ensure
> the lock is always taken at the caller (which in this case is super easy)
> and move on.

As I am calling the function in if statement checks, it would make
that code more messy, while with
this approach the function can be called from anywhere. I also looked
at having two functions, with
one being a wrapper function taking the lock and calling the other...

>
> > +
> > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > +             goto out;
>
> It would be good to actually propagate the error to the caller and
> "locked" via a pass-by-reference bool pointer argument, not just say
> that I/O errors mean that the port is unlocked.

Again the wish to be able to call it from if statement checks,.

> > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > +     if (locked) {
> > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
>
> I'd suggest aligning these macros vertically.

They are according to the Linux kernel coding standard wrt indentation afaik.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-06-28 12:26       ` [Bridge] " Hans S
@ 2022-07-05 15:05         ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-05 15:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

Hi, does anybody know what it going on with this variable?
struct dsa_port *dp ->ageing_time;

I experience that it changes value like a factor ~10 at times.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-05 15:05         ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-05 15:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

Hi, does anybody know what it going on with this variable?
struct dsa_port *dp ->ageing_time;

I experience that it changes value like a factor ~10 at times.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-06-28 12:26       ` [Bridge] " Hans S
@ 2022-07-06  8:55         ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06  8:55 UTC (permalink / raw)
  To: Hans S
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Tue, Jun 28, 2022 at 02:26:43PM +0200, Hans S wrote:
> > Dumb question: if you only flush the locked entries at fast age if the
> > port is locked, then what happens with the existing locked entries if
> > the port becomes unlocked before an FDB flush takes place?
> > Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> > too?
> 
> That was my first thought too, but the way the flags are handled with the mask etc, does so that
> mv88e6xxx_port_set_lock() is called when other flags change. It could be done by the transition
> from locked->unlocked by checking if the port is locked already.

Why does mv88e6xxx_port_set_lock() get called when other flags change?

> On the other hand, the timers will timeout and the entries will be removed anyhow.

> > > +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
> >
> > Please find a more adequate name for this function.
> 
> Any suggestions?

Not sure. It depends on whether you leave just the logic to delete a
locked ATU entry, or also the switchdev FDB_DEL_TO_BRIDGE notifier.
In any case, pick a name that reflects what it does. Something with
locked_entry_delete() can't be too wrong.

> > From the discussion with Ido and Nikolay I get the impression that
> > you're not doing the right thing here either, notifying a
> > SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> > SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).
> 
> Hmm, my experience tells me that much is opposite the normal
> conventions when dealing with
> locked ports, as there was never switchdev notifications from the
> driver to the bridge before, but
> that is needed to keep ATU and FDB entries in sync.

On delete you mean? So the bridge signals switchdev a deletion of a
locked FDB entry (as I pointed out, this function gets indirectly called
from port_fdb_del), but it won't get deleted until switchdev signals it
back, is what you're saying?

> > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> 
> Just a small optimization as I also have another case of the switch
> (only one switch case if
> you didn't notice) belonging to the next patch set regarding dynamic
> ATU entries.

What kind of optimization are you even talking about? Please get rid of
coding patterns like this, sorry.

> > Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
> > which serves both for documentation and for validation purposes, ensure
> > the lock is always taken at the caller (which in this case is super easy)
> > and move on.
> 
> As I am calling the function in if statement checks, it would make
> that code more messy, while with
> this approach the function can be called from anywhere. I also looked
> at having two functions, with
> one being a wrapper function taking the lock and calling the other...

There are many functions in mv88e6xxx that require the reg_lock to be
held, there's nothing new or special here.

> >
> > > +
> > > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > > +             goto out;
> >
> > It would be good to actually propagate the error to the caller and
> > "locked" via a pass-by-reference bool pointer argument, not just say
> > that I/O errors mean that the port is unlocked.
> 
> Again the wish to be able to call it from if statement checks,.
> 
> > > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > > +     if (locked) {
> > > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> >
> > I'd suggest aligning these macros vertically.
> 
> They are according to the Linux kernel coding standard wrt indentation afaik.

Compare:

		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
			MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
			MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
			MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

with:

		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
		       MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
		       MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
		       MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06  8:55         ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06  8:55 UTC (permalink / raw)
  To: Hans S
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Tue, Jun 28, 2022 at 02:26:43PM +0200, Hans S wrote:
> > Dumb question: if you only flush the locked entries at fast age if the
> > port is locked, then what happens with the existing locked entries if
> > the port becomes unlocked before an FDB flush takes place?
> > Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> > too?
> 
> That was my first thought too, but the way the flags are handled with the mask etc, does so that
> mv88e6xxx_port_set_lock() is called when other flags change. It could be done by the transition
> from locked->unlocked by checking if the port is locked already.

Why does mv88e6xxx_port_set_lock() get called when other flags change?

> On the other hand, the timers will timeout and the entries will be removed anyhow.

> > > +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
> >
> > Please find a more adequate name for this function.
> 
> Any suggestions?

Not sure. It depends on whether you leave just the logic to delete a
locked ATU entry, or also the switchdev FDB_DEL_TO_BRIDGE notifier.
In any case, pick a name that reflects what it does. Something with
locked_entry_delete() can't be too wrong.

> > From the discussion with Ido and Nikolay I get the impression that
> > you're not doing the right thing here either, notifying a
> > SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> > SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).
> 
> Hmm, my experience tells me that much is opposite the normal
> conventions when dealing with
> locked ports, as there was never switchdev notifications from the
> driver to the bridge before, but
> that is needed to keep ATU and FDB entries in sync.

On delete you mean? So the bridge signals switchdev a deletion of a
locked FDB entry (as I pointed out, this function gets indirectly called
from port_fdb_del), but it won't get deleted until switchdev signals it
back, is what you're saying?

> > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> 
> Just a small optimization as I also have another case of the switch
> (only one switch case if
> you didn't notice) belonging to the next patch set regarding dynamic
> ATU entries.

What kind of optimization are you even talking about? Please get rid of
coding patterns like this, sorry.

> > Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
> > which serves both for documentation and for validation purposes, ensure
> > the lock is always taken at the caller (which in this case is super easy)
> > and move on.
> 
> As I am calling the function in if statement checks, it would make
> that code more messy, while with
> this approach the function can be called from anywhere. I also looked
> at having two functions, with
> one being a wrapper function taking the lock and calling the other...

There are many functions in mv88e6xxx that require the reg_lock to be
held, there's nothing new or special here.

> >
> > > +
> > > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > > +             goto out;
> >
> > It would be good to actually propagate the error to the caller and
> > "locked" via a pass-by-reference bool pointer argument, not just say
> > that I/O errors mean that the port is unlocked.
> 
> Again the wish to be able to call it from if statement checks,.
> 
> > > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > > +     if (locked) {
> > > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> >
> > I'd suggest aligning these macros vertically.
> 
> They are according to the Linux kernel coding standard wrt indentation afaik.

Compare:

		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
			MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
			MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
			MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

with:

		reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
		       MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
		       MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
		       MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-06  8:55         ` [Bridge] " Vladimir Oltean
@ 2022-07-06 10:12           ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 10:12 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Wed, Jul 6, 2022 at 10:56 AM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Tue, Jun 28, 2022 at 02:26:43PM +0200, Hans S wrote:
> > > Dumb question: if you only flush the locked entries at fast age if the
> > > port is locked, then what happens with the existing locked entries if
> > > the port becomes unlocked before an FDB flush takes place?
> > > Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> > > too?

BTW:
>> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
>>       if (err)
>>               dev_err(chip->dev,
>>                       "p%d: failed to force MAC link down\n", port);
>> +     else
>> +             if (mv88e6xxx_port_is_locked(chip, port, true))
>> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
>
>This is superfluous, is it not? The bridge will transition a port whose
>link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
>fast-age the dynamic FDB entries on the port, which you've already
>handled below.

I removed this code, but then on link down the locked entries were not
cleared out. Something not as thought?

> >
> > That was my first thought too, but the way the flags are handled with the mask etc, does so that
> > mv88e6xxx_port_set_lock() is called when other flags change. It could be done by the transition
> > from locked->unlocked by checking if the port is locked already.
>
> Why does mv88e6xxx_port_set_lock() get called when other flags change?

That is what seems to happen, but maybe I am wrong. Looking at the dsa
functions dsa_port_inherit_brport_flags() and
dsa_port_clear_brport_flags(), they set the mask for which underlying
function is called, and it looks to me that they call once for all the
flags: BR_LEARNING, BR_FLOOD, BR_MCAST_FLOOD, BR_BCAST_FLOOD,
BR_PORT_LOCKED?

>
> > On the other hand, the timers will timeout and the entries will be removed anyhow.
>
> > > > +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
> > >
> > > Please find a more adequate name for this function.
> >
> > Any suggestions?
>
> Not sure. It depends on whether you leave just the logic to delete a
> locked ATU entry, or also the switchdev FDB_DEL_TO_BRIDGE notifier.
> In any case, pick a name that reflects what it does. Something with
> locked_entry_delete() can't be too wrong.
>
> > > From the discussion with Ido and Nikolay I get the impression that
> > > you're not doing the right thing here either, notifying a
> > > SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> > > SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).
> >
> > Hmm, my experience tells me that much is opposite the normal
> > conventions when dealing with
> > locked ports, as there was never switchdev notifications from the
> > driver to the bridge before, but
> > that is needed to keep ATU and FDB entries in sync.
>
> On delete you mean? So the bridge signals switchdev a deletion of a
> locked FDB entry (as I pointed out, this function gets indirectly called
> from port_fdb_del), but it won't get deleted until switchdev signals it
> back, is what you're saying?
>

When added they are added with bridge FDB flags: extern_learn offload
locked, with a SWITCHDEV_FDB_ADD_TO_BRIDGE event. So they are owned by
the driver.
When the driver deletes the locked entry the bridge FDB entry is
removes by the SWITCHDEV_FDB_DEL_TO_BRIDGE event from the driver. That
seems quite fair?

> > > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> >
> > Just a small optimization as I also have another case of the switch
> > (only one switch case if
> > you didn't notice) belonging to the next patch set regarding dynamic
> > ATU entries.
>
> What kind of optimization are you even talking about? Please get rid of
> coding patterns like this, sorry.
>
Right!

> > > Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
> > > which serves both for documentation and for validation purposes, ensure
> > > the lock is always taken at the caller (which in this case is super easy)
> > > and move on.
> >
> > As I am calling the function in if statement checks, it would make
> > that code more messy, while with
> > this approach the function can be called from anywhere. I also looked
> > at having two functions, with
> > one being a wrapper function taking the lock and calling the other...
>
> There are many functions in mv88e6xxx that require the reg_lock to be
> held, there's nothing new or special here.
>

Now I take the lock in the function regardless. No boolean.

> > >
> > > > +
> > > > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > > > +             goto out;
> > >
> > > It would be good to actually propagate the error to the caller and
> > > "locked" via a pass-by-reference bool pointer argument, not just say
> > > that I/O errors mean that the port is unlocked.
> >
> > Again the wish to be able to call it from if statement checks,.
> >
> > > > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > > > +     if (locked) {
> > > > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> > >
> > > I'd suggest aligning these macros vertically.
> >
> > They are according to the Linux kernel coding standard wrt indentation afaik.
>
> Compare:
>
>                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
>                         MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
>                         MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
>                         MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
>
> with:
>
>                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
>                        MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
>                        MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
>                        MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

I cannot see the difference here...?

Another issue...

I have removed the timers as they are superfluous and now just use the
worker and jiffies. But I have found that the whole ageing time seems
to be broken on the 5.17 kernel I am running. I don't know if it has
been fixed, but the ageing timeout is supposed to be given in seconds.
Here is the output from various functions after the command "ip link
set dev br0 type bridge ageing_time 1500" (that is nominally 1500
seconds according to man page!):

dsa_switch_ageing_time: ageing_time 10000, ageing_time_min 1000,
ageing_time_max 3825000
mv88e6xxx_set_ageing_time: set ageing time to 10000
br0: failed (err=-34) to set attribute (id=6)
dsa_switch_ageing_time: ageing_time 15000, ageing_time_min 1000,
ageing_time_max 3825000
mv88e6xxx_set_ageing_time: set ageing time to 15000

The 15000 set corresponds to 150 seconds! (I hardcoded the dsa
ageing_time_min to 1000)

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06 10:12           ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 10:12 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Wed, Jul 6, 2022 at 10:56 AM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Tue, Jun 28, 2022 at 02:26:43PM +0200, Hans S wrote:
> > > Dumb question: if you only flush the locked entries at fast age if the
> > > port is locked, then what happens with the existing locked entries if
> > > the port becomes unlocked before an FDB flush takes place?
> > > Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> > > too?

BTW:
>> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
>>       if (err)
>>               dev_err(chip->dev,
>>                       "p%d: failed to force MAC link down\n", port);
>> +     else
>> +             if (mv88e6xxx_port_is_locked(chip, port, true))
>> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
>
>This is superfluous, is it not? The bridge will transition a port whose
>link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
>fast-age the dynamic FDB entries on the port, which you've already
>handled below.

I removed this code, but then on link down the locked entries were not
cleared out. Something not as thought?

> >
> > That was my first thought too, but the way the flags are handled with the mask etc, does so that
> > mv88e6xxx_port_set_lock() is called when other flags change. It could be done by the transition
> > from locked->unlocked by checking if the port is locked already.
>
> Why does mv88e6xxx_port_set_lock() get called when other flags change?

That is what seems to happen, but maybe I am wrong. Looking at the dsa
functions dsa_port_inherit_brport_flags() and
dsa_port_clear_brport_flags(), they set the mask for which underlying
function is called, and it looks to me that they call once for all the
flags: BR_LEARNING, BR_FLOOD, BR_MCAST_FLOOD, BR_BCAST_FLOOD,
BR_PORT_LOCKED?

>
> > On the other hand, the timers will timeout and the entries will be removed anyhow.
>
> > > > +static void mv88e6xxx_atu_locked_entry_timer_work(struct atu_locked_entry *ale)
> > >
> > > Please find a more adequate name for this function.
> >
> > Any suggestions?
>
> Not sure. It depends on whether you leave just the logic to delete a
> locked ATU entry, or also the switchdev FDB_DEL_TO_BRIDGE notifier.
> In any case, pick a name that reflects what it does. Something with
> locked_entry_delete() can't be too wrong.
>
> > > From the discussion with Ido and Nikolay I get the impression that
> > > you're not doing the right thing here either, notifying a
> > > SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> > > SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).
> >
> > Hmm, my experience tells me that much is opposite the normal
> > conventions when dealing with
> > locked ports, as there was never switchdev notifications from the
> > driver to the bridge before, but
> > that is needed to keep ATU and FDB entries in sync.
>
> On delete you mean? So the bridge signals switchdev a deletion of a
> locked FDB entry (as I pointed out, this function gets indirectly called
> from port_fdb_del), but it won't get deleted until switchdev signals it
> back, is what you're saying?
>

When added they are added with bridge FDB flags: extern_learn offload
locked, with a SWITCHDEV_FDB_ADD_TO_BRIDGE event. So they are owned by
the driver.
When the driver deletes the locked entry the bridge FDB entry is
removes by the SWITCHDEV_FDB_DEL_TO_BRIDGE event from the driver. That
seems quite fair?

> > > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> >
> > Just a small optimization as I also have another case of the switch
> > (only one switch case if
> > you didn't notice) belonging to the next patch set regarding dynamic
> > ATU entries.
>
> What kind of optimization are you even talking about? Please get rid of
> coding patterns like this, sorry.
>
Right!

> > > Please, no "if (chiplock) mutex_lock()" hacks. Just lockdep_assert_held(&chip->reg_lock),
> > > which serves both for documentation and for validation purposes, ensure
> > > the lock is always taken at the caller (which in this case is super easy)
> > > and move on.
> >
> > As I am calling the function in if statement checks, it would make
> > that code more messy, while with
> > this approach the function can be called from anywhere. I also looked
> > at having two functions, with
> > one being a wrapper function taking the lock and calling the other...
>
> There are many functions in mv88e6xxx that require the reg_lock to be
> held, there's nothing new or special here.
>

Now I take the lock in the function regardless. No boolean.

> > >
> > > > +
> > > > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > > > +             goto out;
> > >
> > > It would be good to actually propagate the error to the caller and
> > > "locked" via a pass-by-reference bool pointer argument, not just say
> > > that I/O errors mean that the port is unlocked.
> >
> > Again the wish to be able to call it from if statement checks,.
> >
> > > > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > > > +     if (locked) {
> > > > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> > >
> > > I'd suggest aligning these macros vertically.
> >
> > They are according to the Linux kernel coding standard wrt indentation afaik.
>
> Compare:
>
>                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
>                         MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
>                         MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
>                         MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
>
> with:
>
>                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
>                        MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
>                        MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
>                        MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;

I cannot see the difference here...?

Another issue...

I have removed the timers as they are superfluous and now just use the
worker and jiffies. But I have found that the whole ageing time seems
to be broken on the 5.17 kernel I am running. I don't know if it has
been fixed, but the ageing timeout is supposed to be given in seconds.
Here is the output from various functions after the command "ip link
set dev br0 type bridge ageing_time 1500" (that is nominally 1500
seconds according to man page!):

dsa_switch_ageing_time: ageing_time 10000, ageing_time_min 1000,
ageing_time_max 3825000
mv88e6xxx_set_ageing_time: set ageing time to 10000
br0: failed (err=-34) to set attribute (id=6)
dsa_switch_ageing_time: ageing_time 15000, ageing_time_min 1000,
ageing_time_max 3825000
mv88e6xxx_set_ageing_time: set ageing time to 15000

The 15000 set corresponds to 150 seconds! (I hardcoded the dsa
ageing_time_min to 1000)

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-05 15:05         ` [Bridge] " Hans S
@ 2022-07-06 13:28           ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 13:28 UTC (permalink / raw)
  To: Hans S
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Tue, Jul 05, 2022 at 05:05:52PM +0200, Hans S wrote:
> Hi, does anybody know what it going on with this variable?
> struct dsa_port *dp ->ageing_time;
> 
> I experience that it changes value like a factor ~10 at times.

Could you be a bit more specific? Are you talking about STP Topology
Change Notification BPDUs, which trigger this code path?

diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 7d27b2e6038f..9b25bc2dcb3e 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -671,10 +671,10 @@ void __br_set_topology_change(struct net_bridge *br, unsigned char val)
 
 		if (val) {
 			t = 2 * br->forward_delay;
-			br_debug(br, "decreasing ageing time to %lu\n", t);
+			br_info(br, "decreasing ageing time to %lu\n", t);
 		} else {
 			t = br->bridge_ageing_time;
-			br_debug(br, "restoring ageing time to %lu\n", t);
+			br_info(br, "restoring ageing time to %lu\n", t);
 		}
 
 		err = __set_ageing_time(br->dev, t);

Coincidentally the default values of 2 * br->forward_delay and br->bridge_ageing_time
are 1 order of magnitude apart from each other.

[  139.998310] br0: topology change detected, propagating
[  140.003490] br0: decreasing ageing time to 3000
[  175.193054] br0: restoring ageing time to 30000

What's the problem anyway?

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06 13:28           ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 13:28 UTC (permalink / raw)
  To: Hans S
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Tue, Jul 05, 2022 at 05:05:52PM +0200, Hans S wrote:
> Hi, does anybody know what it going on with this variable?
> struct dsa_port *dp ->ageing_time;
> 
> I experience that it changes value like a factor ~10 at times.

Could you be a bit more specific? Are you talking about STP Topology
Change Notification BPDUs, which trigger this code path?

diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 7d27b2e6038f..9b25bc2dcb3e 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -671,10 +671,10 @@ void __br_set_topology_change(struct net_bridge *br, unsigned char val)
 
 		if (val) {
 			t = 2 * br->forward_delay;
-			br_debug(br, "decreasing ageing time to %lu\n", t);
+			br_info(br, "decreasing ageing time to %lu\n", t);
 		} else {
 			t = br->bridge_ageing_time;
-			br_debug(br, "restoring ageing time to %lu\n", t);
+			br_info(br, "restoring ageing time to %lu\n", t);
 		}
 
 		err = __set_ageing_time(br->dev, t);

Coincidentally the default values of 2 * br->forward_delay and br->bridge_ageing_time
are 1 order of magnitude apart from each other.

[  139.998310] br0: topology change detected, propagating
[  140.003490] br0: decreasing ageing time to 3000
[  175.193054] br0: restoring ageing time to 30000

What's the problem anyway?

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-06 13:28           ` [Bridge] " Vladimir Oltean
@ 2022-07-06 13:48             ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 13:48 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Wed, Jul 6, 2022 at 3:28 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Tue, Jul 05, 2022 at 05:05:52PM +0200, Hans S wrote:
> > Hi, does anybody know what it going on with this variable?
> > struct dsa_port *dp ->ageing_time;
> >
> > I experience that it changes value like a factor ~10 at times.
>
> Could you be a bit more specific? Are you talking about STP Topology
> Change Notification BPDUs, which trigger this code path?
>
> diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
> index 7d27b2e6038f..9b25bc2dcb3e 100644
> --- a/net/bridge/br_stp.c
> +++ b/net/bridge/br_stp.c
> @@ -671,10 +671,10 @@ void __br_set_topology_change(struct net_bridge *br, unsigned char val)
>
>                 if (val) {
>                         t = 2 * br->forward_delay;
> -                       br_debug(br, "decreasing ageing time to %lu\n", t);
> +                       br_info(br, "decreasing ageing time to %lu\n", t);
>                 } else {
>                         t = br->bridge_ageing_time;
> -                       br_debug(br, "restoring ageing time to %lu\n", t);
> +                       br_info(br, "restoring ageing time to %lu\n", t);
>                 }
>
>                 err = __set_ageing_time(br->dev, t);
>
> Coincidentally the default values of 2 * br->forward_delay and br->bridge_ageing_time
> are 1 order of magnitude apart from each other.
>
> [  139.998310] br0: topology change detected, propagating
> [  140.003490] br0: decreasing ageing time to 3000
> [  175.193054] br0: restoring ageing time to 30000
>
> What's the problem anyway?

It might be a topology change as you indicate, though I am not sure.
So I am not using that variable any more for determining the ageing
time for the locked FDB entries, but instead I have made a function to
read the time from the chip instead.

The problem with that, I have mentioned in my latest reply to the
mac-auth patch set...

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06 13:48             ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 13:48 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Wed, Jul 6, 2022 at 3:28 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Tue, Jul 05, 2022 at 05:05:52PM +0200, Hans S wrote:
> > Hi, does anybody know what it going on with this variable?
> > struct dsa_port *dp ->ageing_time;
> >
> > I experience that it changes value like a factor ~10 at times.
>
> Could you be a bit more specific? Are you talking about STP Topology
> Change Notification BPDUs, which trigger this code path?
>
> diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
> index 7d27b2e6038f..9b25bc2dcb3e 100644
> --- a/net/bridge/br_stp.c
> +++ b/net/bridge/br_stp.c
> @@ -671,10 +671,10 @@ void __br_set_topology_change(struct net_bridge *br, unsigned char val)
>
>                 if (val) {
>                         t = 2 * br->forward_delay;
> -                       br_debug(br, "decreasing ageing time to %lu\n", t);
> +                       br_info(br, "decreasing ageing time to %lu\n", t);
>                 } else {
>                         t = br->bridge_ageing_time;
> -                       br_debug(br, "restoring ageing time to %lu\n", t);
> +                       br_info(br, "restoring ageing time to %lu\n", t);
>                 }
>
>                 err = __set_ageing_time(br->dev, t);
>
> Coincidentally the default values of 2 * br->forward_delay and br->bridge_ageing_time
> are 1 order of magnitude apart from each other.
>
> [  139.998310] br0: topology change detected, propagating
> [  140.003490] br0: decreasing ageing time to 3000
> [  175.193054] br0: restoring ageing time to 30000
>
> What's the problem anyway?

It might be a topology change as you indicate, though I am not sure.
So I am not using that variable any more for determining the ageing
time for the locked FDB entries, but instead I have made a function to
read the time from the chip instead.

The problem with that, I have mentioned in my latest reply to the
mac-auth patch set...

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-06 10:12           ` [Bridge] " Hans S
@ 2022-07-06 14:23             ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 14:23 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

> >> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
> >>       if (err)
> >>               dev_err(chip->dev,
> >>                       "p%d: failed to force MAC link down\n", port);
> >> +     else
> >> +             if (mv88e6xxx_port_is_locked(chip, port, true))
> >> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
> >
> >This is superfluous, is it not? The bridge will transition a port whose
> >link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> >fast-age the dynamic FDB entries on the port, which you've already
> >handled below.
>
> I removed this code, but then on link down the locked entries were not
> cleared out. Something not as thought?

I don't see a fast ageing happening on link down. There is the two cases:
1. Soft link down

With iproute2 command the link is brought down and
mv88e6xxx_mac_link_down() is called with rtnl lock taken.

2. Hard link down

I remove the cable from the port and mv88e6xxx_mac_link_down() is
called without rtnl lock.

As the hard link down case calls without rtnl lock, either I trigger
the case you have mentioned or I have to use rtnl_is_locked()
somewhere along the line?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06 14:23             ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 14:23 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

> >> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
> >>       if (err)
> >>               dev_err(chip->dev,
> >>                       "p%d: failed to force MAC link down\n", port);
> >> +     else
> >> +             if (mv88e6xxx_port_is_locked(chip, port, true))
> >> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
> >
> >This is superfluous, is it not? The bridge will transition a port whose
> >link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> >fast-age the dynamic FDB entries on the port, which you've already
> >handled below.
>
> I removed this code, but then on link down the locked entries were not
> cleared out. Something not as thought?

I don't see a fast ageing happening on link down. There is the two cases:
1. Soft link down

With iproute2 command the link is brought down and
mv88e6xxx_mac_link_down() is called with rtnl lock taken.

2. Hard link down

I remove the cable from the port and mv88e6xxx_mac_link_down() is
called without rtnl lock.

As the hard link down case calls without rtnl lock, either I trigger
the case you have mentioned or I have to use rtnl_is_locked()
somewhere along the line?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-06 10:12           ` [Bridge] " Hans S
@ 2022-07-06 14:33             ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 14:33 UTC (permalink / raw)
  To: Hans S
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Wed, Jul 06, 2022 at 12:12:01PM +0200, Hans S wrote:
> On Wed, Jul 6, 2022 at 10:56 AM Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Tue, Jun 28, 2022 at 02:26:43PM +0200, Hans S wrote:
> > > > Dumb question: if you only flush the locked entries at fast age if the
> > > > port is locked, then what happens with the existing locked entries if
> > > > the port becomes unlocked before an FDB flush takes place?
> > > > Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> > > > too?
> 
> BTW:
> >> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
> >>       if (err)
> >>               dev_err(chip->dev,
> >>                       "p%d: failed to force MAC link down\n", port);
> >> +     else
> >> +             if (mv88e6xxx_port_is_locked(chip, port, true))
> >> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
> >
> >This is superfluous, is it not? The bridge will transition a port whose
> >link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> >fast-age the dynamic FDB entries on the port, which you've already
> >handled below.
> 
> I removed this code, but then on link down the locked entries were not
> cleared out. Something not as thought?

What was the port's STP state before the link down event, and did it
change after the link down?

The relevant code in DSA is:

int dsa_port_set_state(struct dsa_port *dp, u8 state, bool do_fast_age)
{
	struct dsa_switch *ds = dp->ds;
	int port = dp->index;

	if (!ds->ops->port_stp_state_set)
		return -EOPNOTSUPP;

	ds->ops->port_stp_state_set(ds, port, state);

	if (!dsa_port_can_configure_learning(dp) ||
	    (do_fast_age && dp->learning)) {
		/* Fast age FDB entries or flush appropriate forwarding database
		 * for the given port, if we are moving it from Learning or
		 * Forwarding state, to Disabled or Blocking or Listening state.
		 * Ports that were standalone before the STP state change don't
		 * need to fast age the FDB, since address learning is off in
		 * standalone mode.
		 */

		if ((dp->stp_state == BR_STATE_LEARNING ||
		     dp->stp_state == BR_STATE_FORWARDING) &&
		    (state == BR_STATE_DISABLED ||
		     state == BR_STATE_BLOCKING ||
		     state == BR_STATE_LISTENING))
			dsa_port_fast_age(dp);
	}

	dp->stp_state = state;

	return 0;
}

If the STP state wasn't LEARNING or FORWARDING, there weren't supposed
to be dynamic FDB entries on the port in the first place, so DSA says
there's nothing to flush, and doesn't call dsa_port_fast_age().
Are there dynamic FDB entries being installed on a port that isn't
in a state that's supposed to learn? I guess the answer is yes.
Is that what you want, or should the locked entries be recorded only in
the LEARNING or FORWARDING states, otherwise discarded?

> > > That was my first thought too, but the way the flags are handled with the mask etc, does so that
> > > mv88e6xxx_port_set_lock() is called when other flags change. It could be done by the transition
> > > from locked->unlocked by checking if the port is locked already.
> >
> > Why does mv88e6xxx_port_set_lock() get called when other flags change?
> 
> That is what seems to happen, but maybe I am wrong. Looking at the dsa
> functions dsa_port_inherit_brport_flags() and
> dsa_port_clear_brport_flags(), they set the mask for which underlying
> function is called, and it looks to me that they call once for all the
> flags: BR_LEARNING, BR_FLOOD, BR_MCAST_FLOOD, BR_BCAST_FLOOD,
> BR_PORT_LOCKED?

What you actually want to say is: "mv88e6xxx_port_set_lock() is also
called when the DSA port joins a bridge, due to the switchdev attribute
replay logic present in dsa_port_switchdev_sync_attrs()".

Which, by the way, is logic that you've added yourself, in commit
b9e8b58fd2cb ("net: dsa: Include BR_PORT_LOCKED in the list of synced
brport flags") ;)

You are free to return early from mv88e6xxx_port_set_lock() if nothing has
changed. The DSA layer doesn't keep track of the locked state of the
port so it cannot deduce whether propagating to the switch driver is
necessary or not.

> > > > From the discussion with Ido and Nikolay I get the impression that
> > > > you're not doing the right thing here either, notifying a
> > > > SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> > > > SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).
> > >
> > > Hmm, my experience tells me that much is opposite the normal
> > > conventions when dealing with
> > > locked ports, as there was never switchdev notifications from the
> > > driver to the bridge before, but
> > > that is needed to keep ATU and FDB entries in sync.
> >
> > On delete you mean? So the bridge signals switchdev a deletion of a
> > locked FDB entry (as I pointed out, this function gets indirectly called
> > from port_fdb_del), but it won't get deleted until switchdev signals it
> > back, is what you're saying?
> >
> 
> When added they are added with bridge FDB flags: extern_learn offload
> locked, with a SWITCHDEV_FDB_ADD_TO_BRIDGE event. So they are owned by
> the driver.
> When the driver deletes the locked entry the bridge FDB entry is
> removes by the SWITCHDEV_FDB_DEL_TO_BRIDGE event from the driver. That
> seems quite fair?

I'm just pointing out that you left other (probably unintended) code
paths for which the SWITCHDEV_FDB_DEL_TO_BRIDGE notifier is quite
useless. I haven't yet looked at your newest revision to see what
changed there.

> > > > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > > > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> > >
> > > Just a small optimization as I also have another case of the switch
> > > (only one switch case if
> > > you didn't notice) belonging to the next patch set regarding dynamic
> > > ATU entries.
> >
> > What kind of optimization are you even talking about? Please get rid of
> > coding patterns like this, sorry.
> >
> Right!

Right what? I'm genuinely curious what optimization are you talking about.

> > > > > +
> > > > > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > > > > +             goto out;
> > > >
> > > > It would be good to actually propagate the error to the caller and
> > > > "locked" via a pass-by-reference bool pointer argument, not just say
> > > > that I/O errors mean that the port is unlocked.
> > >
> > > Again the wish to be able to call it from if statement checks,.
> > >
> > > > > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > > > > +     if (locked) {
> > > > > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> > > >
> > > > I'd suggest aligning these macros vertically.
> > >
> > > They are according to the Linux kernel coding standard wrt indentation afaik.
> >
> > Compare:
> >
> >                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> >                         MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> >                         MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> >                         MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> >
> > with:
> >
> >                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> >                        MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> >                        MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> >                        MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> 
> I cannot see the difference here...?

Just out of curiosity, are you even trying, are you looking at the
difference using a monospace font?

> Another issue...
> 
> I have removed the timers as they are superfluous and now just use the
> worker and jiffies. But I have found that the whole ageing time seems
> to be broken on the 5.17 kernel I am running. I don't know if it has
> been fixed, but the ageing timeout is supposed to be given in seconds.
> Here is the output from various functions after the command "ip link
> set dev br0 type bridge ageing_time 1500" (that is nominally 1500
> seconds according to man page!):
> 
> dsa_switch_ageing_time: ageing_time 10000, ageing_time_min 1000,
> ageing_time_max 3825000
> mv88e6xxx_set_ageing_time: set ageing time to 10000
> br0: failed (err=-34) to set attribute (id=6)
> dsa_switch_ageing_time: ageing_time 15000, ageing_time_min 1000,
> ageing_time_max 3825000
> mv88e6xxx_set_ageing_time: set ageing time to 15000
> 
> The 15000 set corresponds to 150 seconds! (I hardcoded the dsa
> ageing_time_min to 1000)

Are you talking about this known problem, that the ageing time values in
seconds need to be scaled up by a factor of USER_HZ when passed to the
kernel?
https://www.spinics.net/lists/netdev/msg672070.html
https://www.spinics.net/lists/netdev/msg567332.html

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06 14:33             ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 14:33 UTC (permalink / raw)
  To: Hans S
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Wed, Jul 06, 2022 at 12:12:01PM +0200, Hans S wrote:
> On Wed, Jul 6, 2022 at 10:56 AM Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Tue, Jun 28, 2022 at 02:26:43PM +0200, Hans S wrote:
> > > > Dumb question: if you only flush the locked entries at fast age if the
> > > > port is locked, then what happens with the existing locked entries if
> > > > the port becomes unlocked before an FDB flush takes place?
> > > > Shouldn't mv88e6xxx_port_set_lock() call mv88e6xxx_atu_locked_entry_flush()
> > > > too?
> 
> BTW:
> >> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
> >>       if (err)
> >>               dev_err(chip->dev,
> >>                       "p%d: failed to force MAC link down\n", port);
> >> +     else
> >> +             if (mv88e6xxx_port_is_locked(chip, port, true))
> >> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
> >
> >This is superfluous, is it not? The bridge will transition a port whose
> >link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> >fast-age the dynamic FDB entries on the port, which you've already
> >handled below.
> 
> I removed this code, but then on link down the locked entries were not
> cleared out. Something not as thought?

What was the port's STP state before the link down event, and did it
change after the link down?

The relevant code in DSA is:

int dsa_port_set_state(struct dsa_port *dp, u8 state, bool do_fast_age)
{
	struct dsa_switch *ds = dp->ds;
	int port = dp->index;

	if (!ds->ops->port_stp_state_set)
		return -EOPNOTSUPP;

	ds->ops->port_stp_state_set(ds, port, state);

	if (!dsa_port_can_configure_learning(dp) ||
	    (do_fast_age && dp->learning)) {
		/* Fast age FDB entries or flush appropriate forwarding database
		 * for the given port, if we are moving it from Learning or
		 * Forwarding state, to Disabled or Blocking or Listening state.
		 * Ports that were standalone before the STP state change don't
		 * need to fast age the FDB, since address learning is off in
		 * standalone mode.
		 */

		if ((dp->stp_state == BR_STATE_LEARNING ||
		     dp->stp_state == BR_STATE_FORWARDING) &&
		    (state == BR_STATE_DISABLED ||
		     state == BR_STATE_BLOCKING ||
		     state == BR_STATE_LISTENING))
			dsa_port_fast_age(dp);
	}

	dp->stp_state = state;

	return 0;
}

If the STP state wasn't LEARNING or FORWARDING, there weren't supposed
to be dynamic FDB entries on the port in the first place, so DSA says
there's nothing to flush, and doesn't call dsa_port_fast_age().
Are there dynamic FDB entries being installed on a port that isn't
in a state that's supposed to learn? I guess the answer is yes.
Is that what you want, or should the locked entries be recorded only in
the LEARNING or FORWARDING states, otherwise discarded?

> > > That was my first thought too, but the way the flags are handled with the mask etc, does so that
> > > mv88e6xxx_port_set_lock() is called when other flags change. It could be done by the transition
> > > from locked->unlocked by checking if the port is locked already.
> >
> > Why does mv88e6xxx_port_set_lock() get called when other flags change?
> 
> That is what seems to happen, but maybe I am wrong. Looking at the dsa
> functions dsa_port_inherit_brport_flags() and
> dsa_port_clear_brport_flags(), they set the mask for which underlying
> function is called, and it looks to me that they call once for all the
> flags: BR_LEARNING, BR_FLOOD, BR_MCAST_FLOOD, BR_BCAST_FLOOD,
> BR_PORT_LOCKED?

What you actually want to say is: "mv88e6xxx_port_set_lock() is also
called when the DSA port joins a bridge, due to the switchdev attribute
replay logic present in dsa_port_switchdev_sync_attrs()".

Which, by the way, is logic that you've added yourself, in commit
b9e8b58fd2cb ("net: dsa: Include BR_PORT_LOCKED in the list of synced
brport flags") ;)

You are free to return early from mv88e6xxx_port_set_lock() if nothing has
changed. The DSA layer doesn't keep track of the locked state of the
port so it cannot deduce whether propagating to the switch driver is
necessary or not.

> > > > From the discussion with Ido and Nikolay I get the impression that
> > > > you're not doing the right thing here either, notifying a
> > > > SWITCHDEV_FDB_DEL_TO_BRIDGE from what is effectively the
> > > > SWITCHDEV_FDB_DEL_TO_DEVICE handler (port_fdb_del).
> > >
> > > Hmm, my experience tells me that much is opposite the normal
> > > conventions when dealing with
> > > locked ports, as there was never switchdev notifications from the
> > > driver to the bridge before, but
> > > that is needed to keep ATU and FDB entries in sync.
> >
> > On delete you mean? So the bridge signals switchdev a deletion of a
> > locked FDB entry (as I pointed out, this function gets indirectly called
> > from port_fdb_del), but it won't get deleted until switchdev signals it
> > back, is what you're saying?
> >
> 
> When added they are added with bridge FDB flags: extern_learn offload
> locked, with a SWITCHDEV_FDB_ADD_TO_BRIDGE event. So they are owned by
> the driver.
> When the driver deletes the locked entry the bridge FDB entry is
> removes by the SWITCHDEV_FDB_DEL_TO_BRIDGE event from the driver. That
> seems quite fair?

I'm just pointing out that you left other (probably unintended) code
paths for which the SWITCHDEV_FDB_DEL_TO_BRIDGE notifier is quite
useless. I haven't yet looked at your newest revision to see what
changed there.

> > > > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > > > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> > >
> > > Just a small optimization as I also have another case of the switch
> > > (only one switch case if
> > > you didn't notice) belonging to the next patch set regarding dynamic
> > > ATU entries.
> >
> > What kind of optimization are you even talking about? Please get rid of
> > coding patterns like this, sorry.
> >
> Right!

Right what? I'm genuinely curious what optimization are you talking about.

> > > > > +
> > > > > +     if (mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, &reg))
> > > > > +             goto out;
> > > >
> > > > It would be good to actually propagate the error to the caller and
> > > > "locked" via a pass-by-reference bool pointer argument, not just say
> > > > that I/O errors mean that the port is unlocked.
> > >
> > > Again the wish to be able to call it from if statement checks,.
> > >
> > > > > +     reg &= MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK;
> > > > > +     if (locked) {
> > > > > +             reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> > > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> > > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> > > > > +                     MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> > > >
> > > > I'd suggest aligning these macros vertically.
> > >
> > > They are according to the Linux kernel coding standard wrt indentation afaik.
> >
> > Compare:
> >
> >                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> >                         MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> >                         MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> >                         MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> >
> > with:
> >
> >                 reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
> >                        MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT |
> >                        MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT |
> >                        MV88E6XXX_PORT_ASSOC_VECTOR_HOLD_AT_1;
> 
> I cannot see the difference here...?

Just out of curiosity, are you even trying, are you looking at the
difference using a monospace font?

> Another issue...
> 
> I have removed the timers as they are superfluous and now just use the
> worker and jiffies. But I have found that the whole ageing time seems
> to be broken on the 5.17 kernel I am running. I don't know if it has
> been fixed, but the ageing timeout is supposed to be given in seconds.
> Here is the output from various functions after the command "ip link
> set dev br0 type bridge ageing_time 1500" (that is nominally 1500
> seconds according to man page!):
> 
> dsa_switch_ageing_time: ageing_time 10000, ageing_time_min 1000,
> ageing_time_max 3825000
> mv88e6xxx_set_ageing_time: set ageing time to 10000
> br0: failed (err=-34) to set attribute (id=6)
> dsa_switch_ageing_time: ageing_time 15000, ageing_time_min 1000,
> ageing_time_max 3825000
> mv88e6xxx_set_ageing_time: set ageing time to 15000
> 
> The 15000 set corresponds to 150 seconds! (I hardcoded the dsa
> ageing_time_min to 1000)

Are you talking about this known problem, that the ageing time values in
seconds need to be scaled up by a factor of USER_HZ when passed to the
kernel?
https://www.spinics.net/lists/netdev/msg672070.html
https://www.spinics.net/lists/netdev/msg567332.html

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-06 14:33             ` [Bridge] " Vladimir Oltean
@ 2022-07-06 15:38               ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 15:38 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

On Wed, Jul 6, 2022 at 4:33 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> > >> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
> > >>       if (err)
> > >>               dev_err(chip->dev,
> > >>                       "p%d: failed to force MAC link down\n", port);
> > >> +     else
> > >> +             if (mv88e6xxx_port_is_locked(chip, port, true))
> > >> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
> > >
> > >This is superfluous, is it not? The bridge will transition a port whose
> > >link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> > >fast-age the dynamic FDB entries on the port, which you've already
> > >handled below.
> >
> > I removed this code, but then on link down the locked entries were not
> > cleared out. Something not as thought?
>
> What was the port's STP state before the link down event, and did it
> change after the link down?

The stp state is FORWARDING.

>
> If the STP state wasn't LEARNING or FORWARDING, there weren't supposed
> to be dynamic FDB entries on the port in the first place, so DSA says
> there's nothing to flush, and doesn't call dsa_port_fast_age().
> Are there dynamic FDB entries being installed on a port that isn't
> in a state that's supposed to learn? I guess the answer is yes.
> Is that what you want, or should the locked entries be recorded only in
> the LEARNING or FORWARDING states, otherwise discarded?
>
Learning is off as has been discussed, and I do want the locked
entries to be dynamic in the sense that the driver removes them after
the system ageing time has passed.

>
> What you actually want to say is: "mv88e6xxx_port_set_lock() is also
> called when the DSA port joins a bridge, due to the switchdev attribute
> replay logic present in dsa_port_switchdev_sync_attrs()".
>
> Which, by the way, is logic that you've added yourself, in commit
> b9e8b58fd2cb ("net: dsa: Include BR_PORT_LOCKED in the list of synced
> brport flags") ;)
>
> You are free to return early from mv88e6xxx_port_set_lock() if nothing has
> changed. The DSA layer doesn't keep track of the locked state of the
> port so it cannot deduce whether propagating to the switch driver is
> necessary or not.
>

I think I can safely call mv88e6xxx_atu_locked_entry_flush() from
mv88e6xxx_port_set_lock() when locked is off as the port setup for the
respective port must have been completed successfully.

> > When added they are added with bridge FDB flags: extern_learn offload
> > locked, with a SWITCHDEV_FDB_ADD_TO_BRIDGE event. So they are owned by
> > the driver.
> > When the driver deletes the locked entry the bridge FDB entry is
> > removes by the SWITCHDEV_FDB_DEL_TO_BRIDGE event from the driver. That
> > seems quite fair?
>
> I'm just pointing out that you left other (probably unintended) code
> paths for which the SWITCHDEV_FDB_DEL_TO_BRIDGE notifier is quite
> useless. I haven't yet looked at your newest revision to see what
> changed there.
>

I guess I should add a boolean to tell if
mv88e6xxx_atu_locked_entry_purge() should send a notification or not.
So that port_fdb_del() will not cause a SWITCHDEV_FDB_DEL_TO_BRIDGE
event.

> > > > > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > > > > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> > > >
> > > > Just a small optimization as I also have another case of the switch
> > > > (only one switch case if
> > > > you didn't notice) belonging to the next patch set regarding dynamic
> > > > ATU entries.
> > >
> > > What kind of optimization are you even talking about? Please get rid of
> > > coding patterns like this, sorry.
> > >
> > Right!
>
> Right what? I'm genuinely curious what optimization are you talking about.
>

I am just confirming that what you wrote is correct, e.g. the
"Right!". So I have fixed that. :-)

>
> Just out of curiosity, are you even trying, are you looking at the
> difference using a monospace font?
>
> > Another issue...
> >
> > I have removed the timers as they are superfluous and now just use the
> > worker and jiffies. But I have found that the whole ageing time seems
> > to be broken on the 5.17 kernel I am running. I don't know if it has
> > been fixed, but the ageing timeout is supposed to be given in seconds.
> > Here is the output from various functions after the command "ip link
> > set dev br0 type bridge ageing_time 1500" (that is nominally 1500
> > seconds according to man page!):
> >
> > dsa_switch_ageing_time: ageing_time 10000, ageing_time_min 1000,
> > ageing_time_max 3825000
> > mv88e6xxx_set_ageing_time: set ageing time to 10000
> > br0: failed (err=-34) to set attribute (id=6)
> > dsa_switch_ageing_time: ageing_time 15000, ageing_time_min 1000,
> > ageing_time_max 3825000
> > mv88e6xxx_set_ageing_time: set ageing time to 15000
> >
> > The 15000 set corresponds to 150 seconds! (I hardcoded the dsa
> > ageing_time_min to 1000)
>
> Are you talking about this known problem, that the ageing time values in
> seconds need to be scaled up by a factor of USER_HZ when passed to the
> kernel?
> https://www.spinics.net/lists/netdev/msg672070.html
> https://www.spinics.net/lists/netdev/msg567332.html

It might be so, but there is another factor 10 which might be
regarding topology change as I understand. If I want a ageing timeout
of say 15 or 30 seconds, that hardly seems possible?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-06 15:38               ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-06 15:38 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

On Wed, Jul 6, 2022 at 4:33 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> > >> @@ -919,6 +920,9 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port,
> > >>       if (err)
> > >>               dev_err(chip->dev,
> > >>                       "p%d: failed to force MAC link down\n", port);
> > >> +     else
> > >> +             if (mv88e6xxx_port_is_locked(chip, port, true))
> > >> +                     mv88e6xxx_atu_locked_entry_flush(ds, port);
> > >
> > >This is superfluous, is it not? The bridge will transition a port whose
> > >link goes down to BR_STATE_DISABLED, which will make dsa_port_set_state()
> > >fast-age the dynamic FDB entries on the port, which you've already
> > >handled below.
> >
> > I removed this code, but then on link down the locked entries were not
> > cleared out. Something not as thought?
>
> What was the port's STP state before the link down event, and did it
> change after the link down?

The stp state is FORWARDING.

>
> If the STP state wasn't LEARNING or FORWARDING, there weren't supposed
> to be dynamic FDB entries on the port in the first place, so DSA says
> there's nothing to flush, and doesn't call dsa_port_fast_age().
> Are there dynamic FDB entries being installed on a port that isn't
> in a state that's supposed to learn? I guess the answer is yes.
> Is that what you want, or should the locked entries be recorded only in
> the LEARNING or FORWARDING states, otherwise discarded?
>
Learning is off as has been discussed, and I do want the locked
entries to be dynamic in the sense that the driver removes them after
the system ageing time has passed.

>
> What you actually want to say is: "mv88e6xxx_port_set_lock() is also
> called when the DSA port joins a bridge, due to the switchdev attribute
> replay logic present in dsa_port_switchdev_sync_attrs()".
>
> Which, by the way, is logic that you've added yourself, in commit
> b9e8b58fd2cb ("net: dsa: Include BR_PORT_LOCKED in the list of synced
> brport flags") ;)
>
> You are free to return early from mv88e6xxx_port_set_lock() if nothing has
> changed. The DSA layer doesn't keep track of the locked state of the
> port so it cannot deduce whether propagating to the switch driver is
> necessary or not.
>

I think I can safely call mv88e6xxx_atu_locked_entry_flush() from
mv88e6xxx_port_set_lock() when locked is off as the port setup for the
respective port must have been completed successfully.

> > When added they are added with bridge FDB flags: extern_learn offload
> > locked, with a SWITCHDEV_FDB_ADD_TO_BRIDGE event. So they are owned by
> > the driver.
> > When the driver deletes the locked entry the bridge FDB entry is
> > removes by the SWITCHDEV_FDB_DEL_TO_BRIDGE event from the driver. That
> > seems quite fair?
>
> I'm just pointing out that you left other (probably unintended) code
> paths for which the SWITCHDEV_FDB_DEL_TO_BRIDGE notifier is quite
> useless. I haven't yet looked at your newest revision to see what
> changed there.
>

I guess I should add a boolean to tell if
mv88e6xxx_atu_locked_entry_purge() should send a notification or not.
So that port_fdb_del() will not cause a SWITCHDEV_FDB_DEL_TO_BRIDGE
event.

> > > > > Why is the rtnl_unlock() outside the switch statement but the rtnl_lock() inside?
> > > > > Not to mention, the dsa_port_to_bridge_port() call needs to be under rtnl_lock().
> > > >
> > > > Just a small optimization as I also have another case of the switch
> > > > (only one switch case if
> > > > you didn't notice) belonging to the next patch set regarding dynamic
> > > > ATU entries.
> > >
> > > What kind of optimization are you even talking about? Please get rid of
> > > coding patterns like this, sorry.
> > >
> > Right!
>
> Right what? I'm genuinely curious what optimization are you talking about.
>

I am just confirming that what you wrote is correct, e.g. the
"Right!". So I have fixed that. :-)

>
> Just out of curiosity, are you even trying, are you looking at the
> difference using a monospace font?
>
> > Another issue...
> >
> > I have removed the timers as they are superfluous and now just use the
> > worker and jiffies. But I have found that the whole ageing time seems
> > to be broken on the 5.17 kernel I am running. I don't know if it has
> > been fixed, but the ageing timeout is supposed to be given in seconds.
> > Here is the output from various functions after the command "ip link
> > set dev br0 type bridge ageing_time 1500" (that is nominally 1500
> > seconds according to man page!):
> >
> > dsa_switch_ageing_time: ageing_time 10000, ageing_time_min 1000,
> > ageing_time_max 3825000
> > mv88e6xxx_set_ageing_time: set ageing time to 10000
> > br0: failed (err=-34) to set attribute (id=6)
> > dsa_switch_ageing_time: ageing_time 15000, ageing_time_min 1000,
> > ageing_time_max 3825000
> > mv88e6xxx_set_ageing_time: set ageing time to 15000
> >
> > The 15000 set corresponds to 150 seconds! (I hardcoded the dsa
> > ageing_time_min to 1000)
>
> Are you talking about this known problem, that the ageing time values in
> seconds need to be scaled up by a factor of USER_HZ when passed to the
> kernel?
> https://www.spinics.net/lists/netdev/msg672070.html
> https://www.spinics.net/lists/netdev/msg567332.html

It might be so, but there is another factor 10 which might be
regarding topology change as I understand. If I want a ageing timeout
of say 15 or 30 seconds, that hardly seems possible?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-05-25 10:18                 ` [Bridge] " Nikolay Aleksandrov
@ 2022-07-06 18:13                   ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 18:13 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

Hi Nikolay,

On Wed, May 25, 2022 at 01:18:49PM +0300, Nikolay Aleksandrov wrote:
> >>>>>> Hi Hans,
> >>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
> >>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
> >>>>>> careful if we try to add any new synchronization not to affect performance as well.
> >>>>>> More below...
> >>>>>>
> >>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
> >>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
> >>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
> >>>>>>>  
> >>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
> >>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
> >>>>>>
> >>>>>> Sorry but you cannot do this for multiple reasons:
> >>>>>>  - f->dst can be NULL
> >>>>>>  - f->dst changes without any synchronization
> >>>>>>  - there is no synchronization between fdb's flags and its ->dst
> >>>>>>
> >>>>>> Cheers,
> >>>>>>  Nik
> >>>>>
> >>>>> Hi Nik,
> >>>>>
> >>>>> if a port is decoupled from the bridge, the locked entries would of
> >>>>> course be invalid, so maybe if adding and removing a port is accounted
> >>>>> for wrt locked entries and the count of locked entries, would that not
> >>>>> work?
> >>>>>
> >>>>> Best,
> >>>>> Hans
> >>>>
> >>>> Hi Hans,
> >>>> Unfortunately you need the correct amount of locked entries per-port if you want
> >>>> to limit their number per-port, instead of globally. So you need a
> >>>> consistent
> >>>
> >>> Hi Nik,
> >>> the used dst is a port structure, so it is per-port and not globally.
> >>>
> >>> Best,
> >>> Hans
> >>>
> >>
> >> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
> >> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
> >>
> >> By the way just fyi net-next is closed right now due to merge window. And one more
> >> thing please include a short log of changes between versions when you send a new one.
> >> I had to go look for v2 to find out what changed.
> >>
> > 
> > Okay, I will drop the limit in the bridge module, which is an easy thing
> > to do. :) (It is mostly there to ensure against DOS attacks if someone
> > bombards a locked port with random mac addresses.)
> > I have a similar limitation in the driver, which should then probably be
> > dropped too?
> > 
> 
> That is up to you/driver, I'd try looking for similar problems in other switch drivers
> and check how those were handled. There are people in the CC above that can
> directly answer that. :)

Not sure whom you're referring to?

In fact I was pretty sure that I didn't see any OOM protection in the
source code of the Linux bridge driver itself either, so I wanted to
check that for myself, so I wrote a small "killswitch" program that's
supposed to, well, kill a switch. It took me a while to find a few free
hours to do the test, sorry for that.

https://github.com/vladimiroltean/killswitch/blob/master/src/killswitch.c

Sure enough, I can kill a Marvell Armada 3720 device with 1GB of RAM
within 3 minutes of running the test program.

[  273.864203] ksoftirqd/0: page allocation failure: order:0, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
[  273.876426] CPU: 0 PID: 12 Comm: ksoftirqd/0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
[  273.884775] Hardware name: CZ.NIC Turris Mox Board (DT)
[  273.889994] Call trace:
[  273.892437]  dump_backtrace.part.0+0xc8/0xd4
[  273.896721]  show_stack+0x18/0x70
[  273.900039]  dump_stack_lvl+0x68/0x84
[  273.903703]  dump_stack+0x18/0x34
[  273.907017]  warn_alloc+0x114/0x1a0
[  273.910508]  __alloc_pages+0xbb0/0xbe0
[  273.914257]  cache_grow_begin+0x60/0x300
[  273.918183]  fallback_alloc+0x184/0x220
[  273.922017]  ____cache_alloc_node+0x174/0x190
[  273.926373]  kmem_cache_alloc+0x1a4/0x220
[  273.930381]  fdb_create+0x40/0x430
[  273.933784]  br_fdb_update+0x198/0x210
[  273.937532]  br_handle_frame_finish+0x244/0x530
[  273.942063]  br_handle_frame+0x1c0/0x270
[  273.945986]  __netif_receive_skb_core.constprop.0+0x29c/0xd30
[  273.951734]  __netif_receive_skb_list_core+0xe8/0x210
[  273.956784]  netif_receive_skb_list_internal+0x180/0x29c
[  273.962091]  napi_gro_receive+0x174/0x190
[  273.966099]  mvneta_rx_swbm+0x6b8/0xb40
[  273.969935]  mvneta_poll+0x684/0x900
[  273.973506]  __napi_poll+0x38/0x18c
[  273.976988]  net_rx_action+0xe8/0x280
[  273.980643]  __do_softirq+0x124/0x2a0
[  273.984299]  run_ksoftirqd+0x4c/0x60
[  273.987871]  smpboot_thread_fn+0x23c/0x270
[  273.991963]  kthread+0x10c/0x110
[  273.995188]  ret_from_fork+0x10/0x20

(followed by lots upon lots of vomiting, followed by ...)

[  311.138590] Out of memory and no killable processes...
[  311.143774] Kernel panic - not syncing: System is deadlocked on memory
[  311.150295] CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
[  311.158550] Hardware name: CZ.NIC Turris Mox Board (DT)
[  311.163766] Workqueue: events rht_deferred_worker
[  311.168477] Call trace:
[  311.170916]  dump_backtrace.part.0+0xc8/0xd4
[  311.175188]  show_stack+0x18/0x70
[  311.178501]  dump_stack_lvl+0x68/0x84
[  311.182159]  dump_stack+0x18/0x34
[  311.185466]  panic+0x168/0x328
[  311.188515]  out_of_memory+0x568/0x584
[  311.192261]  __alloc_pages+0xb04/0xbe0
[  311.196006]  __alloc_pages_bulk+0x15c/0x604
[  311.200185]  alloc_pages_bulk_array_mempolicy+0xbc/0x24c
[  311.205491]  __vmalloc_node_range+0x238/0x550
[  311.209843]  __vmalloc_node_range+0x1c0/0x550
[  311.214195]  kvmalloc_node+0xe0/0x124
[  311.217856]  bucket_table_alloc.isra.0+0x40/0x150
[  311.222554]  rhashtable_rehash_alloc.isra.0+0x20/0x8c
[  311.227599]  rht_deferred_worker+0x7c/0x540
[  311.231775]  process_one_work+0x1d0/0x320
[  311.235779]  worker_thread+0x70/0x440
[  311.239435]  kthread+0x10c/0x110
[  311.242661]  ret_from_fork+0x10/0x20
[  311.246238] SMP: stopping secondary CPUs
[  311.250161] Kernel Offset: disabled
[  311.253642] CPU features: 0x000,00020009,00001086
[  311.258338] Memory Limit: none
[  311.261390] ---[ end Kernel panic - not syncing: System is deadlocked on memory ]---

That can't be quite alright? Shouldn't we have some sort of protection
in the bridge itself too, not just tell hardware driver writers to deal
with it? Or is it somewhere, but it needs to be enabled/configured?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-06 18:13                   ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 18:13 UTC (permalink / raw)
  To: Nikolay Aleksandrov, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Vivien Didelot, Eric Dumazet, linux-kselftest, Roopa Prabhu,
	kuba, Paolo Abeni, Shuah Khan, davem

Hi Nikolay,

On Wed, May 25, 2022 at 01:18:49PM +0300, Nikolay Aleksandrov wrote:
> >>>>>> Hi Hans,
> >>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
> >>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
> >>>>>> careful if we try to add any new synchronization not to affect performance as well.
> >>>>>> More below...
> >>>>>>
> >>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
> >>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
> >>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
> >>>>>>>  
> >>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
> >>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
> >>>>>>
> >>>>>> Sorry but you cannot do this for multiple reasons:
> >>>>>>  - f->dst can be NULL
> >>>>>>  - f->dst changes without any synchronization
> >>>>>>  - there is no synchronization between fdb's flags and its ->dst
> >>>>>>
> >>>>>> Cheers,
> >>>>>>  Nik
> >>>>>
> >>>>> Hi Nik,
> >>>>>
> >>>>> if a port is decoupled from the bridge, the locked entries would of
> >>>>> course be invalid, so maybe if adding and removing a port is accounted
> >>>>> for wrt locked entries and the count of locked entries, would that not
> >>>>> work?
> >>>>>
> >>>>> Best,
> >>>>> Hans
> >>>>
> >>>> Hi Hans,
> >>>> Unfortunately you need the correct amount of locked entries per-port if you want
> >>>> to limit their number per-port, instead of globally. So you need a
> >>>> consistent
> >>>
> >>> Hi Nik,
> >>> the used dst is a port structure, so it is per-port and not globally.
> >>>
> >>> Best,
> >>> Hans
> >>>
> >>
> >> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
> >> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
> >>
> >> By the way just fyi net-next is closed right now due to merge window. And one more
> >> thing please include a short log of changes between versions when you send a new one.
> >> I had to go look for v2 to find out what changed.
> >>
> > 
> > Okay, I will drop the limit in the bridge module, which is an easy thing
> > to do. :) (It is mostly there to ensure against DOS attacks if someone
> > bombards a locked port with random mac addresses.)
> > I have a similar limitation in the driver, which should then probably be
> > dropped too?
> > 
> 
> That is up to you/driver, I'd try looking for similar problems in other switch drivers
> and check how those were handled. There are people in the CC above that can
> directly answer that. :)

Not sure whom you're referring to?

In fact I was pretty sure that I didn't see any OOM protection in the
source code of the Linux bridge driver itself either, so I wanted to
check that for myself, so I wrote a small "killswitch" program that's
supposed to, well, kill a switch. It took me a while to find a few free
hours to do the test, sorry for that.

https://github.com/vladimiroltean/killswitch/blob/master/src/killswitch.c

Sure enough, I can kill a Marvell Armada 3720 device with 1GB of RAM
within 3 minutes of running the test program.

[  273.864203] ksoftirqd/0: page allocation failure: order:0, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
[  273.876426] CPU: 0 PID: 12 Comm: ksoftirqd/0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
[  273.884775] Hardware name: CZ.NIC Turris Mox Board (DT)
[  273.889994] Call trace:
[  273.892437]  dump_backtrace.part.0+0xc8/0xd4
[  273.896721]  show_stack+0x18/0x70
[  273.900039]  dump_stack_lvl+0x68/0x84
[  273.903703]  dump_stack+0x18/0x34
[  273.907017]  warn_alloc+0x114/0x1a0
[  273.910508]  __alloc_pages+0xbb0/0xbe0
[  273.914257]  cache_grow_begin+0x60/0x300
[  273.918183]  fallback_alloc+0x184/0x220
[  273.922017]  ____cache_alloc_node+0x174/0x190
[  273.926373]  kmem_cache_alloc+0x1a4/0x220
[  273.930381]  fdb_create+0x40/0x430
[  273.933784]  br_fdb_update+0x198/0x210
[  273.937532]  br_handle_frame_finish+0x244/0x530
[  273.942063]  br_handle_frame+0x1c0/0x270
[  273.945986]  __netif_receive_skb_core.constprop.0+0x29c/0xd30
[  273.951734]  __netif_receive_skb_list_core+0xe8/0x210
[  273.956784]  netif_receive_skb_list_internal+0x180/0x29c
[  273.962091]  napi_gro_receive+0x174/0x190
[  273.966099]  mvneta_rx_swbm+0x6b8/0xb40
[  273.969935]  mvneta_poll+0x684/0x900
[  273.973506]  __napi_poll+0x38/0x18c
[  273.976988]  net_rx_action+0xe8/0x280
[  273.980643]  __do_softirq+0x124/0x2a0
[  273.984299]  run_ksoftirqd+0x4c/0x60
[  273.987871]  smpboot_thread_fn+0x23c/0x270
[  273.991963]  kthread+0x10c/0x110
[  273.995188]  ret_from_fork+0x10/0x20

(followed by lots upon lots of vomiting, followed by ...)

[  311.138590] Out of memory and no killable processes...
[  311.143774] Kernel panic - not syncing: System is deadlocked on memory
[  311.150295] CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
[  311.158550] Hardware name: CZ.NIC Turris Mox Board (DT)
[  311.163766] Workqueue: events rht_deferred_worker
[  311.168477] Call trace:
[  311.170916]  dump_backtrace.part.0+0xc8/0xd4
[  311.175188]  show_stack+0x18/0x70
[  311.178501]  dump_stack_lvl+0x68/0x84
[  311.182159]  dump_stack+0x18/0x34
[  311.185466]  panic+0x168/0x328
[  311.188515]  out_of_memory+0x568/0x584
[  311.192261]  __alloc_pages+0xb04/0xbe0
[  311.196006]  __alloc_pages_bulk+0x15c/0x604
[  311.200185]  alloc_pages_bulk_array_mempolicy+0xbc/0x24c
[  311.205491]  __vmalloc_node_range+0x238/0x550
[  311.209843]  __vmalloc_node_range+0x1c0/0x550
[  311.214195]  kvmalloc_node+0xe0/0x124
[  311.217856]  bucket_table_alloc.isra.0+0x40/0x150
[  311.222554]  rhashtable_rehash_alloc.isra.0+0x20/0x8c
[  311.227599]  rht_deferred_worker+0x7c/0x540
[  311.231775]  process_one_work+0x1d0/0x320
[  311.235779]  worker_thread+0x70/0x440
[  311.239435]  kthread+0x10c/0x110
[  311.242661]  ret_from_fork+0x10/0x20
[  311.246238] SMP: stopping secondary CPUs
[  311.250161] Kernel Offset: disabled
[  311.253642] CPU features: 0x000,00020009,00001086
[  311.258338] Memory Limit: none
[  311.261390] ---[ end Kernel panic - not syncing: System is deadlocked on memory ]---

That can't be quite alright? Shouldn't we have some sort of protection
in the bridge itself too, not just tell hardware driver writers to deal
with it? Or is it somewhere, but it needs to be enabled/configured?

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-06 18:13                   ` [Bridge] " Vladimir Oltean
@ 2022-07-06 19:38                     ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-06 19:38 UTC (permalink / raw)
  To: Vladimir Oltean, Hans Schultz
  Cc: davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 06/07/2022 21:13, Vladimir Oltean wrote:
> Hi Nikolay,
> 
> On Wed, May 25, 2022 at 01:18:49PM +0300, Nikolay Aleksandrov wrote:
>>>>>>>> Hi Hans,
>>>>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>>>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>>>>>> More below...
>>>>>>>>
>>>>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>>>>>  
>>>>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>>>>>
>>>>>>>> Sorry but you cannot do this for multiple reasons:
>>>>>>>>  - f->dst can be NULL
>>>>>>>>  - f->dst changes without any synchronization
>>>>>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>  Nik
>>>>>>>
>>>>>>> Hi Nik,
>>>>>>>
>>>>>>> if a port is decoupled from the bridge, the locked entries would of
>>>>>>> course be invalid, so maybe if adding and removing a port is accounted
>>>>>>> for wrt locked entries and the count of locked entries, would that not
>>>>>>> work?
>>>>>>>
>>>>>>> Best,
>>>>>>> Hans
>>>>>>
>>>>>> Hi Hans,
>>>>>> Unfortunately you need the correct amount of locked entries per-port if you want
>>>>>> to limit their number per-port, instead of globally. So you need a
>>>>>> consistent
>>>>>
>>>>> Hi Nik,
>>>>> the used dst is a port structure, so it is per-port and not globally.
>>>>>
>>>>> Best,
>>>>> Hans
>>>>>
>>>>
>>>> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
>>>> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
>>>>
>>>> By the way just fyi net-next is closed right now due to merge window. And one more
>>>> thing please include a short log of changes between versions when you send a new one.
>>>> I had to go look for v2 to find out what changed.
>>>>
>>>
>>> Okay, I will drop the limit in the bridge module, which is an easy thing
>>> to do. :) (It is mostly there to ensure against DOS attacks if someone
>>> bombards a locked port with random mac addresses.)
>>> I have a similar limitation in the driver, which should then probably be
>>> dropped too?
>>>
>>
>> That is up to you/driver, I'd try looking for similar problems in other switch drivers
>> and check how those were handled. There are people in the CC above that can
>> directly answer that. :)
> 
> Not sure whom you're referring to?

I meant people who have dealt with hardware resource management in the drivers.

> 
> In fact I was pretty sure that I didn't see any OOM protection in the
> source code of the Linux bridge driver itself either, so I wanted to
> check that for myself, so I wrote a small "killswitch" program that's
> supposed to, well, kill a switch. It took me a while to find a few free
> hours to do the test, sorry for that.
> 
> https://github.com/vladimiroltean/killswitch/blob/master/src/killswitch.c
> 
> Sure enough, I can kill a Marvell Armada 3720 device with 1GB of RAM
> within 3 minutes of running the test program.
> 

I don't think that is new or surprising, if there isn't anything to control the
device resources you'll get there. You don't really need to write any new programs
you can easily do it with mausezahn. I have tests that add over 10 million fdbs on
devices for a few seconds.

The point is it's not the bridge's task to limit memory consumption or to watch for resource
management. You can limit new entries from the device driver (in case of swdev learning) or
you can use a daemon to watch the number of entries and disable learning. There are many
different ways to avoid this. We've discussed it before and I don't mind adding a hard fdb
per-port limit in the bridge as long as it's done properly. We've also discussed LRU and similar
algorithms for fdb learning and eviction. But any hardcoded limits or limits that can break
current default use cases are unacceptable, they must be opt-in.

> [  273.864203] ksoftirqd/0: page allocation failure: order:0, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
> [  273.876426] CPU: 0 PID: 12 Comm: ksoftirqd/0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
> [  273.884775] Hardware name: CZ.NIC Turris Mox Board (DT)
> [  273.889994] Call trace:
> [  273.892437]  dump_backtrace.part.0+0xc8/0xd4
> [  273.896721]  show_stack+0x18/0x70
> [  273.900039]  dump_stack_lvl+0x68/0x84
> [  273.903703]  dump_stack+0x18/0x34
> [  273.907017]  warn_alloc+0x114/0x1a0
> [  273.910508]  __alloc_pages+0xbb0/0xbe0
> [  273.914257]  cache_grow_begin+0x60/0x300
> [  273.918183]  fallback_alloc+0x184/0x220
> [  273.922017]  ____cache_alloc_node+0x174/0x190
> [  273.926373]  kmem_cache_alloc+0x1a4/0x220
> [  273.930381]  fdb_create+0x40/0x430
> [  273.933784]  br_fdb_update+0x198/0x210
> [  273.937532]  br_handle_frame_finish+0x244/0x530
> [  273.942063]  br_handle_frame+0x1c0/0x270
> [  273.945986]  __netif_receive_skb_core.constprop.0+0x29c/0xd30
> [  273.951734]  __netif_receive_skb_list_core+0xe8/0x210
> [  273.956784]  netif_receive_skb_list_internal+0x180/0x29c
> [  273.962091]  napi_gro_receive+0x174/0x190
> [  273.966099]  mvneta_rx_swbm+0x6b8/0xb40
> [  273.969935]  mvneta_poll+0x684/0x900
> [  273.973506]  __napi_poll+0x38/0x18c
> [  273.976988]  net_rx_action+0xe8/0x280
> [  273.980643]  __do_softirq+0x124/0x2a0
> [  273.984299]  run_ksoftirqd+0x4c/0x60
> [  273.987871]  smpboot_thread_fn+0x23c/0x270
> [  273.991963]  kthread+0x10c/0x110
> [  273.995188]  ret_from_fork+0x10/0x20
> 
> (followed by lots upon lots of vomiting, followed by ...)
> 
> [  311.138590] Out of memory and no killable processes...
> [  311.143774] Kernel panic - not syncing: System is deadlocked on memory
> [  311.150295] CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
> [  311.158550] Hardware name: CZ.NIC Turris Mox Board (DT)
> [  311.163766] Workqueue: events rht_deferred_worker
> [  311.168477] Call trace:
> [  311.170916]  dump_backtrace.part.0+0xc8/0xd4
> [  311.175188]  show_stack+0x18/0x70
> [  311.178501]  dump_stack_lvl+0x68/0x84
> [  311.182159]  dump_stack+0x18/0x34
> [  311.185466]  panic+0x168/0x328
> [  311.188515]  out_of_memory+0x568/0x584
> [  311.192261]  __alloc_pages+0xb04/0xbe0
> [  311.196006]  __alloc_pages_bulk+0x15c/0x604
> [  311.200185]  alloc_pages_bulk_array_mempolicy+0xbc/0x24c
> [  311.205491]  __vmalloc_node_range+0x238/0x550
> [  311.209843]  __vmalloc_node_range+0x1c0/0x550
> [  311.214195]  kvmalloc_node+0xe0/0x124
> [  311.217856]  bucket_table_alloc.isra.0+0x40/0x150
> [  311.222554]  rhashtable_rehash_alloc.isra.0+0x20/0x8c
> [  311.227599]  rht_deferred_worker+0x7c/0x540
> [  311.231775]  process_one_work+0x1d0/0x320
> [  311.235779]  worker_thread+0x70/0x440
> [  311.239435]  kthread+0x10c/0x110
> [  311.242661]  ret_from_fork+0x10/0x20
> [  311.246238] SMP: stopping secondary CPUs
> [  311.250161] Kernel Offset: disabled
> [  311.253642] CPU features: 0x000,00020009,00001086
> [  311.258338] Memory Limit: none
> [  311.261390] ---[ end Kernel panic - not syncing: System is deadlocked on memory ]---
> 
> That can't be quite alright? Shouldn't we have some sort of protection
> in the bridge itself too, not just tell hardware driver writers to deal
> with it? Or is it somewhere, but it needs to be enabled/configured?

This is expected, if you'd like feel free to add a hard learning limit in the driver
and the bridge (again if implemented properly).
Nothing can save you if someone has L2 access to the device, they can poison any
table if learning is enabled.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-06 19:38                     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-06 19:38 UTC (permalink / raw)
  To: Vladimir Oltean, Hans Schultz
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Vivien Didelot, Eric Dumazet, linux-kselftest, Roopa Prabhu,
	kuba, Paolo Abeni, Shuah Khan, davem

On 06/07/2022 21:13, Vladimir Oltean wrote:
> Hi Nikolay,
> 
> On Wed, May 25, 2022 at 01:18:49PM +0300, Nikolay Aleksandrov wrote:
>>>>>>>> Hi Hans,
>>>>>>>> So this approach has a fundamental problem, f->dst is changed without any synchronization
>>>>>>>> you cannot rely on it and thus you cannot account for these entries properly. We must be very
>>>>>>>> careful if we try to add any new synchronization not to affect performance as well.
>>>>>>>> More below...
>>>>>>>>
>>>>>>>>> @@ -319,6 +326,9 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f,
>>>>>>>>>  	if (test_bit(BR_FDB_STATIC, &f->flags))
>>>>>>>>>  		fdb_del_hw_addr(br, f->key.addr.addr);
>>>>>>>>>  
>>>>>>>>> +	if (test_bit(BR_FDB_ENTRY_LOCKED, &f->flags) && !test_bit(BR_FDB_OFFLOADED, &f->flags))
>>>>>>>>> +		atomic_dec(&f->dst->locked_entry_cnt);
>>>>>>>>
>>>>>>>> Sorry but you cannot do this for multiple reasons:
>>>>>>>>  - f->dst can be NULL
>>>>>>>>  - f->dst changes without any synchronization
>>>>>>>>  - there is no synchronization between fdb's flags and its ->dst
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>  Nik
>>>>>>>
>>>>>>> Hi Nik,
>>>>>>>
>>>>>>> if a port is decoupled from the bridge, the locked entries would of
>>>>>>> course be invalid, so maybe if adding and removing a port is accounted
>>>>>>> for wrt locked entries and the count of locked entries, would that not
>>>>>>> work?
>>>>>>>
>>>>>>> Best,
>>>>>>> Hans
>>>>>>
>>>>>> Hi Hans,
>>>>>> Unfortunately you need the correct amount of locked entries per-port if you want
>>>>>> to limit their number per-port, instead of globally. So you need a
>>>>>> consistent
>>>>>
>>>>> Hi Nik,
>>>>> the used dst is a port structure, so it is per-port and not globally.
>>>>>
>>>>> Best,
>>>>> Hans
>>>>>
>>>>
>>>> Yeah, I know. :) That's why I wrote it, if the limit is not a feature requirement I'd suggest
>>>> dropping it altogether, it can be enforced externally (e.g. from user-space) if needed.
>>>>
>>>> By the way just fyi net-next is closed right now due to merge window. And one more
>>>> thing please include a short log of changes between versions when you send a new one.
>>>> I had to go look for v2 to find out what changed.
>>>>
>>>
>>> Okay, I will drop the limit in the bridge module, which is an easy thing
>>> to do. :) (It is mostly there to ensure against DOS attacks if someone
>>> bombards a locked port with random mac addresses.)
>>> I have a similar limitation in the driver, which should then probably be
>>> dropped too?
>>>
>>
>> That is up to you/driver, I'd try looking for similar problems in other switch drivers
>> and check how those were handled. There are people in the CC above that can
>> directly answer that. :)
> 
> Not sure whom you're referring to?

I meant people who have dealt with hardware resource management in the drivers.

> 
> In fact I was pretty sure that I didn't see any OOM protection in the
> source code of the Linux bridge driver itself either, so I wanted to
> check that for myself, so I wrote a small "killswitch" program that's
> supposed to, well, kill a switch. It took me a while to find a few free
> hours to do the test, sorry for that.
> 
> https://github.com/vladimiroltean/killswitch/blob/master/src/killswitch.c
> 
> Sure enough, I can kill a Marvell Armada 3720 device with 1GB of RAM
> within 3 minutes of running the test program.
> 

I don't think that is new or surprising, if there isn't anything to control the
device resources you'll get there. You don't really need to write any new programs
you can easily do it with mausezahn. I have tests that add over 10 million fdbs on
devices for a few seconds.

The point is it's not the bridge's task to limit memory consumption or to watch for resource
management. You can limit new entries from the device driver (in case of swdev learning) or
you can use a daemon to watch the number of entries and disable learning. There are many
different ways to avoid this. We've discussed it before and I don't mind adding a hard fdb
per-port limit in the bridge as long as it's done properly. We've also discussed LRU and similar
algorithms for fdb learning and eviction. But any hardcoded limits or limits that can break
current default use cases are unacceptable, they must be opt-in.

> [  273.864203] ksoftirqd/0: page allocation failure: order:0, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
> [  273.876426] CPU: 0 PID: 12 Comm: ksoftirqd/0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
> [  273.884775] Hardware name: CZ.NIC Turris Mox Board (DT)
> [  273.889994] Call trace:
> [  273.892437]  dump_backtrace.part.0+0xc8/0xd4
> [  273.896721]  show_stack+0x18/0x70
> [  273.900039]  dump_stack_lvl+0x68/0x84
> [  273.903703]  dump_stack+0x18/0x34
> [  273.907017]  warn_alloc+0x114/0x1a0
> [  273.910508]  __alloc_pages+0xbb0/0xbe0
> [  273.914257]  cache_grow_begin+0x60/0x300
> [  273.918183]  fallback_alloc+0x184/0x220
> [  273.922017]  ____cache_alloc_node+0x174/0x190
> [  273.926373]  kmem_cache_alloc+0x1a4/0x220
> [  273.930381]  fdb_create+0x40/0x430
> [  273.933784]  br_fdb_update+0x198/0x210
> [  273.937532]  br_handle_frame_finish+0x244/0x530
> [  273.942063]  br_handle_frame+0x1c0/0x270
> [  273.945986]  __netif_receive_skb_core.constprop.0+0x29c/0xd30
> [  273.951734]  __netif_receive_skb_list_core+0xe8/0x210
> [  273.956784]  netif_receive_skb_list_internal+0x180/0x29c
> [  273.962091]  napi_gro_receive+0x174/0x190
> [  273.966099]  mvneta_rx_swbm+0x6b8/0xb40
> [  273.969935]  mvneta_poll+0x684/0x900
> [  273.973506]  __napi_poll+0x38/0x18c
> [  273.976988]  net_rx_action+0xe8/0x280
> [  273.980643]  __do_softirq+0x124/0x2a0
> [  273.984299]  run_ksoftirqd+0x4c/0x60
> [  273.987871]  smpboot_thread_fn+0x23c/0x270
> [  273.991963]  kthread+0x10c/0x110
> [  273.995188]  ret_from_fork+0x10/0x20
> 
> (followed by lots upon lots of vomiting, followed by ...)
> 
> [  311.138590] Out of memory and no killable processes...
> [  311.143774] Kernel panic - not syncing: System is deadlocked on memory
> [  311.150295] CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.18.7-rc1-00013-g52b92343db13 #74
> [  311.158550] Hardware name: CZ.NIC Turris Mox Board (DT)
> [  311.163766] Workqueue: events rht_deferred_worker
> [  311.168477] Call trace:
> [  311.170916]  dump_backtrace.part.0+0xc8/0xd4
> [  311.175188]  show_stack+0x18/0x70
> [  311.178501]  dump_stack_lvl+0x68/0x84
> [  311.182159]  dump_stack+0x18/0x34
> [  311.185466]  panic+0x168/0x328
> [  311.188515]  out_of_memory+0x568/0x584
> [  311.192261]  __alloc_pages+0xb04/0xbe0
> [  311.196006]  __alloc_pages_bulk+0x15c/0x604
> [  311.200185]  alloc_pages_bulk_array_mempolicy+0xbc/0x24c
> [  311.205491]  __vmalloc_node_range+0x238/0x550
> [  311.209843]  __vmalloc_node_range+0x1c0/0x550
> [  311.214195]  kvmalloc_node+0xe0/0x124
> [  311.217856]  bucket_table_alloc.isra.0+0x40/0x150
> [  311.222554]  rhashtable_rehash_alloc.isra.0+0x20/0x8c
> [  311.227599]  rht_deferred_worker+0x7c/0x540
> [  311.231775]  process_one_work+0x1d0/0x320
> [  311.235779]  worker_thread+0x70/0x440
> [  311.239435]  kthread+0x10c/0x110
> [  311.242661]  ret_from_fork+0x10/0x20
> [  311.246238] SMP: stopping secondary CPUs
> [  311.250161] Kernel Offset: disabled
> [  311.253642] CPU features: 0x000,00020009,00001086
> [  311.258338] Memory Limit: none
> [  311.261390] ---[ end Kernel panic - not syncing: System is deadlocked on memory ]---
> 
> That can't be quite alright? Shouldn't we have some sort of protection
> in the bridge itself too, not just tell hardware driver writers to deal
> with it? Or is it somewhere, but it needs to be enabled/configured?

This is expected, if you'd like feel free to add a hard learning limit in the driver
and the bridge (again if implemented properly).
Nothing can save you if someone has L2 access to the device, they can poison any
table if learning is enabled.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-06 19:38                     ` [Bridge] " Nikolay Aleksandrov
@ 2022-07-06 20:21                       ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 20:21 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Hans Schultz, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
> I don't think that is new or surprising, if there isn't anything to control the
> device resources you'll get there. You don't really need to write any new programs
> you can easily do it with mausezahn. I have tests that add over 10 million fdbs on
> devices for a few seconds.

Of course it isn't new, but that doesn't make the situation in any way better,
quite the opposite...

> The point is it's not the bridge's task to limit memory consumption or to watch for resource
> management. You can limit new entries from the device driver (in case of swdev learning) or
> you can use a daemon to watch the number of entries and disable learning. There are many
> different ways to avoid this. We've discussed it before and I don't mind adding a hard fdb
> per-port limit in the bridge as long as it's done properly. We've also discussed LRU and similar
> algorithms for fdb learning and eviction. But any hardcoded limits or limits that can break
> current default use cases are unacceptable, they must be opt-in.

I don't think you can really say that it's not the bridge's task to
limit memory consumption when what it does is essentially allocate
memory from untrusted and unbounded user input, in kernel softirq
context.

That's in fact the problem, the kernel OOM killer will kick in, but
there will be no process to kill. This is why the kernel deadlocks on
memory and dies.

Maybe where our expectations differ is that I believe that a Linux
bridge shouldn't need gazillions of tweaks to not kill the kernel?
There are many devices in production using a bridge without such
configuration, you can't just make it opt-in.

Of course, performance under heavy stress is a separate concern, and
maybe user space monitoring would be a better idea for that.

I know you changed jobs, but did Cumulus Linux have an application to
monitor and limit the FDB entry count? Is there some standard
application which does this somewhere, or does everybody roll their own?

Anyway, limiting FDB entry count from user space is still theoretically
different from not dying. If you need to schedule a task to dispose of
the weight while the ship is sinking from softirq context, you may never
get to actually schedule that task in time. AFAIK the bridge UAPI doesn't
expose a pre-programmed limit, so what needs to be done is for user
space to manually delete entries until the count falls below the limit.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-06 20:21                       ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-06 20:21 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Ido Schimmel, Vivien Didelot, Eric Dumazet, linux-kselftest,
	Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan, davem

On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
> I don't think that is new or surprising, if there isn't anything to control the
> device resources you'll get there. You don't really need to write any new programs
> you can easily do it with mausezahn. I have tests that add over 10 million fdbs on
> devices for a few seconds.

Of course it isn't new, but that doesn't make the situation in any way better,
quite the opposite...

> The point is it's not the bridge's task to limit memory consumption or to watch for resource
> management. You can limit new entries from the device driver (in case of swdev learning) or
> you can use a daemon to watch the number of entries and disable learning. There are many
> different ways to avoid this. We've discussed it before and I don't mind adding a hard fdb
> per-port limit in the bridge as long as it's done properly. We've also discussed LRU and similar
> algorithms for fdb learning and eviction. But any hardcoded limits or limits that can break
> current default use cases are unacceptable, they must be opt-in.

I don't think you can really say that it's not the bridge's task to
limit memory consumption when what it does is essentially allocate
memory from untrusted and unbounded user input, in kernel softirq
context.

That's in fact the problem, the kernel OOM killer will kick in, but
there will be no process to kill. This is why the kernel deadlocks on
memory and dies.

Maybe where our expectations differ is that I believe that a Linux
bridge shouldn't need gazillions of tweaks to not kill the kernel?
There are many devices in production using a bridge without such
configuration, you can't just make it opt-in.

Of course, performance under heavy stress is a separate concern, and
maybe user space monitoring would be a better idea for that.

I know you changed jobs, but did Cumulus Linux have an application to
monitor and limit the FDB entry count? Is there some standard
application which does this somewhere, or does everybody roll their own?

Anyway, limiting FDB entry count from user space is still theoretically
different from not dying. If you need to schedule a task to dispose of
the weight while the ship is sinking from softirq context, you may never
get to actually schedule that task in time. AFAIK the bridge UAPI doesn't
expose a pre-programmed limit, so what needs to be done is for user
space to manually delete entries until the count falls below the limit.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-06 20:21                       ` [Bridge] " Vladimir Oltean
@ 2022-07-06 21:01                         ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-06 21:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hans Schultz, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 06/07/2022 23:21, Vladimir Oltean wrote:
> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
>> I don't think that is new or surprising, if there isn't anything to control the
>> device resources you'll get there. You don't really need to write any new programs
>> you can easily do it with mausezahn. I have tests that add over 10 million fdbs on
>> devices for a few seconds.
> 
> Of course it isn't new, but that doesn't make the situation in any way better,
> quite the opposite...
> 
>> The point is it's not the bridge's task to limit memory consumption or to watch for resource
>> management. You can limit new entries from the device driver (in case of swdev learning) or
>> you can use a daemon to watch the number of entries and disable learning. There are many
>> different ways to avoid this. We've discussed it before and I don't mind adding a hard fdb
>> per-port limit in the bridge as long as it's done properly. We've also discussed LRU and similar
>> algorithms for fdb learning and eviction. But any hardcoded limits or limits that can break
>> current default use cases are unacceptable, they must be opt-in.
> 
> I don't think you can really say that it's not the bridge's task to
> limit memory consumption when what it does is essentially allocate
> memory from untrusted and unbounded user input, in kernel softirq
> context.
> 
> That's in fact the problem, the kernel OOM killer will kick in, but
> there will be no process to kill. This is why the kernel deadlocks on
> memory and dies.
> 
> Maybe where our expectations differ is that I believe that a Linux
> bridge shouldn't need gazillions of tweaks to not kill the kernel?
> There are many devices in production using a bridge without such
> configuration, you can't just make it opt-in.
> 

No, you cannot suddenly enforce such limit because such limit cannot work for everyone.
There is no silver bullet that works for everyone. Opt-in is the only way to go
about this with specific config for different devices and deployments, anyone
interested can set their limits. They can be auto-adjusted by swdev drivers
after that if necessary, but first they must be implemented in software.

If you're interested in adding default limits based on memory heuristics and consumption
I'd be interested to see it.

> Of course, performance under heavy stress is a separate concern, and
> maybe user space monitoring would be a better idea for that.
> 

You can do the whole software learning from user-space if needed, not only under heavy stress.

> I know you changed jobs, but did Cumulus Linux have an application to
> monitor and limit the FDB entry count? Is there some standard
> application which does this somewhere, or does everybody roll their own?
> 

I don't see how that is relevant.

> Anyway, limiting FDB entry count from user space is still theoretically
> different from not dying. If you need to schedule a task to dispose of

you can disable learning altogether and add entries from a user-space daemon, ie
implement complete user-space learning agent, theoretically you can solve it in
many ways if that's the problem

> the weight while the ship is sinking from softirq context, you may never
> get to actually schedule that task in time. AFAIK the bridge UAPI doesn't
> expose a pre-programmed limit, so what needs to be done is for user
> space to manually delete entries until the count falls below the limit.

That is a single case speculation, it depends on how it was implemented in the first place. You
can disable learning and have more than enough time to deal with it.

I already said it's ok to add hard configurable limits if they're done properly performance-wise.
Any distribution can choose to set some default limits after the option exists.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-06 21:01                         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-06 21:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Ido Schimmel, Vivien Didelot, Eric Dumazet, linux-kselftest,
	Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan, davem

On 06/07/2022 23:21, Vladimir Oltean wrote:
> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
>> I don't think that is new or surprising, if there isn't anything to control the
>> device resources you'll get there. You don't really need to write any new programs
>> you can easily do it with mausezahn. I have tests that add over 10 million fdbs on
>> devices for a few seconds.
> 
> Of course it isn't new, but that doesn't make the situation in any way better,
> quite the opposite...
> 
>> The point is it's not the bridge's task to limit memory consumption or to watch for resource
>> management. You can limit new entries from the device driver (in case of swdev learning) or
>> you can use a daemon to watch the number of entries and disable learning. There are many
>> different ways to avoid this. We've discussed it before and I don't mind adding a hard fdb
>> per-port limit in the bridge as long as it's done properly. We've also discussed LRU and similar
>> algorithms for fdb learning and eviction. But any hardcoded limits or limits that can break
>> current default use cases are unacceptable, they must be opt-in.
> 
> I don't think you can really say that it's not the bridge's task to
> limit memory consumption when what it does is essentially allocate
> memory from untrusted and unbounded user input, in kernel softirq
> context.
> 
> That's in fact the problem, the kernel OOM killer will kick in, but
> there will be no process to kill. This is why the kernel deadlocks on
> memory and dies.
> 
> Maybe where our expectations differ is that I believe that a Linux
> bridge shouldn't need gazillions of tweaks to not kill the kernel?
> There are many devices in production using a bridge without such
> configuration, you can't just make it opt-in.
> 

No, you cannot suddenly enforce such limit because such limit cannot work for everyone.
There is no silver bullet that works for everyone. Opt-in is the only way to go
about this with specific config for different devices and deployments, anyone
interested can set their limits. They can be auto-adjusted by swdev drivers
after that if necessary, but first they must be implemented in software.

If you're interested in adding default limits based on memory heuristics and consumption
I'd be interested to see it.

> Of course, performance under heavy stress is a separate concern, and
> maybe user space monitoring would be a better idea for that.
> 

You can do the whole software learning from user-space if needed, not only under heavy stress.

> I know you changed jobs, but did Cumulus Linux have an application to
> monitor and limit the FDB entry count? Is there some standard
> application which does this somewhere, or does everybody roll their own?
> 

I don't see how that is relevant.

> Anyway, limiting FDB entry count from user space is still theoretically
> different from not dying. If you need to schedule a task to dispose of

you can disable learning altogether and add entries from a user-space daemon, ie
implement complete user-space learning agent, theoretically you can solve it in
many ways if that's the problem

> the weight while the ship is sinking from softirq context, you may never
> get to actually schedule that task in time. AFAIK the bridge UAPI doesn't
> expose a pre-programmed limit, so what needs to be done is for user
> space to manually delete entries until the count falls below the limit.

That is a single case speculation, it depends on how it was implemented in the first place. You
can disable learning and have more than enough time to deal with it.

I already said it's ok to add hard configurable limits if they're done properly performance-wise.
Any distribution can choose to set some default limits after the option exists.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
  2022-07-06 15:38               ` [Bridge] " Hans S
@ 2022-07-07  6:54                 ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-07  6:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Jakub Kicinski, netdev, Hans Schultz,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu,
	Nikolay Aleksandrov, Shuah Khan, Daniel Borkmann, Ido Schimmel,
	linux-kernel, bridge, linux-kselftest

Hi Vladimir,

BTW, I have sent the patch to read the FID as you requested. You
should have received it yesterday (6th July) at around 12:25 UTC.

Hans

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation
@ 2022-07-07  6:54                 ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-07  6:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Nikolay Aleksandrov, bridge,
	Eric Dumazet, Ido Schimmel, Vivien Didelot, Hans Schultz,
	linux-kselftest, Roopa Prabhu, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, David S. Miller, linux-kernel

Hi Vladimir,

BTW, I have sent the patch to read the FID as you requested. You
should have received it yesterday (6th July) at around 12:25 UTC.

Hans

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-06 21:01                         ` [Bridge] " Nikolay Aleksandrov
@ 2022-07-07 14:08                           ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-07 14:08 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hans Schultz, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
> On 06/07/2022 23:21, Vladimir Oltean wrote:
>> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
[snip]
> I already said it's ok to add hard configurable limits if they're done properly performance-wise.
> Any distribution can choose to set some default limits after the option exists.
> 

Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
statistics etc).


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-07 14:08                           ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-07 14:08 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Ido Schimmel, Vivien Didelot, Eric Dumazet, linux-kselftest,
	Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan, davem

On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
> On 06/07/2022 23:21, Vladimir Oltean wrote:
>> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
[snip]
> I already said it's ok to add hard configurable limits if they're done properly performance-wise.
> Any distribution can choose to set some default limits after the option exists.
> 

Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
statistics etc).


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-07 14:08                           ` [Bridge] " Nikolay Aleksandrov
@ 2022-07-07 17:15                             ` Vladimir Oltean
  -1 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-07 17:15 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Hans Schultz, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

Hi Nikolay,

On Thu, Jul 07, 2022 at 05:08:15PM +0300, Nikolay Aleksandrov wrote:
> On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
> > On 06/07/2022 23:21, Vladimir Oltean wrote:
> >> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
> [snip]
> > I already said it's ok to add hard configurable limits if they're done properly performance-wise.
> > Any distribution can choose to set some default limits after the option exists.
> > 
> 
> Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
> fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
> more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
> to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
> statistics etc).

So again, to repeat myself, it's nice to have limits on FDB size, but
those won't fix the software bridges that are now out in the open and
can't have their configuration scripts changed.

I haven't had the time to expand on this in a proper change yet, but I
was thinking more along the lines of adding an OOM handler with
register_oom_notifier() in br_fdb_init(), and on OOM, do something, like
flush the FDB from all bridges. There are going to be complications, it
will schedule switchdev, switchdev is going to allocate memory which
we're low on, the workqueues aren't created with WQ_MEM_RECLAIM, so this
isn't necessarily going to be a silver bullet either. But this is what
concerns me the most, the unconfigured bridge killing the kernel so
easily. As you can see, with an OOM handler I'm not so much trying to
impose a fixed limit on FDB size, but do something sensible such that
the bridge doesn't contribute to the kernel dying.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-07 17:15                             ` Vladimir Oltean
  0 siblings, 0 replies; 108+ messages in thread
From: Vladimir Oltean @ 2022-07-07 17:15 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Ido Schimmel, Vivien Didelot, Eric Dumazet, linux-kselftest,
	Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan, davem

Hi Nikolay,

On Thu, Jul 07, 2022 at 05:08:15PM +0300, Nikolay Aleksandrov wrote:
> On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
> > On 06/07/2022 23:21, Vladimir Oltean wrote:
> >> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
> [snip]
> > I already said it's ok to add hard configurable limits if they're done properly performance-wise.
> > Any distribution can choose to set some default limits after the option exists.
> > 
> 
> Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
> fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
> more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
> to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
> statistics etc).

So again, to repeat myself, it's nice to have limits on FDB size, but
those won't fix the software bridges that are now out in the open and
can't have their configuration scripts changed.

I haven't had the time to expand on this in a proper change yet, but I
was thinking more along the lines of adding an OOM handler with
register_oom_notifier() in br_fdb_init(), and on OOM, do something, like
flush the FDB from all bridges. There are going to be complications, it
will schedule switchdev, switchdev is going to allocate memory which
we're low on, the workqueues aren't created with WQ_MEM_RECLAIM, so this
isn't necessarily going to be a silver bullet either. But this is what
concerns me the most, the unconfigured bridge killing the kernel so
easily. As you can see, with an OOM handler I'm not so much trying to
impose a fixed limit on FDB size, but do something sensible such that
the bridge doesn't contribute to the kernel dying.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-07 17:15                             ` [Bridge] " Vladimir Oltean
@ 2022-07-07 17:26                               ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-07 17:26 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hans Schultz, davem, kuba, netdev, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Ivan Vecera, Roopa Prabhu, Shuah Khan, Daniel Borkmann,
	Ido Schimmel, linux-kernel, bridge, linux-kselftest

On 7 July 2022 20:15:07 EEST, Vladimir Oltean <olteanv@gmail.com> wrote:
>Hi Nikolay,
>
>On Thu, Jul 07, 2022 at 05:08:15PM +0300, Nikolay Aleksandrov wrote:
>> On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
>> > On 06/07/2022 23:21, Vladimir Oltean wrote:
>> >> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
>> [snip]
>> > I already said it's ok to add hard configurable limits if they're done properly performance-wise.
>> > Any distribution can choose to set some default limits after the option exists.
>> > 
>> 
>> Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
>> fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
>> more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
>> to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
>> statistics etc).
>
>So again, to repeat myself, it's nice to have limits on FDB size, but
>those won't fix the software bridges that are now out in the open and
>can't have their configuration scripts changed.
>
>I haven't had the time to expand on this in a proper change yet, but I
>was thinking more along the lines of adding an OOM handler with
>register_oom_notifier() in br_fdb_init(), and on OOM, do something, like
>flush the FDB from all bridges. There are going to be complications, it
>will schedule switchdev, switchdev is going to allocate memory which
>we're low on, the workqueues aren't created with WQ_MEM_RECLAIM, so this
>isn't necessarily going to be a silver bullet either. But this is what
>concerns me the most, the unconfigured bridge killing the kernel so
>easily. As you can see, with an OOM handler I'm not so much trying to
>impose a fixed limit on FDB size, but do something sensible such that
>the bridge doesn't contribute to the kernel dying.

Hi Vladimir,
Sounds good to me, the fdb limits have come up multiple times in the past so I decided 
to finally add them and build from there, with them configured oom shouldn't be hit.
These limits have never been present and people are fine (everyone deals with or leaves it), but I'll be happy to review and ack such changes. I hope you can correlate the oom and the bridge fdbs, not
just blindly flushing as that can be problematic if you plan to have it enabled by default.

Cheers,
  Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-07 17:26                               ` Nikolay Aleksandrov
  0 siblings, 0 replies; 108+ messages in thread
From: Nikolay Aleksandrov @ 2022-07-07 17:26 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Hans Schultz, bridge, linux-kernel,
	Ido Schimmel, Vivien Didelot, Eric Dumazet, linux-kselftest,
	Roopa Prabhu, kuba, Paolo Abeni, Shuah Khan, davem

On 7 July 2022 20:15:07 EEST, Vladimir Oltean <olteanv@gmail.com> wrote:
>Hi Nikolay,
>
>On Thu, Jul 07, 2022 at 05:08:15PM +0300, Nikolay Aleksandrov wrote:
>> On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
>> > On 06/07/2022 23:21, Vladimir Oltean wrote:
>> >> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
>> [snip]
>> > I already said it's ok to add hard configurable limits if they're done properly performance-wise.
>> > Any distribution can choose to set some default limits after the option exists.
>> > 
>> 
>> Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
>> fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
>> more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
>> to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
>> statistics etc).
>
>So again, to repeat myself, it's nice to have limits on FDB size, but
>those won't fix the software bridges that are now out in the open and
>can't have their configuration scripts changed.
>
>I haven't had the time to expand on this in a proper change yet, but I
>was thinking more along the lines of adding an OOM handler with
>register_oom_notifier() in br_fdb_init(), and on OOM, do something, like
>flush the FDB from all bridges. There are going to be complications, it
>will schedule switchdev, switchdev is going to allocate memory which
>we're low on, the workqueues aren't created with WQ_MEM_RECLAIM, so this
>isn't necessarily going to be a silver bullet either. But this is what
>concerns me the most, the unconfigured bridge killing the kernel so
>easily. As you can see, with an OOM handler I'm not so much trying to
>impose a fixed limit on FDB size, but do something sensible such that
>the bridge doesn't contribute to the kernel dying.

Hi Vladimir,
Sounds good to me, the fdb limits have come up multiple times in the past so I decided 
to finally add them and build from there, with them configured oom shouldn't be hit.
These limits have never been present and people are fine (everyone deals with or leaves it), but I'll be happy to review and ack such changes. I hope you can correlate the oom and the bridge fdbs, not
just blindly flushing as that can be problematic if you plan to have it enabled by default.

Cheers,
  Nik

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
  2022-07-07 14:08                           ` [Bridge] " Nikolay Aleksandrov
@ 2022-07-08  6:38                             ` Hans S
  -1 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-08  6:38 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Vladimir Oltean, David S. Miller, Jakub Kicinski, netdev,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Ivan Vecera, Roopa Prabhu, Shuah Khan,
	Daniel Borkmann, Ido Schimmel, linux-kernel, bridge,
	linux-kselftest

On Thu, Jul 7, 2022 at 4:08 PM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>
> On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
> > On 06/07/2022 23:21, Vladimir Oltean wrote:
> >> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
> [snip]
> > I already said it's ok to add hard configurable limits if they're done properly performance-wise.
> > Any distribution can choose to set some default limits after the option exists.
> >
>
> Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
> fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
> more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
> to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
> statistics etc).
>

Sounds good, I will just limit the number of locked entries in the
driver as they are not controllable from the bridge. :-)

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [Bridge] [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature
@ 2022-07-08  6:38                             ` Hans S
  0 siblings, 0 replies; 108+ messages in thread
From: Hans S @ 2022-07-08  6:38 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Ivan Vecera, Andrew Lunn, Florian Fainelli, Jiri Pirko,
	Daniel Borkmann, netdev, Ido Schimmel, bridge, linux-kernel,
	Vivien Didelot, Eric Dumazet, Paolo Abeni, linux-kselftest,
	Roopa Prabhu, Jakub Kicinski, Vladimir Oltean, Shuah Khan,
	David S. Miller

On Thu, Jul 7, 2022 at 4:08 PM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>
> On 07/07/2022 00:01, Nikolay Aleksandrov wrote:
> > On 06/07/2022 23:21, Vladimir Oltean wrote:
> >> On Wed, Jul 06, 2022 at 10:38:04PM +0300, Nikolay Aleksandrov wrote:
> [snip]
> > I already said it's ok to add hard configurable limits if they're done properly performance-wise.
> > Any distribution can choose to set some default limits after the option exists.
> >
>
> Just fyi, and to avoid duplicate efforts, I already have patches for global and per-port software
> fdb limits that I'll polish and submit soon (depending on time availability, of course). If I find
> more time I might add per-vlan limits as well to the set. They use embedded netlink attributes
> to config and dump, so we can easily extend them later (e.g. different action on limit hit, limit
> statistics etc).
>

Sounds good, I will just limit the number of locked entries in the
driver as they are not controllable from the bridge. :-)

^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2022-07-08  6:38 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24 15:21 [PATCH V3 net-next 0/4] Extend locked port feature with FDB locked flag (MAC-Auth/MAB) Hans Schultz
2022-05-24 15:21 ` [Bridge] " Hans Schultz
2022-05-24 15:21 ` [PATCH V3 net-next 1/4] net: bridge: add fdb flag to extent locked port feature Hans Schultz
2022-05-24 15:21   ` [Bridge] " Hans Schultz
2022-05-24 15:39   ` Nikolay Aleksandrov
2022-05-24 15:39     ` [Bridge] " Nikolay Aleksandrov
2022-05-24 16:08     ` Hans Schultz
2022-05-24 16:08       ` [Bridge] " Hans Schultz
2022-05-24 16:21     ` Hans Schultz
2022-05-24 16:21       ` [Bridge] " Hans Schultz
2022-05-25  8:06       ` Nikolay Aleksandrov
2022-05-25  8:06         ` [Bridge] " Nikolay Aleksandrov
2022-05-25  8:34         ` Hans Schultz
2022-05-25  8:34           ` [Bridge] " Hans Schultz
2022-05-25  8:38           ` Nikolay Aleksandrov
2022-05-25  8:38             ` [Bridge] " Nikolay Aleksandrov
2022-05-25  9:11             ` Hans Schultz
2022-05-25  9:11               ` [Bridge] " Hans Schultz
2022-05-25 10:18               ` Nikolay Aleksandrov
2022-05-25 10:18                 ` [Bridge] " Nikolay Aleksandrov
2022-07-06 18:13                 ` Vladimir Oltean
2022-07-06 18:13                   ` [Bridge] " Vladimir Oltean
2022-07-06 19:38                   ` Nikolay Aleksandrov
2022-07-06 19:38                     ` [Bridge] " Nikolay Aleksandrov
2022-07-06 20:21                     ` Vladimir Oltean
2022-07-06 20:21                       ` [Bridge] " Vladimir Oltean
2022-07-06 21:01                       ` Nikolay Aleksandrov
2022-07-06 21:01                         ` [Bridge] " Nikolay Aleksandrov
2022-07-07 14:08                         ` Nikolay Aleksandrov
2022-07-07 14:08                           ` [Bridge] " Nikolay Aleksandrov
2022-07-07 17:15                           ` Vladimir Oltean
2022-07-07 17:15                             ` [Bridge] " Vladimir Oltean
2022-07-07 17:26                             ` Nikolay Aleksandrov
2022-07-07 17:26                               ` [Bridge] " Nikolay Aleksandrov
2022-07-08  6:38                           ` Hans S
2022-07-08  6:38                             ` [Bridge] " Hans S
2022-05-26 14:13   ` Ido Schimmel
2022-05-26 14:13     ` [Bridge] " Ido Schimmel
2022-05-27  8:52     ` Hans Schultz
2022-05-27  8:52       ` [Bridge] " Hans Schultz
2022-05-27  9:58       ` Ido Schimmel
2022-05-27  9:58         ` [Bridge] " Ido Schimmel
2022-05-27 16:00         ` Hans Schultz
2022-05-27 16:00           ` [Bridge] " Hans Schultz
2022-05-31  9:34         ` Hans Schultz
2022-05-31  9:34           ` [Bridge] " Hans Schultz
2022-05-31 14:23           ` Ido Schimmel
2022-05-31 14:23             ` [Bridge] " Ido Schimmel
2022-05-31 15:49             ` Hans Schultz
2022-05-31 15:49               ` [Bridge] " Hans Schultz
2022-06-02  9:17             ` Hans Schultz
2022-06-02  9:17               ` [Bridge] " Hans Schultz
2022-06-02  9:33               ` Nikolay Aleksandrov
2022-06-02  9:33                 ` [Bridge] " Nikolay Aleksandrov
2022-06-02 10:17                 ` Hans Schultz
2022-06-02 10:17                   ` [Bridge] " Hans Schultz
2022-06-02 10:30                   ` Nikolay Aleksandrov
2022-06-02 10:30                     ` [Bridge] " Nikolay Aleksandrov
2022-06-02 10:39                     ` Ido Schimmel
2022-06-02 10:39                       ` [Bridge] " Ido Schimmel
2022-06-02 11:36                       ` Hans Schultz
2022-06-02 11:36                         ` [Bridge] " Hans Schultz
2022-06-02 11:55                         ` Ido Schimmel
2022-06-02 11:55                           ` [Bridge] " Ido Schimmel
2022-06-02 12:08                       ` Hans Schultz
2022-06-02 12:08                         ` [Bridge] " Hans Schultz
2022-06-02 12:18                         ` Ido Schimmel
2022-06-02 12:18                           ` [Bridge] " Ido Schimmel
2022-06-02 12:53                           ` Hans S
2022-06-02 13:27                           ` Hans S
2022-06-02 13:27                             ` [Bridge] " Hans S
2022-05-24 15:21 ` [PATCH V3 net-next 2/4] net: switchdev: add support for offloading of fdb locked flag Hans Schultz
2022-05-24 15:21   ` [Bridge] " Hans Schultz
2022-06-27 16:06   ` Vladimir Oltean
2022-06-27 16:06     ` [Bridge] " Vladimir Oltean
2022-05-24 15:21 ` [PATCH V3 net-next 3/4] net: dsa: mv88e6xxx: mac-auth/MAB implementation Hans Schultz
2022-05-24 15:21   ` [Bridge] " Hans Schultz
2022-05-24 21:36   ` kernel test robot
2022-06-27 12:58   ` Hans S
2022-06-27 12:58     ` [Bridge] " Hans S
2022-06-27 18:05   ` Vladimir Oltean
2022-06-27 18:05     ` [Bridge] " Vladimir Oltean
2022-06-28 12:26     ` Hans S
2022-06-28 12:26       ` [Bridge] " Hans S
2022-07-05 15:05       ` Hans S
2022-07-05 15:05         ` [Bridge] " Hans S
2022-07-06 13:28         ` Vladimir Oltean
2022-07-06 13:28           ` [Bridge] " Vladimir Oltean
2022-07-06 13:48           ` Hans S
2022-07-06 13:48             ` [Bridge] " Hans S
2022-07-06  8:55       ` Vladimir Oltean
2022-07-06  8:55         ` [Bridge] " Vladimir Oltean
2022-07-06 10:12         ` Hans S
2022-07-06 10:12           ` [Bridge] " Hans S
2022-07-06 14:23           ` Hans S
2022-07-06 14:23             ` [Bridge] " Hans S
2022-07-06 14:33           ` Vladimir Oltean
2022-07-06 14:33             ` [Bridge] " Vladimir Oltean
2022-07-06 15:38             ` Hans S
2022-07-06 15:38               ` [Bridge] " Hans S
2022-07-07  6:54               ` Hans S
2022-07-07  6:54                 ` [Bridge] " Hans S
2022-05-24 15:21 ` [PATCH V3 net-next 4/4] selftests: forwarding: add test of MAC-Auth Bypass to locked port tests Hans Schultz
2022-05-24 15:21   ` [Bridge] " Hans Schultz
2022-05-26 14:27   ` Ido Schimmel
2022-05-26 14:27     ` [Bridge] " Ido Schimmel
2022-05-27  9:07     ` Hans Schultz
2022-05-27  9:07       ` [Bridge] " Hans Schultz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.