All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking
@ 2021-08-20 11:57 Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 1/7] net: bridge: move br_fdb_replay inside br_switchdev.c Vladimir Oltean
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

Problem statement:

Any time a driver needs to create a private association between a bridge
upper interface and use that association within its
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
entries deleted by the bridge when the port leaves. The issue is that
all switchdev drivers schedule a work item to have sleepable context,
and that work item can be actually scheduled after the port has left the
bridge, which means the association might have already been broken by
the time the scheduled FDB work item attempts to use it.

The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
mechanism to make the FDB notifiers emitted from the fastpath be
scheduled in sleepable context. All drivers are converted to handle
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
handler (or register a blocking switchdev notifier handler if they
didn't have one). This solves the aforementioned problem because the
bridge waits for the switchdev deferred work items to finish before a
port leaves (del_nbp calls switchdev_deferred_process), whereas a work
item privately scheduled by the driver will obviously not be waited upon
by the bridge, leading to the possibility of having the race.

This is a dependency for the "DSA FDB isolation" posted here. It was
split out of that series hence the numbering starts directly at v2.

https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/

Changes in v3:
- make "addr" part of switchdev_fdb_notifier_info to avoid dangling
  pointers not watched by RCU
- mlx5 correction
- build fixes in the S/390 qeth driver

Vladimir Oltean (7):
  net: bridge: move br_fdb_replay inside br_switchdev.c
  net: switchdev: keep the MAC address by value in struct
    switchdev_notifier_fdb_info
  net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking
    notifier chain
  net: bridge: switchdev: make br_fdb_replay offer sleepable context to
    consumers
  net: switchdev: drop the atomic notifier block from
    switchdev_bridge_port_{,un}offload
  net: switchdev: don't assume RCU context in
    switchdev_handle_fdb_{add,del}_to_device
  net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously

 .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  75 ++++------
 .../marvell/prestera/prestera_switchdev.c     | 104 ++++++-------
 .../mellanox/mlx5/core/en/rep/bridge.c        |  65 +++++++--
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
 .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
 .../mellanox/mlxsw/spectrum_switchdev.c       |  62 ++++++--
 .../microchip/sparx5/sparx5_mactable.c        |   2 +-
 .../microchip/sparx5/sparx5_switchdev.c       |  72 ++++-----
 drivers/net/ethernet/mscc/ocelot_net.c        |   3 -
 drivers/net/ethernet/rocker/rocker_main.c     |  67 ++++-----
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |   6 +-
 drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   4 +-
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  54 +++----
 drivers/net/ethernet/ti/cpsw_new.c            |   4 +-
 drivers/net/ethernet/ti/cpsw_switchdev.c      |  57 ++++----
 drivers/s390/net/qeth_l2_main.c               |  26 ++--
 include/net/switchdev.h                       |  33 ++++-
 net/bridge/br.c                               |   5 +-
 net/bridge/br_fdb.c                           |  54 -------
 net/bridge/br_private.h                       |   6 -
 net/bridge/br_switchdev.c                     | 128 +++++++++++++---
 net/dsa/dsa.c                                 |  15 --
 net/dsa/dsa_priv.h                            |  15 --
 net/dsa/port.c                                |   3 -
 net/dsa/slave.c                               | 138 ++++++------------
 net/switchdev/switchdev.c                     |  61 +++++++-
 26 files changed, 550 insertions(+), 515 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 1/7] net: bridge: move br_fdb_replay inside br_switchdev.c
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 2/7] net: switchdev: keep the MAC address by value in struct switchdev_notifier_fdb_info Vladimir Oltean
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

br_fdb_replay is only called from switchdev code paths, so it makes
sense to be disabled if switchdev is not enabled in the first place.

As opposed to br_mdb_replay and br_vlan_replay which might be turned off
depending on bridge support for multicast and VLANs, FDB support is
always on. So moving br_mdb_replay and br_vlan_replay inside
br_switchdev.c would mean adding some #ifdef's in br_switchdev.c, so we
keep those where they are.

The reason for the movement is that in future changes there will be some
code reuse between br_switchdev_fdb_notify and br_fdb_replay.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3: patch is new

 net/bridge/br_fdb.c       | 54 ---------------------------------------
 net/bridge/br_private.h   |  2 --
 net/bridge/br_switchdev.c | 54 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 56 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 46812b659710..c6e51701bc37 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -732,60 +732,6 @@ static inline size_t fdb_nlmsg_size(void)
 		+ nla_total_size(sizeof(u8)); /* NFEA_ACTIVITY_NOTIFY */
 }
 
-static int br_fdb_replay_one(struct net_bridge *br, struct notifier_block *nb,
-			     const struct net_bridge_fdb_entry *fdb,
-			     unsigned long action, const void *ctx)
-{
-	const struct net_bridge_port *p = READ_ONCE(fdb->dst);
-	struct switchdev_notifier_fdb_info item;
-	int err;
-
-	item.addr = fdb->key.addr.addr;
-	item.vid = fdb->key.vlan_id;
-	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
-	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
-	item.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
-	item.info.dev = (!p || item.is_local) ? br->dev : p->dev;
-	item.info.ctx = ctx;
-
-	err = nb->notifier_call(nb, action, &item);
-	return notifier_to_errno(err);
-}
-
-int br_fdb_replay(const struct net_device *br_dev, const void *ctx, bool adding,
-		  struct notifier_block *nb)
-{
-	struct net_bridge_fdb_entry *fdb;
-	struct net_bridge *br;
-	unsigned long action;
-	int err = 0;
-
-	if (!nb)
-		return 0;
-
-	if (!netif_is_bridge_master(br_dev))
-		return -EINVAL;
-
-	br = netdev_priv(br_dev);
-
-	if (adding)
-		action = SWITCHDEV_FDB_ADD_TO_DEVICE;
-	else
-		action = SWITCHDEV_FDB_DEL_TO_DEVICE;
-
-	rcu_read_lock();
-
-	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
-		err = br_fdb_replay_one(br, nb, fdb, action, ctx);
-		if (err)
-			break;
-	}
-
-	rcu_read_unlock();
-
-	return err;
-}
-
 static void fdb_notify(struct net_bridge *br,
 		       const struct net_bridge_fdb_entry *fdb, int type,
 		       bool swdev_notify)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 21b292eb2b3e..390c807d1c7c 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -778,8 +778,6 @@ int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      bool swdev_notify);
 void br_fdb_offloaded_set(struct net_bridge *br, struct net_bridge_port *p,
 			  const unsigned char *addr, u16 vid, bool offloaded);
-int br_fdb_replay(const struct net_device *br_dev, const void *ctx, bool adding,
-		  struct notifier_block *nb);
 
 /* br_forward.c */
 enum br_pkt_type {
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 6bf518d78f02..8a45b1cfe06f 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -270,6 +270,60 @@ static void nbp_switchdev_del(struct net_bridge_port *p)
 	}
 }
 
+static int br_fdb_replay_one(struct net_bridge *br, struct notifier_block *nb,
+			     const struct net_bridge_fdb_entry *fdb,
+			     unsigned long action, const void *ctx)
+{
+	const struct net_bridge_port *p = READ_ONCE(fdb->dst);
+	struct switchdev_notifier_fdb_info item;
+	int err;
+
+	item.addr = fdb->key.addr.addr;
+	item.vid = fdb->key.vlan_id;
+	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
+	item.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
+	item.info.dev = (!p || item.is_local) ? br->dev : p->dev;
+	item.info.ctx = ctx;
+
+	err = nb->notifier_call(nb, action, &item);
+	return notifier_to_errno(err);
+}
+
+static int br_fdb_replay(const struct net_device *br_dev, const void *ctx,
+			 bool adding, struct notifier_block *nb)
+{
+	struct net_bridge_fdb_entry *fdb;
+	struct net_bridge *br;
+	unsigned long action;
+	int err = 0;
+
+	if (!nb)
+		return 0;
+
+	if (!netif_is_bridge_master(br_dev))
+		return -EINVAL;
+
+	br = netdev_priv(br_dev);
+
+	if (adding)
+		action = SWITCHDEV_FDB_ADD_TO_DEVICE;
+	else
+		action = SWITCHDEV_FDB_DEL_TO_DEVICE;
+
+	rcu_read_lock();
+
+	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
+		err = br_fdb_replay_one(br, nb, fdb, action, ctx);
+		if (err)
+			break;
+	}
+
+	rcu_read_unlock();
+
+	return err;
+}
+
 static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
 				   struct notifier_block *atomic_nb,
 				   struct notifier_block *blocking_nb,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 2/7] net: switchdev: keep the MAC address by value in struct switchdev_notifier_fdb_info
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 1/7] net: bridge: move br_fdb_replay inside br_switchdev.c Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain Vladimir Oltean
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

ETH_ALEN is 6 bytes, which is 2 bytes more than a pointer on a 32-bit
machine, and 2 bytes less than a pointer on a 64-bit machine. So in
terms of memory usage, when you consider that the variant with the
pointer needs memory for the pointer _and_ the address, we would rather
embed the address within the structure directly. Not a huge difference
either way though.

But with the upcoming conversion of the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
notifiers from the atomic call chain to the blocking call chain, where
switchdev_deferred_enqueue is used to copy the FDB entry created by the
bridge, we can see that only a shallow copy is performed (with memcpy),
leaving behind the fdb_info->addr to point to bridge memory.

When switchdev hands out a shallow fdb_info copy to drivers where the
addr still points to bridge memory, the RCU read-side critical section
is dropped quickly, much quicker than the driver might act upon the
fdb_info copy. So the address might be freed by the bridge after an RCU
grace period expiration. This does currently not happen, because RCU
watches for the entire period during which br_switchdev_fdb_notify runs.

We want to avoid that, but we don't want to patch switchdev_deferred_enqueue
to perform a deep copy, and we don't want to create a deep copy ourselves
which we pass to switchdev_deferred_enqueue for it to make a shallow one.
Who frees the fdb_info->addr? Do we really bother to do it under RCU?

Instead we can notice that switchdev_deferred_enqueue works just fine
with struct switchdev_obj_port_mdb, because the addr is embedded inside
the structure there, and does not reference bridge memory.

So we do the same with struct switchdev_notifier_fdb_info. Making that
change means we also need to touch all drivers that operate on
fdb_info->addr.

For the most part, the change is for the better: the shallow memcpy that
most drivers do in the bridge -> driver direction is now good enough to
also copy the MAC address. So we delete a bunch of code that used to
allocate an ETH_ALEN unsigned char array, and the ether_addr_copy that
goes with it.

As for notifiers in the reverse direction (driver -> bridge), it is true
that drivers now do an extra memcpy where they previously did not, but I
have heard it is desirable for SWITCHDEV_FDB_{ADD,DEL}_TO_BRIDGE to be
converted to switchdev_deferred_enqueue in the future too, and this
change will only make that easier.

Some drivers perform a type cast from the void *ptr to a struct
switchdev_notifier_info *, then use container_of to get from info to
fdb_info, even though this is not necessary as fdb_info is guaranteed to
be the first element of info. Anyway, type safety is good to have..

Except these same drivers then proceed to memcpy into their fdb_info
structure directly from the ptr variable, and only copy from the
fdb_info pointer the MAC address... So much for type safety...

I didn't really want to touch that in this patch too, but with the
elimination of the ether_addr_copy from fdb_info, the fdb_info pointer
is really no longer used. So let's really use it for something.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3: patch is new

 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 14 +------------
 .../marvell/prestera/prestera_switchdev.c     | 18 +++--------------
 .../mellanox/mlx5/core/en/rep/bridge.c        | 10 ----------
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |  2 +-
 .../ethernet/mellanox/mlxsw/spectrum_router.c |  4 ++--
 .../mellanox/mlxsw/spectrum_switchdev.c       | 11 ++--------
 .../microchip/sparx5/sparx5_mactable.c        |  2 +-
 .../microchip/sparx5/sparx5_switchdev.c       | 12 +----------
 drivers/net/ethernet/rocker/rocker_main.c     | 13 ++----------
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |  2 +-
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c | 14 ++-----------
 drivers/net/ethernet/ti/cpsw_switchdev.c      | 14 ++-----------
 drivers/s390/net/qeth_l2_main.c               | 11 ++++------
 include/net/switchdev.h                       |  2 +-
 net/bridge/br_switchdev.c                     | 20 ++++++++++---------
 net/dsa/slave.c                               |  2 +-
 16 files changed, 35 insertions(+), 116 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index d260993ab2dc..dd0096cc3221 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2244,7 +2244,6 @@ static void dpaa2_switch_event_work(struct work_struct *work)
 	}
 
 	rtnl_unlock();
-	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(dev);
 }
@@ -2276,15 +2275,8 @@ static int dpaa2_switch_port_event(struct notifier_block *nb,
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
-		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (!switchdev_work->fdb_info.addr)
-			goto err_addr_alloc;
-
-		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
-				fdb_info->addr);
-
 		/* Take a reference on the device to avoid being freed. */
 		dev_hold(dev);
 		break;
@@ -2296,10 +2288,6 @@ static int dpaa2_switch_port_event(struct notifier_block *nb,
 	queue_work(ethsw->workqueue, &switchdev_work->work);
 
 	return NOTIFY_DONE;
-
-err_addr_alloc:
-	kfree(switchdev_work);
-	return NOTIFY_BAD;
 }
 
 static int dpaa2_switch_port_obj_event(unsigned long event,
diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
index 3ce6ccd0f539..236b07c42df0 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
@@ -760,7 +760,7 @@ prestera_fdb_offload_notify(struct prestera_port *port,
 {
 	struct switchdev_notifier_fdb_info send_info = {};
 
-	send_info.addr = info->addr;
+	ether_addr_copy(send_info.addr, info->addr);
 	send_info.vid = info->vid;
 	send_info.offloaded = true;
 
@@ -836,7 +836,6 @@ static void prestera_fdb_event_work(struct work_struct *work)
 out_unlock:
 	rtnl_unlock();
 
-	kfree(swdev_work->fdb_info.addr);
 	kfree(swdev_work);
 	dev_put(dev);
 }
@@ -883,15 +882,8 @@ static int prestera_switchdev_event(struct notifier_block *unused,
 					info);
 
 		INIT_WORK(&swdev_work->work, prestera_fdb_event_work);
-		memcpy(&swdev_work->fdb_info, ptr,
+		memcpy(&swdev_work->fdb_info, fdb_info,
 		       sizeof(swdev_work->fdb_info));
-
-		swdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (!swdev_work->fdb_info.addr)
-			goto out_bad;
-
-		ether_addr_copy((u8 *)swdev_work->fdb_info.addr,
-				fdb_info->addr);
 		dev_hold(dev);
 		break;
 
@@ -902,10 +894,6 @@ static int prestera_switchdev_event(struct notifier_block *unused,
 
 	queue_work(swdev_wq, &swdev_work->work);
 	return NOTIFY_DONE;
-
-out_bad:
-	kfree(swdev_work);
-	return NOTIFY_BAD;
 }
 
 static int
@@ -1156,7 +1144,7 @@ static void prestera_fdb_event(struct prestera_switch *sw,
 	if (!dev)
 		return;
 
-	info.addr = evt->fdb_evt.data.mac;
+	ether_addr_copy(info.addr, evt->fdb_evt.data.mac);
 	info.vid = evt->fdb_evt.vid;
 	info.offloaded = true;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 0c38c2e319be..3e11420d8057 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -306,7 +306,6 @@ static void
 mlx5_esw_bridge_cleanup_switchdev_fdb_work(struct mlx5_bridge_switchdev_fdb_work *fdb_work)
 {
 	dev_put(fdb_work->dev);
-	kfree(fdb_work->fdb_info.addr);
 	kfree(fdb_work);
 }
 
@@ -345,7 +344,6 @@ mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add,
 					struct mlx5_esw_bridge_offloads *br_offloads)
 {
 	struct mlx5_bridge_switchdev_fdb_work *work;
-	u8 *addr;
 
 	work = kzalloc(sizeof(*work), GFP_ATOMIC);
 	if (!work)
@@ -354,14 +352,6 @@ mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add,
 	INIT_WORK(&work->work, mlx5_esw_bridge_switchdev_fdb_event_work);
 	memcpy(&work->fdb_info, fdb_info, sizeof(work->fdb_info));
 
-	addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-	if (!addr) {
-		kfree(work);
-		return ERR_PTR(-ENOMEM);
-	}
-	ether_addr_copy(addr, fdb_info->addr);
-	work->fdb_info.addr = addr;
-
 	dev_hold(dev);
 	work->dev = dev;
 	work->br_offloads = br_offloads;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 7e221038df8d..38beef330b94 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -71,7 +71,7 @@ mlx5_esw_bridge_fdb_offload_notify(struct net_device *dev, const unsigned char *
 {
 	struct switchdev_notifier_fdb_info send_info = {};
 
-	send_info.addr = addr;
+	ether_addr_copy(send_info.addr, addr);
 	send_info.vid = vid;
 	send_info.offloaded = true;
 	call_switchdev_notifiers(val, dev, &send_info.info, NULL);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index f69cbb3852d5..9e129c93581b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -9086,7 +9086,7 @@ static void mlxsw_sp_rif_fid_fdb_del(struct mlxsw_sp_rif *rif, const char *mac)
 	if (!dev)
 		return;
 
-	info.addr = mac;
+	ether_addr_copy(info.addr, mac);
 	info.vid = 0;
 	call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, dev, &info.info,
 				 NULL);
@@ -9137,7 +9137,7 @@ static void mlxsw_sp_rif_vlan_fdb_del(struct mlxsw_sp_rif *rif, const char *mac)
 	if (!dev)
 		return;
 
-	info.addr = mac;
+	ether_addr_copy(info.addr, mac);
 	info.vid = vid;
 	call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, dev, &info.info,
 				 NULL);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 22fede5cb32c..78e5059beafa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -2522,7 +2522,7 @@ mlxsw_sp_fdb_call_notifiers(enum switchdev_notifier_type type,
 {
 	struct switchdev_notifier_fdb_info info = {};
 
-	info.addr = mac;
+	ether_addr_copy(info.addr, mac);
 	info.vid = vid;
 	info.offloaded = offloaded;
 	call_switchdev_notifiers(type, dev, &info.info, NULL);
@@ -3013,7 +3013,6 @@ static void mlxsw_sp_switchdev_bridge_fdb_event_work(struct work_struct *work)
 
 out:
 	rtnl_unlock();
-	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(dev);
 }
@@ -3256,13 +3255,8 @@ static int mlxsw_sp_switchdev_event(struct notifier_block *unused,
 					info);
 		INIT_WORK(&switchdev_work->work,
 			  mlxsw_sp_switchdev_bridge_fdb_event_work);
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
-		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (!switchdev_work->fdb_info.addr)
-			goto err_addr_alloc;
-		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
-				fdb_info->addr);
 		/* Take a reference on the device. This can be either
 		 * upper device containig mlxsw_sp_port or just a
 		 * mlxsw_sp_port
@@ -3289,7 +3283,6 @@ static int mlxsw_sp_switchdev_event(struct notifier_block *unused,
 	return NOTIFY_DONE;
 
 err_vxlan_work_prepare:
-err_addr_alloc:
 	kfree(switchdev_work);
 	return NOTIFY_BAD;
 }
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c
index 9a8e4f201eb1..3dcb6a887ea5 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c
@@ -279,7 +279,7 @@ static void sparx5_fdb_call_notifiers(enum switchdev_notifier_type type,
 {
 	struct switchdev_notifier_fdb_info info = {};
 
-	info.addr = mac;
+	ether_addr_copy(info.addr, mac);
 	info.vid = vid;
 	info.offloaded = offloaded;
 	call_switchdev_notifiers(type, dev, &info.info, NULL);
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
index 649ca609884a..5c5eb557a19c 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
@@ -254,7 +254,6 @@ static void sparx5_switchdev_bridge_fdb_event_work(struct work_struct *work)
 
 out:
 	rtnl_unlock();
-	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(dev);
 }
@@ -294,14 +293,8 @@ static int sparx5_switchdev_event(struct notifier_block *unused,
 					info);
 		INIT_WORK(&switchdev_work->work,
 			  sparx5_switchdev_bridge_fdb_event_work);
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
-		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (!switchdev_work->fdb_info.addr)
-			goto err_addr_alloc;
-
-		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
-				fdb_info->addr);
 		dev_hold(dev);
 
 		sparx5_schedule_work(&switchdev_work->work);
@@ -309,9 +302,6 @@ static int sparx5_switchdev_event(struct notifier_block *unused,
 	}
 
 	return NOTIFY_DONE;
-err_addr_alloc:
-	kfree(switchdev_work);
-	return NOTIFY_BAD;
 }
 
 static void sparx5_sync_port_dev_addr(struct sparx5 *sparx5,
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 3364b6a56bd1..d490d006cc98 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2718,7 +2718,7 @@ rocker_fdb_offload_notify(struct rocker_port *rocker_port,
 {
 	struct switchdev_notifier_fdb_info info = {};
 
-	info.addr = recv_info->addr;
+	ether_addr_copy(info.addr, recv_info->addr);
 	info.vid = recv_info->vid;
 	info.offloaded = true;
 	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
@@ -2757,7 +2757,6 @@ static void rocker_switchdev_event_work(struct work_struct *work)
 	}
 	rtnl_unlock();
 
-	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(rocker_port->dev);
 }
@@ -2789,16 +2788,8 @@ static int rocker_switchdev_event(struct notifier_block *unused,
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
-		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (unlikely(!switchdev_work->fdb_info.addr)) {
-			kfree(switchdev_work);
-			return NOTIFY_BAD;
-		}
-
-		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
-				fdb_info->addr);
 		/* Take a reference on the rocker device */
 		dev_hold(dev);
 		break;
diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index 3e1ca7a8d029..abdb4b49add1 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -1824,7 +1824,7 @@ static void ofdpa_port_fdb_learn_work(struct work_struct *work)
 	bool learned = (lw->flags & OFDPA_OP_FLAG_LEARNED);
 	struct switchdev_notifier_fdb_info info = {};
 
-	info.addr = lw->addr;
+	ether_addr_copy(info.addr, lw->addr);
 	info.vid = lw->vid;
 
 	rtnl_lock();
diff --git a/drivers/net/ethernet/ti/am65-cpsw-switchdev.c b/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
index 599708a3e81d..860214e1a8ca 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
@@ -360,7 +360,7 @@ static void am65_cpsw_fdb_offload_notify(struct net_device *ndev,
 {
 	struct switchdev_notifier_fdb_info info = {};
 
-	info.addr = rcv->addr;
+	ether_addr_copy(info.addr, rcv->addr);
 	info.vid = rcv->vid;
 	info.offloaded = true;
 	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
@@ -414,7 +414,6 @@ static void am65_cpsw_switchdev_event_work(struct work_struct *work)
 	}
 	rtnl_unlock();
 
-	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(port->ndev);
 }
@@ -450,13 +449,8 @@ static int am65_cpsw_switchdev_event(struct notifier_block *unused,
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
-		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (!switchdev_work->fdb_info.addr)
-			goto err_addr_alloc;
-		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
-				fdb_info->addr);
 		dev_hold(ndev);
 		break;
 	default:
@@ -467,10 +461,6 @@ static int am65_cpsw_switchdev_event(struct notifier_block *unused,
 	queue_work(system_long_wq, &switchdev_work->work);
 
 	return NOTIFY_DONE;
-
-err_addr_alloc:
-	kfree(switchdev_work);
-	return NOTIFY_BAD;
 }
 
 static struct notifier_block cpsw_switchdev_notifier = {
diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c
index a7d97d429e06..786bb848ddeb 100644
--- a/drivers/net/ethernet/ti/cpsw_switchdev.c
+++ b/drivers/net/ethernet/ti/cpsw_switchdev.c
@@ -370,7 +370,7 @@ static void cpsw_fdb_offload_notify(struct net_device *ndev,
 {
 	struct switchdev_notifier_fdb_info info = {};
 
-	info.addr = rcv->addr;
+	ether_addr_copy(info.addr, rcv->addr);
 	info.vid = rcv->vid;
 	info.offloaded = true;
 	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
@@ -424,7 +424,6 @@ static void cpsw_switchdev_event_work(struct work_struct *work)
 	}
 	rtnl_unlock();
 
-	kfree(switchdev_work->fdb_info.addr);
 	kfree(switchdev_work);
 	dev_put(priv->ndev);
 }
@@ -460,13 +459,8 @@ static int cpsw_switchdev_event(struct notifier_block *unused,
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, ptr,
+		memcpy(&switchdev_work->fdb_info, fdb_info,
 		       sizeof(switchdev_work->fdb_info));
-		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-		if (!switchdev_work->fdb_info.addr)
-			goto err_addr_alloc;
-		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
-				fdb_info->addr);
 		dev_hold(ndev);
 		break;
 	default:
@@ -477,10 +471,6 @@ static int cpsw_switchdev_event(struct notifier_block *unused,
 	queue_work(system_long_wq, &switchdev_work->work);
 
 	return NOTIFY_DONE;
-
-err_addr_alloc:
-	kfree(switchdev_work);
-	return NOTIFY_BAD;
 }
 
 static struct notifier_block cpsw_switchdev_notifier = {
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 72e84ff9fea5..de98f79c11ab 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -283,7 +283,6 @@ static void qeth_l2_dev2br_fdb_flush(struct qeth_card *card)
 
 	QETH_CARD_TEXT(card, 2, "fdbflush");
 
-	info.addr = NULL;
 	/* flush all VLANs: */
 	info.vid = 0;
 	info.added_by_user = false;
@@ -637,20 +636,18 @@ static void qeth_l2_dev2br_fdb_notify(struct qeth_card *card, u8 code,
 				      struct mac_addr_lnid *addr_lnid)
 {
 	struct switchdev_notifier_fdb_info info = {};
-	u8 ntfy_mac[ETH_ALEN];
 
-	ether_addr_copy(ntfy_mac, addr_lnid->mac);
 	/* Ignore VLAN only changes */
 	if (!(code & IPA_ADDR_CHANGE_CODE_MACADDR))
 		return;
 	/* Ignore mcast entries */
-	if (is_multicast_ether_addr(ntfy_mac))
+	if (is_multicast_ether_addr(addr_lnid->mac))
 		return;
 	/* Ignore my own addresses */
 	if (qeth_is_my_net_if_token(card, token))
 		return;
 
-	info.addr = ntfy_mac;
+	ether_addr_copy(info.addr, addr_lnid->mac);
 	/* don't report VLAN IDs */
 	info.vid = 0;
 	info.added_by_user = false;
@@ -661,13 +658,13 @@ static void qeth_l2_dev2br_fdb_notify(struct qeth_card *card, u8 code,
 					 card->dev, &info.info, NULL);
 		QETH_CARD_TEXT(card, 4, "andelmac");
 		QETH_CARD_TEXT_(card, 4,
-				"mc%012lx", ether_addr_to_u64(ntfy_mac));
+				"mc%012lx", ether_addr_to_u64(info.addr));
 	} else {
 		call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE,
 					 card->dev, &info.info, NULL);
 		QETH_CARD_TEXT(card, 4, "anaddmac");
 		QETH_CARD_TEXT_(card, 4,
-				"mc%012lx", ether_addr_to_u64(ntfy_mac));
+				"mc%012lx", ether_addr_to_u64(info.addr));
 	}
 }
 
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 60d806b6a5ae..6764fb7692e2 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -218,7 +218,7 @@ struct switchdev_notifier_info {
 
 struct switchdev_notifier_fdb_info {
 	struct switchdev_notifier_info info; /* must be first */
-	const unsigned char *addr;
+	unsigned char addr[ETH_ALEN];
 	u16 vid;
 	u8 added_by_user:1,
 	   is_local:1,
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 8a45b1cfe06f..7e62904089c8 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -127,14 +127,16 @@ br_switchdev_fdb_notify(struct net_bridge *br,
 			const struct net_bridge_fdb_entry *fdb, int type)
 {
 	const struct net_bridge_port *dst = READ_ONCE(fdb->dst);
-	struct switchdev_notifier_fdb_info info = {
-		.addr = fdb->key.addr.addr,
-		.vid = fdb->key.vlan_id,
-		.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags),
-		.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags),
-		.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags),
-	};
-	struct net_device *dev = (!dst || info.is_local) ? br->dev : dst->dev;
+	struct switchdev_notifier_fdb_info info = {};
+	struct net_device *dev;
+
+	ether_addr_copy(info.addr, fdb->key.addr.addr);
+	info.vid = fdb->key.vlan_id;
+	info.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+	info.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
+	info.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
+
+	dev = (!dst || info.is_local) ? br->dev : dst->dev;
 
 	switch (type) {
 	case RTM_DELNEIGH:
@@ -278,7 +280,7 @@ static int br_fdb_replay_one(struct net_bridge *br, struct notifier_block *nb,
 	struct switchdev_notifier_fdb_info item;
 	int err;
 
-	item.addr = fdb->key.addr.addr;
+	ether_addr_copy(item.addr, fdb->key.addr.addr);
 	item.vid = fdb->key.vlan_id;
 	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index eb9d9e53c536..7bc88767db9d 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2288,7 +2288,7 @@ dsa_fdb_offload_notify(struct dsa_switchdev_event_work *switchdev_work)
 	if (!dsa_is_user_port(ds, switchdev_work->port))
 		return;
 
-	info.addr = switchdev_work->addr;
+	ether_addr_copy(info.addr, switchdev_work->addr);
 	info.vid = switchdev_work->vid;
 	info.offloaded = true;
 	dp = dsa_to_port(ds, switchdev_work->port);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 1/7] net: bridge: move br_fdb_replay inside br_switchdev.c Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 2/7] net: switchdev: keep the MAC address by value in struct switchdev_notifier_fdb_info Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 15:54   ` Jakub Kicinski
  2021-08-20 11:57 ` [PATCH v3 net-next 4/7] net: bridge: switchdev: make br_fdb_replay offer sleepable context to consumers Vladimir Oltean
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

Currently, br_switchdev_fdb_notify() uses call_switchdev_notifiers (and
br_fdb_replay() open-codes the same thing). This means that drivers
handle the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events on the atomic
switchdev notifier block.

Most existing switchdev drivers either talk to firmware, or to a device
over a bus where the I/O is sleepable (SPI, I2C, MDIO etc). So there
exists an (anti)pattern where drivers make a sleepable context for
offloading the given FDB entry by registering an ordered workqueue and
scheduling work items on it, and doing all the work from there.

The problem is the inherent limitation that this design imposes upon
what a switchdev driver can do with those FDB entries.

For example, a switchdev driver might want to perform FDB isolation,
i.e. associate each FDB entry with the bridge it belongs to. Maybe the
driver associates each bridge with a number, allocating that number when
the first port of the driver joins that bridge, and freeing it when the
last port leaves it.

And this is where the problem is. When user space deletes a bridge and
all the ports leave, the bridge will notify us of the deletion of all
FDB entries in atomic context, and switchdev drivers will schedule their
private work items on their private workqueue.

The FDB entry deletion notifications will succeed, the bridge will then
finish deleting itself, but the switchdev work items have not run yet.
When they will eventually get scheduled, the aforementioned association
between the bridge_dev and a number will have already been broken by the
switchdev driver. All ports are standalone now, the bridge is a foreign
interface!

One might say "why don't you cache all your associations while you're
still in the atomic context and they're still valid, pass them by value
through your switchdev_work and work with the cached values as opposed
to the current ones?"

This option smells of poor design, because instead of fixing a central
problem, we add tens of lateral workarounds to avoid it. It should be
easier to use switchdev, not harder, and we should look at the common
patterns which lead to code duplication and eliminate them.

In this case, we must notice that
(a) switchdev already has the concept of notifiers emitted from the fast
    path that are still processed by drivers from blocking context. This
    is accomplished through the SWITCHDEV_F_DEFER flag which is used by
    e.g. SWITCHDEV_OBJ_ID_HOST_MDB.
(b) the bridge del_nbp() function already calls switchdev_deferred_process().
    So if we could hook into that, we could have a chance that the
    bridge simply waits for our FDB entry offloading procedure to finish
    before it calls netdev_upper_dev_unlink() - which is almost
    immediately afterwards, and also when switchdev drivers typically
    break their stateful associations between the bridge upper and
    private data.

So it is in fact possible to use switchdev's generic
switchdev_deferred_enqueue mechanism to get a sleepable callback, and
from there we can call_switchdev_blocking_notifiers().

In the case of br_fdb_replay(), the only code path is from
switchdev_bridge_port_offload(), which is already in blocking context.
So we don't need to go through switchdev_deferred_enqueue, and we can
just call the blocking notifier block directly.

To preserve the same behavior as before, all drivers need to have their
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handlers moved from their switchdev
atomic notifier blocks to the blocking ones. This patch attempts to make
that trivial movement. Note that now they might schedule a work item for
nothing (since they are now called from a work item themselves), but I
don't have the energy or hardware to test all of them, so this will have
to do.

Note that previously, we were under rcu_read_lock() but now we're not.
I have eyeballed the drivers that make any sort of RCU assumption and
for the most part, enclosed them between a private pair of
rcu_read_lock() and rcu_read_unlock(). The RCU protection can be dropped
from the other drivers when they are reworked to stop scheduling.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3:
- make sure that a shallow memcpy is enough to not have dangling bridge
  memory inside struct switchdev_notifier_fdb_info.
- removed a bogus mlx5_esw_bridge_is_local (suggested by Vlad Buslov)
- actually compile-tested the S/390 qeth driver

 .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 61 +++++++-------
 .../marvell/prestera/prestera_switchdev.c     | 82 ++++++++++---------
 .../mellanox/mlx5/core/en/rep/bridge.c        | 55 ++++++++++++-
 .../mellanox/mlxsw/spectrum_switchdev.c       | 47 ++++++++++-
 .../microchip/sparx5/sparx5_switchdev.c       | 58 +++++++------
 drivers/net/ethernet/rocker/rocker_main.c     | 56 +++++++------
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c | 42 +++++-----
 drivers/net/ethernet/ti/cpsw_switchdev.c      | 45 +++++-----
 drivers/s390/net/qeth_l2_main.c               | 15 +++-
 include/net/switchdev.h                       | 25 +++++-
 net/bridge/br_switchdev.c                     | 12 +--
 net/dsa/slave.c                               | 32 ++++----
 net/switchdev/switchdev.c                     | 47 +++++++++++
 13 files changed, 385 insertions(+), 192 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index dd0096cc3221..d0d63da7f01f 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2253,40 +2253,10 @@ static int dpaa2_switch_port_event(struct notifier_block *nb,
 				   unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
-	struct ethsw_port_priv *port_priv = netdev_priv(dev);
-	struct ethsw_switchdev_event_work *switchdev_work;
-	struct switchdev_notifier_fdb_info *fdb_info = ptr;
-	struct ethsw_core *ethsw = port_priv->ethsw_data;
 
 	if (event == SWITCHDEV_PORT_ATTR_SET)
 		return dpaa2_switch_port_attr_set_event(dev, ptr);
 
-	if (!dpaa2_switch_port_dev_check(dev))
-		return NOTIFY_DONE;
-
-	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
-	if (!switchdev_work)
-		return NOTIFY_BAD;
-
-	INIT_WORK(&switchdev_work->work, dpaa2_switch_event_work);
-	switchdev_work->dev = dev;
-	switchdev_work->event = event;
-
-	switch (event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, fdb_info,
-		       sizeof(switchdev_work->fdb_info));
-		/* Take a reference on the device to avoid being freed. */
-		dev_hold(dev);
-		break;
-	default:
-		kfree(switchdev_work);
-		return NOTIFY_DONE;
-	}
-
-	queue_work(ethsw->workqueue, &switchdev_work->work);
-
 	return NOTIFY_DONE;
 }
 
@@ -2312,6 +2282,34 @@ static int dpaa2_switch_port_obj_event(unsigned long event,
 	return notifier_from_errno(err);
 }
 
+static int dpaa2_switch_fdb_event(unsigned long event,
+				  struct net_device *dev,
+				  struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct ethsw_port_priv *port_priv = netdev_priv(dev);
+	struct ethsw_switchdev_event_work *switchdev_work;
+	struct ethsw_core *ethsw = port_priv->ethsw_data;
+
+	if (!dpaa2_switch_port_dev_check(dev))
+		return NOTIFY_DONE;
+
+	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+	if (!switchdev_work)
+		return NOTIFY_BAD;
+
+	INIT_WORK(&switchdev_work->work, dpaa2_switch_event_work);
+	switchdev_work->dev = dev;
+	switchdev_work->event = event;
+	memcpy(&switchdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+
+	/* Take a reference on the device to avoid being freed. */
+	dev_hold(dev);
+
+	queue_work(ethsw->workqueue, &switchdev_work->work);
+
+	return NOTIFY_DONE;
+}
+
 static int dpaa2_switch_port_blocking_event(struct notifier_block *nb,
 					    unsigned long event, void *ptr)
 {
@@ -2323,6 +2321,9 @@ static int dpaa2_switch_port_blocking_event(struct notifier_block *nb,
 		return dpaa2_switch_port_obj_event(event, dev, ptr);
 	case SWITCHDEV_PORT_ATTR_SET:
 		return dpaa2_switch_port_attr_set_event(dev, ptr);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		return dpaa2_switch_fdb_event(event, dev, ptr);
 	}
 
 	return NOTIFY_DONE;
diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
index 236b07c42df0..a89cda394685 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
@@ -844,10 +844,6 @@ static int prestera_switchdev_event(struct notifier_block *unused,
 				    unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
-	struct switchdev_notifier_fdb_info *fdb_info;
-	struct switchdev_notifier_info *info = ptr;
-	struct prestera_fdb_event_work *swdev_work;
-	struct net_device *upper;
 	int err;
 
 	if (event == SWITCHDEV_PORT_ATTR_SET) {
@@ -857,42 +853,6 @@ static int prestera_switchdev_event(struct notifier_block *unused,
 		return notifier_from_errno(err);
 	}
 
-	if (!prestera_netdev_check(dev))
-		return NOTIFY_DONE;
-
-	upper = netdev_master_upper_dev_get_rcu(dev);
-	if (!upper)
-		return NOTIFY_DONE;
-
-	if (!netif_is_bridge_master(upper))
-		return NOTIFY_DONE;
-
-	swdev_work = kzalloc(sizeof(*swdev_work), GFP_ATOMIC);
-	if (!swdev_work)
-		return NOTIFY_BAD;
-
-	swdev_work->event = event;
-	swdev_work->dev = dev;
-
-	switch (event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		fdb_info = container_of(info,
-					struct switchdev_notifier_fdb_info,
-					info);
-
-		INIT_WORK(&swdev_work->work, prestera_fdb_event_work);
-		memcpy(&swdev_work->fdb_info, fdb_info,
-		       sizeof(swdev_work->fdb_info));
-		dev_hold(dev);
-		break;
-
-	default:
-		kfree(swdev_work);
-		return NOTIFY_DONE;
-	}
-
-	queue_work(swdev_wq, &swdev_work->work);
 	return NOTIFY_DONE;
 }
 
@@ -1089,6 +1049,42 @@ static int prestera_port_obj_del(struct net_device *dev, const void *ctx,
 	}
 }
 
+static int prestera_switchdev_fdb_event(struct net_device *dev,
+					unsigned long event,
+					struct switchdev_notifier_info *info)
+{
+	struct switchdev_notifier_fdb_info *fdb_info;
+	struct prestera_fdb_event_work *swdev_work;
+	struct net_device *upper;
+
+	if (!prestera_netdev_check(dev))
+		return 0;
+
+	upper = netdev_master_upper_dev_get_rcu(dev);
+	if (!upper)
+		return 0;
+
+	if (!netif_is_bridge_master(upper))
+		return 0;
+
+	swdev_work = kzalloc(sizeof(*swdev_work), GFP_ATOMIC);
+	if (!swdev_work)
+		return -ENOMEM;
+
+	swdev_work->event = event;
+	swdev_work->dev = dev;
+
+	fdb_info = container_of(info, struct switchdev_notifier_fdb_info,
+				info);
+
+	INIT_WORK(&swdev_work->work, prestera_fdb_event_work);
+	memcpy(&swdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+	dev_hold(dev);
+
+	queue_work(swdev_wq, &swdev_work->work);
+	return 0;
+}
+
 static int prestera_switchdev_blk_event(struct notifier_block *unused,
 					unsigned long event, void *ptr)
 {
@@ -1111,6 +1107,12 @@ static int prestera_switchdev_blk_event(struct notifier_block *unused,
 						     prestera_netdev_check,
 						     prestera_port_obj_attr_set);
 		break;
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		rcu_read_lock();
+		err = prestera_switchdev_fdb_event(dev, event, ptr);
+		rcu_read_unlock();
+		break;
 	default:
 		err = -EOPNOTSUPP;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 3e11420d8057..06bb8bb8e39e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -276,6 +276,51 @@ mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
 	return err;
 }
 
+static struct mlx5_bridge_switchdev_fdb_work *
+mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add,
+					struct switchdev_notifier_fdb_info *fdb_info,
+					struct mlx5_esw_bridge_offloads *br_offloads);
+
+static int
+mlx5_esw_bridge_fdb_event(struct net_device *dev, unsigned long event,
+			  struct switchdev_notifier_info *info,
+			  struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	struct switchdev_notifier_fdb_info *fdb_info;
+	struct mlx5_bridge_switchdev_fdb_work *work;
+	struct mlx5_eswitch *esw = br_offloads->esw;
+	u16 vport_num, esw_owner_vhca_id;
+	struct net_device *upper, *rep;
+
+	upper = netdev_master_upper_dev_get_rcu(dev);
+	if (!upper)
+		return 0;
+	if (!netif_is_bridge_master(upper))
+		return 0;
+
+	rep = mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, esw,
+							&vport_num,
+							&esw_owner_vhca_id);
+	if (!rep)
+		return 0;
+
+	fdb_info = container_of(info, struct switchdev_notifier_fdb_info, info);
+
+	work = mlx5_esw_bridge_init_switchdev_fdb_work(dev,
+						       event == SWITCHDEV_FDB_ADD_TO_DEVICE,
+						       fdb_info,
+						       br_offloads);
+	if (IS_ERR(work)) {
+		WARN_ONCE(1, "Failed to init switchdev work, err=%ld",
+			  PTR_ERR(work));
+		return PTR_ERR(work);
+	}
+
+	queue_work(br_offloads->wq, &work->work);
+
+	return 0;
+}
+
 static int mlx5_esw_bridge_event_blocking(struct notifier_block *nb,
 					  unsigned long event, void *ptr)
 {
@@ -295,6 +340,12 @@ static int mlx5_esw_bridge_event_blocking(struct notifier_block *nb,
 	case SWITCHDEV_PORT_ATTR_SET:
 		err = mlx5_esw_bridge_port_obj_attr_set(dev, ptr, br_offloads);
 		break;
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		rcu_read_lock();
+		err = mlx5_esw_bridge_fdb_event(dev, event, ptr, br_offloads);
+		rcu_read_unlock();
+		break;
 	default:
 		err = 0;
 	}
@@ -405,9 +456,7 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 		/* only handle the event on peers */
 		if (mlx5_esw_bridge_is_local(dev, rep, esw))
 			break;
-		fallthrough;
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+
 		fdb_info = container_of(info,
 					struct switchdev_notifier_fdb_info,
 					info);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 78e5059beafa..fbaed9de3929 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -3246,8 +3246,6 @@ static int mlxsw_sp_switchdev_event(struct notifier_block *unused,
 	switchdev_work->event = event;
 
 	switch (event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 	case SWITCHDEV_FDB_DEL_TO_BRIDGE:
 		fdb_info = container_of(info,
@@ -3506,6 +3504,45 @@ mlxsw_sp_switchdev_handle_vxlan_obj_del(struct net_device *vxlan_dev,
 	}
 }
 
+static int mlxsw_sp_switchdev_fdb_event(struct net_device *dev, unsigned long event,
+					struct switchdev_notifier_info *info)
+{
+	struct mlxsw_sp_switchdev_event_work *switchdev_work;
+	struct switchdev_notifier_fdb_info *fdb_info;
+	struct net_device *br_dev;
+
+	/* Tunnel devices are not our uppers, so check their master instead */
+	br_dev = netdev_master_upper_dev_get_rcu(dev);
+	if (!br_dev)
+		return 0;
+	if (!netif_is_bridge_master(br_dev))
+		return 0;
+	if (!mlxsw_sp_port_dev_lower_find_rcu(br_dev))
+		return 0;
+
+	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+	if (!switchdev_work)
+		return -ENOMEM;
+
+	switchdev_work->dev = dev;
+	switchdev_work->event = event;
+
+	fdb_info = container_of(info, struct switchdev_notifier_fdb_info,
+				info);
+	INIT_WORK(&switchdev_work->work,
+		  mlxsw_sp_switchdev_bridge_fdb_event_work);
+	memcpy(&switchdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+	/* Take a reference on the device. This can be either
+	 * upper device containig mlxsw_sp_port or just a
+	 * mlxsw_sp_port
+	 */
+	dev_hold(dev);
+
+	mlxsw_core_schedule_work(&switchdev_work->work);
+
+	return 0;
+}
+
 static int mlxsw_sp_switchdev_blocking_event(struct notifier_block *unused,
 					     unsigned long event, void *ptr)
 {
@@ -3534,6 +3571,12 @@ static int mlxsw_sp_switchdev_blocking_event(struct notifier_block *unused,
 						     mlxsw_sp_port_dev_check,
 						     mlxsw_sp_port_attr_set);
 		return notifier_from_errno(err);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		rcu_read_lock();
+		err = mlxsw_sp_switchdev_fdb_event(dev, event, ptr);
+		rcu_read_unlock();
+		return notifier_from_errno(err);
 	}
 
 	return NOTIFY_DONE;
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
index 5c5eb557a19c..0d19f2be0895 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
@@ -267,9 +267,6 @@ static int sparx5_switchdev_event(struct notifier_block *unused,
 				  unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
-	struct sparx5_switchdev_event_work *switchdev_work;
-	struct switchdev_notifier_fdb_info *fdb_info;
-	struct switchdev_notifier_info *info = ptr;
 	int err;
 
 	switch (event) {
@@ -278,27 +275,6 @@ static int sparx5_switchdev_event(struct notifier_block *unused,
 						     sparx5_netdevice_check,
 						     sparx5_port_attr_set);
 		return notifier_from_errno(err);
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		fallthrough;
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
-		if (!switchdev_work)
-			return NOTIFY_BAD;
-
-		switchdev_work->dev = dev;
-		switchdev_work->event = event;
-
-		fdb_info = container_of(info,
-					struct switchdev_notifier_fdb_info,
-					info);
-		INIT_WORK(&switchdev_work->work,
-			  sparx5_switchdev_bridge_fdb_event_work);
-		memcpy(&switchdev_work->fdb_info, fdb_info,
-		       sizeof(switchdev_work->fdb_info));
-		dev_hold(dev);
-
-		sparx5_schedule_work(&switchdev_work->work);
-		break;
 	}
 
 	return NOTIFY_DONE;
@@ -449,6 +425,36 @@ static int sparx5_handle_port_obj_del(struct net_device *dev,
 	return err;
 }
 
+static int sparx5_switchdev_fdb_event(struct net_device *dev, unsigned long event,
+				      struct switchdev_notifier_info *info)
+{
+	struct sparx5_switchdev_event_work *switchdev_work;
+	struct switchdev_notifier_fdb_info *fdb_info;
+
+	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+	if (!switchdev_work)
+		return -ENOMEM;
+
+	switchdev_work->dev = dev;
+	switchdev_work->event = event;
+
+	fdb_info = container_of(info,
+				struct switchdev_notifier_fdb_info,
+				info);
+	INIT_WORK(&switchdev_work->work,
+		  sparx5_switchdev_bridge_fdb_event_work);
+	memcpy(&switchdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+	dev_hold(dev);
+
+	sparx5_schedule_work(&switchdev_work->work);
+
+	return 0;
+
+err_addr_alloc:
+	kfree(switchdev_work);
+	return -ENOMEM;
+}
+
 static int sparx5_switchdev_blocking_event(struct notifier_block *nb,
 					   unsigned long event,
 					   void *ptr)
@@ -468,6 +474,10 @@ static int sparx5_switchdev_blocking_event(struct notifier_block *nb,
 						     sparx5_netdevice_check,
 						     sparx5_port_attr_set);
 		return notifier_from_errno(err);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		err = sparx5_switchdev_fdb_event(dev, event, ptr);
+		return notifier_from_errno(err);
 	}
 
 	return NOTIFY_DONE;
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index d490d006cc98..f6facbf4dc2c 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2766,9 +2766,6 @@ static int rocker_switchdev_event(struct notifier_block *unused,
 				  unsigned long event, void *ptr)
 {
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
-	struct rocker_switchdev_event_work *switchdev_work;
-	struct switchdev_notifier_fdb_info *fdb_info = ptr;
-	struct rocker_port *rocker_port;
 
 	if (!rocker_port_dev_check(dev))
 		return NOTIFY_DONE;
@@ -2776,30 +2773,6 @@ static int rocker_switchdev_event(struct notifier_block *unused,
 	if (event == SWITCHDEV_PORT_ATTR_SET)
 		return rocker_switchdev_port_attr_set_event(dev, ptr);
 
-	rocker_port = netdev_priv(dev);
-	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
-	if (WARN_ON(!switchdev_work))
-		return NOTIFY_BAD;
-
-	INIT_WORK(&switchdev_work->work, rocker_switchdev_event_work);
-	switchdev_work->rocker_port = rocker_port;
-	switchdev_work->event = event;
-
-	switch (event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, fdb_info,
-		       sizeof(switchdev_work->fdb_info));
-		/* Take a reference on the rocker device */
-		dev_hold(dev);
-		break;
-	default:
-		kfree(switchdev_work);
-		return NOTIFY_DONE;
-	}
-
-	queue_work(rocker_port->rocker->rocker_owq,
-		   &switchdev_work->work);
 	return NOTIFY_DONE;
 }
 
@@ -2822,6 +2795,32 @@ rocker_switchdev_port_obj_event(unsigned long event, struct net_device *netdev,
 	return notifier_from_errno(err);
 }
 
+static int
+rocker_switchdev_fdb_event(unsigned long event, struct net_device *dev,
+			   struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct rocker_switchdev_event_work *switchdev_work;
+	struct rocker_port *rocker_port;
+
+	rocker_port = netdev_priv(dev);
+	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+	if (WARN_ON(!switchdev_work))
+		return NOTIFY_BAD;
+
+	INIT_WORK(&switchdev_work->work, rocker_switchdev_event_work);
+	switchdev_work->rocker_port = rocker_port;
+	switchdev_work->event = event;
+
+	memcpy(&switchdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+	/* Take a reference on the rocker device */
+	dev_hold(dev);
+
+	queue_work(rocker_port->rocker->rocker_owq,
+		   &switchdev_work->work);
+
+	return NOTIFY_DONE;
+}
+
 static int rocker_switchdev_blocking_event(struct notifier_block *unused,
 					   unsigned long event, void *ptr)
 {
@@ -2836,6 +2835,9 @@ static int rocker_switchdev_blocking_event(struct notifier_block *unused,
 		return rocker_switchdev_port_obj_event(event, dev, ptr);
 	case SWITCHDEV_PORT_ATTR_SET:
 		return rocker_switchdev_port_attr_set_event(dev, ptr);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		return rocker_switchdev_fdb_event(event, dev, ptr);
 	}
 
 	return NOTIFY_DONE;
diff --git a/drivers/net/ethernet/ti/am65-cpsw-switchdev.c b/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
index 860214e1a8ca..5087d529c93e 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-switchdev.c
@@ -423,9 +423,6 @@ static int am65_cpsw_switchdev_event(struct notifier_block *unused,
 				     unsigned long event, void *ptr)
 {
 	struct net_device *ndev = switchdev_notifier_info_to_dev(ptr);
-	struct am65_cpsw_switchdev_event_work *switchdev_work;
-	struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
-	struct switchdev_notifier_fdb_info *fdb_info = ptr;
 	int err;
 
 	if (event == SWITCHDEV_PORT_ATTR_SET) {
@@ -435,38 +432,39 @@ static int am65_cpsw_switchdev_event(struct notifier_block *unused,
 		return notifier_from_errno(err);
 	}
 
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block cpsw_switchdev_notifier = {
+	.notifier_call = am65_cpsw_switchdev_event,
+};
+
+static int am65_cpsw_switchdev_fdb_event(struct net_device *ndev,
+					 unsigned long event,
+					 struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct am65_cpsw_switchdev_event_work *switchdev_work;
+	struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
+
 	if (!am65_cpsw_port_dev_check(ndev))
-		return NOTIFY_DONE;
+		return 0;
 
 	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
 	if (WARN_ON(!switchdev_work))
-		return NOTIFY_BAD;
+		return -ENOMEM;
 
 	INIT_WORK(&switchdev_work->work, am65_cpsw_switchdev_event_work);
 	switchdev_work->port = port;
 	switchdev_work->event = event;
 
-	switch (event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, fdb_info,
-		       sizeof(switchdev_work->fdb_info));
-		dev_hold(ndev);
-		break;
-	default:
-		kfree(switchdev_work);
-		return NOTIFY_DONE;
-	}
+	memcpy(&switchdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+	dev_hold(ndev);
 
 	queue_work(system_long_wq, &switchdev_work->work);
 
 	return NOTIFY_DONE;
 }
 
-static struct notifier_block cpsw_switchdev_notifier = {
-	.notifier_call = am65_cpsw_switchdev_event,
-};
-
 static int am65_cpsw_switchdev_blocking_event(struct notifier_block *unused,
 					      unsigned long event, void *ptr)
 {
@@ -489,6 +487,10 @@ static int am65_cpsw_switchdev_blocking_event(struct notifier_block *unused,
 						     am65_cpsw_port_dev_check,
 						     am65_cpsw_port_attr_set);
 		return notifier_from_errno(err);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		err = am65_cpsw_switchdev_fdb_event(dev, event, ptr);
+		return notifier_from_errno(err);
 	default:
 		break;
 	}
diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c
index 786bb848ddeb..8d463d1283f9 100644
--- a/drivers/net/ethernet/ti/cpsw_switchdev.c
+++ b/drivers/net/ethernet/ti/cpsw_switchdev.c
@@ -433,9 +433,6 @@ static int cpsw_switchdev_event(struct notifier_block *unused,
 				unsigned long event, void *ptr)
 {
 	struct net_device *ndev = switchdev_notifier_info_to_dev(ptr);
-	struct switchdev_notifier_fdb_info *fdb_info = ptr;
-	struct cpsw_switchdev_event_work *switchdev_work;
-	struct cpsw_priv *priv = netdev_priv(ndev);
 	int err;
 
 	if (event == SWITCHDEV_PORT_ATTR_SET) {
@@ -445,38 +442,40 @@ static int cpsw_switchdev_event(struct notifier_block *unused,
 		return notifier_from_errno(err);
 	}
 
-	if (!cpsw_port_dev_check(ndev))
-		return NOTIFY_DONE;
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block cpsw_switchdev_notifier = {
+	.notifier_call = cpsw_switchdev_event,
+};
+
+static int cpsw_switchdev_fdb_event(struct net_device *dev, unsigned long event,
+				    struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct cpsw_switchdev_event_work *switchdev_work;
+	struct cpsw_priv *priv;
+
+	if (!cpsw_port_dev_check(dev))
+		return 0;
+
+	priv = netdev_priv(dev);
 
 	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
 	if (WARN_ON(!switchdev_work))
-		return NOTIFY_BAD;
+		return -ENOMEM;
 
 	INIT_WORK(&switchdev_work->work, cpsw_switchdev_event_work);
 	switchdev_work->priv = priv;
 	switchdev_work->event = event;
 
-	switch (event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		memcpy(&switchdev_work->fdb_info, fdb_info,
-		       sizeof(switchdev_work->fdb_info));
-		dev_hold(ndev);
-		break;
-	default:
-		kfree(switchdev_work);
-		return NOTIFY_DONE;
-	}
+	memcpy(&switchdev_work->fdb_info, fdb_info, sizeof(*fdb_info));
+	dev_hold(dev);
 
 	queue_work(system_long_wq, &switchdev_work->work);
 
 	return NOTIFY_DONE;
 }
 
-static struct notifier_block cpsw_switchdev_notifier = {
-	.notifier_call = cpsw_switchdev_event,
-};
-
 static int cpsw_switchdev_blocking_event(struct notifier_block *unused,
 					 unsigned long event, void *ptr)
 {
@@ -499,6 +498,10 @@ static int cpsw_switchdev_blocking_event(struct notifier_block *unused,
 						     cpsw_port_dev_check,
 						     cpsw_port_attr_set);
 		return notifier_from_errno(err);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		err = cpsw_switchdev_fdb_event(dev, event, ptr);
+		return notifier_from_errno(err);
 	default:
 		break;
 	}
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index de98f79c11ab..98678dc9d054 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -863,10 +863,15 @@ static int qeth_l2_switchdev_event(struct notifier_block *unused,
 	      event == SWITCHDEV_FDB_DEL_TO_DEVICE))
 		return NOTIFY_DONE;
 
+	rcu_read_lock();
+
 	dstdev = switchdev_notifier_info_to_dev(info);
 	brdev = netdev_master_upper_dev_get_rcu(dstdev);
-	if (!brdev || !netif_is_bridge_master(brdev))
+	if (!brdev || !netif_is_bridge_master(brdev)) {
+		rcu_read_unlock();
 		return NOTIFY_DONE;
+	}
+
 	fdb_info = container_of(info,
 				struct switchdev_notifier_fdb_info,
 				info);
@@ -881,11 +886,15 @@ static int qeth_l2_switchdev_event(struct notifier_block *unused,
 						       fdb_info->addr);
 			if (rc) {
 				QETH_CARD_TEXT(card, 2, "b2dqwerr");
+				rcu_read_unlock();
 				return NOTIFY_BAD;
 			}
 		}
 		lowerdev = netdev_next_lower_dev_rcu(brdev, &iter);
 	}
+
+	rcu_read_unlock();
+
 	return NOTIFY_DONE;
 }
 
@@ -901,7 +910,7 @@ static void qeth_l2_br2dev_get(void)
 	int rc;
 
 	if (!refcount_inc_not_zero(&qeth_l2_switchdev_notify_refcnt)) {
-		rc = register_switchdev_notifier(&qeth_l2_sw_notifier);
+		rc = register_switchdev_blocking_notifier(&qeth_l2_sw_notifier);
 		if (rc) {
 			QETH_DBF_MESSAGE(2,
 					 "failed to register qeth_l2_sw_notifier: %d\n",
@@ -921,7 +930,7 @@ static void qeth_l2_br2dev_put(void)
 	int rc;
 
 	if (refcount_dec_and_test(&qeth_l2_switchdev_notify_refcnt)) {
-		rc = unregister_switchdev_notifier(&qeth_l2_sw_notifier);
+		rc = unregister_switchdev_blocking_notifier(&qeth_l2_sw_notifier);
 		if (rc) {
 			QETH_DBF_MESSAGE(2,
 					 "failed to unregister qeth_l2_sw_notifier: %d\n",
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 6764fb7692e2..e27da5bd665f 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -191,8 +191,8 @@ struct switchdev_brport {
 enum switchdev_notifier_type {
 	SWITCHDEV_FDB_ADD_TO_BRIDGE = 1,
 	SWITCHDEV_FDB_DEL_TO_BRIDGE,
-	SWITCHDEV_FDB_ADD_TO_DEVICE,
-	SWITCHDEV_FDB_DEL_TO_DEVICE,
+	SWITCHDEV_FDB_ADD_TO_DEVICE, /* Blocking. */
+	SWITCHDEV_FDB_DEL_TO_DEVICE, /* Blocking. */
 	SWITCHDEV_FDB_OFFLOADED,
 	SWITCHDEV_FDB_FLUSH_TO_BRIDGE,
 
@@ -283,6 +283,13 @@ int switchdev_port_obj_add(struct net_device *dev,
 int switchdev_port_obj_del(struct net_device *dev,
 			   const struct switchdev_obj *obj);
 
+int
+switchdev_fdb_add_to_device(struct net_device *dev,
+			    const struct switchdev_notifier_fdb_info *fdb_info);
+int
+switchdev_fdb_del_to_device(struct net_device *dev,
+			    const struct switchdev_notifier_fdb_info *fdb_info);
+
 int register_switchdev_notifier(struct notifier_block *nb);
 int unregister_switchdev_notifier(struct notifier_block *nb);
 int call_switchdev_notifiers(unsigned long val, struct net_device *dev,
@@ -386,6 +393,20 @@ static inline int switchdev_port_obj_del(struct net_device *dev,
 	return -EOPNOTSUPP;
 }
 
+static inline int
+switchdev_fdb_add_to_device(struct net_device *dev,
+			    const struct switchdev_notifier_fdb_info *fdb_info)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int
+switchdev_fdb_del_to_device(struct net_device *dev,
+			    const struct switchdev_notifier_fdb_info *fdb_info)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline int register_switchdev_notifier(struct notifier_block *nb)
 {
 	return 0;
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 7e62904089c8..c7c8e23c2147 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -140,12 +140,10 @@ br_switchdev_fdb_notify(struct net_bridge *br,
 
 	switch (type) {
 	case RTM_DELNEIGH:
-		call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_DEVICE,
-					 dev, &info.info, NULL);
+		switchdev_fdb_del_to_device(dev, &info);
 		break;
 	case RTM_NEWNEIGH:
-		call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_DEVICE,
-					 dev, &info.info, NULL);
+		switchdev_fdb_add_to_device(dev, &info);
 		break;
 	}
 }
@@ -303,6 +301,8 @@ static int br_fdb_replay(const struct net_device *br_dev, const void *ctx,
 	if (!nb)
 		return 0;
 
+	ASSERT_RTNL();
+
 	if (!netif_is_bridge_master(br_dev))
 		return -EINVAL;
 
@@ -343,7 +343,7 @@ static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
 	if (err && err != -EOPNOTSUPP)
 		return err;
 
-	err = br_fdb_replay(br_dev, ctx, true, atomic_nb);
+	err = br_fdb_replay(br_dev, ctx, true, blocking_nb);
 	if (err && err != -EOPNOTSUPP)
 		return err;
 
@@ -362,7 +362,7 @@ static void nbp_switchdev_unsync_objs(struct net_bridge_port *p,
 
 	br_mdb_replay(br_dev, dev, ctx, false, blocking_nb, NULL);
 
-	br_fdb_replay(br_dev, ctx, false, atomic_nb);
+	br_fdb_replay(br_dev, ctx, false, blocking_nb);
 }
 
 /* Let the bridge know that this port is offloaded, so that it can assign a
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 7bc88767db9d..6601224f6a5a 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2454,20 +2454,6 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 						     dsa_slave_dev_check,
 						     dsa_slave_port_attr_set);
 		return notifier_from_errno(err);
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		err = switchdev_handle_fdb_add_to_device(dev, ptr,
-							 dsa_slave_dev_check,
-							 dsa_foreign_dev_check,
-							 dsa_slave_fdb_add_to_device,
-							 NULL);
-		return notifier_from_errno(err);
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		err = switchdev_handle_fdb_del_to_device(dev, ptr,
-							 dsa_slave_dev_check,
-							 dsa_foreign_dev_check,
-							 dsa_slave_fdb_del_to_device,
-							 NULL);
-		return notifier_from_errno(err);
 	default:
 		return NOTIFY_DONE;
 	}
@@ -2497,6 +2483,24 @@ static int dsa_slave_switchdev_blocking_event(struct notifier_block *unused,
 						     dsa_slave_dev_check,
 						     dsa_slave_port_attr_set);
 		return notifier_from_errno(err);
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+		rcu_read_lock();
+		err = switchdev_handle_fdb_add_to_device(dev, ptr,
+							 dsa_slave_dev_check,
+							 dsa_foreign_dev_check,
+							 dsa_slave_fdb_add_to_device,
+							 NULL);
+		rcu_read_unlock();
+		return notifier_from_errno(err);
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		rcu_read_lock();
+		err = switchdev_handle_fdb_del_to_device(dev, ptr,
+							 dsa_slave_dev_check,
+							 dsa_foreign_dev_check,
+							 dsa_slave_fdb_del_to_device,
+							 NULL);
+		rcu_read_unlock();
+		return notifier_from_errno(err);
 	}
 
 	return NOTIFY_DONE;
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 0b2c18efc079..c34c6abceec6 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -378,6 +378,53 @@ int call_switchdev_blocking_notifiers(unsigned long val, struct net_device *dev,
 }
 EXPORT_SYMBOL_GPL(call_switchdev_blocking_notifiers);
 
+static void switchdev_fdb_add_deferred(struct net_device *dev, const void *data)
+{
+	const struct switchdev_notifier_fdb_info *fdb_info = data;
+	struct switchdev_notifier_fdb_info tmp = *fdb_info;
+	int err;
+
+	ASSERT_RTNL();
+	err = call_switchdev_blocking_notifiers(SWITCHDEV_FDB_ADD_TO_DEVICE,
+						dev, &tmp.info, NULL);
+	err = notifier_to_errno(err);
+	if (err && err != -EOPNOTSUPP)
+		netdev_err(dev, "failed to add FDB entry: %pe\n", ERR_PTR(err));
+}
+
+static void switchdev_fdb_del_deferred(struct net_device *dev, const void *data)
+{
+	const struct switchdev_notifier_fdb_info *fdb_info = data;
+	struct switchdev_notifier_fdb_info tmp = *fdb_info;
+	int err;
+
+	ASSERT_RTNL();
+	err = call_switchdev_blocking_notifiers(SWITCHDEV_FDB_DEL_TO_DEVICE,
+						dev, &tmp.info, NULL);
+	err = notifier_to_errno(err);
+	if (err && err != -EOPNOTSUPP)
+		netdev_err(dev, "failed to delete FDB entry: %pe\n",
+			   ERR_PTR(err));
+}
+
+int
+switchdev_fdb_add_to_device(struct net_device *dev,
+			    const struct switchdev_notifier_fdb_info *fdb_info)
+{
+	return switchdev_deferred_enqueue(dev, fdb_info, sizeof(*fdb_info),
+					  switchdev_fdb_add_deferred);
+}
+EXPORT_SYMBOL_GPL(switchdev_fdb_add_to_device);
+
+int
+switchdev_fdb_del_to_device(struct net_device *dev,
+			    const struct switchdev_notifier_fdb_info *fdb_info)
+{
+	return switchdev_deferred_enqueue(dev, fdb_info, sizeof(*fdb_info),
+					  switchdev_fdb_del_deferred);
+}
+EXPORT_SYMBOL_GPL(switchdev_fdb_del_to_device);
+
 struct switchdev_nested_priv {
 	bool (*check_cb)(const struct net_device *dev);
 	bool (*foreign_dev_check_cb)(const struct net_device *dev,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 4/7] net: bridge: switchdev: make br_fdb_replay offer sleepable context to consumers
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
                   ` (2 preceding siblings ...)
  2021-08-20 11:57 ` [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 5/7] net: switchdev: drop the atomic notifier block from switchdev_bridge_port_{,un}offload Vladimir Oltean
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

Now that the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events are notified on
the blocking chain, it would be nice if we could also drop the
rcu_read_lock() atomic context from br_fdb_replay() so that drivers can
actually benefit from the blocking context and simplify their logic.

Do something similar to what is done in br_mdb_queue_one/br_mdb_replay_one.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3:
since we want to drop the RCU protection early, we need to copy the
fdb->addr from the bridge FDB entries early too. So we need to queue the
FDB entries as switchdev objects and not bridge objects (those still
have an "addr" pointer and not array) and keep them in a list. So add a
list element to the switchdev FDB structure, like we have in struct
switchdev_obj too.

 include/net/switchdev.h   |  1 +
 net/bridge/br_switchdev.c | 88 +++++++++++++++++++++++++++------------
 2 files changed, 62 insertions(+), 27 deletions(-)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index e27da5bd665f..5570df4d9b76 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -218,6 +218,7 @@ struct switchdev_notifier_info {
 
 struct switchdev_notifier_fdb_info {
 	struct switchdev_notifier_info info; /* must be first */
+	struct list_head list;
 	unsigned char addr[ETH_ALEN];
 	u16 vid;
 	u8 added_by_user:1,
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index c7c8e23c2147..f2cb066e3ebb 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -122,21 +122,30 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p,
 	return 0;
 }
 
+static void br_switchdev_fdb_populate(struct net_bridge *br,
+				      struct switchdev_notifier_fdb_info *info,
+				      struct net_device **dev,
+				      const struct net_bridge_fdb_entry *fdb)
+{
+	const struct net_bridge_port *dst = READ_ONCE(fdb->dst);
+
+	ether_addr_copy(info->addr, fdb->key.addr.addr);
+	info->vid = fdb->key.vlan_id;
+	info->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
+	info->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
+	info->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
+
+	*dev = (!dst || info->is_local) ? br->dev : dst->dev;
+}
+
 void
 br_switchdev_fdb_notify(struct net_bridge *br,
 			const struct net_bridge_fdb_entry *fdb, int type)
 {
-	const struct net_bridge_port *dst = READ_ONCE(fdb->dst);
 	struct switchdev_notifier_fdb_info info = {};
 	struct net_device *dev;
 
-	ether_addr_copy(info.addr, fdb->key.addr.addr);
-	info.vid = fdb->key.vlan_id;
-	info.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
-	info.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
-	info.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
-
-	dev = (!dst || info.is_local) ? br->dev : dst->dev;
+	br_switchdev_fdb_populate(br, &info, &dev, fdb);
 
 	switch (type) {
 	case RTM_DELNEIGH:
@@ -270,32 +279,43 @@ static void nbp_switchdev_del(struct net_bridge_port *p)
 	}
 }
 
-static int br_fdb_replay_one(struct net_bridge *br, struct notifier_block *nb,
-			     const struct net_bridge_fdb_entry *fdb,
-			     unsigned long action, const void *ctx)
+static int br_fdb_replay_one(struct notifier_block *nb,
+			     struct switchdev_notifier_fdb_info *info,
+			     unsigned long action)
 {
-	const struct net_bridge_port *p = READ_ONCE(fdb->dst);
-	struct switchdev_notifier_fdb_info item;
 	int err;
 
-	ether_addr_copy(item.addr, fdb->key.addr.addr);
-	item.vid = fdb->key.vlan_id;
-	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
-	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
-	item.is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
-	item.info.dev = (!p || item.is_local) ? br->dev : p->dev;
-	item.info.ctx = ctx;
-
-	err = nb->notifier_call(nb, action, &item);
+	err = nb->notifier_call(nb, action, info);
 	return notifier_to_errno(err);
 }
 
+static int br_fdb_queue_one(struct net_bridge *br, struct list_head *fdb_list,
+			    const struct net_bridge_fdb_entry *fdb,
+			    const void *ctx)
+{
+	struct switchdev_notifier_fdb_info *info;
+	struct net_device *dev;
+
+	info = kzalloc(sizeof(*info), GFP_ATOMIC);
+	if (!info)
+		return -ENOMEM;
+
+	br_switchdev_fdb_populate(br, info, &dev, fdb);
+	info->info.dev = dev;
+	info->info.ctx = ctx;
+	list_add_tail(&info->list, fdb_list);
+
+	return 0;
+}
+
 static int br_fdb_replay(const struct net_device *br_dev, const void *ctx,
 			 bool adding, struct notifier_block *nb)
 {
+	struct switchdev_notifier_fdb_info *info, *tmp;
 	struct net_bridge_fdb_entry *fdb;
 	struct net_bridge *br;
 	unsigned long action;
+	LIST_HEAD(fdb_list);
 	int err = 0;
 
 	if (!nb)
@@ -308,20 +328,34 @@ static int br_fdb_replay(const struct net_device *br_dev, const void *ctx,
 
 	br = netdev_priv(br_dev);
 
+	rcu_read_lock();
+
+	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
+		err = br_fdb_queue_one(br, &fdb_list, fdb, ctx);
+		if (err) {
+			rcu_read_unlock();
+			goto out_free_fdb;
+		}
+	}
+
+	rcu_read_unlock();
+
 	if (adding)
 		action = SWITCHDEV_FDB_ADD_TO_DEVICE;
 	else
 		action = SWITCHDEV_FDB_DEL_TO_DEVICE;
 
-	rcu_read_lock();
-
-	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
-		err = br_fdb_replay_one(br, nb, fdb, action, ctx);
+	list_for_each_entry(info, &fdb_list, list) {
+		err = br_fdb_replay_one(nb, info, action);
 		if (err)
 			break;
 	}
 
-	rcu_read_unlock();
+out_free_fdb:
+	list_for_each_entry_safe(info, tmp, &fdb_list, list) {
+		list_del(&info->list);
+		kfree(info);
+	}
 
 	return err;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 5/7] net: switchdev: drop the atomic notifier block from switchdev_bridge_port_{,un}offload
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
                   ` (3 preceding siblings ...)
  2021-08-20 11:57 ` [PATCH v3 net-next 4/7] net: bridge: switchdev: make br_fdb_replay offer sleepable context to consumers Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 6/7] net: switchdev: don't assume RCU context in switchdev_handle_fdb_{add,del}_to_device Vladimir Oltean
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

Now that br_fdb_replay() uses the blocking_nb, there is no point in
passing the atomic nb anymore.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3: none

 drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c       | 2 --
 .../net/ethernet/marvell/prestera/prestera_switchdev.c    | 6 +++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c  | 4 ++--
 drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c  | 4 ++--
 drivers/net/ethernet/mscc/ocelot_net.c                    | 3 ---
 drivers/net/ethernet/rocker/rocker_ofdpa.c                | 4 ++--
 drivers/net/ethernet/ti/am65-cpsw-nuss.c                  | 4 ++--
 drivers/net/ethernet/ti/cpsw_new.c                        | 4 ++--
 include/net/switchdev.h                                   | 5 -----
 net/bridge/br.c                                           | 5 ++---
 net/bridge/br_private.h                                   | 4 ----
 net/bridge/br_switchdev.c                                 | 8 ++------
 net/dsa/port.c                                            | 3 ---
 net/switchdev/switchdev.c                                 | 4 ----
 14 files changed, 17 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
index d0d63da7f01f..f6fec0e7590c 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
@@ -2016,7 +2016,6 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
 		goto err_egress_flood;
 
 	err = switchdev_bridge_port_offload(netdev, netdev, NULL,
-					    &dpaa2_switch_port_switchdev_nb,
 					    &dpaa2_switch_port_switchdev_blocking_nb,
 					    false, extack);
 	if (err)
@@ -2053,7 +2052,6 @@ static int dpaa2_switch_port_restore_rxvlan(struct net_device *vdev, int vid, vo
 static void dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev)
 {
 	switchdev_bridge_port_unoffload(netdev, NULL,
-					&dpaa2_switch_port_switchdev_nb,
 					&dpaa2_switch_port_switchdev_blocking_nb);
 }
 
diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
index a89cda394685..5a9f26d5ddd6 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
@@ -502,7 +502,7 @@ int prestera_bridge_port_join(struct net_device *br_dev,
 	}
 
 	err = switchdev_bridge_port_offload(br_port->dev, port->dev, NULL,
-					    NULL, NULL, false, extack);
+					    NULL, false, extack);
 	if (err)
 		goto err_switchdev_offload;
 
@@ -516,7 +516,7 @@ int prestera_bridge_port_join(struct net_device *br_dev,
 	return 0;
 
 err_port_join:
-	switchdev_bridge_port_unoffload(br_port->dev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(br_port->dev, NULL, NULL);
 err_switchdev_offload:
 	prestera_bridge_port_put(br_port);
 err_brport_create:
@@ -592,7 +592,7 @@ void prestera_bridge_port_leave(struct net_device *br_dev,
 	else
 		prestera_bridge_1d_port_leave(br_port);
 
-	switchdev_bridge_port_unoffload(br_port->dev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(br_port->dev, NULL, NULL);
 
 	prestera_hw_port_learning_set(port, false);
 	prestera_hw_port_flood_set(port, BR_FLOOD | BR_MCAST_FLOOD, 0);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index fbaed9de3929..589563a93cf2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -362,7 +362,7 @@ mlxsw_sp_bridge_port_create(struct mlxsw_sp_bridge_device *bridge_device,
 	bridge_port->ref_count = 1;
 
 	err = switchdev_bridge_port_offload(brport_dev, mlxsw_sp_port->dev,
-					    NULL, NULL, NULL, false, extack);
+					    NULL, NULL, false, extack);
 	if (err)
 		goto err_switchdev_offload;
 
@@ -377,7 +377,7 @@ mlxsw_sp_bridge_port_create(struct mlxsw_sp_bridge_device *bridge_device,
 static void
 mlxsw_sp_bridge_port_destroy(struct mlxsw_sp_bridge_port *bridge_port)
 {
-	switchdev_bridge_port_unoffload(bridge_port->dev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(bridge_port->dev, NULL, NULL);
 	list_del(&bridge_port->list);
 	WARN_ON(!list_empty(&bridge_port->vlans_list));
 	kfree(bridge_port);
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
index 0d19f2be0895..dbc2a5560004 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
@@ -112,7 +112,7 @@ static int sparx5_port_bridge_join(struct sparx5_port *port,
 
 	set_bit(port->portno, sparx5->bridge_mask);
 
-	err = switchdev_bridge_port_offload(ndev, ndev, NULL, NULL, NULL,
+	err = switchdev_bridge_port_offload(ndev, ndev, NULL, NULL,
 					    false, extack);
 	if (err)
 		goto err_switchdev_offload;
@@ -134,7 +134,7 @@ static void sparx5_port_bridge_leave(struct sparx5_port *port,
 {
 	struct sparx5 *sparx5 = port->sparx5;
 
-	switchdev_bridge_port_unoffload(port->ndev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(port->ndev, NULL, NULL);
 
 	clear_bit(port->portno, sparx5->bridge_mask);
 	if (bitmap_empty(sparx5->bridge_mask, SPX5_PORTS))
diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c
index 5e8965be968a..04ca55ff0fd0 100644
--- a/drivers/net/ethernet/mscc/ocelot_net.c
+++ b/drivers/net/ethernet/mscc/ocelot_net.c
@@ -1162,7 +1162,6 @@ static int ocelot_netdevice_bridge_join(struct net_device *dev,
 	ocelot_port_bridge_join(ocelot, port, bridge);
 
 	err = switchdev_bridge_port_offload(brport_dev, dev, priv,
-					    &ocelot_netdevice_nb,
 					    &ocelot_switchdev_blocking_nb,
 					    false, extack);
 	if (err)
@@ -1176,7 +1175,6 @@ static int ocelot_netdevice_bridge_join(struct net_device *dev,
 
 err_switchdev_sync:
 	switchdev_bridge_port_unoffload(brport_dev, priv,
-					&ocelot_netdevice_nb,
 					&ocelot_switchdev_blocking_nb);
 err_switchdev_offload:
 	ocelot_port_bridge_leave(ocelot, port, bridge);
@@ -1189,7 +1187,6 @@ static void ocelot_netdevice_pre_bridge_leave(struct net_device *dev,
 	struct ocelot_port_private *priv = netdev_priv(dev);
 
 	switchdev_bridge_port_unoffload(brport_dev, priv,
-					&ocelot_netdevice_nb,
 					&ocelot_switchdev_blocking_nb);
 }
 
diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index abdb4b49add1..5790180fee07 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -2598,7 +2598,7 @@ static int ofdpa_port_bridge_join(struct ofdpa_port *ofdpa_port,
 	if (err)
 		return err;
 
-	return switchdev_bridge_port_offload(dev, dev, NULL, NULL, NULL,
+	return switchdev_bridge_port_offload(dev, dev, NULL, NULL,
 					     false, extack);
 }
 
@@ -2607,7 +2607,7 @@ static int ofdpa_port_bridge_leave(struct ofdpa_port *ofdpa_port)
 	struct net_device *dev = ofdpa_port->dev;
 	int err;
 
-	switchdev_bridge_port_unoffload(dev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(dev, NULL, NULL);
 
 	err = ofdpa_port_vlan_del(ofdpa_port, OFDPA_UNTAGGED_VID, 0);
 	if (err)
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
index 130346f74ee8..3a7fde2bf861 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
@@ -2109,7 +2109,7 @@ static int am65_cpsw_netdevice_port_link(struct net_device *ndev,
 			return -EOPNOTSUPP;
 	}
 
-	err = switchdev_bridge_port_offload(ndev, ndev, NULL, NULL, NULL,
+	err = switchdev_bridge_port_offload(ndev, ndev, NULL, NULL,
 					    false, extack);
 	if (err)
 		return err;
@@ -2126,7 +2126,7 @@ static void am65_cpsw_netdevice_port_unlink(struct net_device *ndev)
 	struct am65_cpsw_common *common = am65_ndev_to_common(ndev);
 	struct am65_cpsw_ndev_priv *priv = am65_ndev_to_priv(ndev);
 
-	switchdev_bridge_port_unoffload(ndev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(ndev, NULL, NULL);
 
 	common->br_members &= ~BIT(priv->port->port_id);
 
diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
index 85d05b9be2b8..239ccdd6bc48 100644
--- a/drivers/net/ethernet/ti/cpsw_new.c
+++ b/drivers/net/ethernet/ti/cpsw_new.c
@@ -1518,7 +1518,7 @@ static int cpsw_netdevice_port_link(struct net_device *ndev,
 			return -EOPNOTSUPP;
 	}
 
-	err = switchdev_bridge_port_offload(ndev, ndev, NULL, NULL, NULL,
+	err = switchdev_bridge_port_offload(ndev, ndev, NULL, NULL,
 					    false, extack);
 	if (err)
 		return err;
@@ -1535,7 +1535,7 @@ static void cpsw_netdevice_port_unlink(struct net_device *ndev)
 	struct cpsw_priv *priv = netdev_priv(ndev);
 	struct cpsw_common *cpsw = priv->cpsw;
 
-	switchdev_bridge_port_unoffload(ndev, NULL, NULL, NULL);
+	switchdev_bridge_port_unoffload(ndev, NULL, NULL);
 
 	cpsw->br_members &= ~BIT(priv->emac_port);
 
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 5570df4d9b76..f6f200b77986 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -183,7 +183,6 @@ typedef int switchdev_obj_dump_cb_t(struct switchdev_obj *obj);
 struct switchdev_brport {
 	struct net_device *dev;
 	const void *ctx;
-	struct notifier_block *atomic_nb;
 	struct notifier_block *blocking_nb;
 	bool tx_fwd_offload;
 };
@@ -265,13 +264,11 @@ switchdev_fdb_is_dynamically_learned(const struct switchdev_notifier_fdb_info *f
 
 int switchdev_bridge_port_offload(struct net_device *brport_dev,
 				  struct net_device *dev, const void *ctx,
-				  struct notifier_block *atomic_nb,
 				  struct notifier_block *blocking_nb,
 				  bool tx_fwd_offload,
 				  struct netlink_ext_ack *extack);
 void switchdev_bridge_port_unoffload(struct net_device *brport_dev,
 				     const void *ctx,
-				     struct notifier_block *atomic_nb,
 				     struct notifier_block *blocking_nb);
 
 void switchdev_deferred_process(void);
@@ -354,7 +351,6 @@ int switchdev_handle_port_attr_set(struct net_device *dev,
 static inline int
 switchdev_bridge_port_offload(struct net_device *brport_dev,
 			      struct net_device *dev, const void *ctx,
-			      struct notifier_block *atomic_nb,
 			      struct notifier_block *blocking_nb,
 			      bool tx_fwd_offload,
 			      struct netlink_ext_ack *extack)
@@ -365,7 +361,6 @@ switchdev_bridge_port_offload(struct net_device *brport_dev,
 static inline void
 switchdev_bridge_port_unoffload(struct net_device *brport_dev,
 				const void *ctx,
-				struct notifier_block *atomic_nb,
 				struct notifier_block *blocking_nb)
 {
 }
diff --git a/net/bridge/br.c b/net/bridge/br.c
index d3a32c6813e0..ef92f57b14e6 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -222,7 +222,7 @@ static int br_switchdev_blocking_event(struct notifier_block *nb,
 		b = &brport_info->brport;
 
 		err = br_switchdev_port_offload(p, b->dev, b->ctx,
-						b->atomic_nb, b->blocking_nb,
+						b->blocking_nb,
 						b->tx_fwd_offload, extack);
 		err = notifier_from_errno(err);
 		break;
@@ -230,8 +230,7 @@ static int br_switchdev_blocking_event(struct notifier_block *nb,
 		brport_info = ptr;
 		b = &brport_info->brport;
 
-		br_switchdev_port_unoffload(p, b->ctx, b->atomic_nb,
-					    b->blocking_nb);
+		br_switchdev_port_unoffload(p, b->ctx, b->blocking_nb);
 		break;
 	}
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 390c807d1c7c..0320fe816a97 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -1946,13 +1946,11 @@ static inline void br_sysfs_delbr(struct net_device *dev) { return; }
 #ifdef CONFIG_NET_SWITCHDEV
 int br_switchdev_port_offload(struct net_bridge_port *p,
 			      struct net_device *dev, const void *ctx,
-			      struct notifier_block *atomic_nb,
 			      struct notifier_block *blocking_nb,
 			      bool tx_fwd_offload,
 			      struct netlink_ext_ack *extack);
 
 void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
-				 struct notifier_block *atomic_nb,
 				 struct notifier_block *blocking_nb);
 
 bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb);
@@ -1986,7 +1984,6 @@ static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
 static inline int
 br_switchdev_port_offload(struct net_bridge_port *p,
 			  struct net_device *dev, const void *ctx,
-			  struct notifier_block *atomic_nb,
 			  struct notifier_block *blocking_nb,
 			  bool tx_fwd_offload,
 			  struct netlink_ext_ack *extack)
@@ -1996,7 +1993,6 @@ br_switchdev_port_offload(struct net_bridge_port *p,
 
 static inline void
 br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
-			    struct notifier_block *atomic_nb,
 			    struct notifier_block *blocking_nb)
 {
 }
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index f2cb066e3ebb..659793fcb4ed 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -361,7 +361,6 @@ static int br_fdb_replay(const struct net_device *br_dev, const void *ctx,
 }
 
 static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
-				   struct notifier_block *atomic_nb,
 				   struct notifier_block *blocking_nb,
 				   struct netlink_ext_ack *extack)
 {
@@ -386,7 +385,6 @@ static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
 
 static void nbp_switchdev_unsync_objs(struct net_bridge_port *p,
 				      const void *ctx,
-				      struct notifier_block *atomic_nb,
 				      struct notifier_block *blocking_nb)
 {
 	struct net_device *br_dev = p->br->dev;
@@ -404,7 +402,6 @@ static void nbp_switchdev_unsync_objs(struct net_bridge_port *p,
  */
 int br_switchdev_port_offload(struct net_bridge_port *p,
 			      struct net_device *dev, const void *ctx,
-			      struct notifier_block *atomic_nb,
 			      struct notifier_block *blocking_nb,
 			      bool tx_fwd_offload,
 			      struct netlink_ext_ack *extack)
@@ -420,7 +417,7 @@ int br_switchdev_port_offload(struct net_bridge_port *p,
 	if (err)
 		return err;
 
-	err = nbp_switchdev_sync_objs(p, ctx, atomic_nb, blocking_nb, extack);
+	err = nbp_switchdev_sync_objs(p, ctx, blocking_nb, extack);
 	if (err)
 		goto out_switchdev_del;
 
@@ -433,10 +430,9 @@ int br_switchdev_port_offload(struct net_bridge_port *p,
 }
 
 void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
-				 struct notifier_block *atomic_nb,
 				 struct notifier_block *blocking_nb)
 {
-	nbp_switchdev_unsync_objs(p, ctx, atomic_nb, blocking_nb);
+	nbp_switchdev_unsync_objs(p, ctx, blocking_nb);
 
 	nbp_switchdev_del(p);
 }
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 979042a64d1a..30071da45403 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -375,7 +375,6 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
 	tx_fwd_offload = dsa_port_bridge_tx_fwd_offload(dp, br);
 
 	err = switchdev_bridge_port_offload(brport_dev, dev, dp,
-					    &dsa_slave_switchdev_notifier,
 					    &dsa_slave_switchdev_blocking_notifier,
 					    tx_fwd_offload, extack);
 	if (err)
@@ -389,7 +388,6 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
 
 out_rollback_unoffload:
 	switchdev_bridge_port_unoffload(brport_dev, dp,
-					&dsa_slave_switchdev_notifier,
 					&dsa_slave_switchdev_blocking_notifier);
 out_rollback_unbridge:
 	dsa_broadcast(DSA_NOTIFIER_BRIDGE_LEAVE, &info);
@@ -403,7 +401,6 @@ void dsa_port_pre_bridge_leave(struct dsa_port *dp, struct net_device *br)
 	struct net_device *brport_dev = dsa_port_to_bridge_port(dp);
 
 	switchdev_bridge_port_unoffload(brport_dev, dp,
-					&dsa_slave_switchdev_notifier,
 					&dsa_slave_switchdev_blocking_notifier);
 }
 
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index c34c6abceec6..d09e8e9df5b6 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -859,7 +859,6 @@ EXPORT_SYMBOL_GPL(switchdev_handle_port_attr_set);
 
 int switchdev_bridge_port_offload(struct net_device *brport_dev,
 				  struct net_device *dev, const void *ctx,
-				  struct notifier_block *atomic_nb,
 				  struct notifier_block *blocking_nb,
 				  bool tx_fwd_offload,
 				  struct netlink_ext_ack *extack)
@@ -868,7 +867,6 @@ int switchdev_bridge_port_offload(struct net_device *brport_dev,
 		.brport = {
 			.dev = dev,
 			.ctx = ctx,
-			.atomic_nb = atomic_nb,
 			.blocking_nb = blocking_nb,
 			.tx_fwd_offload = tx_fwd_offload,
 		},
@@ -886,13 +884,11 @@ EXPORT_SYMBOL_GPL(switchdev_bridge_port_offload);
 
 void switchdev_bridge_port_unoffload(struct net_device *brport_dev,
 				     const void *ctx,
-				     struct notifier_block *atomic_nb,
 				     struct notifier_block *blocking_nb)
 {
 	struct switchdev_notifier_brport_info brport_info = {
 		.brport = {
 			.ctx = ctx,
-			.atomic_nb = atomic_nb,
 			.blocking_nb = blocking_nb,
 		},
 	};
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 6/7] net: switchdev: don't assume RCU context in switchdev_handle_fdb_{add,del}_to_device
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
                   ` (4 preceding siblings ...)
  2021-08-20 11:57 ` [PATCH v3 net-next 5/7] net: switchdev: drop the atomic notifier block from switchdev_bridge_port_{,un}offload Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 11:57 ` [PATCH v3 net-next 7/7] net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously Vladimir Oltean
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

Now that the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events are blocking, it
would be nice if callers of the fan-out helper functions (i.e. DSA)
could benefit from that blocking context.

But at the moment, switchdev_handle_fdb_{add,del}_to_device use some
netdev adjacency list checking functions that assume RCU protection.
Switch over to their rtnl_mutex equivalents, since we are also running
with that taken, and drop the surrounding rcu_read_lock from the callers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3: none

 net/dsa/slave.c           |  4 ----
 net/switchdev/switchdev.c | 10 +++++++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 6601224f6a5a..196a0e1f4294 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2484,22 +2484,18 @@ static int dsa_slave_switchdev_blocking_event(struct notifier_block *unused,
 						     dsa_slave_port_attr_set);
 		return notifier_from_errno(err);
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		rcu_read_lock();
 		err = switchdev_handle_fdb_add_to_device(dev, ptr,
 							 dsa_slave_dev_check,
 							 dsa_foreign_dev_check,
 							 dsa_slave_fdb_add_to_device,
 							 NULL);
-		rcu_read_unlock();
 		return notifier_from_errno(err);
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		rcu_read_lock();
 		err = switchdev_handle_fdb_del_to_device(dev, ptr,
 							 dsa_slave_dev_check,
 							 dsa_foreign_dev_check,
 							 dsa_slave_fdb_del_to_device,
 							 NULL);
-		rcu_read_unlock();
 		return notifier_from_errno(err);
 	}
 
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index d09e8e9df5b6..fdbb73439f37 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -470,7 +470,7 @@ switchdev_lower_dev_find(struct net_device *dev,
 		.data = &switchdev_priv,
 	};
 
-	netdev_walk_all_lower_dev_rcu(dev, switchdev_lower_dev_walk, &priv);
+	netdev_walk_all_lower_dev(dev, switchdev_lower_dev_walk, &priv);
 
 	return switchdev_priv.lower_dev;
 }
@@ -543,7 +543,7 @@ static int __switchdev_handle_fdb_add_to_device(struct net_device *dev,
 	/* Event is neither on a bridge nor a LAG. Check whether it is on an
 	 * interface that is in a bridge with us.
 	 */
-	br = netdev_master_upper_dev_get_rcu(dev);
+	br = netdev_master_upper_dev_get(dev);
 	if (!br || !netif_is_bridge_master(br))
 		return 0;
 
@@ -569,6 +569,8 @@ int switchdev_handle_fdb_add_to_device(struct net_device *dev,
 {
 	int err;
 
+	ASSERT_RTNL();
+
 	err = __switchdev_handle_fdb_add_to_device(dev, dev, fdb_info,
 						   check_cb,
 						   foreign_dev_check_cb,
@@ -648,7 +650,7 @@ static int __switchdev_handle_fdb_del_to_device(struct net_device *dev,
 	/* Event is neither on a bridge nor a LAG. Check whether it is on an
 	 * interface that is in a bridge with us.
 	 */
-	br = netdev_master_upper_dev_get_rcu(dev);
+	br = netdev_master_upper_dev_get(dev);
 	if (!br || !netif_is_bridge_master(br))
 		return 0;
 
@@ -674,6 +676,8 @@ int switchdev_handle_fdb_del_to_device(struct net_device *dev,
 {
 	int err;
 
+	ASSERT_RTNL();
+
 	err = __switchdev_handle_fdb_del_to_device(dev, dev, fdb_info,
 						   check_cb,
 						   foreign_dev_check_cb,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 net-next 7/7] net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
                   ` (5 preceding siblings ...)
  2021-08-20 11:57 ` [PATCH v3 net-next 6/7] net: switchdev: don't assume RCU context in switchdev_handle_fdb_{add,del}_to_device Vladimir Oltean
@ 2021-08-20 11:57 ` Vladimir Oltean
  2021-08-20 15:46 ` [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vlad Buslov
  2021-08-26 14:35 ` Alexandra Winter
  8 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 11:57 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Vlad Buslov,
	Jianbo Liu, Mark Bloch, Roi Dayan, Tobias Waldekranz,
	Vignesh Raghavendra, Jesse Brandeburg, linux-s390

Since the switchdev FDB entry notifications are now blocking and
deferred by switchdev and not by us, switchdev will also wait for us to
finish, which means we can proceed with our FDB isolation mechanism
based on dp->bridge_num.

It also means that the ordered workqueue is no longer needed, drop it
and simply call the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v2->v3: none

 net/dsa/dsa.c      |  15 -------
 net/dsa/dsa_priv.h |  15 -------
 net/dsa/slave.c    | 110 ++++++++++++---------------------------------
 3 files changed, 28 insertions(+), 112 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 1dc45e40f961..b2126334387f 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -338,13 +338,6 @@ static struct packet_type dsa_pack_type __read_mostly = {
 	.func	= dsa_switch_rcv,
 };
 
-static struct workqueue_struct *dsa_owq;
-
-bool dsa_schedule_work(struct work_struct *work)
-{
-	return queue_work(dsa_owq, work);
-}
-
 int dsa_devlink_param_get(struct devlink *dl, u32 id,
 			  struct devlink_param_gset_ctx *ctx)
 {
@@ -465,11 +458,6 @@ static int __init dsa_init_module(void)
 {
 	int rc;
 
-	dsa_owq = alloc_ordered_workqueue("dsa_ordered",
-					  WQ_MEM_RECLAIM);
-	if (!dsa_owq)
-		return -ENOMEM;
-
 	rc = dsa_slave_register_notifier();
 	if (rc)
 		goto register_notifier_fail;
@@ -482,8 +470,6 @@ static int __init dsa_init_module(void)
 	return 0;
 
 register_notifier_fail:
-	destroy_workqueue(dsa_owq);
-
 	return rc;
 }
 module_init(dsa_init_module);
@@ -494,7 +480,6 @@ static void __exit dsa_cleanup_module(void)
 
 	dsa_slave_unregister_notifier();
 	dev_remove_pack(&dsa_pack_type);
-	destroy_workqueue(dsa_owq);
 }
 module_exit(dsa_cleanup_module);
 
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index b7a269e0513f..f759abceeb18 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -125,20 +125,6 @@ struct dsa_notifier_tag_8021q_vlan_info {
 	u16 vid;
 };
 
-struct dsa_switchdev_event_work {
-	struct dsa_switch *ds;
-	int port;
-	struct net_device *dev;
-	struct work_struct work;
-	unsigned long event;
-	/* Specific for SWITCHDEV_FDB_ADD_TO_DEVICE and
-	 * SWITCHDEV_FDB_DEL_TO_DEVICE
-	 */
-	unsigned char addr[ETH_ALEN];
-	u16 vid;
-	bool host_addr;
-};
-
 /* DSA_NOTIFIER_HSR_* */
 struct dsa_notifier_hsr_info {
 	struct net_device *hsr;
@@ -169,7 +155,6 @@ const struct dsa_device_ops *dsa_tag_driver_get(int tag_protocol);
 void dsa_tag_driver_put(const struct dsa_device_ops *ops);
 const struct dsa_device_ops *dsa_find_tagger_by_name(const char *buf);
 
-bool dsa_schedule_work(struct work_struct *work);
 const char *dsa_tag_protocol_to_str(const struct dsa_device_ops *ops);
 
 static inline int dsa_tag_protocol_overhead(const struct dsa_device_ops *ops)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 196a0e1f4294..8507e0cd2b9e 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2278,73 +2278,18 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
-static void
-dsa_fdb_offload_notify(struct dsa_switchdev_event_work *switchdev_work)
+static void dsa_fdb_offload_notify(struct net_device *dev,
+				   const unsigned char *addr,
+				   u16 vid)
 {
 	struct switchdev_notifier_fdb_info info = {};
-	struct dsa_switch *ds = switchdev_work->ds;
-	struct dsa_port *dp;
-
-	if (!dsa_is_user_port(ds, switchdev_work->port))
-		return;
 
-	ether_addr_copy(info.addr, switchdev_work->addr);
-	info.vid = switchdev_work->vid;
+	ether_addr_copy(info.addr, addr);
+	info.vid = vid;
 	info.offloaded = true;
-	dp = dsa_to_port(ds, switchdev_work->port);
-	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
-				 dp->slave, &info.info, NULL);
-}
-
-static void dsa_slave_switchdev_event_work(struct work_struct *work)
-{
-	struct dsa_switchdev_event_work *switchdev_work =
-		container_of(work, struct dsa_switchdev_event_work, work);
-	struct dsa_switch *ds = switchdev_work->ds;
-	struct dsa_port *dp;
-	int err;
-
-	dp = dsa_to_port(ds, switchdev_work->port);
-
-	rtnl_lock();
-	switch (switchdev_work->event) {
-	case SWITCHDEV_FDB_ADD_TO_DEVICE:
-		if (switchdev_work->host_addr)
-			err = dsa_port_host_fdb_add(dp, switchdev_work->addr,
-						    switchdev_work->vid);
-		else
-			err = dsa_port_fdb_add(dp, switchdev_work->addr,
-					       switchdev_work->vid);
-		if (err) {
-			dev_err(ds->dev,
-				"port %d failed to add %pM vid %d to fdb: %d\n",
-				dp->index, switchdev_work->addr,
-				switchdev_work->vid, err);
-			break;
-		}
-		dsa_fdb_offload_notify(switchdev_work);
-		break;
-
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
-		if (switchdev_work->host_addr)
-			err = dsa_port_host_fdb_del(dp, switchdev_work->addr,
-						    switchdev_work->vid);
-		else
-			err = dsa_port_fdb_del(dp, switchdev_work->addr,
-					       switchdev_work->vid);
-		if (err) {
-			dev_err(ds->dev,
-				"port %d failed to delete %pM vid %d from fdb: %d\n",
-				dp->index, switchdev_work->addr,
-				switchdev_work->vid, err);
-		}
-
-		break;
-	}
-	rtnl_unlock();
 
-	dev_put(switchdev_work->dev);
-	kfree(switchdev_work);
+	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, dev, &info.info,
+				 NULL);
 }
 
 static bool dsa_foreign_dev_check(const struct net_device *dev,
@@ -2369,10 +2314,12 @@ static int dsa_slave_fdb_event(struct net_device *dev,
 			       const struct switchdev_notifier_fdb_info *fdb_info,
 			       unsigned long event)
 {
-	struct dsa_switchdev_event_work *switchdev_work;
 	struct dsa_port *dp = dsa_slave_to_port(dev);
+	const unsigned char *addr = fdb_info->addr;
 	bool host_addr = fdb_info->is_local;
 	struct dsa_switch *ds = dp->ds;
+	u16 vid = fdb_info->vid;
+	int err;
 
 	if (ctx && ctx != dp)
 		return 0;
@@ -2397,30 +2344,29 @@ static int dsa_slave_fdb_event(struct net_device *dev,
 	if (dsa_foreign_dev_check(dev, orig_dev))
 		host_addr = true;
 
-	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
-	if (!switchdev_work)
-		return -ENOMEM;
-
 	netdev_dbg(dev, "%s FDB entry towards %s, addr %pM vid %d%s\n",
 		   event == SWITCHDEV_FDB_ADD_TO_DEVICE ? "Adding" : "Deleting",
-		   orig_dev->name, fdb_info->addr, fdb_info->vid,
-		   host_addr ? " as host address" : "");
-
-	INIT_WORK(&switchdev_work->work, dsa_slave_switchdev_event_work);
-	switchdev_work->ds = ds;
-	switchdev_work->port = dp->index;
-	switchdev_work->event = event;
-	switchdev_work->dev = dev;
+		   orig_dev->name, addr, vid, host_addr ? " as host address" : "");
 
-	ether_addr_copy(switchdev_work->addr, fdb_info->addr);
-	switchdev_work->vid = fdb_info->vid;
-	switchdev_work->host_addr = host_addr;
+	switch (event) {
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+		if (host_addr)
+			err = dsa_port_host_fdb_add(dp, addr, vid);
+		else
+			err = dsa_port_fdb_add(dp, addr, vid);
+		if (!err)
+			dsa_fdb_offload_notify(dev, addr, vid);
+		break;
 
-	/* Hold a reference for dsa_fdb_offload_notify */
-	dev_hold(dev);
-	dsa_schedule_work(&switchdev_work->work);
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		if (host_addr)
+			err = dsa_port_host_fdb_del(dp, addr, vid);
+		else
+			err = dsa_port_fdb_del(dp, addr, vid);
+		break;
+	}
 
-	return 0;
+	return err;
 }
 
 static int
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
                   ` (6 preceding siblings ...)
  2021-08-20 11:57 ` [PATCH v3 net-next 7/7] net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously Vladimir Oltean
@ 2021-08-20 15:46 ` Vlad Buslov
  2021-08-26 14:35 ` Alexandra Winter
  8 siblings, 0 replies; 14+ messages in thread
From: Vlad Buslov @ 2021-08-20 15:46 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Jakub Kicinski, David S. Miller, Roopa Prabhu,
	Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Alexandra Winter, Karsten Graul, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, Ivan Vecera, Jianbo Liu,
	Mark Bloch, Roi Dayan, Tobias Waldekranz, Vignesh Raghavendra,
	Jesse Brandeburg, linux-s390

On Fri 20 Aug 2021 at 14:57, Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> Problem statement:
>
> Any time a driver needs to create a private association between a bridge
> upper interface and use that association within its
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
> entries deleted by the bridge when the port leaves. The issue is that
> all switchdev drivers schedule a work item to have sleepable context,
> and that work item can be actually scheduled after the port has left the
> bridge, which means the association might have already been broken by
> the time the scheduled FDB work item attempts to use it.
>
> The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
> mechanism to make the FDB notifiers emitted from the fastpath be
> scheduled in sleepable context. All drivers are converted to handle
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
> handler (or register a blocking switchdev notifier handler if they
> didn't have one). This solves the aforementioned problem because the
> bridge waits for the switchdev deferred work items to finish before a
> port leaves (del_nbp calls switchdev_deferred_process), whereas a work
> item privately scheduled by the driver will obviously not be waited upon
> by the bridge, leading to the possibility of having the race.
>
> This is a dependency for the "DSA FDB isolation" posted here. It was
> split out of that series hence the numbering starts directly at v2.
>
> https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/
>
> Changes in v3:
> - make "addr" part of switchdev_fdb_notifier_info to avoid dangling
>   pointers not watched by RCU
> - mlx5 correction
> - build fixes in the S/390 qeth driver
>
> Vladimir Oltean (7):
>   net: bridge: move br_fdb_replay inside br_switchdev.c
>   net: switchdev: keep the MAC address by value in struct
>     switchdev_notifier_fdb_info
>   net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking
>     notifier chain
>   net: bridge: switchdev: make br_fdb_replay offer sleepable context to
>     consumers
>   net: switchdev: drop the atomic notifier block from
>     switchdev_bridge_port_{,un}offload
>   net: switchdev: don't assume RCU context in
>     switchdev_handle_fdb_{add,del}_to_device
>   net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously
>
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  75 ++++------
>  .../marvell/prestera/prestera_switchdev.c     | 104 ++++++-------
>  .../mellanox/mlx5/core/en/rep/bridge.c        |  65 +++++++--
>  .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
>  .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
>  .../mellanox/mlxsw/spectrum_switchdev.c       |  62 ++++++--
>  .../microchip/sparx5/sparx5_mactable.c        |   2 +-
>  .../microchip/sparx5/sparx5_switchdev.c       |  72 ++++-----
>  drivers/net/ethernet/mscc/ocelot_net.c        |   3 -
>  drivers/net/ethernet/rocker/rocker_main.c     |  67 ++++-----
>  drivers/net/ethernet/rocker/rocker_ofdpa.c    |   6 +-
>  drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   4 +-
>  drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  54 +++----
>  drivers/net/ethernet/ti/cpsw_new.c            |   4 +-
>  drivers/net/ethernet/ti/cpsw_switchdev.c      |  57 ++++----
>  drivers/s390/net/qeth_l2_main.c               |  26 ++--
>  include/net/switchdev.h                       |  33 ++++-
>  net/bridge/br.c                               |   5 +-
>  net/bridge/br_fdb.c                           |  54 -------
>  net/bridge/br_private.h                       |   6 -
>  net/bridge/br_switchdev.c                     | 128 +++++++++++++---
>  net/dsa/dsa.c                                 |  15 --
>  net/dsa/dsa_priv.h                            |  15 --
>  net/dsa/port.c                                |   3 -
>  net/dsa/slave.c                               | 138 ++++++------------
>  net/switchdev/switchdev.c                     |  61 +++++++-
>  26 files changed, 550 insertions(+), 515 deletions(-)

For mlx5 parts:

Reviewed-and-tested-by: Vlad Buslov <vladbu@nvidia.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain
  2021-08-20 11:57 ` [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain Vladimir Oltean
@ 2021-08-20 15:54   ` Jakub Kicinski
  2021-08-20 15:57     ` Vladimir Oltean
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2021-08-20 15:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Roopa Prabhu, Nikolay Aleksandrov,
	Andrew Lunn, Florian Fainelli, Vivien Didelot, Vladimir Oltean,
	Vadym Kochan, Taras Chornyi, Jiri Pirko, Ido Schimmel,
	UNGLinuxDriver, Grygorii Strashko, Marek Behun, DENG Qingfang,
	Kurt Kanzenbach, Hauke Mehrtens, Woojung Huh, Sean Wang,
	Landen Chao, Claudiu Manoil, Alexandre Belloni,
	George McCollister, Ioana Ciornei, Saeed Mahameed,
	Leon Romanovsky, Lars Povlsen, Steen Hegelund, Julian Wiedmann,
	Alexandra Winter, Karsten Graul, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Ivan Vecera, Vlad Buslov, Jianbo Liu,
	Mark Bloch, Roi Dayan, Tobias Waldekranz, Vignesh Raghavendra,
	Jesse Brandeburg, linux-s390

On Fri, 20 Aug 2021 14:57:42 +0300 Vladimir Oltean wrote:
> Currently, br_switchdev_fdb_notify() uses call_switchdev_notifiers (and
> br_fdb_replay() open-codes the same thing). This means that drivers
> handle the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events on the atomic
> switchdev notifier block.

drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c: In function ‘sparx5_switchdev_fdb_event’:
drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c:453:1: warning: label ‘err_addr_alloc’ defined but not used [-Wunused-label]
  453 | err_addr_alloc:
      | ^~~~~~~~~~~~~~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain
  2021-08-20 15:54   ` Jakub Kicinski
@ 2021-08-20 15:57     ` Vladimir Oltean
  2021-08-20 16:55       ` Jakub Kicinski
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-20 15:57 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Vladimir Oltean, netdev, David S. Miller, Roopa Prabhu,
	Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vadym Kochan, Taras Chornyi, Jiri Pirko,
	Ido Schimmel, UNGLinuxDriver, Grygorii Strashko, Marek Behun,
	DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens, Woojung Huh,
	Sean Wang, Landen Chao, Claudiu Manoil, Alexandre Belloni,
	George McCollister, Ioana Ciornei, Saeed Mahameed,
	Leon Romanovsky, Lars Povlsen, Steen Hegelund, Julian Wiedmann,
	Alexandra Winter, Karsten Graul, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Ivan Vecera, Vlad Buslov, Jianbo Liu,
	Mark Bloch, Roi Dayan, Tobias Waldekranz, Vignesh Raghavendra,
	Jesse Brandeburg, linux-s390

On Fri, Aug 20, 2021 at 08:54:02AM -0700, Jakub Kicinski wrote:
> On Fri, 20 Aug 2021 14:57:42 +0300 Vladimir Oltean wrote:
> > Currently, br_switchdev_fdb_notify() uses call_switchdev_notifiers (and
> > br_fdb_replay() open-codes the same thing). This means that drivers
> > handle the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events on the atomic
> > switchdev notifier block.
> 
> drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c: In function ‘sparx5_switchdev_fdb_event’:
> drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c:453:1: warning: label ‘err_addr_alloc’ defined but not used [-Wunused-label]
>   453 | err_addr_alloc:
>       | ^~~~~~~~~~~~~~

Yeah, I noticed (too late sadly). Other than a bit of dead code it does
not impact functionality, so that's why I didn't jump to resend until I
got some feedback first (thanks Vlad). Do you think it's time to resend?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain
  2021-08-20 15:57     ` Vladimir Oltean
@ 2021-08-20 16:55       ` Jakub Kicinski
  0 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2021-08-20 16:55 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Vladimir Oltean, netdev, David S. Miller, Roopa Prabhu,
	Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vadym Kochan, Taras Chornyi, Jiri Pirko,
	Ido Schimmel, UNGLinuxDriver, Grygorii Strashko, Marek Behun,
	DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens, Woojung Huh,
	Sean Wang, Landen Chao, Claudiu Manoil, Alexandre Belloni,
	George McCollister, Ioana Ciornei, Saeed Mahameed,
	Leon Romanovsky, Lars Povlsen, Steen Hegelund, Julian Wiedmann,
	Alexandra Winter, Karsten Graul, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Ivan Vecera, Vlad Buslov, Jianbo Liu,
	Mark Bloch, Roi Dayan, Tobias Waldekranz, Vignesh Raghavendra,
	Jesse Brandeburg, linux-s390

On Fri, 20 Aug 2021 18:57:49 +0300 Vladimir Oltean wrote:
> On Fri, Aug 20, 2021 at 08:54:02AM -0700, Jakub Kicinski wrote:
> > On Fri, 20 Aug 2021 14:57:42 +0300 Vladimir Oltean wrote:  
> > > Currently, br_switchdev_fdb_notify() uses call_switchdev_notifiers (and
> > > br_fdb_replay() open-codes the same thing). This means that drivers
> > > handle the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events on the atomic
> > > switchdev notifier block.  
> > 
> > drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c: In function ‘sparx5_switchdev_fdb_event’:
> > drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c:453:1: warning: label ‘err_addr_alloc’ defined but not used [-Wunused-label]
> >   453 | err_addr_alloc:
> >       | ^~~~~~~~~~~~~~  
> 
> Yeah, I noticed (too late sadly). Other than a bit of dead code it does
> not impact functionality, so that's why I didn't jump to resend until I
> got some feedback first (thanks Vlad). Do you think it's time to resend?

No really, I just don't want to drop things from patchwork without
making a note of what the reason was. You can reply to your own posting
saying that you noticed an error but want it reviewed anyway, or wait
for me to complain and quietly stick to your plan.. No real preference.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking
  2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
                   ` (7 preceding siblings ...)
  2021-08-20 15:46 ` [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vlad Buslov
@ 2021-08-26 14:35 ` Alexandra Winter
  2021-08-26 14:41   ` Vladimir Oltean
  8 siblings, 1 reply; 14+ messages in thread
From: Alexandra Winter @ 2021-08-26 14:35 UTC (permalink / raw)
  To: Vladimir Oltean, netdev, Jakub Kicinski, David S. Miller
  Cc: Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Vadym Kochan, Taras Chornyi,
	Jiri Pirko, Ido Schimmel, UNGLinuxDriver, Grygorii Strashko,
	Marek Behun, DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens,
	Woojung Huh, Sean Wang, Landen Chao, Claudiu Manoil,
	Alexandre Belloni, George McCollister, Ioana Ciornei,
	Saeed Mahameed, Leon Romanovsky, Lars Povlsen, Steen Hegelund,
	Julian Wiedmann, Karsten Graul, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Ivan Vecera, Vlad Buslov, Jianbo Liu,
	Mark Bloch, Roi Dayan, Tobias Waldekranz, Vignesh Raghavendra,
	Jesse Brandeburg, linux-s390



On 20.08.21 13:57, Vladimir Oltean wrote:
> Problem statement:
> 
> Any time a driver needs to create a private association between a bridge
> upper interface and use that association within its
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
> entries deleted by the bridge when the port leaves. The issue is that
> all switchdev drivers schedule a work item to have sleepable context,
> and that work item can be actually scheduled after the port has left the
> bridge, which means the association might have already been broken by
> the time the scheduled FDB work item attempts to use it.
> 
> The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
> mechanism to make the FDB notifiers emitted from the fastpath be
> scheduled in sleepable context. All drivers are converted to handle
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
> handler (or register a blocking switchdev notifier handler if they
> didn't have one). This solves the aforementioned problem because the
> bridge waits for the switchdev deferred work items to finish before a
> port leaves (del_nbp calls switchdev_deferred_process), whereas a work
> item privately scheduled by the driver will obviously not be waited upon
> by the bridge, leading to the possibility of having the race.
> 
> This is a dependency for the "DSA FDB isolation" posted here. It was
> split out of that series hence the numbering starts directly at v2.
> 
> https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/
> 
> Changes in v3:
> - make "addr" part of switchdev_fdb_notifier_info to avoid dangling
>   pointers not watched by RCU
> - mlx5 correction
> - build fixes in the S/390 qeth driver
> 
> Vladimir Oltean (7):
>   net: bridge: move br_fdb_replay inside br_switchdev.c
>   net: switchdev: keep the MAC address by value in struct
>     switchdev_notifier_fdb_info
>   net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking
>     notifier chain
>   net: bridge: switchdev: make br_fdb_replay offer sleepable context to
>     consumers
>   net: switchdev: drop the atomic notifier block from
>     switchdev_bridge_port_{,un}offload
>   net: switchdev: don't assume RCU context in
>     switchdev_handle_fdb_{add,del}_to_device
>   net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously
> 
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  75 ++++------
>  .../marvell/prestera/prestera_switchdev.c     | 104 ++++++-------
>  .../mellanox/mlx5/core/en/rep/bridge.c        |  65 +++++++--
>  .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
>  .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
>  .../mellanox/mlxsw/spectrum_switchdev.c       |  62 ++++++--
>  .../microchip/sparx5/sparx5_mactable.c        |   2 +-
>  .../microchip/sparx5/sparx5_switchdev.c       |  72 ++++-----
>  drivers/net/ethernet/mscc/ocelot_net.c        |   3 -
>  drivers/net/ethernet/rocker/rocker_main.c     |  67 ++++-----
>  drivers/net/ethernet/rocker/rocker_ofdpa.c    |   6 +-
>  drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   4 +-
>  drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  54 +++----
>  drivers/net/ethernet/ti/cpsw_new.c            |   4 +-
>  drivers/net/ethernet/ti/cpsw_switchdev.c      |  57 ++++----
>  drivers/s390/net/qeth_l2_main.c               |  26 ++--
>  include/net/switchdev.h                       |  33 ++++-
>  net/bridge/br.c                               |   5 +-
>  net/bridge/br_fdb.c                           |  54 -------
>  net/bridge/br_private.h                       |   6 -
>  net/bridge/br_switchdev.c                     | 128 +++++++++++++---
>  net/dsa/dsa.c                                 |  15 --
>  net/dsa/dsa_priv.h                            |  15 --
>  net/dsa/port.c                                |   3 -
>  net/dsa/slave.c                               | 138 ++++++------------
>  net/switchdev/switchdev.c                     |  61 +++++++-
>  26 files changed, 550 insertions(+), 515 deletions(-)
> 

For drivers/s390/net/qeth_l2_main.c :

Reviewed-and-tested-by: Alexandra Winter <wintera@linux.ibm.com>

Thank you for this proposal, it makes qeth switchdev code more robust, cleaner and gives the
opportunity for future enhancements, like you proposed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking
  2021-08-26 14:35 ` Alexandra Winter
@ 2021-08-26 14:41   ` Vladimir Oltean
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Oltean @ 2021-08-26 14:41 UTC (permalink / raw)
  To: Alexandra Winter
  Cc: Vladimir Oltean, netdev, Jakub Kicinski, David S. Miller,
	Roopa Prabhu, Nikolay Aleksandrov, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vadym Kochan, Taras Chornyi, Jiri Pirko,
	Ido Schimmel, UNGLinuxDriver, Grygorii Strashko, Marek Behun,
	DENG Qingfang, Kurt Kanzenbach, Hauke Mehrtens, Woojung Huh,
	Sean Wang, Landen Chao, Claudiu Manoil, Alexandre Belloni,
	George McCollister, Ioana Ciornei, Saeed Mahameed,
	Leon Romanovsky, Lars Povlsen, Steen Hegelund, Julian Wiedmann,
	Karsten Graul, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Ivan Vecera, Vlad Buslov, Jianbo Liu,
	Mark Bloch, Roi Dayan, Tobias Waldekranz, Vignesh Raghavendra,
	Jesse Brandeburg, linux-s390

Hi Alexandra,

On Thu, Aug 26, 2021 at 04:35:58PM +0200, Alexandra Winter wrote:
> For drivers/s390/net/qeth_l2_main.c :
> 
> Reviewed-and-tested-by: Alexandra Winter <wintera@linux.ibm.com>
> 
> Thank you for this proposal, it makes qeth switchdev code more robust, cleaner and gives the
> opportunity for future enhancements, like you proposed.

Thanks for reviewing and testing, and sorry that I did not copy you on
v2 too, only on v3. The discussion on v2 had not yet completed when I
posted the v3, and unless anybody has any better idea, a v4 is not going
to take place:

https://patchwork.kernel.org/project/netdevbpf/cover/20210819160723.2186424-1-vladimir.oltean@nxp.com/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-08-26 14:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-20 11:57 [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
2021-08-20 11:57 ` [PATCH v3 net-next 1/7] net: bridge: move br_fdb_replay inside br_switchdev.c Vladimir Oltean
2021-08-20 11:57 ` [PATCH v3 net-next 2/7] net: switchdev: keep the MAC address by value in struct switchdev_notifier_fdb_info Vladimir Oltean
2021-08-20 11:57 ` [PATCH v3 net-next 3/7] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain Vladimir Oltean
2021-08-20 15:54   ` Jakub Kicinski
2021-08-20 15:57     ` Vladimir Oltean
2021-08-20 16:55       ` Jakub Kicinski
2021-08-20 11:57 ` [PATCH v3 net-next 4/7] net: bridge: switchdev: make br_fdb_replay offer sleepable context to consumers Vladimir Oltean
2021-08-20 11:57 ` [PATCH v3 net-next 5/7] net: switchdev: drop the atomic notifier block from switchdev_bridge_port_{,un}offload Vladimir Oltean
2021-08-20 11:57 ` [PATCH v3 net-next 6/7] net: switchdev: don't assume RCU context in switchdev_handle_fdb_{add,del}_to_device Vladimir Oltean
2021-08-20 11:57 ` [PATCH v3 net-next 7/7] net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously Vladimir Oltean
2021-08-20 15:46 ` [PATCH v3 net-next 0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vlad Buslov
2021-08-26 14:35 ` Alexandra Winter
2021-08-26 14:41   ` Vladimir Oltean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.