All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch net-next v2 0/7] switchdev: change locking
@ 2015-10-12 13:15 Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 1/7] switchdev: introduce switchdev workqueue Jiri Pirko
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

This is something which I'm currently struggling with.
Callers of attr_set and obj_add/del often hold not only RTNL, but also
spinlock (bridge). So in that case, the driver implementing the op cannot sleep.

The way rocker is dealing with this now is just to invoke driver operation
and go out, without any checking or reporting of the operation status.

Since it would be nice to at least put a warning in case the operation fails,
it makes sense to do this in delayed work directly in switchdev core
instead of implementing this in separate drivers. And that is what this patchset
is introducing.

So from now on, the locking of switchdev mod ops is consistent. Caller either
holds rtnl mutex or in case it does not, caller sets defer flag, telling
switchdev core to process the op later in delayed work.

Flush function for switchdev deferred ops can be called by op
caller in appropriate location, for example after it releases
spin lock, to force switchdev core to process pending ops.

v1->v2:
- rebased on current net-next head (including Scott's ageing patchset)

Jiri Pirko (7):
  switchdev: introduce switchdev workqueue
  switchdev: allow caller to explicitly request attr_set as deferred
  switchdev: remove pointers from switchdev objects
  switchdev: introduce possibility to defer obj_add/del
  bridge: defer switchdev fdb del call in fdb_del_external_learn
  rocker: remove nowait from switchdev callbacks.
  switchdev: assert rtnl mutex when going over lower netdevs

 drivers/net/ethernet/rocker/rocker.c |  13 +-
 include/net/switchdev.h              |  14 +-
 net/bridge/br_fdb.c                  |   7 +-
 net/bridge/br_if.c                   |   3 +
 net/bridge/br_stp.c                  |   3 +-
 net/dsa/slave.c                      |   2 +-
 net/switchdev/switchdev.c            | 289 ++++++++++++++++++++++++-----------
 7 files changed, 231 insertions(+), 100 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [patch net-next v2 1/7] switchdev: introduce switchdev workqueue
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 2/7] switchdev: allow caller to explicitly request attr_set as deferred Jiri Pirko
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

This is going to be used for deferred operations.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/switchdev.h   |  5 +++++
 net/switchdev/switchdev.c | 19 +++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 1ce7083..d2879f2 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -205,6 +205,7 @@ int switchdev_port_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
 void switchdev_port_fwd_mark_set(struct net_device *dev,
 				 struct net_device *group_dev,
 				 bool joining);
+void switchdev_flush_deferred(void);
 
 #else
 
@@ -326,6 +327,10 @@ static inline void switchdev_port_fwd_mark_set(struct net_device *dev,
 {
 }
 
+static inline void switchdev_flush_deferred(void)
+{
+}
+
 #endif
 
 #endif /* _LINUX_SWITCHDEV_H_ */
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 7a9ab90..d119e9c 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -17,9 +17,12 @@
 #include <linux/netdevice.h>
 #include <linux/if_bridge.h>
 #include <linux/list.h>
+#include <linux/workqueue.h>
 #include <net/ip_fib.h>
 #include <net/switchdev.h>
 
+static struct workqueue_struct *switchdev_wq;
+
 /**
  *	switchdev_trans_item_enqueue - Enqueue data item to transaction queue
  *
@@ -1217,3 +1220,19 @@ void switchdev_port_fwd_mark_set(struct net_device *dev,
 	dev->offload_fwd_mark = mark;
 }
 EXPORT_SYMBOL_GPL(switchdev_port_fwd_mark_set);
+
+void switchdev_flush_deferred(void)
+{
+	flush_workqueue(switchdev_wq);
+}
+EXPORT_SYMBOL_GPL(switchdev_flush_deferred);
+
+static int __init switchdev_init(void)
+{
+	switchdev_wq = create_workqueue("switchdev");
+	if (!switchdev_wq)
+		return -ENOMEM;
+	return 0;
+}
+
+subsys_initcall(switchdev_init);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [patch net-next v2 2/7] switchdev: allow caller to explicitly request attr_set as deferred
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 1/7] switchdev: introduce switchdev workqueue Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 3/7] switchdev: remove pointers from switchdev objects Jiri Pirko
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

Caller should know if he can call attr_set directly (when holding RTNL)
or if he has to defer the att_set processing for later.

This also allows drivers to sleep inside attr_set and report operation
status back to switchdev core. Switchdev core then warns if status is
not ok, instead of silent errors happening in drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/switchdev.h   |   1 +
 net/bridge/br_stp.c       |   3 +-
 net/switchdev/switchdev.c | 107 ++++++++++++++++++++++++----------------------
 3 files changed, 59 insertions(+), 52 deletions(-)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index d2879f2..6b109e4 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -17,6 +17,7 @@
 
 #define SWITCHDEV_F_NO_RECURSE		BIT(0)
 #define SWITCHDEV_F_SKIP_EOPNOTSUPP	BIT(1)
+#define SWITCHDEV_F_DEFER		BIT(2)
 
 struct switchdev_trans_item {
 	struct list_head list;
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index db6d243de..80c34d7 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -41,13 +41,14 @@ void br_set_state(struct net_bridge_port *p, unsigned int state)
 {
 	struct switchdev_attr attr = {
 		.id = SWITCHDEV_ATTR_ID_PORT_STP_STATE,
+		.flags = SWITCHDEV_F_DEFER,
 		.u.stp_state = state,
 	};
 	int err;
 
 	p->state = state;
 	err = switchdev_port_attr_set(p->dev, &attr);
-	if (err && err != -EOPNOTSUPP)
+	if (err)
 		br_warn(p->br, "error setting offload STP state on port %u(%s)\n",
 				(unsigned int) p->port_no, p->dev->name);
 }
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index d119e9c..9f94272 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -173,6 +173,49 @@ done:
 	return err;
 }
 
+static int switchdev_port_attr_set_now(struct net_device *dev,
+				       struct switchdev_attr *attr)
+{
+	struct switchdev_trans trans;
+	int err;
+
+	switchdev_trans_init(&trans);
+
+	/* Phase I: prepare for attr set. Driver/device should fail
+	 * here if there are going to be issues in the commit phase,
+	 * such as lack of resources or support.  The driver/device
+	 * should reserve resources needed for the commit phase here,
+	 * but should not commit the attr.
+	 */
+
+	trans.ph_prepare = true;
+	err = __switchdev_port_attr_set(dev, attr, &trans);
+	if (err) {
+		/* Prepare phase failed: abort the transaction.  Any
+		 * resources reserved in the prepare phase are
+		 * released.
+		 */
+
+		if (err != -EOPNOTSUPP)
+			switchdev_trans_items_destroy(&trans);
+
+		return err;
+	}
+
+	/* Phase II: commit attr set.  This cannot fail as a fault
+	 * of driver/device.  If it does, it's a bug in the driver/device
+	 * because the driver said everythings was OK in phase I.
+	 */
+
+	trans.ph_prepare = false;
+	err = __switchdev_port_attr_set(dev, attr, &trans);
+	WARN(err, "%s: Commit of attribute (id=%d) failed.\n",
+	     dev->name, attr->id);
+	switchdev_trans_items_warn_destroy(dev, &trans);
+
+	return err;
+}
+
 struct switchdev_attr_set_work {
 	struct work_struct work;
 	struct net_device *dev;
@@ -183,14 +226,17 @@ static void switchdev_port_attr_set_work(struct work_struct *work)
 {
 	struct switchdev_attr_set_work *asw =
 		container_of(work, struct switchdev_attr_set_work, work);
+	bool rtnl_locked = rtnl_is_locked();
 	int err;
 
-	rtnl_lock();
-	err = switchdev_port_attr_set(asw->dev, &asw->attr);
+	if (!rtnl_locked)
+		rtnl_lock();
+	err = switchdev_port_attr_set_now(asw->dev, &asw->attr);
 	if (err && err != -EOPNOTSUPP)
 		netdev_err(asw->dev, "failed (err=%d) to set attribute (id=%d)\n",
 			   err, asw->attr.id);
-	rtnl_unlock();
+	if (!rtnl_locked)
+		rtnl_unlock();
 
 	dev_put(asw->dev);
 	kfree(work);
@@ -211,7 +257,7 @@ static int switchdev_port_attr_set_defer(struct net_device *dev,
 	asw->dev = dev;
 	memcpy(&asw->attr, attr, sizeof(asw->attr));
 
-	schedule_work(&asw->work);
+	queue_work(switchdev_wq, &asw->work);
 
 	return 0;
 }
@@ -225,57 +271,16 @@ static int switchdev_port_attr_set_defer(struct net_device *dev,
  *	Use a 2-phase prepare-commit transaction model to ensure
  *	system is not left in a partially updated state due to
  *	failure from driver/device.
+ *
+ *	rtnl_lock must be held and must not be in atomic section,
+ *	in case SWITCHDEV_F_DEFER flag is not set.
  */
 int switchdev_port_attr_set(struct net_device *dev, struct switchdev_attr *attr)
 {
-	struct switchdev_trans trans;
-	int err;
-
-	if (!rtnl_is_locked()) {
-		/* Running prepare-commit transaction across stacked
-		 * devices requires nothing moves, so if rtnl_lock is
-		 * not held, schedule a worker thread to hold rtnl_lock
-		 * while setting attr.
-		 */
-
+	if (attr->flags & SWITCHDEV_F_DEFER)
 		return switchdev_port_attr_set_defer(dev, attr);
-	}
-
-	switchdev_trans_init(&trans);
-
-	/* Phase I: prepare for attr set. Driver/device should fail
-	 * here if there are going to be issues in the commit phase,
-	 * such as lack of resources or support.  The driver/device
-	 * should reserve resources needed for the commit phase here,
-	 * but should not commit the attr.
-	 */
-
-	trans.ph_prepare = true;
-	err = __switchdev_port_attr_set(dev, attr, &trans);
-	if (err) {
-		/* Prepare phase failed: abort the transaction.  Any
-		 * resources reserved in the prepare phase are
-		 * released.
-		 */
-
-		if (err != -EOPNOTSUPP)
-			switchdev_trans_items_destroy(&trans);
-
-		return err;
-	}
-
-	/* Phase II: commit attr set.  This cannot fail as a fault
-	 * of driver/device.  If it does, it's a bug in the driver/device
-	 * because the driver said everythings was OK in phase I.
-	 */
-
-	trans.ph_prepare = false;
-	err = __switchdev_port_attr_set(dev, attr, &trans);
-	WARN(err, "%s: Commit of attribute (id=%d) failed.\n",
-	     dev->name, attr->id);
-	switchdev_trans_items_warn_destroy(dev, &trans);
-
-	return err;
+	ASSERT_RTNL();
+	return switchdev_port_attr_set_now(dev, attr);
 }
 EXPORT_SYMBOL_GPL(switchdev_port_attr_set);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [patch net-next v2 3/7] switchdev: remove pointers from switchdev objects
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 1/7] switchdev: introduce switchdev workqueue Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 2/7] switchdev: allow caller to explicitly request attr_set as deferred Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del Jiri Pirko
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

When object is used in deferred work, we cannot use pointers in
switchdev object structures because the memory they point at may be already
used by someone else. So rather do local copy of the value.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/rocker/rocker.c |  6 +++---
 include/net/switchdev.h              |  7 +++----
 net/bridge/br_fdb.c                  |  2 +-
 net/dsa/slave.c                      |  2 +-
 net/switchdev/switchdev.c            | 11 +++++++----
 5 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index eafa907..bb956a5 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -4469,7 +4469,7 @@ static int rocker_port_obj_add(struct net_device *dev,
 		fib4 = SWITCHDEV_OBJ_IPV4_FIB(obj);
 		err = rocker_port_fib_ipv4(rocker_port, trans,
 					   htonl(fib4->dst), fib4->dst_len,
-					   fib4->fi, fib4->tb_id, 0);
+					   &fib4->fi, fib4->tb_id, 0);
 		break;
 	case SWITCHDEV_OBJ_ID_PORT_FDB:
 		err = rocker_port_fdb_add(rocker_port, trans,
@@ -4541,7 +4541,7 @@ static int rocker_port_obj_del(struct net_device *dev,
 		fib4 = SWITCHDEV_OBJ_IPV4_FIB(obj);
 		err = rocker_port_fib_ipv4(rocker_port, NULL,
 					   htonl(fib4->dst), fib4->dst_len,
-					   fib4->fi, fib4->tb_id,
+					   &fib4->fi, fib4->tb_id,
 					   ROCKER_OP_FLAG_REMOVE);
 		break;
 	case SWITCHDEV_OBJ_ID_PORT_FDB:
@@ -4571,7 +4571,7 @@ static int rocker_port_fdb_dump(const struct rocker_port *rocker_port,
 	hash_for_each_safe(rocker->fdb_tbl, bkt, tmp, found, entry) {
 		if (found->key.rocker_port != rocker_port)
 			continue;
-		fdb->addr = found->key.addr;
+		ether_addr_copy(fdb->addr, found->key.addr);
 		fdb->ndm_state = NUD_REACHABLE;
 		fdb->vid = rocker_port_vlan_to_vid(rocker_port,
 						   found->key.vlan_id);
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 6b109e4..767d516 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -14,6 +14,7 @@
 #include <linux/netdevice.h>
 #include <linux/notifier.h>
 #include <linux/list.h>
+#include <net/ip_fib.h>
 
 #define SWITCHDEV_F_NO_RECURSE		BIT(0)
 #define SWITCHDEV_F_SKIP_EOPNOTSUPP	BIT(1)
@@ -59,8 +60,6 @@ struct switchdev_attr {
 	} u;
 };
 
-struct fib_info;
-
 enum switchdev_obj_id {
 	SWITCHDEV_OBJ_ID_UNDEFINED,
 	SWITCHDEV_OBJ_ID_PORT_VLAN,
@@ -88,7 +87,7 @@ struct switchdev_obj_ipv4_fib {
 	struct switchdev_obj obj;
 	u32 dst;
 	int dst_len;
-	struct fib_info *fi;
+	struct fib_info fi;
 	u8 tos;
 	u8 type;
 	u32 nlflags;
@@ -101,7 +100,7 @@ struct switchdev_obj_ipv4_fib {
 /* SWITCHDEV_OBJ_ID_PORT_FDB */
 struct switchdev_obj_port_fdb {
 	struct switchdev_obj obj;
-	const unsigned char *addr;
+	unsigned char addr[ETH_ALEN];
 	u16 vid;
 	u16 ndm_state;
 };
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index f43ce05..f5e7da0 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -135,10 +135,10 @@ static void fdb_del_external_learn(struct net_bridge_fdb_entry *f)
 {
 	struct switchdev_obj_port_fdb fdb = {
 		.obj.id = SWITCHDEV_OBJ_ID_PORT_FDB,
-		.addr = f->addr.addr,
 		.vid = f->vlan_id,
 	};
 
+	ether_addr_copy(fdb.addr, f->addr.addr);
 	switchdev_port_obj_del(f->dst->dev, &fdb.obj);
 }
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index bb2bd3b..2ad4e0e 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -393,7 +393,7 @@ static int dsa_slave_port_fdb_dump(struct net_device *dev,
 		if (ret < 0)
 			break;
 
-		fdb->addr = addr;
+		ether_addr_copy(fdb->addr, addr);
 		fdb->vid = vid;
 		fdb->ndm_state = is_static ? NUD_NOARP : NUD_REACHABLE;
 
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 9f94272..c23dd31 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -15,6 +15,7 @@
 #include <linux/mutex.h>
 #include <linux/notifier.h>
 #include <linux/netdevice.h>
+#include <linux/etherdevice.h>
 #include <linux/if_bridge.h>
 #include <linux/list.h>
 #include <linux/workqueue.h>
@@ -837,10 +838,10 @@ int switchdev_port_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 {
 	struct switchdev_obj_port_fdb fdb = {
 		.obj.id = SWITCHDEV_OBJ_ID_PORT_FDB,
-		.addr = addr,
 		.vid = vid,
 	};
 
+	ether_addr_copy(fdb.addr, addr);
 	return switchdev_port_obj_add(dev, &fdb.obj);
 }
 EXPORT_SYMBOL_GPL(switchdev_port_fdb_add);
@@ -862,10 +863,10 @@ int switchdev_port_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
 {
 	struct switchdev_obj_port_fdb fdb = {
 		.obj.id = SWITCHDEV_OBJ_ID_PORT_FDB,
-		.addr = addr,
 		.vid = vid,
 	};
 
+	ether_addr_copy(fdb.addr, addr);
 	return switchdev_port_obj_del(dev, &fdb.obj);
 }
 EXPORT_SYMBOL_GPL(switchdev_port_fdb_del);
@@ -1027,7 +1028,6 @@ int switchdev_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
 		.obj.id = SWITCHDEV_OBJ_ID_IPV4_FIB,
 		.dst = dst,
 		.dst_len = dst_len,
-		.fi = fi,
 		.tos = tos,
 		.type = type,
 		.nlflags = nlflags,
@@ -1036,6 +1036,8 @@ int switchdev_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
 	struct net_device *dev;
 	int err = 0;
 
+	memcpy(&ipv4_fib.fi, fi, sizeof(ipv4_fib.fi));
+
 	/* Don't offload route if using custom ip rules or if
 	 * IPv4 FIB offloading has been disabled completely.
 	 */
@@ -1079,7 +1081,6 @@ int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
 		.obj.id = SWITCHDEV_OBJ_ID_IPV4_FIB,
 		.dst = dst,
 		.dst_len = dst_len,
-		.fi = fi,
 		.tos = tos,
 		.type = type,
 		.nlflags = 0,
@@ -1088,6 +1089,8 @@ int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
 	struct net_device *dev;
 	int err = 0;
 
+	memcpy(&ipv4_fib.fi, fi, sizeof(ipv4_fib.fi));
+
 	if (!(fi->fib_flags & RTNH_F_OFFLOAD))
 		return 0;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
                   ` (2 preceding siblings ...)
  2015-10-12 13:15 ` [patch net-next v2 3/7] switchdev: remove pointers from switchdev objects Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  2015-10-12 14:34   ` Nikolay Aleksandrov
  2015-10-12 13:15 ` [patch net-next v2 5/7] bridge: defer switchdev fdb del call in fdb_del_external_learn Jiri Pirko
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

Similar to the attr usecase, the caller knows if he is holding RTNL and is
in atomic section. So let the called to decide the correct call variant.

This allows drivers to sleep inside their ops and wait for hw to get the
operation status. Then the status is propagated into switchdev core.
This avoids silent errors in drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/switchdev.h   |   1 +
 net/switchdev/switchdev.c | 137 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 112 insertions(+), 26 deletions(-)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 767d516..14e2595 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -69,6 +69,7 @@ enum switchdev_obj_id {
 
 struct switchdev_obj {
 	enum switchdev_obj_id id;
+	u32 flags;
 };
 
 /* SWITCHDEV_OBJ_ID_PORT_VLAN */
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index c23dd31..5d0d731 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -311,21 +311,8 @@ static int __switchdev_port_obj_add(struct net_device *dev,
 	return err;
 }
 
-/**
- *	switchdev_port_obj_add - Add port object
- *
- *	@dev: port device
- *	@id: object ID
- *	@obj: object to add
- *
- *	Use a 2-phase prepare-commit transaction model to ensure
- *	system is not left in a partially updated state due to
- *	failure from driver/device.
- *
- *	rtnl_lock must be held.
- */
-int switchdev_port_obj_add(struct net_device *dev,
-			   const struct switchdev_obj *obj)
+static int switchdev_port_obj_add_now(struct net_device *dev,
+				      const struct switchdev_obj *obj)
 {
 	struct switchdev_trans trans;
 	int err;
@@ -367,17 +354,9 @@ int switchdev_port_obj_add(struct net_device *dev,
 
 	return err;
 }
-EXPORT_SYMBOL_GPL(switchdev_port_obj_add);
 
-/**
- *	switchdev_port_obj_del - Delete port object
- *
- *	@dev: port device
- *	@id: object ID
- *	@obj: object to delete
- */
-int switchdev_port_obj_del(struct net_device *dev,
-			   const struct switchdev_obj *obj)
+static int switchdev_port_obj_del_now(struct net_device *dev,
+				      const struct switchdev_obj *obj)
 {
 	const struct switchdev_ops *ops = dev->switchdev_ops;
 	struct net_device *lower_dev;
@@ -393,13 +372,119 @@ int switchdev_port_obj_del(struct net_device *dev,
 	 */
 
 	netdev_for_each_lower_dev(dev, lower_dev, iter) {
-		err = switchdev_port_obj_del(lower_dev, obj);
+		err = switchdev_port_obj_del_now(lower_dev, obj);
 		if (err)
 			break;
 	}
 
 	return err;
 }
+
+struct switchdev_obj_work {
+	struct work_struct work;
+	struct net_device *dev;
+	struct switchdev_obj obj;
+	bool add; /* add of del */
+};
+
+static void switchdev_port_obj_work(struct work_struct *work)
+{
+	struct switchdev_obj_work *ow =
+			container_of(work, struct switchdev_obj_work, work);
+	bool rtnl_locked = rtnl_is_locked();
+	int err;
+
+	if (!rtnl_locked)
+		rtnl_lock();
+	if (ow->add)
+		err = switchdev_port_obj_add_now(ow->dev, &ow->obj);
+	else
+		err = switchdev_port_obj_del_now(ow->dev, &ow->obj);
+	if (err && err != -EOPNOTSUPP)
+		netdev_err(ow->dev, "failed (err=%d) to %s object (id=%d)\n",
+			   err, ow->add ? "add" : "del", ow->obj.id);
+	if (!rtnl_locked)
+		rtnl_unlock();
+
+	dev_put(ow->dev);
+	kfree(ow);
+}
+
+static int switchdev_port_obj_work_schedule(struct net_device *dev,
+					    const struct switchdev_obj *obj,
+					    bool add)
+{
+	struct switchdev_obj_work *ow;
+
+	ow = kmalloc(sizeof(*ow), GFP_ATOMIC);
+	if (!ow)
+		return -ENOMEM;
+
+	INIT_WORK(&ow->work, switchdev_port_obj_work);
+
+	dev_hold(dev);
+	ow->dev = dev;
+	memcpy(&ow->obj, obj, sizeof(ow->obj));
+	ow->add = add;
+
+	queue_work(switchdev_wq, &ow->work);
+	return 0;
+}
+
+static int switchdev_port_obj_add_defer(struct net_device *dev,
+					const struct switchdev_obj *obj)
+{
+	return switchdev_port_obj_work_schedule(dev, obj, true);
+}
+
+/**
+ *	switchdev_port_obj_add - Add port object
+ *
+ *	@dev: port device
+ *	@id: object ID
+ *	@obj: object to add
+ *
+ *	Use a 2-phase prepare-commit transaction model to ensure
+ *	system is not left in a partially updated state due to
+ *	failure from driver/device.
+ *
+ *	rtnl_lock must be held and must not be in atomic section,
+ *	in case SWITCHDEV_F_DEFER flag is not set.
+ */
+int switchdev_port_obj_add(struct net_device *dev,
+			   const struct switchdev_obj *obj)
+{
+	if (obj->flags & SWITCHDEV_F_DEFER)
+		return switchdev_port_obj_add_defer(dev, obj);
+	ASSERT_RTNL();
+	return switchdev_port_obj_add_now(dev, obj);
+}
+EXPORT_SYMBOL_GPL(switchdev_port_obj_add);
+
+static int switchdev_port_obj_del_defer(struct net_device *dev,
+					const struct switchdev_obj *obj)
+{
+	return switchdev_port_obj_work_schedule(dev, obj, false);
+}
+
+/**
+ *	switchdev_port_obj_del - Delete port object
+ *
+ *	@dev: port device
+ *	@id: object ID
+ *	@obj: object to delete
+ *
+ *	rtnl_lock must be held and must not be in atomic section,
+ *	in case SWITCHDEV_F_DEFER flag is not set.
+ */
+int switchdev_port_obj_del(struct net_device *dev,
+			   const struct switchdev_obj *obj)
+{
+	if (obj->flags & SWITCHDEV_F_DEFER)
+		return switchdev_port_obj_del_defer(dev, obj);
+	ASSERT_RTNL();
+	return switchdev_port_obj_del_now(dev, obj);
+}
 EXPORT_SYMBOL_GPL(switchdev_port_obj_del);
 
 /**
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [patch net-next v2 5/7] bridge: defer switchdev fdb del call in fdb_del_external_learn
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
                   ` (3 preceding siblings ...)
  2015-10-12 13:15 ` [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 6/7] rocker: remove nowait from switchdev callbacks Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 7/7] switchdev: assert rtnl mutex when going over lower netdevs Jiri Pirko
  6 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

Since spinlock is held here, defer the switchdev operation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/bridge/br_fdb.c | 5 ++++-
 net/bridge/br_if.c  | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index f5e7da0..c88bd8e 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -134,7 +134,10 @@ static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
 static void fdb_del_external_learn(struct net_bridge_fdb_entry *f)
 {
 	struct switchdev_obj_port_fdb fdb = {
-		.obj.id = SWITCHDEV_OBJ_ID_PORT_FDB,
+		.obj = {
+			.id = SWITCHDEV_OBJ_ID_PORT_FDB,
+			.flags = SWITCHDEV_F_DEFER,
+		},
 		.vid = f->vlan_id,
 	};
 
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 934cae9..09147cb 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -24,6 +24,7 @@
 #include <linux/slab.h>
 #include <net/sock.h>
 #include <linux/if_vlan.h>
+#include <net/switchdev.h>
 
 #include "br_private.h"
 
@@ -249,6 +250,8 @@ static void del_nbp(struct net_bridge_port *p)
 	list_del_rcu(&p->list);
 
 	br_fdb_delete_by_port(br, p, 0, 1);
+	switchdev_flush_deferred();
+
 	nbp_update_port_count(br);
 
 	netdev_upper_dev_unlink(dev, br->dev);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [patch net-next v2 6/7] rocker: remove nowait from switchdev callbacks.
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
                   ` (4 preceding siblings ...)
  2015-10-12 13:15 ` [patch net-next v2 5/7] bridge: defer switchdev fdb del call in fdb_del_external_learn Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  2015-10-12 13:15 ` [patch net-next v2 7/7] switchdev: assert rtnl mutex when going over lower netdevs Jiri Pirko
  6 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

No need to avoid sleeping in switchdev callbacks now, as the switchdev
core allows it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/rocker/rocker.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index bb956a5..9629c5b5 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3672,7 +3672,7 @@ static int rocker_port_fdb_flush(struct rocker_port *rocker_port,
 	    rocker_port->stp_state == BR_STATE_FORWARDING)
 		return 0;
 
-	flags |= ROCKER_OP_FLAG_REMOVE;
+	flags |= ROCKER_OP_FLAG_NOWAIT | ROCKER_OP_FLAG_REMOVE;
 
 	spin_lock_irqsave(&rocker->fdb_tbl_lock, lock_flags);
 
@@ -4382,8 +4382,7 @@ static int rocker_port_attr_set(struct net_device *dev,
 
 	switch (attr->id) {
 	case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
-		err = rocker_port_stp_update(rocker_port, trans,
-					     ROCKER_OP_FLAG_NOWAIT,
+		err = rocker_port_stp_update(rocker_port, trans, 0,
 					     attr->u.stp_state);
 		break;
 	case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
@@ -4517,7 +4516,7 @@ static int rocker_port_fdb_del(struct rocker_port *rocker_port,
 			       const struct switchdev_obj_port_fdb *fdb)
 {
 	__be16 vlan_id = rocker_port_vid_to_vlan(rocker_port, fdb->vid, NULL);
-	int flags = ROCKER_OP_FLAG_NOWAIT | ROCKER_OP_FLAG_REMOVE;
+	int flags = ROCKER_OP_FLAG_REMOVE;
 
 	if (!rocker_port_is_bridged(rocker_port))
 		return -EINVAL;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [patch net-next v2 7/7] switchdev: assert rtnl mutex when going over lower netdevs
  2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
                   ` (5 preceding siblings ...)
  2015-10-12 13:15 ` [patch net-next v2 6/7] rocker: remove nowait from switchdev callbacks Jiri Pirko
@ 2015-10-12 13:15 ` Jiri Pirko
  6 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 13:15 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

From: Jiri Pirko <jiri@mellanox.com>

netdev_for_each_lower_dev has to be called with rtnl mutex held. So
better enforce it in switchdev functions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/switchdev/switchdev.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 5d0d731..11e0291 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -494,6 +494,8 @@ EXPORT_SYMBOL_GPL(switchdev_port_obj_del);
  *	@id: object ID
  *	@obj: object to dump
  *	@cb: function to call with a filled object
+ *
+ *	rtnl_lock must be held.
  */
 int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj *obj,
 			    switchdev_obj_dump_cb_t *cb)
@@ -503,6 +505,8 @@ int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj *obj,
 	struct list_head *iter;
 	int err = -EOPNOTSUPP;
 
+	ASSERT_RTNL();
+
 	if (ops && ops->switchdev_port_obj_dump)
 		return ops->switchdev_port_obj_dump(dev, obj, cb);
 
@@ -1068,6 +1072,8 @@ static struct net_device *switchdev_get_dev_by_nhs(struct fib_info *fi)
 	struct net_device *dev = NULL;
 	int nhsel;
 
+	ASSERT_RTNL();
+
 	/* For this route, all nexthop devs must be on the same switch. */
 
 	for (nhsel = 0; nhsel < fi->fib_nhs; nhsel++) {
@@ -1298,10 +1304,11 @@ void switchdev_port_fwd_mark_set(struct net_device *dev,
 	u32 mark = dev->ifindex;
 	u32 reset_mark = 0;
 
-	if (group_dev && joining) {
-		mark = switchdev_port_fwd_mark_get(dev, group_dev);
-	} else if (group_dev && !joining) {
-		if (dev->offload_fwd_mark == mark)
+	if (group_dev) {
+		ASSERT_RTNL();
+		if (joining)
+			mark = switchdev_port_fwd_mark_get(dev, group_dev);
+		else if (dev->offload_fwd_mark == mark)
 			/* Ohoh, this port was the mark reference port,
 			 * but it's leaving the group, so reset the
 			 * mark for the remaining ports in the group.
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del
  2015-10-12 13:15 ` [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del Jiri Pirko
@ 2015-10-12 14:34   ` Nikolay Aleksandrov
  2015-10-12 14:42     ` Nikolay Aleksandrov
  2015-10-12 14:44     ` Jiri Pirko
  0 siblings, 2 replies; 13+ messages in thread
From: Nikolay Aleksandrov @ 2015-10-12 14:34 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

On 10/12/2015 03:15 PM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Similar to the attr usecase, the caller knows if he is holding RTNL and is
> in atomic section. So let the called to decide the correct call variant.
> 
> This allows drivers to sleep inside their ops and wait for hw to get the
> operation status. Then the status is propagated into switchdev core.
> This avoids silent errors in drivers.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  include/net/switchdev.h   |   1 +
>  net/switchdev/switchdev.c | 137 +++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 112 insertions(+), 26 deletions(-)
> 
[snip]
> +
> +struct switchdev_obj_work {
> +	struct work_struct work;
> +	struct net_device *dev;
> +	struct switchdev_obj obj;
> +	bool add; /* add of del */
s/of/or/ ? :-)

> +};
> +
> +static void switchdev_port_obj_work(struct work_struct *work)
> +{
> +	struct switchdev_obj_work *ow =
> +			container_of(work, struct switchdev_obj_work, work);
> +	bool rtnl_locked = rtnl_is_locked();
> +	int err;
> +
> +	if (!rtnl_locked)
> +		rtnl_lock();
> +	if (ow->add)
> +		err = switchdev_port_obj_add_now(ow->dev, &ow->obj);
> +	else
> +		err = switchdev_port_obj_del_now(ow->dev, &ow->obj);
> +	if (err && err != -EOPNOTSUPP)
> +		netdev_err(ow->dev, "failed (err=%d) to %s object (id=%d)\n",
> +			   err, ow->add ? "add" : "del", ow->obj.id);
> +	if (!rtnl_locked)
> +		rtnl_unlock();
> +
> +	dev_put(ow->dev);
> +	kfree(ow);
> +}
> +
> +static int switchdev_port_obj_work_schedule(struct net_device *dev,
> +					    const struct switchdev_obj *obj,
> +					    bool add)
> +{
> +	struct switchdev_obj_work *ow;
> +
> +	ow = kmalloc(sizeof(*ow), GFP_ATOMIC);
> +	if (!ow)
> +		return -ENOMEM;
> +
> +	INIT_WORK(&ow->work, switchdev_port_obj_work);
> +
This can be called without rtnl, what stops the device from disappearing
between the above and the hold below ?

> +	dev_hold(dev);
> +	ow->dev = dev;
> +	memcpy(&ow->obj, obj, sizeof(ow->obj));
> +	ow->add = add;
> +
> +	queue_work(switchdev_wq, &ow->work);
> +	return 0;
> +}
> +
[snip]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del
  2015-10-12 14:34   ` Nikolay Aleksandrov
@ 2015-10-12 14:42     ` Nikolay Aleksandrov
  2015-10-12 14:57       ` Jiri Pirko
  2015-10-12 14:44     ` Jiri Pirko
  1 sibling, 1 reply; 13+ messages in thread
From: Nikolay Aleksandrov @ 2015-10-12 14:42 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, idosch, eladr, sfeldma, f.fainelli, linux, vivien.didelot,
	andrew, john.fastabend, David.Laight, stephen

On 10/12/2015 04:34 PM, Nikolay Aleksandrov wrote:
> On 10/12/2015 03:15 PM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>>
>> Similar to the attr usecase, the caller knows if he is holding RTNL and is
>> in atomic section. So let the called to decide the correct call variant.
>>
>> This allows drivers to sleep inside their ops and wait for hw to get the
>> operation status. Then the status is propagated into switchdev core.
>> This avoids silent errors in drivers.
>>
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  include/net/switchdev.h   |   1 +
>>  net/switchdev/switchdev.c | 137 +++++++++++++++++++++++++++++++++++++---------
>>  2 files changed, 112 insertions(+), 26 deletions(-)
>>
> [snip]
>> +
>> +struct switchdev_obj_work {
>> +	struct work_struct work;
>> +	struct net_device *dev;
>> +	struct switchdev_obj obj;
>> +	bool add; /* add of del */
> s/of/or/ ? :-)
> 
>> +};
>> +
>> +static void switchdev_port_obj_work(struct work_struct *work)
>> +{
>> +	struct switchdev_obj_work *ow =
>> +			container_of(work, struct switchdev_obj_work, work);
>> +	bool rtnl_locked = rtnl_is_locked();
>> +	int err;
>> +
>> +	if (!rtnl_locked)
>> +		rtnl_lock();
>> +	if (ow->add)
>> +		err = switchdev_port_obj_add_now(ow->dev, &ow->obj);
>> +	else
>> +		err = switchdev_port_obj_del_now(ow->dev, &ow->obj);
>> +	if (err && err != -EOPNOTSUPP)
>> +		netdev_err(ow->dev, "failed (err=%d) to %s object (id=%d)\n",
>> +			   err, ow->add ? "add" : "del", ow->obj.id);
>> +	if (!rtnl_locked)
>> +		rtnl_unlock();
>> +
>> +	dev_put(ow->dev);
>> +	kfree(ow);
>> +}
>> +
>> +static int switchdev_port_obj_work_schedule(struct net_device *dev,
>> +					    const struct switchdev_obj *obj,
>> +					    bool add)
>> +{
>> +	struct switchdev_obj_work *ow;
>> +
>> +	ow = kmalloc(sizeof(*ow), GFP_ATOMIC);
>> +	if (!ow)
>> +		return -ENOMEM;
>> +
>> +	INIT_WORK(&ow->work, switchdev_port_obj_work);
>> +
> This can be called without rtnl, what stops the device from disappearing
> between the above and the hold below ?
> 
Nevermind this question, got it.

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del
  2015-10-12 14:34   ` Nikolay Aleksandrov
  2015-10-12 14:42     ` Nikolay Aleksandrov
@ 2015-10-12 14:44     ` Jiri Pirko
  2015-10-12 14:46       ` Nikolay Aleksandrov
  1 sibling, 1 reply; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 14:44 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, davem, idosch, eladr, sfeldma, f.fainelli, linux,
	vivien.didelot, andrew, john.fastabend, David.Laight, stephen

Mon, Oct 12, 2015 at 04:34:25PM CEST, nikolay@cumulusnetworks.com wrote:
>On 10/12/2015 03:15 PM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Similar to the attr usecase, the caller knows if he is holding RTNL and is
>> in atomic section. So let the called to decide the correct call variant.
>> 
>> This allows drivers to sleep inside their ops and wait for hw to get the
>> operation status. Then the status is propagated into switchdev core.
>> This avoids silent errors in drivers.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  include/net/switchdev.h   |   1 +
>>  net/switchdev/switchdev.c | 137 +++++++++++++++++++++++++++++++++++++---------
>>  2 files changed, 112 insertions(+), 26 deletions(-)
>> 
>[snip]
>> +
>> +struct switchdev_obj_work {
>> +	struct work_struct work;
>> +	struct net_device *dev;
>> +	struct switchdev_obj obj;
>> +	bool add; /* add of del */
>s/of/or/ ? :-)

will fix, thanks.


>
>> +};
>> +
>> +static void switchdev_port_obj_work(struct work_struct *work)
>> +{
>> +	struct switchdev_obj_work *ow =
>> +			container_of(work, struct switchdev_obj_work, work);
>> +	bool rtnl_locked = rtnl_is_locked();
>> +	int err;
>> +
>> +	if (!rtnl_locked)
>> +		rtnl_lock();
>> +	if (ow->add)
>> +		err = switchdev_port_obj_add_now(ow->dev, &ow->obj);
>> +	else
>> +		err = switchdev_port_obj_del_now(ow->dev, &ow->obj);
>> +	if (err && err != -EOPNOTSUPP)
>> +		netdev_err(ow->dev, "failed (err=%d) to %s object (id=%d)\n",
>> +			   err, ow->add ? "add" : "del", ow->obj.id);
>> +	if (!rtnl_locked)
>> +		rtnl_unlock();
>> +
>> +	dev_put(ow->dev);
>> +	kfree(ow);
>> +}
>> +
>> +static int switchdev_port_obj_work_schedule(struct net_device *dev,
>> +					    const struct switchdev_obj *obj,
>> +					    bool add)
>> +{
>> +	struct switchdev_obj_work *ow;
>> +
>> +	ow = kmalloc(sizeof(*ow), GFP_ATOMIC);
>> +	if (!ow)
>> +		return -ENOMEM;
>> +
>> +	INIT_WORK(&ow->work, switchdev_port_obj_work);
>> +
>This can be called without rtnl, what stops the device from disappearing
>between the above and the hold below ?

You are right. I will have to figure that out. Btw the same issue
already exists for attr_set deferred work.


>
>> +	dev_hold(dev);
>> +	ow->dev = dev;
>> +	memcpy(&ow->obj, obj, sizeof(ow->obj));
>> +	ow->add = add;
>> +
>> +	queue_work(switchdev_wq, &ow->work);
>> +	return 0;
>> +}
>> +
>[snip]
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del
  2015-10-12 14:44     ` Jiri Pirko
@ 2015-10-12 14:46       ` Nikolay Aleksandrov
  0 siblings, 0 replies; 13+ messages in thread
From: Nikolay Aleksandrov @ 2015-10-12 14:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, idosch, eladr, sfeldma, f.fainelli, linux,
	vivien.didelot, andrew, john.fastabend, David.Laight, stephen

On 10/12/2015 04:44 PM, Jiri Pirko wrote:
> Mon, Oct 12, 2015 at 04:34:25PM CEST, nikolay@cumulusnetworks.com wrote:
>> On 10/12/2015 03:15 PM, Jiri Pirko wrote:
>>> From: Jiri Pirko <jiri@mellanox.com>
>>>
>>> Similar to the attr usecase, the caller knows if he is holding RTNL and is
>>> in atomic section. So let the called to decide the correct call variant.
>>>
>>> This allows drivers to sleep inside their ops and wait for hw to get the
>>> operation status. Then the status is propagated into switchdev core.
>>> This avoids silent errors in drivers.
>>>
>>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>>> ---
>>>  include/net/switchdev.h   |   1 +
>>>  net/switchdev/switchdev.c | 137 +++++++++++++++++++++++++++++++++++++---------
>>>  2 files changed, 112 insertions(+), 26 deletions(-)
>>>
>> [snip]
>>> +
>>> +struct switchdev_obj_work {
>>> +	struct work_struct work;
>>> +	struct net_device *dev;
>>> +	struct switchdev_obj obj;
>>> +	bool add; /* add of del */
>> s/of/or/ ? :-)
> 
> will fix, thanks.
> 
> 
>>
>>> +};
>>> +
>>> +static void switchdev_port_obj_work(struct work_struct *work)
>>> +{
>>> +	struct switchdev_obj_work *ow =
>>> +			container_of(work, struct switchdev_obj_work, work);
>>> +	bool rtnl_locked = rtnl_is_locked();
>>> +	int err;
>>> +
>>> +	if (!rtnl_locked)
>>> +		rtnl_lock();
>>> +	if (ow->add)
>>> +		err = switchdev_port_obj_add_now(ow->dev, &ow->obj);
>>> +	else
>>> +		err = switchdev_port_obj_del_now(ow->dev, &ow->obj);
>>> +	if (err && err != -EOPNOTSUPP)
>>> +		netdev_err(ow->dev, "failed (err=%d) to %s object (id=%d)\n",
>>> +			   err, ow->add ? "add" : "del", ow->obj.id);
>>> +	if (!rtnl_locked)
>>> +		rtnl_unlock();
>>> +
>>> +	dev_put(ow->dev);
>>> +	kfree(ow);
>>> +}
>>> +
>>> +static int switchdev_port_obj_work_schedule(struct net_device *dev,
>>> +					    const struct switchdev_obj *obj,
>>> +					    bool add)
>>> +{
>>> +	struct switchdev_obj_work *ow;
>>> +
>>> +	ow = kmalloc(sizeof(*ow), GFP_ATOMIC);
>>> +	if (!ow)
>>> +		return -ENOMEM;
>>> +
>>> +	INIT_WORK(&ow->work, switchdev_port_obj_work);
>>> +
>> This can be called without rtnl, what stops the device from disappearing
>> between the above and the hold below ?
> 
> You are right. I will have to figure that out. Btw the same issue
> already exists for attr_set deferred work.
> 
> 

I have to say there're a few users now that need delayed RTNL execution
the bonding being a heavy one, teaming I think also has some rtnl delays.
Maybe it's time we do a generic delayed rtnl execution so it can be re-used
by all.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del
  2015-10-12 14:42     ` Nikolay Aleksandrov
@ 2015-10-12 14:57       ` Jiri Pirko
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2015-10-12 14:57 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, davem, idosch, eladr, sfeldma, f.fainelli, linux,
	vivien.didelot, andrew, john.fastabend, David.Laight, stephen

Mon, Oct 12, 2015 at 04:42:12PM CEST, nikolay@cumulusnetworks.com wrote:
>On 10/12/2015 04:34 PM, Nikolay Aleksandrov wrote:
>> On 10/12/2015 03:15 PM, Jiri Pirko wrote:
>>> From: Jiri Pirko <jiri@mellanox.com>
>>>
>>> Similar to the attr usecase, the caller knows if he is holding RTNL and is
>>> in atomic section. So let the called to decide the correct call variant.
>>>
>>> This allows drivers to sleep inside their ops and wait for hw to get the
>>> operation status. Then the status is propagated into switchdev core.
>>> This avoids silent errors in drivers.
>>>
>>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>>> ---
>>>  include/net/switchdev.h   |   1 +
>>>  net/switchdev/switchdev.c | 137 +++++++++++++++++++++++++++++++++++++---------
>>>  2 files changed, 112 insertions(+), 26 deletions(-)
>>>
>> [snip]
>>> +
>>> +struct switchdev_obj_work {
>>> +	struct work_struct work;
>>> +	struct net_device *dev;
>>> +	struct switchdev_obj obj;
>>> +	bool add; /* add of del */
>> s/of/or/ ? :-)
>> 
>>> +};
>>> +
>>> +static void switchdev_port_obj_work(struct work_struct *work)
>>> +{
>>> +	struct switchdev_obj_work *ow =
>>> +			container_of(work, struct switchdev_obj_work, work);
>>> +	bool rtnl_locked = rtnl_is_locked();
>>> +	int err;
>>> +
>>> +	if (!rtnl_locked)
>>> +		rtnl_lock();
>>> +	if (ow->add)
>>> +		err = switchdev_port_obj_add_now(ow->dev, &ow->obj);
>>> +	else
>>> +		err = switchdev_port_obj_del_now(ow->dev, &ow->obj);
>>> +	if (err && err != -EOPNOTSUPP)
>>> +		netdev_err(ow->dev, "failed (err=%d) to %s object (id=%d)\n",
>>> +			   err, ow->add ? "add" : "del", ow->obj.id);
>>> +	if (!rtnl_locked)
>>> +		rtnl_unlock();
>>> +
>>> +	dev_put(ow->dev);
>>> +	kfree(ow);
>>> +}
>>> +
>>> +static int switchdev_port_obj_work_schedule(struct net_device *dev,
>>> +					    const struct switchdev_obj *obj,
>>> +					    bool add)
>>> +{
>>> +	struct switchdev_obj_work *ow;
>>> +
>>> +	ow = kmalloc(sizeof(*ow), GFP_ATOMIC);
>>> +	if (!ow)
>>> +		return -ENOMEM;
>>> +
>>> +	INIT_WORK(&ow->work, switchdev_port_obj_work);
>>> +
>> This can be called without rtnl, what stops the device from disappearing
>> between the above and the hold below ?
>> 
>Nevermind this question, got it.

Well anyone without rtnl should hold dev in the first place. So this
should be ok.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-10-12 14:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 13:15 [patch net-next v2 0/7] switchdev: change locking Jiri Pirko
2015-10-12 13:15 ` [patch net-next v2 1/7] switchdev: introduce switchdev workqueue Jiri Pirko
2015-10-12 13:15 ` [patch net-next v2 2/7] switchdev: allow caller to explicitly request attr_set as deferred Jiri Pirko
2015-10-12 13:15 ` [patch net-next v2 3/7] switchdev: remove pointers from switchdev objects Jiri Pirko
2015-10-12 13:15 ` [patch net-next v2 4/7] switchdev: introduce possibility to defer obj_add/del Jiri Pirko
2015-10-12 14:34   ` Nikolay Aleksandrov
2015-10-12 14:42     ` Nikolay Aleksandrov
2015-10-12 14:57       ` Jiri Pirko
2015-10-12 14:44     ` Jiri Pirko
2015-10-12 14:46       ` Nikolay Aleksandrov
2015-10-12 13:15 ` [patch net-next v2 5/7] bridge: defer switchdev fdb del call in fdb_del_external_learn Jiri Pirko
2015-10-12 13:15 ` [patch net-next v2 6/7] rocker: remove nowait from switchdev callbacks Jiri Pirko
2015-10-12 13:15 ` [patch net-next v2 7/7] switchdev: assert rtnl mutex when going over lower netdevs Jiri Pirko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.