netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next 0/5] bonding: get rid of bond->lock
@ 2014-09-05 18:16 Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead " Nikolay Aleksandrov
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 18:16 UTC (permalink / raw)
  To: netdev; +Cc: vfalico, j.vosburgh, andy, davem, Nikolay Aleksandrov

Hi all,
This is only a raw version of the idea, I've only compile-tested this
patch-set but I think it's possible to remove bond->lock completely as
its current users are only few and can be either removed or converted.
I'm sending it early as RFC to see if you have any suggestions, comments
or objections, and I'll do stress-testing and lockdep next week and
resubmit it with any changes/fixes that come up in the process.

If this gets accepted in some form, a re-work of curr_slave_lock will be
following as it can be dropped in a lot of places and only kept to sync the
few users that actually need it (e.g. 3ad wq handler and slave release, alb
wq handler and active change etc).
The goal of these patchsets is to simplify bond locking as it's getting
more convoluted and unnecessarily complex, in most of the cases we should
be able to rely on RTNL or rcu alone, and if the case is more complex we
should explain/document well why we need another lock.

Best regards,
 Nikolay Aleksandrov

Nikolay Aleksandrov (5):
  bonding: 3ad: use curr_slave_lock instead of bond->lock
  bonding: alb: clean bond->lock
  bonding: procfs: clean bond->lock usage and use RTNL
  bonding: options: remove bond->lock usage
  bonding: remove last users of bond->lock and bond->lock itself

 drivers/net/bonding/bond_3ad.c     |  9 +++----
 drivers/net/bonding/bond_alb.c     | 11 ++------
 drivers/net/bonding/bond_main.c    | 51 ++++++--------------------------------
 drivers/net/bonding/bond_options.c | 19 +-------------
 drivers/net/bonding/bond_procfs.c  |  8 ++----
 drivers/net/bonding/bonding.h      |  8 ++----
 6 files changed, 18 insertions(+), 88 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead of bond->lock
  2014-09-05 18:16 [RFC net-next 0/5] bonding: get rid of bond->lock Nikolay Aleksandrov
@ 2014-09-05 18:16 ` Nikolay Aleksandrov
  2014-09-05 20:37   ` Jay Vosburgh
  2014-09-05 18:16 ` [RFC net-next 2/5] bonding: alb: clean bond->lock Nikolay Aleksandrov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 18:16 UTC (permalink / raw)
  To: netdev; +Cc: vfalico, j.vosburgh, andy, davem, Nikolay Aleksandrov

In 3ad mode the only syncing needed by bond->lock is for the wq
and the recv handler, so change them to use curr_slave_lock.
There're no locking dependencies here as 3ad doesn't use
curr_slave_lock at all.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_3ad.c  |  9 ++++-----
 drivers/net/bonding/bond_main.c | 12 +++++++-----
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index ee2c73a9de39..5d27a6207384 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2057,7 +2057,7 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
 	struct port *port;
 	bool should_notify_rtnl = BOND_SLAVE_NOTIFY_LATER;
 
-	read_lock(&bond->lock);
+	read_lock(&bond->curr_slave_lock);
 	rcu_read_lock();
 
 	/* check if there are any slaves */
@@ -2120,7 +2120,7 @@ re_arm:
 		}
 	}
 	rcu_read_unlock();
-	read_unlock(&bond->lock);
+	read_unlock(&bond->curr_slave_lock);
 
 	if (should_notify_rtnl && rtnl_trylock()) {
 		bond_slave_state_notify(bond);
@@ -2395,7 +2395,6 @@ int __bond_3ad_get_active_agg_info(struct bonding *bond,
 	return 0;
 }
 
-/* Wrapper used to hold bond->lock so no slave manipulation can occur */
 int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
 {
 	int ret;
@@ -2487,9 +2486,9 @@ int bond_3ad_lacpdu_recv(const struct sk_buff *skb, struct bonding *bond,
 	if (!lacpdu)
 		return ret;
 
-	read_lock(&bond->lock);
+	read_lock(&bond->curr_slave_lock);
 	ret = bond_3ad_rx_indication(lacpdu, slave, skb->len);
-	read_unlock(&bond->lock);
+	read_unlock(&bond->curr_slave_lock);
 	return ret;
 }
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f0f5eab0fab1..dcd331bd0c17 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1687,13 +1687,15 @@ static int __bond_release_one(struct net_device *bond_dev,
 	 * for this slave anymore.
 	 */
 	netdev_rx_handler_unregister(slave_dev);
-	write_lock_bh(&bond->lock);
 
-	/* Inform AD package of unbinding of slave. */
-	if (BOND_MODE(bond) == BOND_MODE_8023AD)
+	if (BOND_MODE(bond) == BOND_MODE_8023AD) {
+		/* Sync against bond_3ad_rx_indication and
+		 * bond_3ad_state_machine_handler
+		 */
+		write_lock_bh(&bond->curr_slave_lock);
 		bond_3ad_unbind_slave(slave);
-
-	write_unlock_bh(&bond->lock);
+		write_unlock_bh(&bond->curr_slave_lock);
+	}
 
 	netdev_info(bond_dev, "Releasing %s interface %s\n",
 		    bond_is_active_slave(slave) ? "active" : "backup",
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC net-next 2/5] bonding: alb: clean bond->lock
  2014-09-05 18:16 [RFC net-next 0/5] bonding: get rid of bond->lock Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead " Nikolay Aleksandrov
@ 2014-09-05 18:16 ` Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 3/5] bonding: procfs: clean bond->lock usage and use RTNL Nikolay Aleksandrov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 18:16 UTC (permalink / raw)
  To: netdev; +Cc: vfalico, j.vosburgh, andy, davem, Nikolay Aleksandrov

We can remove the lock/unlock as it's no longer necessary since
RTNL should be held while calling bond_alb_set_mac_address().

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_alb.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 73c21e233131..028496205f39 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1775,8 +1775,7 @@ void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char
  * Set the bond->curr_active_slave to @new_slave and handle
  * mac address swapping and promiscuity changes as needed.
  *
- * If new_slave is NULL, caller must hold curr_slave_lock or
- * bond->lock for write.
+ * If new_slave is NULL, caller must hold curr_slave_lock for write
  *
  * If new_slave is not NULL, caller must hold RTNL, curr_slave_lock
  * for write.  Processing here may sleep, so no other locks may be held.
@@ -1857,12 +1856,8 @@ void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave
 	write_lock_bh(&bond->curr_slave_lock);
 }
 
-/*
- * Called with RTNL
- */
+/* Called with RTNL */
 int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr)
-	__acquires(&bond->lock)
-	__releases(&bond->lock)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct sockaddr *sa = addr;
@@ -1895,14 +1890,12 @@ int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr)
 	} else {
 		alb_set_slave_mac_addr(curr_active, bond_dev->dev_addr);
 
-		read_lock(&bond->lock);
 		alb_send_learning_packets(curr_active,
 					  bond_dev->dev_addr, false);
 		if (bond->alb_info.rlb_enabled) {
 			/* inform clients mac address has changed */
 			rlb_req_update_slave_clients(bond, curr_active);
 		}
-		read_unlock(&bond->lock);
 	}
 
 	return 0;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC net-next 3/5] bonding: procfs: clean bond->lock usage and use RTNL
  2014-09-05 18:16 [RFC net-next 0/5] bonding: get rid of bond->lock Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead " Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 2/5] bonding: alb: clean bond->lock Nikolay Aleksandrov
@ 2014-09-05 18:16 ` Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 4/5] bonding: options: remove bond->lock usage Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 5/5] bonding: remove last users of bond->lock and bond->lock itself Nikolay Aleksandrov
  4 siblings, 0 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 18:16 UTC (permalink / raw)
  To: netdev; +Cc: vfalico, j.vosburgh, andy, davem, Nikolay Aleksandrov

Use RTNL because bond->primary_slave might change otherwise.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_procfs.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bonding/bond_procfs.c b/drivers/net/bonding/bond_procfs.c
index de62c0385dfb..eee937afe94e 100644
--- a/drivers/net/bonding/bond_procfs.c
+++ b/drivers/net/bonding/bond_procfs.c
@@ -7,7 +7,6 @@
 
 static void *bond_info_seq_start(struct seq_file *seq, loff_t *pos)
 	__acquires(RCU)
-	__acquires(&bond->lock)
 {
 	struct bonding *bond = seq->private;
 	struct list_head *iter;
@@ -15,8 +14,8 @@ static void *bond_info_seq_start(struct seq_file *seq, loff_t *pos)
 	loff_t off = 0;
 
 	/* make sure the bond won't be taken away */
+	rtnl_lock();
 	rcu_read_lock();
-	read_lock(&bond->lock);
 
 	if (*pos == 0)
 		return SEQ_START_TOKEN;
@@ -53,13 +52,10 @@ static void *bond_info_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 }
 
 static void bond_info_seq_stop(struct seq_file *seq, void *v)
-	__releases(&bond->lock)
 	__releases(RCU)
 {
-	struct bonding *bond = seq->private;
-
-	read_unlock(&bond->lock);
 	rcu_read_unlock();
+	rtnl_unlock();
 }
 
 static void bond_info_show_master(struct seq_file *seq)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC net-next 4/5] bonding: options: remove bond->lock usage
  2014-09-05 18:16 [RFC net-next 0/5] bonding: get rid of bond->lock Nikolay Aleksandrov
                   ` (2 preceding siblings ...)
  2014-09-05 18:16 ` [RFC net-next 3/5] bonding: procfs: clean bond->lock usage and use RTNL Nikolay Aleksandrov
@ 2014-09-05 18:16 ` Nikolay Aleksandrov
  2014-09-05 18:16 ` [RFC net-next 5/5] bonding: remove last users of bond->lock and bond->lock itself Nikolay Aleksandrov
  4 siblings, 0 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 18:16 UTC (permalink / raw)
  To: netdev; +Cc: vfalico, j.vosburgh, andy, davem, Nikolay Aleksandrov

We're safe to remove the bond->lock use from the arp targets because
arp_rcv_probe no longer acquires bond->lock, only rcu_read_lock.
Also setting the primary slave is safe because noone uses the bond->lock
as a syncing mechanism for that anymore.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_options.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index d8dc17faa6b4..67a1c9bcdc43 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -955,14 +955,7 @@ static int _bond_option_arp_ip_target_add(struct bonding *bond, __be32 target)
 
 static int bond_option_arp_ip_target_add(struct bonding *bond, __be32 target)
 {
-	int ret;
-
-	/* not to race with bond_arp_rcv */
-	write_lock_bh(&bond->lock);
-	ret = _bond_option_arp_ip_target_add(bond, target);
-	write_unlock_bh(&bond->lock);
-
-	return ret;
+	return _bond_option_arp_ip_target_add(bond, target);
 }
 
 static int bond_option_arp_ip_target_rem(struct bonding *bond, __be32 target)
@@ -991,9 +984,6 @@ static int bond_option_arp_ip_target_rem(struct bonding *bond, __be32 target)
 
 	netdev_info(bond->dev, "Removing ARP target %pI4\n", &target);
 
-	/* not to race with bond_arp_rcv */
-	write_lock_bh(&bond->lock);
-
 	bond_for_each_slave(bond, slave, iter) {
 		targets_rx = slave->target_last_arp_rx;
 		for (i = ind; (i < BOND_MAX_ARP_TARGETS-1) && targets[i+1]; i++)
@@ -1004,8 +994,6 @@ static int bond_option_arp_ip_target_rem(struct bonding *bond, __be32 target)
 		targets[i] = targets[i+1];
 	targets[i] = 0;
 
-	write_unlock_bh(&bond->lock);
-
 	return 0;
 }
 
@@ -1013,11 +1001,8 @@ void bond_option_arp_ip_targets_clear(struct bonding *bond)
 {
 	int i;
 
-	/* not to race with bond_arp_rcv */
-	write_lock_bh(&bond->lock);
 	for (i = 0; i < BOND_MAX_ARP_TARGETS; i++)
 		_bond_options_arp_ip_target_set(bond, i, 0, 0);
-	write_unlock_bh(&bond->lock);
 }
 
 static int bond_option_arp_ip_targets_set(struct bonding *bond,
@@ -1081,7 +1066,6 @@ static int bond_option_primary_set(struct bonding *bond,
 	struct slave *slave;
 
 	block_netpoll_tx();
-	read_lock(&bond->lock);
 	write_lock_bh(&bond->curr_slave_lock);
 
 	p = strchr(primary, '\n');
@@ -1120,7 +1104,6 @@ static int bond_option_primary_set(struct bonding *bond,
 
 out:
 	write_unlock_bh(&bond->curr_slave_lock);
-	read_unlock(&bond->lock);
 	unblock_netpoll_tx();
 
 	return 0;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC net-next 5/5] bonding: remove last users of bond->lock and bond->lock itself
  2014-09-05 18:16 [RFC net-next 0/5] bonding: get rid of bond->lock Nikolay Aleksandrov
                   ` (3 preceding siblings ...)
  2014-09-05 18:16 ` [RFC net-next 4/5] bonding: options: remove bond->lock usage Nikolay Aleksandrov
@ 2014-09-05 18:16 ` Nikolay Aleksandrov
  4 siblings, 0 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 18:16 UTC (permalink / raw)
  To: netdev; +Cc: vfalico, j.vosburgh, andy, davem, Nikolay Aleksandrov

The usage of bond->lock in bond_main.c was completely unnecessary as it
didn't help to sync with anything, most of the spots already had RTNL.
Since there're no more users of bond->lock, remove it.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_main.c | 39 ---------------------------------------
 drivers/net/bonding/bonding.h   |  8 ++------
 2 files changed, 2 insertions(+), 45 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index dcd331bd0c17..4301c3d49fce 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3101,7 +3101,6 @@ static int bond_open(struct net_device *bond_dev)
 	struct slave *slave;
 
 	/* reset slave->backup and slave->inactive */
-	read_lock(&bond->lock);
 	if (bond_has_slaves(bond)) {
 		read_lock(&bond->curr_slave_lock);
 		bond_for_each_slave(bond, slave, iter) {
@@ -3116,7 +3115,6 @@ static int bond_open(struct net_device *bond_dev)
 		}
 		read_unlock(&bond->curr_slave_lock);
 	}
-	read_unlock(&bond->lock);
 
 	bond_work_init_all(bond);
 
@@ -3171,7 +3169,6 @@ static struct rtnl_link_stats64 *bond_get_stats(struct net_device *bond_dev,
 
 	memset(stats, 0, sizeof(*stats));
 
-	read_lock_bh(&bond->lock);
 	bond_for_each_slave(bond, slave, iter) {
 		const struct rtnl_link_stats64 *sstats =
 			dev_get_stats(slave->dev, &temp);
@@ -3202,7 +3199,6 @@ static struct rtnl_link_stats64 *bond_get_stats(struct net_device *bond_dev,
 		stats->tx_heartbeat_errors += sstats->tx_heartbeat_errors;
 		stats->tx_window_errors += sstats->tx_window_errors;
 	}
-	read_unlock_bh(&bond->lock);
 
 	return stats;
 }
@@ -3242,13 +3238,11 @@ static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd
 
 		if (mii->reg_num == 1) {
 			mii->val_out = 0;
-			read_lock(&bond->lock);
 			read_lock(&bond->curr_slave_lock);
 			if (netif_carrier_ok(bond->dev))
 				mii->val_out = BMSR_LSTATUS;
 
 			read_unlock(&bond->curr_slave_lock);
-			read_unlock(&bond->lock);
 		}
 
 		return 0;
@@ -3424,21 +3418,6 @@ static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
 
 	netdev_dbg(bond_dev, "bond=%p, new_mtu=%d\n", bond, new_mtu);
 
-	/* Can't hold bond->lock with bh disabled here since
-	 * some base drivers panic. On the other hand we can't
-	 * hold bond->lock without bh disabled because we'll
-	 * deadlock. The only solution is to rely on the fact
-	 * that we're under rtnl_lock here, and the slaves
-	 * list won't change. This doesn't solve the problem
-	 * of setting the slave's MTU while it is
-	 * transmitting, but the assumption is that the base
-	 * driver can handle that.
-	 *
-	 * TODO: figure out a way to safely iterate the slaves
-	 * list, but without holding a lock around the actual
-	 * call to the base driver.
-	 */
-
 	bond_for_each_slave(bond, slave, iter) {
 		netdev_dbg(bond_dev, "s %p c_m %p\n",
 			   slave, slave->dev->netdev_ops->ndo_change_mtu);
@@ -3513,21 +3492,6 @@ static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
 	if (!is_valid_ether_addr(sa->sa_data))
 		return -EADDRNOTAVAIL;
 
-	/* Can't hold bond->lock with bh disabled here since
-	 * some base drivers panic. On the other hand we can't
-	 * hold bond->lock without bh disabled because we'll
-	 * deadlock. The only solution is to rely on the fact
-	 * that we're under rtnl_lock here, and the slaves
-	 * list won't change. This doesn't solve the problem
-	 * of setting the slave's hw address while it is
-	 * transmitting, but the assumption is that the base
-	 * driver can handle that.
-	 *
-	 * TODO: figure out a way to safely iterate the slaves
-	 * list, but without holding a lock around the actual
-	 * call to the base driver.
-	 */
-
 	bond_for_each_slave(bond, slave, iter) {
 		netdev_dbg(bond_dev, "slave %p %s\n", slave, slave->dev->name);
 		res = dev_set_mac_address(slave->dev, addr);
@@ -3853,7 +3817,6 @@ static int bond_ethtool_get_settings(struct net_device *bond_dev,
 	 * the true receive or transmit bandwidth (not all modes are symmetric)
 	 * this is an accurate maximum.
 	 */
-	read_lock(&bond->lock);
 	bond_for_each_slave(bond, slave, iter) {
 		if (bond_slave_can_tx(slave)) {
 			if (slave->speed != SPEED_UNKNOWN)
@@ -3864,7 +3827,6 @@ static int bond_ethtool_get_settings(struct net_device *bond_dev,
 		}
 	}
 	ethtool_cmd_speed_set(ecmd, speed ? : SPEED_UNKNOWN);
-	read_unlock(&bond->lock);
 
 	return 0;
 }
@@ -3927,7 +3889,6 @@ void bond_setup(struct net_device *bond_dev)
 	struct bonding *bond = netdev_priv(bond_dev);
 
 	/* initialize rwlocks */
-	rwlock_init(&bond->lock);
 	rwlock_init(&bond->curr_slave_lock);
 	bond->params = bonding_defaults;
 
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index aace510d08d1..1f44a78283ad 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -83,7 +83,7 @@
  * @pos:	current slave
  * @iter:	list_head * iterator
  *
- * Caller must hold bond->lock
+ * Caller must hold RTNL
  */
 #define bond_for_each_slave(bond, pos, iter) \
 	netdev_for_each_lower_private((bond)->dev, pos, iter)
@@ -185,11 +185,8 @@ struct slave {
 /*
  * Here are the locking policies for the two bonding locks:
  *
- * 1) Get bond->lock when reading/writing slave list.
+ * 1) Get rcu_read_lock when reading or RTNL when writing slave list.
  * 2) Get bond->curr_slave_lock when reading/writing bond->curr_active_slave.
- *    (It is unnecessary when the write-lock is put with bond->lock.)
- * 3) When we lock with bond->curr_slave_lock, we must lock with bond->lock
- *    beforehand.
  */
 struct bonding {
 	struct   net_device *dev; /* first - useful for panic debug */
@@ -200,7 +197,6 @@ struct bonding {
 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
 			      struct slave *);
-	rwlock_t lock;
 	rwlock_t curr_slave_lock;
 	u8	 send_peer_notif;
 	u8       igmp_retrans;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead of bond->lock
  2014-09-05 18:16 ` [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead " Nikolay Aleksandrov
@ 2014-09-05 20:37   ` Jay Vosburgh
  2014-09-05 21:00     ` Nikolay Aleksandrov
  0 siblings, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2014-09-05 20:37 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, vfalico, andy, davem

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>In 3ad mode the only syncing needed by bond->lock is for the wq
>and the recv handler, so change them to use curr_slave_lock.
>There're no locking dependencies here as 3ad doesn't use
>curr_slave_lock at all.

	One subtle aspect of the 3ad locking is that it's not really
using the "read" property of the read lock with regard to the state
machine; it's largely using it as a spin lock, because there is at most
one reader and at most one writer in the full state machine code
(although there are multiple reader possibilities elsewhere).  The code
would break if there actually were multiple read-lock holders in the
full state machine or aggregator selection logic simultaneously.

	Because the state machine and incoming LACPDU cases both acquire
the read lock for read, there is a separate per-port "state machine"
spin lock to protect only the per-port fields that LACPDU and periodic
state machine both touch.  The incoming LACPDU case doesn't call into
the full state machine, only the RX processing which can't go into agg
selection, so this works.

	The agg selection can be entered via the unbind path or the
periodic state machine (and only these two paths), and relies on the
"one reader max" usage of the read lock to mutex the code paths that may
enter agg selection.

	I suspect that what 3ad may need is a spin lock, not a read
lock, because the multiple reader property isn't really being utilized;
the incoming LACPDU and periodic state machine both acquire the read
lock for read, but then acquire a second per-port spin lock.  If the
"big" (bond->lock currently) lock is a spin lock, then the per-port
state machine lock is unnecessary, as the only purpose of the per-port
lock is to mutex the one case that does have multiple readers.

	In actual practice I doubt there are multiple simultaneous
readers very often; the periodic machine runs every 100 ms, but LACPDUs
arrive for each port either every second or every 30 seconds (depending
on admin configuration).

	Since contention on these locks is generally low, we're probably
better off in the long run with something simpler to understand.

	So, what I'm kind of saying here is that this patch isn't a bad
first step, but at least for the 3ad case, removal of the bond->lock
itself doesn't really simplify the locking as much as could be done.

	Thoughts?

	-J


>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_3ad.c  |  9 ++++-----
> drivers/net/bonding/bond_main.c | 12 +++++++-----
> 2 files changed, 11 insertions(+), 10 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index ee2c73a9de39..5d27a6207384 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -2057,7 +2057,7 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
> 	struct port *port;
> 	bool should_notify_rtnl = BOND_SLAVE_NOTIFY_LATER;
> 
>-	read_lock(&bond->lock);
>+	read_lock(&bond->curr_slave_lock);
> 	rcu_read_lock();
> 
> 	/* check if there are any slaves */
>@@ -2120,7 +2120,7 @@ re_arm:
> 		}
> 	}
> 	rcu_read_unlock();
>-	read_unlock(&bond->lock);
>+	read_unlock(&bond->curr_slave_lock);
> 
> 	if (should_notify_rtnl && rtnl_trylock()) {
> 		bond_slave_state_notify(bond);
>@@ -2395,7 +2395,6 @@ int __bond_3ad_get_active_agg_info(struct bonding *bond,
> 	return 0;
> }
> 
>-/* Wrapper used to hold bond->lock so no slave manipulation can occur */
> int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
> {
> 	int ret;
>@@ -2487,9 +2486,9 @@ int bond_3ad_lacpdu_recv(const struct sk_buff *skb, struct bonding *bond,
> 	if (!lacpdu)
> 		return ret;
> 
>-	read_lock(&bond->lock);
>+	read_lock(&bond->curr_slave_lock);
> 	ret = bond_3ad_rx_indication(lacpdu, slave, skb->len);
>-	read_unlock(&bond->lock);
>+	read_unlock(&bond->curr_slave_lock);
> 	return ret;
> }
> 
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index f0f5eab0fab1..dcd331bd0c17 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1687,13 +1687,15 @@ static int __bond_release_one(struct net_device *bond_dev,
> 	 * for this slave anymore.
> 	 */
> 	netdev_rx_handler_unregister(slave_dev);
>-	write_lock_bh(&bond->lock);
> 
>-	/* Inform AD package of unbinding of slave. */
>-	if (BOND_MODE(bond) == BOND_MODE_8023AD)
>+	if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>+		/* Sync against bond_3ad_rx_indication and
>+		 * bond_3ad_state_machine_handler
>+		 */
>+		write_lock_bh(&bond->curr_slave_lock);
> 		bond_3ad_unbind_slave(slave);
>-
>-	write_unlock_bh(&bond->lock);
>+		write_unlock_bh(&bond->curr_slave_lock);
>+	}
> 
> 	netdev_info(bond_dev, "Releasing %s interface %s\n",
> 		    bond_is_active_slave(slave) ? "active" : "backup",
>-- 
>1.9.3
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead of bond->lock
  2014-09-05 20:37   ` Jay Vosburgh
@ 2014-09-05 21:00     ` Nikolay Aleksandrov
  2014-09-06  1:08       ` Jay Vosburgh
  0 siblings, 1 reply; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-05 21:00 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, vfalico, andy, davem

On 09/05/2014 10:37 PM, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>> In 3ad mode the only syncing needed by bond->lock is for the wq
>> and the recv handler, so change them to use curr_slave_lock.
>> There're no locking dependencies here as 3ad doesn't use
>> curr_slave_lock at all.
> 
> 	One subtle aspect of the 3ad locking is that it's not really
> using the "read" property of the read lock with regard to the state
> machine; it's largely using it as a spin lock, because there is at most
> one reader and at most one writer in the full state machine code
> (although there are multiple reader possibilities elsewhere).  The code
> would break if there actually were multiple read-lock holders in the
> full state machine or aggregator selection logic simultaneously.
> 
> 	Because the state machine and incoming LACPDU cases both acquire
> the read lock for read, there is a separate per-port "state machine"
> spin lock to protect only the per-port fields that LACPDU and periodic
> state machine both touch.  The incoming LACPDU case doesn't call into
> the full state machine, only the RX processing which can't go into agg
> selection, so this works.
> 
> 	The agg selection can be entered via the unbind path or the
> periodic state machine (and only these two paths), and relies on the
> "one reader max" usage of the read lock to mutex the code paths that may
> enter agg selection.
> 
> 	I suspect that what 3ad may need is a spin lock, not a read
> lock, because the multiple reader property isn't really being utilized;
> the incoming LACPDU and periodic state machine both acquire the read
> lock for read, but then acquire a second per-port spin lock.  If the
> "big" (bond->lock currently) lock is a spin lock, then the per-port
> state machine lock is unnecessary, as the only purpose of the per-port
> lock is to mutex the one case that does have multiple readers.
> 
> 	In actual practice I doubt there are multiple simultaneous
> readers very often; the periodic machine runs every 100 ms, but LACPDUs
> arrive for each port either every second or every 30 seconds (depending
> on admin configuration).
> 
> 	Since contention on these locks is generally low, we're probably
> better off in the long run with something simpler to understand.
> 
> 	So, what I'm kind of saying here is that this patch isn't a bad
> first step, but at least for the 3ad case, removal of the bond->lock
> itself doesn't really simplify the locking as much as could be done.
> 
> 	Thoughts?
> 
> 	-J
> 
> 

Hi Jay,
That is a very good point, my main idea was to protect __bond_release_one
and the machine handler otherwise I'd have removed it altogether. I know
that this doesn't improve on the 3ad situation, I did it mostly to get rid
of bond->lock first. Going with a spinlock certainly makes sense there as
we don't spend much time inside and the contention is not high as you said
and would simplify the 3ad code so I like it :-)
I will include it in my bond-locking todo list and will post a follow-up
once I've cleared the details up as I'm speaking from the top of my head
right now, but first I'd like to clean the current lock use especially with
regard to curr_slave_lock and bring it to the necessary minimum. In the
long run I think that we either might be able to remove curr_slave_lock
completely or at least reduce it to ~ 3 places and with the spinlock that
you suggested here, we'll be definitely able to remove it from the 3ad code.

Thanks for the suggestion!

Nik

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead of bond->lock
  2014-09-05 21:00     ` Nikolay Aleksandrov
@ 2014-09-06  1:08       ` Jay Vosburgh
  2014-09-06 10:19         ` Nikolay Aleksandrov
  0 siblings, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2014-09-06  1:08 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, vfalico, andy, davem

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>On 09/05/2014 10:37 PM, Jay Vosburgh wrote:
>> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>> 
>>> In 3ad mode the only syncing needed by bond->lock is for the wq
>>> and the recv handler, so change them to use curr_slave_lock.
>>> There're no locking dependencies here as 3ad doesn't use
>>> curr_slave_lock at all.
>> 
>> 	One subtle aspect of the 3ad locking is that it's not really
>> using the "read" property of the read lock with regard to the state
>> machine; it's largely using it as a spin lock, because there is at most
>> one reader and at most one writer in the full state machine code
>> (although there are multiple reader possibilities elsewhere).  The code
>> would break if there actually were multiple read-lock holders in the
>> full state machine or aggregator selection logic simultaneously.
>> 
>> 	Because the state machine and incoming LACPDU cases both acquire
>> the read lock for read, there is a separate per-port "state machine"
>> spin lock to protect only the per-port fields that LACPDU and periodic
>> state machine both touch.  The incoming LACPDU case doesn't call into
>> the full state machine, only the RX processing which can't go into agg
>> selection, so this works.
>> 
>> 	The agg selection can be entered via the unbind path or the
>> periodic state machine (and only these two paths), and relies on the
>> "one reader max" usage of the read lock to mutex the code paths that may
>> enter agg selection.
>> 
>> 	I suspect that what 3ad may need is a spin lock, not a read
>> lock, because the multiple reader property isn't really being utilized;
>> the incoming LACPDU and periodic state machine both acquire the read
>> lock for read, but then acquire a second per-port spin lock.  If the
>> "big" (bond->lock currently) lock is a spin lock, then the per-port
>> state machine lock is unnecessary, as the only purpose of the per-port
>> lock is to mutex the one case that does have multiple readers.
>> 
>> 	In actual practice I doubt there are multiple simultaneous
>> readers very often; the periodic machine runs every 100 ms, but LACPDUs
>> arrive for each port either every second or every 30 seconds (depending
>> on admin configuration).
>> 
>> 	Since contention on these locks is generally low, we're probably
>> better off in the long run with something simpler to understand.
>> 
>> 	So, what I'm kind of saying here is that this patch isn't a bad
>> first step, but at least for the 3ad case, removal of the bond->lock
>> itself doesn't really simplify the locking as much as could be done.
>> 
>> 	Thoughts?
>> 
>> 	-J
>> 
>> 
>
>Hi Jay,
>That is a very good point, my main idea was to protect __bond_release_one
>and the machine handler otherwise I'd have removed it altogether. I know
>that this doesn't improve on the 3ad situation, I did it mostly to get rid
>of bond->lock first. Going with a spinlock certainly makes sense there as
>we don't spend much time inside and the contention is not high as you said
>and would simplify the 3ad code so I like it :-)
>I will include it in my bond-locking todo list and will post a follow-up
>once I've cleared the details up as I'm speaking from the top of my head
>right now, but first I'd like to clean the current lock use especially with
>regard to curr_slave_lock and bring it to the necessary minimum. In the
>long run I think that we either might be able to remove curr_slave_lock
>completely or at least reduce it to ~ 3 places and with the spinlock that
>you suggested here, we'll be definitely able to remove it from the 3ad code.

	I looked into curr_slave_lock some time ago, and as I recall
there's not much it really protects that's not also covered by RTNL,
since changes to the active slave are all happening under RTNL these
days.  And that was before the RCU conversion; with the current code,
I'm not sure the curr_slave_lock is providing much mutexing beyond what
we get from RTNL.

	There are a couple of special cases, like the TLB rebalance in
bond_alb_monitor, but that happens once every 10 seconds, and could just
grab RTNL for this bit:

			if (slave == rcu_access_pointer(bond->curr_active_slave)) {
				SLAVE_TLB_INFO(slave).load =
					bond_info->unbalanced_load /
						BOND_TLB_REBALANCE_INTERVAL;
				bond_info->unbalanced_load = 0;

	the "unbalanced_load," now that I'm looking at it, might already
have some race problems since it's now updated outside of bonding locks
in bond_do_alb_xmit.  It'll probably race with multiple bond_do_alb_xmit
functions running simultaneously as well as the tx rebalance in
bond_alb_monitor.  I think the worst that will happen is that the tx
traffic load is distributed suboptimally for 10 seconds.

	The rlb_clear_slave case that acquires curr_slave_lock already
also has RTNL, so I'm not sure that removing the curr_slave_lock will
have any impact there, either.  Many of the other curr_slave_lock
holders bounce the curr_slave_lock to call into net/core functions (set
promisc, change MAC, etc), so there's already reliance on RTNL.

	Separately, it also might be possible to combine the various
per-mode special locks internally into a generic "mode lock" so the alb
and rlb hashtbl lock and the 802.3ad state machine lock could be a
single "bond->mode_lock" that mutexes whatever special sauce the active
mode needs protected.  Not sure if that's worth the trouble or not, but
it seems plausible at first glance.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead of bond->lock
  2014-09-06  1:08       ` Jay Vosburgh
@ 2014-09-06 10:19         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2014-09-06 10:19 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, vfalico, andy, davem

On 09/06/2014 03:08 AM, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>> On 09/05/2014 10:37 PM, Jay Vosburgh wrote:
>>> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>>>
>>>> In 3ad mode the only syncing needed by bond->lock is for the wq
>>>> and the recv handler, so change them to use curr_slave_lock.
>>>> There're no locking dependencies here as 3ad doesn't use
>>>> curr_slave_lock at all.
>>>
>>> 	One subtle aspect of the 3ad locking is that it's not really
>>> using the "read" property of the read lock with regard to the state
>>> machine; it's largely using it as a spin lock, because there is at most
>>> one reader and at most one writer in the full state machine code
>>> (although there are multiple reader possibilities elsewhere).  The code
>>> would break if there actually were multiple read-lock holders in the
>>> full state machine or aggregator selection logic simultaneously.
>>>
>>> 	Because the state machine and incoming LACPDU cases both acquire
>>> the read lock for read, there is a separate per-port "state machine"
>>> spin lock to protect only the per-port fields that LACPDU and periodic
>>> state machine both touch.  The incoming LACPDU case doesn't call into
>>> the full state machine, only the RX processing which can't go into agg
>>> selection, so this works.
>>>
>>> 	The agg selection can be entered via the unbind path or the
>>> periodic state machine (and only these two paths), and relies on the
>>> "one reader max" usage of the read lock to mutex the code paths that may
>>> enter agg selection.
>>>
>>> 	I suspect that what 3ad may need is a spin lock, not a read
>>> lock, because the multiple reader property isn't really being utilized;
>>> the incoming LACPDU and periodic state machine both acquire the read
>>> lock for read, but then acquire a second per-port spin lock.  If the
>>> "big" (bond->lock currently) lock is a spin lock, then the per-port
>>> state machine lock is unnecessary, as the only purpose of the per-port
>>> lock is to mutex the one case that does have multiple readers.
>>>
>>> 	In actual practice I doubt there are multiple simultaneous
>>> readers very often; the periodic machine runs every 100 ms, but LACPDUs
>>> arrive for each port either every second or every 30 seconds (depending
>>> on admin configuration).
>>>
>>> 	Since contention on these locks is generally low, we're probably
>>> better off in the long run with something simpler to understand.
>>>
>>> 	So, what I'm kind of saying here is that this patch isn't a bad
>>> first step, but at least for the 3ad case, removal of the bond->lock
>>> itself doesn't really simplify the locking as much as could be done.
>>>
>>> 	Thoughts?
>>>
>>> 	-J
>>>
>>>
>>
>> Hi Jay,
>> That is a very good point, my main idea was to protect __bond_release_one
>> and the machine handler otherwise I'd have removed it altogether. I know
>> that this doesn't improve on the 3ad situation, I did it mostly to get rid
>> of bond->lock first. Going with a spinlock certainly makes sense there as
>> we don't spend much time inside and the contention is not high as you said
>> and would simplify the 3ad code so I like it :-)
>> I will include it in my bond-locking todo list and will post a follow-up
>> once I've cleared the details up as I'm speaking from the top of my head
>> right now, but first I'd like to clean the current lock use especially with
>> regard to curr_slave_lock and bring it to the necessary minimum. In the
>> long run I think that we either might be able to remove curr_slave_lock
>> completely or at least reduce it to ~ 3 places and with the spinlock that
>> you suggested here, we'll be definitely able to remove it from the 3ad code.
> 
> 	I looked into curr_slave_lock some time ago, and as I recall
> there's not much it really protects that's not also covered by RTNL,
> since changes to the active slave are all happening under RTNL these
> days.  And that was before the RCU conversion; with the current code,
> I'm not sure the curr_slave_lock is providing much mutexing beyond what
> we get from RTNL.
> 
Right, that was one of the main things that drove me to start this, after
reviewing Mahesh's patches I realized most of the callers already had RTNL
and curr_slave_lock looked mostly redundant, and bond->lock completely
redundant.

> 	There are a couple of special cases, like the TLB rebalance in
> bond_alb_monitor, but that happens once every 10 seconds, and could just
> grab RTNL for this bit:
> 
> 			if (slave == rcu_access_pointer(bond->curr_active_slave)) {
> 				SLAVE_TLB_INFO(slave).load =
> 					bond_info->unbalanced_load /
> 						BOND_TLB_REBALANCE_INTERVAL;
> 				bond_info->unbalanced_load = 0;
> 
> 	the "unbalanced_load," now that I'm looking at it, might already
> have some race problems since it's now updated outside of bonding locks
> in bond_do_alb_xmit.  It'll probably race with multiple bond_do_alb_xmit
> functions running simultaneously as well as the tx rebalance in
> bond_alb_monitor.  I think the worst that will happen is that the tx
> traffic load is distributed suboptimally for 10 seconds.
> 
Yes, I don't think this needs fixing, but we might revisit it once the rest
of the pieces are in place.

> 	The rlb_clear_slave case that acquires curr_slave_lock already
> also has RTNL, so I'm not sure that removing the curr_slave_lock will
> have any impact there, either.  Many of the other curr_slave_lock
> holders bounce the curr_slave_lock to call into net/core functions (set
> promisc, change MAC, etc), so there's already reliance on RTNL.
> 
Yes, rlb_clear_slave is not the problem, the problem if we remove
curr_slave_lock was that both rlb_clear_slave and bond_alb_monitor can be
transmitting packets at the same time or a mac swap might happen triggering
packet transmission while bond_alb_monitor() is also transmitting i.e.
pretty much everything that now gets curr_slave_lock for writing on the ALB
side and transmits could transmit with bond_alb_monitor(), that's what I
was trying to avoid.

> 	Separately, it also might be possible to combine the various
> per-mode special locks internally into a generic "mode lock" so the alb
> and rlb hashtbl lock and the 802.3ad state machine lock could be a
> single "bond->mode_lock" that mutexes whatever special sauce the active
> mode needs protected.  Not sure if that's worth the trouble or not, but
> it seems plausible at first glance.
> 
Yes, I had something similar in mind /actually called it sync_lock :)/, but
I only went as far as to use it for a few places that needed syncing beyond
RTNL after curr_slave_lock was removed, you've went one step further and
this certainly looks doable. I'll keep it in mind.

> 	-J
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-06 10:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05 18:16 [RFC net-next 0/5] bonding: get rid of bond->lock Nikolay Aleksandrov
2014-09-05 18:16 ` [RFC net-next 1/5] bonding: 3ad: use curr_slave_lock instead " Nikolay Aleksandrov
2014-09-05 20:37   ` Jay Vosburgh
2014-09-05 21:00     ` Nikolay Aleksandrov
2014-09-06  1:08       ` Jay Vosburgh
2014-09-06 10:19         ` Nikolay Aleksandrov
2014-09-05 18:16 ` [RFC net-next 2/5] bonding: alb: clean bond->lock Nikolay Aleksandrov
2014-09-05 18:16 ` [RFC net-next 3/5] bonding: procfs: clean bond->lock usage and use RTNL Nikolay Aleksandrov
2014-09-05 18:16 ` [RFC net-next 4/5] bonding: options: remove bond->lock usage Nikolay Aleksandrov
2014-09-05 18:16 ` [RFC net-next 5/5] bonding: remove last users of bond->lock and bond->lock itself Nikolay Aleksandrov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).