All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
@ 2013-02-18 17:59 Nikolay Aleksandrov
  2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 17:59 UTC (permalink / raw)
  To: netdev; +Cc: davem, fubar, andy

port->slave can be NULL since it's being initialized in bond_enslave
thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
Also fix a minor bug, which could cause a port not to have
AD_STATE_LACP_TIMEOUT since there's no sync between
bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_3ad.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index a030e63..1720742 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2494,11 +2494,13 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
 	struct port *port = NULL;
 	int lacp_fast;
 
-	read_lock(&bond->lock);
+	write_lock_bh(&bond->lock);
 	lacp_fast = bond->params.lacp_fast;
 
 	bond_for_each_slave(bond, slave, i) {
 		port = &(SLAVE_AD_INFO(slave).port);
+		if (port->slave == NULL)
+			continue;
 		__get_state_machine_lock(port);
 		if (lacp_fast)
 			port->actor_oper_port_state |= AD_STATE_LACP_TIMEOUT;
@@ -2507,5 +2509,5 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
 		__release_state_machine_lock(port);
 	}
 
-	read_unlock(&bond->lock);
+	write_unlock_bh(&bond->lock);
 }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
  2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
@ 2013-02-18 17:59 ` Nikolay Aleksandrov
  2013-02-18 21:33   ` Jay Vosburgh
  2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 17:59 UTC (permalink / raw)
  To: netdev; +Cc: davem, fubar, andy

The 3ad machine state spinlock can be used before it is inititialized
while doing bond_enslave() (and the port is being initialized) since
port->slave is set before the lock is prepared, thus causing soft
lock-ups and a multitude of other nasty bugs.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_3ad.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 1720742..96d471e 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -389,13 +389,13 @@ static u8 __get_duplex(struct port *port)
 
 /**
  * __initialize_port_locks - initialize a port's STATE machine spinlock
- * @port: the port we're looking at
+ * @port: the slave of the port we're looking at
  *
  */
-static inline void __initialize_port_locks(struct port *port)
+static inline void __initialize_port_locks(struct slave *port)
 {
 	// make sure it isn't called twice
-	spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
+	spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
 }
 
 //conversions
@@ -1910,6 +1910,7 @@ int bond_3ad_bind_slave(struct slave *slave)
 
 		ad_initialize_port(port, bond->params.lacp_fast);
 
+		__initialize_port_locks(slave);
 		port->slave = slave;
 		port->actor_port_number = SLAVE_AD_INFO(slave).id;
 		// key is determined according to the link speed, duplex and user key(which is yet not supported)
@@ -1932,8 +1933,6 @@ int bond_3ad_bind_slave(struct slave *slave)
 		port->next_port_in_aggregator = NULL;
 
 		__disable_port(port);
-		__initialize_port_locks(port);
-
 
 		// aggregator initialization
 		aggregator = &(SLAVE_AD_INFO(slave).aggregator);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
  2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
@ 2013-02-18 17:59 ` Nikolay Aleksandrov
  2013-02-18 21:56   ` Jay Vosburgh
  2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
  2013-02-19  0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
  3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 17:59 UTC (permalink / raw)
  To: netdev; +Cc: davem, fubar, andy

This patch fixes the following inconsistencies in bond_release_all:
- IFF_BONDING flag is not stripped from slaves
- MTU is not restored
- no netdev notifiers are sent
Instead of trying to keep bond_release and bond_release_all in sync
I think we can re-use bond_release as the environment for calling it
is correct (RTNL is held). I have been running tests for the past
week and they came out successful. The only way for bond_release to fail
is for the slave to be attached in a different bond or to not be a slave
but that cannot happen as RTNL is held and no slave manipulations can be
achieved.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_main.c | 106 ++--------------------------------------
 1 file changed, 5 insertions(+), 101 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 94c1534..fcfc880 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2140,113 +2140,17 @@ static int  bond_release_and_destroy(struct net_device *bond_dev,
 /*
  * This function releases all slaves.
  */
-static int bond_release_all(struct net_device *bond_dev)
+static void bond_release_all(struct net_device *bond_dev)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
-	struct slave *slave;
-	struct net_device *slave_dev;
-	struct sockaddr addr;
-
-	write_lock_bh(&bond->lock);
-
-	netif_carrier_off(bond_dev);
 
 	if (bond->slave_cnt == 0)
-		goto out;
-
-	bond->current_arp_slave = NULL;
-	bond->primary_slave = NULL;
-	bond_change_active_slave(bond, NULL);
-
-	while ((slave = bond->first_slave) != NULL) {
-		/* Inform AD package of unbinding of slave
-		 * before slave is detached from the list.
-		 */
-		if (bond->params.mode == BOND_MODE_8023AD)
-			bond_3ad_unbind_slave(slave);
-
-		slave_dev = slave->dev;
-		bond_detach_slave(bond, slave);
-
-		/* now that the slave is detached, unlock and perform
-		 * all the undo steps that should not be called from
-		 * within a lock.
-		 */
-		write_unlock_bh(&bond->lock);
-
-		/* unregister rx_handler early so bond_handle_frame wouldn't
-		 * be called for this slave anymore.
-		 */
-		netdev_rx_handler_unregister(slave_dev);
-		synchronize_net();
-
-		if (bond_is_lb(bond)) {
-			/* must be called only after the slave
-			 * has been detached from the list
-			 */
-			bond_alb_deinit_slave(bond, slave);
-		}
-
-		bond_destroy_slave_symlinks(bond_dev, slave_dev);
-		bond_del_vlans_from_slave(bond, slave_dev);
-
-		/* If the mode USES_PRIMARY, then we should only remove its
-		 * promisc and mc settings if it was the curr_active_slave, but that was
-		 * already taken care of above when we detached the slave
-		 */
-		if (!USES_PRIMARY(bond->params.mode)) {
-			/* unset promiscuity level from slave */
-			if (bond_dev->flags & IFF_PROMISC)
-				dev_set_promiscuity(slave_dev, -1);
-
-			/* unset allmulti level from slave */
-			if (bond_dev->flags & IFF_ALLMULTI)
-				dev_set_allmulti(slave_dev, -1);
-
-			/* flush master's mc_list from slave */
-			netif_addr_lock_bh(bond_dev);
-			bond_mc_list_flush(bond_dev, slave_dev);
-			netif_addr_unlock_bh(bond_dev);
-		}
-
-		bond_upper_dev_unlink(bond_dev, slave_dev);
-
-		slave_disable_netpoll(slave);
-
-		/* close slave before restoring its mac address */
-		dev_close(slave_dev);
-
-		if (!bond->params.fail_over_mac) {
-			/* restore original ("permanent") mac address*/
-			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
-			addr.sa_family = slave_dev->type;
-			dev_set_mac_address(slave_dev, &addr);
-		}
-
-		kfree(slave);
-
-		/* re-acquire the lock before getting the next slave */
-		write_lock_bh(&bond->lock);
-	}
-
-	eth_hw_addr_random(bond_dev);
-	bond->dev_addr_from_first = true;
-
-	if (bond_vlan_used(bond)) {
-		pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
-			   bond_dev->name, bond_dev->name);
-		pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
-			   bond_dev->name);
-	}
-
+		return;
+	while (bond->first_slave != NULL)
+		bond_release(bond_dev, bond->first_slave->dev);
 	pr_info("%s: released all slaves\n", bond_dev->name);
 
-out:
-	write_unlock_bh(&bond->lock);
-
-	bond_compute_features(bond);
-
-	return 0;
+	return;
 }
 
 /*
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
  2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
  2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
  2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
@ 2013-02-18 21:09 ` Jay Vosburgh
  2013-02-19  5:52   ` David Miller
  2013-02-19  0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
  3 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 21:09 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, davem, andy

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>port->slave can be NULL since it's being initialized in bond_enslave
>thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
>Also fix a minor bug, which could cause a port not to have
>AD_STATE_LACP_TIMEOUT since there's no sync between
>bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
>the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_3ad.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index a030e63..1720742 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -2494,11 +2494,13 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
> 	struct port *port = NULL;
> 	int lacp_fast;
>
>-	read_lock(&bond->lock);
>+	write_lock_bh(&bond->lock);
> 	lacp_fast = bond->params.lacp_fast;
>
> 	bond_for_each_slave(bond, slave, i) {
> 		port = &(SLAVE_AD_INFO(slave).port);
>+		if (port->slave == NULL)
>+			continue;
> 		__get_state_machine_lock(port);
> 		if (lacp_fast)
> 			port->actor_oper_port_state |= AD_STATE_LACP_TIMEOUT;
>@@ -2507,5 +2509,5 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
> 		__release_state_machine_lock(port);
> 	}
>
>-	read_unlock(&bond->lock);
>+	write_unlock_bh(&bond->lock);
> }
>-- 
>1.7.11.7
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
  2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
@ 2013-02-18 21:33   ` Jay Vosburgh
  2013-02-18 21:51     ` Nikolay Aleksandrov
  2013-02-19  5:52     ` David Miller
  0 siblings, 2 replies; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 21:33 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, davem, andy

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>The 3ad machine state spinlock can be used before it is inititialized
>while doing bond_enslave() (and the port is being initialized) since
>port->slave is set before the lock is prepared, thus causing soft
>lock-ups and a multitude of other nasty bugs.

	Does this change cause the "uninitialized port" warnings in
bond_3ad_state_machine_handler and bond_3ad_rx_indication to
intermittently print during the enslavement process?  If so (and it
looks to me like it will), I think the warnings should be removed, since
after this change, port->slave being NULL isn't really an error
condition that needs a warning to the log.

>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_3ad.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index 1720742..96d471e 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -389,13 +389,13 @@ static u8 __get_duplex(struct port *port)
>
> /**
>  * __initialize_port_locks - initialize a port's STATE machine spinlock
>- * @port: the port we're looking at
>+ * @port: the slave of the port we're looking at
>  *
>  */
>-static inline void __initialize_port_locks(struct port *port)
>+static inline void __initialize_port_locks(struct slave *port)
> {
> 	// make sure it isn't called twice
>-	spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
>+	spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));

	Change the name of the variable here, too, not just the type.
This is confusing.

	-J

> }
>
> //conversions
>@@ -1910,6 +1910,7 @@ int bond_3ad_bind_slave(struct slave *slave)
>
> 		ad_initialize_port(port, bond->params.lacp_fast);
>
>+		__initialize_port_locks(slave);
> 		port->slave = slave;
> 		port->actor_port_number = SLAVE_AD_INFO(slave).id;
> 		// key is determined according to the link speed, duplex and user key(which is yet not supported)
>@@ -1932,8 +1933,6 @@ int bond_3ad_bind_slave(struct slave *slave)
> 		port->next_port_in_aggregator = NULL;
>
> 		__disable_port(port);
>-		__initialize_port_locks(port);
>-
>
> 		// aggregator initialization
> 		aggregator = &(SLAVE_AD_INFO(slave).aggregator);
>-- 
>1.7.11.7
>

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
  2013-02-18 21:33   ` Jay Vosburgh
@ 2013-02-18 21:51     ` Nikolay Aleksandrov
  2013-02-19  5:52     ` David Miller
  1 sibling, 0 replies; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 21:51 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, davem, andy

On 18/02/13 22:33, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>> The 3ad machine state spinlock can be used before it is inititialized
>> while doing bond_enslave() (and the port is being initialized) since
>> port->slave is set before the lock is prepared, thus causing soft
>> lock-ups and a multitude of other nasty bugs.
> 
> 	Does this change cause the "uninitialized port" warnings in
> bond_3ad_state_machine_handler and bond_3ad_rx_indication to
> intermittently print during the enslavement process?  If so (and it
> looks to me like it will), I think the warnings should be removed, since
> after this change, port->slave being NULL isn't really an error
> condition that needs a warning to the log.
> 
This change couldn't cause that, it only initializes the spin lock
before the slave is set, currently after the first patch of this series
this is no longer a requirement as far as I can tell the only code that
can access the lock before the slave is set was that one, but it still
is a bug that can manifest later. I don't think it has anything to do
with the warnings, the only change is that the spin lock is initialized
prior to setting the slave to the port.
Am I missing something here ?

>> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>> ---
>> drivers/net/bonding/bond_3ad.c | 9 ++++-----
>> 1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index 1720742..96d471e 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -389,13 +389,13 @@ static u8 __get_duplex(struct port *port)
>>
>> /**
>>  * __initialize_port_locks - initialize a port's STATE machine spinlock
>> - * @port: the port we're looking at
>> + * @port: the slave of the port we're looking at
>>  *
>>  */
>> -static inline void __initialize_port_locks(struct port *port)
>> +static inline void __initialize_port_locks(struct slave *port)
>> {
>> 	// make sure it isn't called twice
>> -	spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
>> +	spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
> 
> 	Change the name of the variable here, too, not just the type.
> This is confusing.
> 
> 	-J
Thanks, I saw that after posting, I have prepared this change already.
> 
>> }
>>
>> //conversions
>> @@ -1910,6 +1910,7 @@ int bond_3ad_bind_slave(struct slave *slave)
>>
>> 		ad_initialize_port(port, bond->params.lacp_fast);
>>
>> +		__initialize_port_locks(slave);
>> 		port->slave = slave;
>> 		port->actor_port_number = SLAVE_AD_INFO(slave).id;
>> 		// key is determined according to the link speed, duplex and user key(which is yet not supported)
>> @@ -1932,8 +1933,6 @@ int bond_3ad_bind_slave(struct slave *slave)
>> 		port->next_port_in_aggregator = NULL;
>>
>> 		__disable_port(port);
>> -		__initialize_port_locks(port);
>> -
>>
>> 		// aggregator initialization
>> 		aggregator = &(SLAVE_AD_INFO(slave).aggregator);
>> -- 
>> 1.7.11.7
>>
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
@ 2013-02-18 21:56   ` Jay Vosburgh
  2013-02-18 22:13     ` Nikolay Aleksandrov
  0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 21:56 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, davem, andy

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>This patch fixes the following inconsistencies in bond_release_all:
>- IFF_BONDING flag is not stripped from slaves
>- MTU is not restored
>- no netdev notifiers are sent
>Instead of trying to keep bond_release and bond_release_all in sync
>I think we can re-use bond_release as the environment for calling it
>is correct (RTNL is held). I have been running tests for the past
>week and they came out successful. The only way for bond_release to fail
>is for the slave to be attached in a different bond or to not be a slave
>but that cannot happen as RTNL is held and no slave manipulations can be
>achieved.

	It might be worthwhile to add an "all" argument to bond_release
that skips some things that don't make sense if all slaves are being
released.  I'm thinking in particular of this block:

	if (oldcurrent == slave) {
		/*
		 * Note that we hold RTNL over this sequence, so there
		 * is no concern that another slave add/remove event
		 * will interfere.
		 */
		write_unlock_bh(&bond->lock);
		read_lock(&bond->lock);
		write_lock_bh(&bond->curr_slave_lock);

		bond_select_active_slave(bond);

		write_unlock_bh(&bond->curr_slave_lock);
		read_unlock(&bond->lock);
		write_lock_bh(&bond->lock);
	}

	as it's written now, for the release all case, the code may go
to the trouble of assigning a new active slave each time one slave is
removed (including various log messages, maybe sending IGMPs, etc).  If
all slaves are being removed, that's pointless.  This could be something
like:

	if (release_all) {
		bond->curr_active_slave = NULL;
	} else if (oldcurrent == slave) {
		[ the current block of stuff ]
	}

	it's safe here to unconditionally set curr_active_slave to NULL
because we hold bond->lock for write.  The lock dance stuff for the
bond_select_active_slave() call is to satisfy its locking requirements.

	-J


>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_main.c | 106 ++--------------------------------------
> 1 file changed, 5 insertions(+), 101 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 94c1534..fcfc880 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2140,113 +2140,17 @@ static int  bond_release_and_destroy(struct net_device *bond_dev,
> /*
>  * This function releases all slaves.
>  */
>-static int bond_release_all(struct net_device *bond_dev)
>+static void bond_release_all(struct net_device *bond_dev)
> {
> 	struct bonding *bond = netdev_priv(bond_dev);
>-	struct slave *slave;
>-	struct net_device *slave_dev;
>-	struct sockaddr addr;
>-
>-	write_lock_bh(&bond->lock);
>-
>-	netif_carrier_off(bond_dev);
>
> 	if (bond->slave_cnt == 0)
>-		goto out;
>-
>-	bond->current_arp_slave = NULL;
>-	bond->primary_slave = NULL;
>-	bond_change_active_slave(bond, NULL);
>-
>-	while ((slave = bond->first_slave) != NULL) {
>-		/* Inform AD package of unbinding of slave
>-		 * before slave is detached from the list.
>-		 */
>-		if (bond->params.mode == BOND_MODE_8023AD)
>-			bond_3ad_unbind_slave(slave);
>-
>-		slave_dev = slave->dev;
>-		bond_detach_slave(bond, slave);
>-
>-		/* now that the slave is detached, unlock and perform
>-		 * all the undo steps that should not be called from
>-		 * within a lock.
>-		 */
>-		write_unlock_bh(&bond->lock);
>-
>-		/* unregister rx_handler early so bond_handle_frame wouldn't
>-		 * be called for this slave anymore.
>-		 */
>-		netdev_rx_handler_unregister(slave_dev);
>-		synchronize_net();
>-
>-		if (bond_is_lb(bond)) {
>-			/* must be called only after the slave
>-			 * has been detached from the list
>-			 */
>-			bond_alb_deinit_slave(bond, slave);
>-		}
>-
>-		bond_destroy_slave_symlinks(bond_dev, slave_dev);
>-		bond_del_vlans_from_slave(bond, slave_dev);
>-
>-		/* If the mode USES_PRIMARY, then we should only remove its
>-		 * promisc and mc settings if it was the curr_active_slave, but that was
>-		 * already taken care of above when we detached the slave
>-		 */
>-		if (!USES_PRIMARY(bond->params.mode)) {
>-			/* unset promiscuity level from slave */
>-			if (bond_dev->flags & IFF_PROMISC)
>-				dev_set_promiscuity(slave_dev, -1);
>-
>-			/* unset allmulti level from slave */
>-			if (bond_dev->flags & IFF_ALLMULTI)
>-				dev_set_allmulti(slave_dev, -1);
>-
>-			/* flush master's mc_list from slave */
>-			netif_addr_lock_bh(bond_dev);
>-			bond_mc_list_flush(bond_dev, slave_dev);
>-			netif_addr_unlock_bh(bond_dev);
>-		}
>-
>-		bond_upper_dev_unlink(bond_dev, slave_dev);
>-
>-		slave_disable_netpoll(slave);
>-
>-		/* close slave before restoring its mac address */
>-		dev_close(slave_dev);
>-
>-		if (!bond->params.fail_over_mac) {
>-			/* restore original ("permanent") mac address*/
>-			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
>-			addr.sa_family = slave_dev->type;
>-			dev_set_mac_address(slave_dev, &addr);
>-		}
>-
>-		kfree(slave);
>-
>-		/* re-acquire the lock before getting the next slave */
>-		write_lock_bh(&bond->lock);
>-	}
>-
>-	eth_hw_addr_random(bond_dev);
>-	bond->dev_addr_from_first = true;
>-
>-	if (bond_vlan_used(bond)) {
>-		pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
>-			   bond_dev->name, bond_dev->name);
>-		pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
>-			   bond_dev->name);
>-	}
>-
>+		return;
>+	while (bond->first_slave != NULL)
>+		bond_release(bond_dev, bond->first_slave->dev);
> 	pr_info("%s: released all slaves\n", bond_dev->name);
>
>-out:
>-	write_unlock_bh(&bond->lock);
>-
>-	bond_compute_features(bond);
>-
>-	return 0;
>+	return;
> }
>
> /*
>-- 
>1.7.11.7

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-18 21:56   ` Jay Vosburgh
@ 2013-02-18 22:13     ` Nikolay Aleksandrov
  2013-02-18 23:17       ` Jay Vosburgh
  0 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 22:13 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, davem, andy

On 18/02/13 22:56, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>> This patch fixes the following inconsistencies in bond_release_all:
>> - IFF_BONDING flag is not stripped from slaves
>> - MTU is not restored
>> - no netdev notifiers are sent
>> Instead of trying to keep bond_release and bond_release_all in sync
>> I think we can re-use bond_release as the environment for calling it
>> is correct (RTNL is held). I have been running tests for the past
>> week and they came out successful. The only way for bond_release to fail
>> is for the slave to be attached in a different bond or to not be a slave
>> but that cannot happen as RTNL is held and no slave manipulations can be
>> achieved.
> 
> 	It might be worthwhile to add an "all" argument to bond_release
> that skips some things that don't make sense if all slaves are being
> released.  I'm thinking in particular of this block:
> 
> 	if (oldcurrent == slave) {
> 		/*
> 		 * Note that we hold RTNL over this sequence, so there
> 		 * is no concern that another slave add/remove event
> 		 * will interfere.
> 		 */
> 		write_unlock_bh(&bond->lock);
> 		read_lock(&bond->lock);
> 		write_lock_bh(&bond->curr_slave_lock);
> 
> 		bond_select_active_slave(bond);
> 
> 		write_unlock_bh(&bond->curr_slave_lock);
> 		read_unlock(&bond->lock);
> 		write_lock_bh(&bond->lock);
> 	}
> 
> 	as it's written now, for the release all case, the code may go
> to the trouble of assigning a new active slave each time one slave is
> removed (including various log messages, maybe sending IGMPs, etc).  If
> all slaves are being removed, that's pointless.  This could be something
> like:
> 
> 	if (release_all) {
> 		bond->curr_active_slave = NULL;
> 	} else if (oldcurrent == slave) {
> 		[ the current block of stuff ]
> 	}
> 
> 	it's safe here to unconditionally set curr_active_slave to NULL
> because we hold bond->lock for write.  The lock dance stuff for the
> bond_select_active_slave() call is to satisfy its locking requirements.
> 
> 	-J
I see your point and I agree. I will prepare another version that
incorporates it, although I can't add it as an argument since
bond_release is used as ndo_del_slave. I'll have to make it a global
variable.

Nik

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-18 22:13     ` Nikolay Aleksandrov
@ 2013-02-18 23:17       ` Jay Vosburgh
  0 siblings, 0 replies; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 23:17 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, davem, andy

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>On 18/02/13 22:56, Jay Vosburgh wrote:
>> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>> 
>>> This patch fixes the following inconsistencies in bond_release_all:
>>> - IFF_BONDING flag is not stripped from slaves
>>> - MTU is not restored
>>> - no netdev notifiers are sent
>>> Instead of trying to keep bond_release and bond_release_all in sync
>>> I think we can re-use bond_release as the environment for calling it
>>> is correct (RTNL is held). I have been running tests for the past
>>> week and they came out successful. The only way for bond_release to fail
>>> is for the slave to be attached in a different bond or to not be a slave
>>> but that cannot happen as RTNL is held and no slave manipulations can be
>>> achieved.
>> 
>> 	It might be worthwhile to add an "all" argument to bond_release
>> that skips some things that don't make sense if all slaves are being
>> released.  I'm thinking in particular of this block:
>> 
>> 	if (oldcurrent == slave) {
>> 		/*
>> 		 * Note that we hold RTNL over this sequence, so there
>> 		 * is no concern that another slave add/remove event
>> 		 * will interfere.
>> 		 */
>> 		write_unlock_bh(&bond->lock);
>> 		read_lock(&bond->lock);
>> 		write_lock_bh(&bond->curr_slave_lock);
>> 
>> 		bond_select_active_slave(bond);
>> 
>> 		write_unlock_bh(&bond->curr_slave_lock);
>> 		read_unlock(&bond->lock);
>> 		write_lock_bh(&bond->lock);
>> 	}
>> 
>> 	as it's written now, for the release all case, the code may go
>> to the trouble of assigning a new active slave each time one slave is
>> removed (including various log messages, maybe sending IGMPs, etc).  If
>> all slaves are being removed, that's pointless.  This could be something
>> like:
>> 
>> 	if (release_all) {
>> 		bond->curr_active_slave = NULL;
>> 	} else if (oldcurrent == slave) {
>> 		[ the current block of stuff ]
>> 	}
>> 
>> 	it's safe here to unconditionally set curr_active_slave to NULL
>> because we hold bond->lock for write.  The lock dance stuff for the
>> bond_select_active_slave() call is to satisfy its locking requirements.
>> 
>> 	-J
>I see your point and I agree. I will prepare another version that
>incorporates it, although I can't add it as an argument since
>bond_release is used as ndo_del_slave. I'll have to make it a global
>variable.

	No, just rename the current bond_release to __bond_release_one,
add the extra argument, and create a new bond_release .ndo_del_slave
that calls __bond_release_one with "all=0".  Then, bond_release_all
calls __bond_release_one with all=1.

	Also, there's only one caller of bond_release_all, and since the
new & improved bond_release_all is trivial, it could be open coded into
bond_uninit, eliminating bond_release_all as a function.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
                   ` (2 preceding siblings ...)
  2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
@ 2013-02-19  0:09 ` Nikolay Aleksandrov
  2013-02-19  3:12   ` Jay Vosburgh
  3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-19  0:09 UTC (permalink / raw)
  To: netdev; +Cc: andy, fubar

This patch fixes the following inconsistencies in bond_release_all:
- IFF_BONDING flag is not stripped from slaves
- MTU is not restored
- no netdev notifiers are sent
Instead of trying to keep bond_release and bond_release_all in sync
I think we can re-use bond_release as the environment for calling it
is correct (RTNL is held). I have been running tests for the past
week and they came out successful. The only way for bond_release to fail
is for the slave to be attached in a different bond or to not be a slave
but that cannot happen as RTNL is held and no slave manipulations can be
achieved.

V2: As suggested bond_release is renamed to __bond_release_one with a
new parameter "all" introduced so to avoid calling unnecessary code while
destroying a bond, and a wrapper for it called bond_release is created
because of ndo_del_link. bond_release_all() is removed.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_main.c | 135 ++++++----------------------------------
 1 file changed, 18 insertions(+), 117 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 94c1534..e242dd1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1937,7 +1937,8 @@ err_undo_flags:
 /*
  * Try to release the slave device <slave> from the bond device <master>
  * It is legal to access curr_active_slave without a lock because all the function
- * is write-locked.
+ * is write-locked. If "all" is true it means that the function is being called
+ * while destroying a bond interface and all slaves are being released.
  *
  * The rules for slave state should be:
  *   for Active/Backup:
@@ -1945,7 +1946,9 @@ err_undo_flags:
  *   for Bonded connections:
  *     The first up interface should be left on and all others downed.
  */
-int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
+static int __bond_release_one(struct net_device *bond_dev,
+			      struct net_device *slave_dev,
+			      bool all)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave, *oldcurrent;
@@ -1982,7 +1985,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	synchronize_net();
 	write_lock_bh(&bond->lock);
 
-	if (!bond->params.fail_over_mac) {
+	if (!all && !bond->params.fail_over_mac) {
 		if (ether_addr_equal(bond_dev->dev_addr, slave->perm_hwaddr) &&
 		    bond->slave_cnt > 1)
 			pr_warning("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s. Set the HWaddr of %s to a different address to avoid conflicts.\n",
@@ -2028,7 +2031,9 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 		write_lock_bh(&bond->lock);
 	}
 
-	if (oldcurrent == slave) {
+	if (all) {
+		bond->curr_active_slave = NULL;
+	} else if (oldcurrent == slave) {
 		/*
 		 * Note that we hold RTNL over this sequence, so there
 		 * is no concern that another slave add/remove event
@@ -2117,6 +2122,12 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	return 0;  /* deletion OK */
 }
 
+/* A wrapper used because of ndo_del_link */
+int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
+{
+	return __bond_release_one(bond_dev, slave_dev, false);
+}
+
 /*
 * First release a slave and then destroy the bond if no more slaves are left.
 * Must be under rtnl_lock when this function is called.
@@ -2138,118 +2149,6 @@ static int  bond_release_and_destroy(struct net_device *bond_dev,
 }
 
 /*
- * This function releases all slaves.
- */
-static int bond_release_all(struct net_device *bond_dev)
-{
-	struct bonding *bond = netdev_priv(bond_dev);
-	struct slave *slave;
-	struct net_device *slave_dev;
-	struct sockaddr addr;
-
-	write_lock_bh(&bond->lock);
-
-	netif_carrier_off(bond_dev);
-
-	if (bond->slave_cnt == 0)
-		goto out;
-
-	bond->current_arp_slave = NULL;
-	bond->primary_slave = NULL;
-	bond_change_active_slave(bond, NULL);
-
-	while ((slave = bond->first_slave) != NULL) {
-		/* Inform AD package of unbinding of slave
-		 * before slave is detached from the list.
-		 */
-		if (bond->params.mode == BOND_MODE_8023AD)
-			bond_3ad_unbind_slave(slave);
-
-		slave_dev = slave->dev;
-		bond_detach_slave(bond, slave);
-
-		/* now that the slave is detached, unlock and perform
-		 * all the undo steps that should not be called from
-		 * within a lock.
-		 */
-		write_unlock_bh(&bond->lock);
-
-		/* unregister rx_handler early so bond_handle_frame wouldn't
-		 * be called for this slave anymore.
-		 */
-		netdev_rx_handler_unregister(slave_dev);
-		synchronize_net();
-
-		if (bond_is_lb(bond)) {
-			/* must be called only after the slave
-			 * has been detached from the list
-			 */
-			bond_alb_deinit_slave(bond, slave);
-		}
-
-		bond_destroy_slave_symlinks(bond_dev, slave_dev);
-		bond_del_vlans_from_slave(bond, slave_dev);
-
-		/* If the mode USES_PRIMARY, then we should only remove its
-		 * promisc and mc settings if it was the curr_active_slave, but that was
-		 * already taken care of above when we detached the slave
-		 */
-		if (!USES_PRIMARY(bond->params.mode)) {
-			/* unset promiscuity level from slave */
-			if (bond_dev->flags & IFF_PROMISC)
-				dev_set_promiscuity(slave_dev, -1);
-
-			/* unset allmulti level from slave */
-			if (bond_dev->flags & IFF_ALLMULTI)
-				dev_set_allmulti(slave_dev, -1);
-
-			/* flush master's mc_list from slave */
-			netif_addr_lock_bh(bond_dev);
-			bond_mc_list_flush(bond_dev, slave_dev);
-			netif_addr_unlock_bh(bond_dev);
-		}
-
-		bond_upper_dev_unlink(bond_dev, slave_dev);
-
-		slave_disable_netpoll(slave);
-
-		/* close slave before restoring its mac address */
-		dev_close(slave_dev);
-
-		if (!bond->params.fail_over_mac) {
-			/* restore original ("permanent") mac address*/
-			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
-			addr.sa_family = slave_dev->type;
-			dev_set_mac_address(slave_dev, &addr);
-		}
-
-		kfree(slave);
-
-		/* re-acquire the lock before getting the next slave */
-		write_lock_bh(&bond->lock);
-	}
-
-	eth_hw_addr_random(bond_dev);
-	bond->dev_addr_from_first = true;
-
-	if (bond_vlan_used(bond)) {
-		pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
-			   bond_dev->name, bond_dev->name);
-		pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
-			   bond_dev->name);
-	}
-
-	pr_info("%s: released all slaves\n", bond_dev->name);
-
-out:
-	write_unlock_bh(&bond->lock);
-
-	bond_compute_features(bond);
-
-	return 0;
-}
-
-/*
  * This function changes the active slave to slave <slave_dev>.
  * It returns -EINVAL in the following cases.
  *  - <slave_dev> is not found in the list.
@@ -4440,7 +4339,9 @@ static void bond_uninit(struct net_device *bond_dev)
 	bond_netpoll_cleanup(bond_dev);
 
 	/* Release the bonded slaves */
-	bond_release_all(bond_dev);
+	while (bond->first_slave != NULL)
+		__bond_release_one(bond_dev, bond->first_slave->dev, true);
+	pr_info("%s: released all slaves\n", bond_dev->name);
 
 	list_del(&bond->bond_list);
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-19  0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
@ 2013-02-19  3:12   ` Jay Vosburgh
  2013-02-19  5:53     ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-19  3:12 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, andy

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>This patch fixes the following inconsistencies in bond_release_all:
>- IFF_BONDING flag is not stripped from slaves
>- MTU is not restored
>- no netdev notifiers are sent
>Instead of trying to keep bond_release and bond_release_all in sync
>I think we can re-use bond_release as the environment for calling it
>is correct (RTNL is held). I have been running tests for the past
>week and they came out successful. The only way for bond_release to fail
>is for the slave to be attached in a different bond or to not be a slave
>but that cannot happen as RTNL is held and no slave manipulations can be
>achieved.
>
>V2: As suggested bond_release is renamed to __bond_release_one with a
>new parameter "all" introduced so to avoid calling unnecessary code while
>destroying a bond, and a wrapper for it called bond_release is created
>because of ndo_del_link. bond_release_all() is removed.
>
>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

>---
> drivers/net/bonding/bond_main.c | 135 ++++++----------------------------------
> 1 file changed, 18 insertions(+), 117 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 94c1534..e242dd1 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1937,7 +1937,8 @@ err_undo_flags:
> /*
>  * Try to release the slave device <slave> from the bond device <master>
>  * It is legal to access curr_active_slave without a lock because all the function
>- * is write-locked.
>+ * is write-locked. If "all" is true it means that the function is being called
>+ * while destroying a bond interface and all slaves are being released.
>  *
>  * The rules for slave state should be:
>  *   for Active/Backup:
>@@ -1945,7 +1946,9 @@ err_undo_flags:
>  *   for Bonded connections:
>  *     The first up interface should be left on and all others downed.
>  */
>-int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
>+static int __bond_release_one(struct net_device *bond_dev,
>+			      struct net_device *slave_dev,
>+			      bool all)
> {
> 	struct bonding *bond = netdev_priv(bond_dev);
> 	struct slave *slave, *oldcurrent;
>@@ -1982,7 +1985,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> 	synchronize_net();
> 	write_lock_bh(&bond->lock);
>
>-	if (!bond->params.fail_over_mac) {
>+	if (!all && !bond->params.fail_over_mac) {
> 		if (ether_addr_equal(bond_dev->dev_addr, slave->perm_hwaddr) &&
> 		    bond->slave_cnt > 1)
> 			pr_warning("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s. Set the HWaddr of %s to a different address to avoid conflicts.\n",
>@@ -2028,7 +2031,9 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> 		write_lock_bh(&bond->lock);
> 	}
>
>-	if (oldcurrent == slave) {
>+	if (all) {
>+		bond->curr_active_slave = NULL;
>+	} else if (oldcurrent == slave) {
> 		/*
> 		 * Note that we hold RTNL over this sequence, so there
> 		 * is no concern that another slave add/remove event
>@@ -2117,6 +2122,12 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> 	return 0;  /* deletion OK */
> }
>
>+/* A wrapper used because of ndo_del_link */
>+int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
>+{
>+	return __bond_release_one(bond_dev, slave_dev, false);
>+}
>+
> /*
> * First release a slave and then destroy the bond if no more slaves are left.
> * Must be under rtnl_lock when this function is called.
>@@ -2138,118 +2149,6 @@ static int  bond_release_and_destroy(struct net_device *bond_dev,
> }
>
> /*
>- * This function releases all slaves.
>- */
>-static int bond_release_all(struct net_device *bond_dev)
>-{
>-	struct bonding *bond = netdev_priv(bond_dev);
>-	struct slave *slave;
>-	struct net_device *slave_dev;
>-	struct sockaddr addr;
>-
>-	write_lock_bh(&bond->lock);
>-
>-	netif_carrier_off(bond_dev);
>-
>-	if (bond->slave_cnt == 0)
>-		goto out;
>-
>-	bond->current_arp_slave = NULL;
>-	bond->primary_slave = NULL;
>-	bond_change_active_slave(bond, NULL);
>-
>-	while ((slave = bond->first_slave) != NULL) {
>-		/* Inform AD package of unbinding of slave
>-		 * before slave is detached from the list.
>-		 */
>-		if (bond->params.mode == BOND_MODE_8023AD)
>-			bond_3ad_unbind_slave(slave);
>-
>-		slave_dev = slave->dev;
>-		bond_detach_slave(bond, slave);
>-
>-		/* now that the slave is detached, unlock and perform
>-		 * all the undo steps that should not be called from
>-		 * within a lock.
>-		 */
>-		write_unlock_bh(&bond->lock);
>-
>-		/* unregister rx_handler early so bond_handle_frame wouldn't
>-		 * be called for this slave anymore.
>-		 */
>-		netdev_rx_handler_unregister(slave_dev);
>-		synchronize_net();
>-
>-		if (bond_is_lb(bond)) {
>-			/* must be called only after the slave
>-			 * has been detached from the list
>-			 */
>-			bond_alb_deinit_slave(bond, slave);
>-		}
>-
>-		bond_destroy_slave_symlinks(bond_dev, slave_dev);
>-		bond_del_vlans_from_slave(bond, slave_dev);
>-
>-		/* If the mode USES_PRIMARY, then we should only remove its
>-		 * promisc and mc settings if it was the curr_active_slave, but that was
>-		 * already taken care of above when we detached the slave
>-		 */
>-		if (!USES_PRIMARY(bond->params.mode)) {
>-			/* unset promiscuity level from slave */
>-			if (bond_dev->flags & IFF_PROMISC)
>-				dev_set_promiscuity(slave_dev, -1);
>-
>-			/* unset allmulti level from slave */
>-			if (bond_dev->flags & IFF_ALLMULTI)
>-				dev_set_allmulti(slave_dev, -1);
>-
>-			/* flush master's mc_list from slave */
>-			netif_addr_lock_bh(bond_dev);
>-			bond_mc_list_flush(bond_dev, slave_dev);
>-			netif_addr_unlock_bh(bond_dev);
>-		}
>-
>-		bond_upper_dev_unlink(bond_dev, slave_dev);
>-
>-		slave_disable_netpoll(slave);
>-
>-		/* close slave before restoring its mac address */
>-		dev_close(slave_dev);
>-
>-		if (!bond->params.fail_over_mac) {
>-			/* restore original ("permanent") mac address*/
>-			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
>-			addr.sa_family = slave_dev->type;
>-			dev_set_mac_address(slave_dev, &addr);
>-		}
>-
>-		kfree(slave);
>-
>-		/* re-acquire the lock before getting the next slave */
>-		write_lock_bh(&bond->lock);
>-	}
>-
>-	eth_hw_addr_random(bond_dev);
>-	bond->dev_addr_from_first = true;
>-
>-	if (bond_vlan_used(bond)) {
>-		pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
>-			   bond_dev->name, bond_dev->name);
>-		pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
>-			   bond_dev->name);
>-	}
>-
>-	pr_info("%s: released all slaves\n", bond_dev->name);
>-
>-out:
>-	write_unlock_bh(&bond->lock);
>-
>-	bond_compute_features(bond);
>-
>-	return 0;
>-}
>-
>-/*
>  * This function changes the active slave to slave <slave_dev>.
>  * It returns -EINVAL in the following cases.
>  *  - <slave_dev> is not found in the list.
>@@ -4440,7 +4339,9 @@ static void bond_uninit(struct net_device *bond_dev)
> 	bond_netpoll_cleanup(bond_dev);
>
> 	/* Release the bonded slaves */
>-	bond_release_all(bond_dev);
>+	while (bond->first_slave != NULL)
>+		__bond_release_one(bond_dev, bond->first_slave->dev, true);
>+	pr_info("%s: released all slaves\n", bond_dev->name);
>
> 	list_del(&bond->bond_list);
>
>-- 
>1.7.11.7
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
  2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
@ 2013-02-19  5:52   ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-19  5:52 UTC (permalink / raw)
  To: fubar; +Cc: nikolay, netdev, andy

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 18 Feb 2013 13:09:13 -0800

> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>>port->slave can be NULL since it's being initialized in bond_enslave
>>thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
>>Also fix a minor bug, which could cause a port not to have
>>AD_STATE_LACP_TIMEOUT since there's no sync between
>>bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
>>the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> 
>>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>

Applied

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
  2013-02-18 21:33   ` Jay Vosburgh
  2013-02-18 21:51     ` Nikolay Aleksandrov
@ 2013-02-19  5:52     ` David Miller
  1 sibling, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-19  5:52 UTC (permalink / raw)
  To: fubar; +Cc: nikolay, netdev, andy

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 18 Feb 2013 13:33:10 -0800

> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>>The 3ad machine state spinlock can be used before it is inititialized
>>while doing bond_enslave() (and the port is being initialized) since
>>port->slave is set before the lock is prepared, thus causing soft
>>lock-ups and a multitude of other nasty bugs.
> 
> 	Does this change cause the "uninitialized port" warnings in
> bond_3ad_state_machine_handler and bond_3ad_rx_indication to
> intermittently print during the enslavement process?  If so (and it
> looks to me like it will), I think the warnings should be removed, since
> after this change, port->slave being NULL isn't really an error
> condition that needs a warning to the log.
> 
>>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
 ...
>>-static inline void __initialize_port_locks(struct port *port)
>>+static inline void __initialize_port_locks(struct slave *port)
>> {
>> 	// make sure it isn't called twice
>>-	spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
>>+	spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
> 
> 	Change the name of the variable here, too, not just the type.
> This is confusing.

I made this adjustment and applied Nikolay's patch.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies
  2013-02-19  3:12   ` Jay Vosburgh
@ 2013-02-19  5:53     ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-19  5:53 UTC (permalink / raw)
  To: fubar; +Cc: nikolay, netdev, andy

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 18 Feb 2013 19:12:01 -0800

> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>>This patch fixes the following inconsistencies in bond_release_all:
>>- IFF_BONDING flag is not stripped from slaves
>>- MTU is not restored
>>- no netdev notifiers are sent
>>Instead of trying to keep bond_release and bond_release_all in sync
>>I think we can re-use bond_release as the environment for calling it
>>is correct (RTNL is held). I have been running tests for the past
>>week and they came out successful. The only way for bond_release to fail
>>is for the slave to be attached in a different bond or to not be a slave
>>but that cannot happen as RTNL is held and no slave manipulations can be
>>achieved.
>>
>>V2: As suggested bond_release is renamed to __bond_release_one with a
>>new parameter "all" introduced so to avoid calling unnecessary code while
>>destroying a bond, and a wrapper for it called bond_release is created
>>because of ndo_del_link. bond_release_all() is removed.
>>
>>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

Applied.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-02-19  5:53 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
2013-02-18 21:33   ` Jay Vosburgh
2013-02-18 21:51     ` Nikolay Aleksandrov
2013-02-19  5:52     ` David Miller
2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
2013-02-18 21:56   ` Jay Vosburgh
2013-02-18 22:13     ` Nikolay Aleksandrov
2013-02-18 23:17       ` Jay Vosburgh
2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
2013-02-19  5:52   ` David Miller
2013-02-19  0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
2013-02-19  3:12   ` Jay Vosburgh
2013-02-19  5:53     ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.