* [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
@ 2013-02-18 17:59 Nikolay Aleksandrov
2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 17:59 UTC (permalink / raw)
To: netdev; +Cc: davem, fubar, andy
port->slave can be NULL since it's being initialized in bond_enslave
thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
Also fix a minor bug, which could cause a port not to have
AD_STATE_LACP_TIMEOUT since there's no sync between
bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
drivers/net/bonding/bond_3ad.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index a030e63..1720742 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2494,11 +2494,13 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
struct port *port = NULL;
int lacp_fast;
- read_lock(&bond->lock);
+ write_lock_bh(&bond->lock);
lacp_fast = bond->params.lacp_fast;
bond_for_each_slave(bond, slave, i) {
port = &(SLAVE_AD_INFO(slave).port);
+ if (port->slave == NULL)
+ continue;
__get_state_machine_lock(port);
if (lacp_fast)
port->actor_oper_port_state |= AD_STATE_LACP_TIMEOUT;
@@ -2507,5 +2509,5 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
__release_state_machine_lock(port);
}
- read_unlock(&bond->lock);
+ write_unlock_bh(&bond->lock);
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
@ 2013-02-18 17:59 ` Nikolay Aleksandrov
2013-02-18 21:33 ` Jay Vosburgh
2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 17:59 UTC (permalink / raw)
To: netdev; +Cc: davem, fubar, andy
The 3ad machine state spinlock can be used before it is inititialized
while doing bond_enslave() (and the port is being initialized) since
port->slave is set before the lock is prepared, thus causing soft
lock-ups and a multitude of other nasty bugs.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
drivers/net/bonding/bond_3ad.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 1720742..96d471e 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -389,13 +389,13 @@ static u8 __get_duplex(struct port *port)
/**
* __initialize_port_locks - initialize a port's STATE machine spinlock
- * @port: the port we're looking at
+ * @port: the slave of the port we're looking at
*
*/
-static inline void __initialize_port_locks(struct port *port)
+static inline void __initialize_port_locks(struct slave *port)
{
// make sure it isn't called twice
- spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
+ spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
}
//conversions
@@ -1910,6 +1910,7 @@ int bond_3ad_bind_slave(struct slave *slave)
ad_initialize_port(port, bond->params.lacp_fast);
+ __initialize_port_locks(slave);
port->slave = slave;
port->actor_port_number = SLAVE_AD_INFO(slave).id;
// key is determined according to the link speed, duplex and user key(which is yet not supported)
@@ -1932,8 +1933,6 @@ int bond_3ad_bind_slave(struct slave *slave)
port->next_port_in_aggregator = NULL;
__disable_port(port);
- __initialize_port_locks(port);
-
// aggregator initialization
aggregator = &(SLAVE_AD_INFO(slave).aggregator);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
@ 2013-02-18 17:59 ` Nikolay Aleksandrov
2013-02-18 21:56 ` Jay Vosburgh
2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
2013-02-19 0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 17:59 UTC (permalink / raw)
To: netdev; +Cc: davem, fubar, andy
This patch fixes the following inconsistencies in bond_release_all:
- IFF_BONDING flag is not stripped from slaves
- MTU is not restored
- no netdev notifiers are sent
Instead of trying to keep bond_release and bond_release_all in sync
I think we can re-use bond_release as the environment for calling it
is correct (RTNL is held). I have been running tests for the past
week and they came out successful. The only way for bond_release to fail
is for the slave to be attached in a different bond or to not be a slave
but that cannot happen as RTNL is held and no slave manipulations can be
achieved.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
drivers/net/bonding/bond_main.c | 106 ++--------------------------------------
1 file changed, 5 insertions(+), 101 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 94c1534..fcfc880 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2140,113 +2140,17 @@ static int bond_release_and_destroy(struct net_device *bond_dev,
/*
* This function releases all slaves.
*/
-static int bond_release_all(struct net_device *bond_dev)
+static void bond_release_all(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
- struct slave *slave;
- struct net_device *slave_dev;
- struct sockaddr addr;
-
- write_lock_bh(&bond->lock);
-
- netif_carrier_off(bond_dev);
if (bond->slave_cnt == 0)
- goto out;
-
- bond->current_arp_slave = NULL;
- bond->primary_slave = NULL;
- bond_change_active_slave(bond, NULL);
-
- while ((slave = bond->first_slave) != NULL) {
- /* Inform AD package of unbinding of slave
- * before slave is detached from the list.
- */
- if (bond->params.mode == BOND_MODE_8023AD)
- bond_3ad_unbind_slave(slave);
-
- slave_dev = slave->dev;
- bond_detach_slave(bond, slave);
-
- /* now that the slave is detached, unlock and perform
- * all the undo steps that should not be called from
- * within a lock.
- */
- write_unlock_bh(&bond->lock);
-
- /* unregister rx_handler early so bond_handle_frame wouldn't
- * be called for this slave anymore.
- */
- netdev_rx_handler_unregister(slave_dev);
- synchronize_net();
-
- if (bond_is_lb(bond)) {
- /* must be called only after the slave
- * has been detached from the list
- */
- bond_alb_deinit_slave(bond, slave);
- }
-
- bond_destroy_slave_symlinks(bond_dev, slave_dev);
- bond_del_vlans_from_slave(bond, slave_dev);
-
- /* If the mode USES_PRIMARY, then we should only remove its
- * promisc and mc settings if it was the curr_active_slave, but that was
- * already taken care of above when we detached the slave
- */
- if (!USES_PRIMARY(bond->params.mode)) {
- /* unset promiscuity level from slave */
- if (bond_dev->flags & IFF_PROMISC)
- dev_set_promiscuity(slave_dev, -1);
-
- /* unset allmulti level from slave */
- if (bond_dev->flags & IFF_ALLMULTI)
- dev_set_allmulti(slave_dev, -1);
-
- /* flush master's mc_list from slave */
- netif_addr_lock_bh(bond_dev);
- bond_mc_list_flush(bond_dev, slave_dev);
- netif_addr_unlock_bh(bond_dev);
- }
-
- bond_upper_dev_unlink(bond_dev, slave_dev);
-
- slave_disable_netpoll(slave);
-
- /* close slave before restoring its mac address */
- dev_close(slave_dev);
-
- if (!bond->params.fail_over_mac) {
- /* restore original ("permanent") mac address*/
- memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
- addr.sa_family = slave_dev->type;
- dev_set_mac_address(slave_dev, &addr);
- }
-
- kfree(slave);
-
- /* re-acquire the lock before getting the next slave */
- write_lock_bh(&bond->lock);
- }
-
- eth_hw_addr_random(bond_dev);
- bond->dev_addr_from_first = true;
-
- if (bond_vlan_used(bond)) {
- pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
- bond_dev->name, bond_dev->name);
- pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
- bond_dev->name);
- }
-
+ return;
+ while (bond->first_slave != NULL)
+ bond_release(bond_dev, bond->first_slave->dev);
pr_info("%s: released all slaves\n", bond_dev->name);
-out:
- write_unlock_bh(&bond->lock);
-
- bond_compute_features(bond);
-
- return 0;
+ return;
}
/*
--
1.7.11.7
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
@ 2013-02-18 21:09 ` Jay Vosburgh
2013-02-19 5:52 ` David Miller
2013-02-19 0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
3 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 21:09 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: netdev, davem, andy
Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>port->slave can be NULL since it's being initialized in bond_enslave
>thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
>Also fix a minor bug, which could cause a port not to have
>AD_STATE_LACP_TIMEOUT since there's no sync between
>bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
>the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_3ad.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index a030e63..1720742 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -2494,11 +2494,13 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
> struct port *port = NULL;
> int lacp_fast;
>
>- read_lock(&bond->lock);
>+ write_lock_bh(&bond->lock);
> lacp_fast = bond->params.lacp_fast;
>
> bond_for_each_slave(bond, slave, i) {
> port = &(SLAVE_AD_INFO(slave).port);
>+ if (port->slave == NULL)
>+ continue;
> __get_state_machine_lock(port);
> if (lacp_fast)
> port->actor_oper_port_state |= AD_STATE_LACP_TIMEOUT;
>@@ -2507,5 +2509,5 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
> __release_state_machine_lock(port);
> }
>
>- read_unlock(&bond->lock);
>+ write_unlock_bh(&bond->lock);
> }
>--
>1.7.11.7
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
@ 2013-02-18 21:33 ` Jay Vosburgh
2013-02-18 21:51 ` Nikolay Aleksandrov
2013-02-19 5:52 ` David Miller
0 siblings, 2 replies; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 21:33 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: netdev, davem, andy
Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>The 3ad machine state spinlock can be used before it is inititialized
>while doing bond_enslave() (and the port is being initialized) since
>port->slave is set before the lock is prepared, thus causing soft
>lock-ups and a multitude of other nasty bugs.
Does this change cause the "uninitialized port" warnings in
bond_3ad_state_machine_handler and bond_3ad_rx_indication to
intermittently print during the enslavement process? If so (and it
looks to me like it will), I think the warnings should be removed, since
after this change, port->slave being NULL isn't really an error
condition that needs a warning to the log.
>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_3ad.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index 1720742..96d471e 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -389,13 +389,13 @@ static u8 __get_duplex(struct port *port)
>
> /**
> * __initialize_port_locks - initialize a port's STATE machine spinlock
>- * @port: the port we're looking at
>+ * @port: the slave of the port we're looking at
> *
> */
>-static inline void __initialize_port_locks(struct port *port)
>+static inline void __initialize_port_locks(struct slave *port)
> {
> // make sure it isn't called twice
>- spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
>+ spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
Change the name of the variable here, too, not just the type.
This is confusing.
-J
> }
>
> //conversions
>@@ -1910,6 +1910,7 @@ int bond_3ad_bind_slave(struct slave *slave)
>
> ad_initialize_port(port, bond->params.lacp_fast);
>
>+ __initialize_port_locks(slave);
> port->slave = slave;
> port->actor_port_number = SLAVE_AD_INFO(slave).id;
> // key is determined according to the link speed, duplex and user key(which is yet not supported)
>@@ -1932,8 +1933,6 @@ int bond_3ad_bind_slave(struct slave *slave)
> port->next_port_in_aggregator = NULL;
>
> __disable_port(port);
>- __initialize_port_locks(port);
>-
>
> // aggregator initialization
> aggregator = &(SLAVE_AD_INFO(slave).aggregator);
>--
>1.7.11.7
>
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
2013-02-18 21:33 ` Jay Vosburgh
@ 2013-02-18 21:51 ` Nikolay Aleksandrov
2013-02-19 5:52 ` David Miller
1 sibling, 0 replies; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 21:51 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, davem, andy
On 18/02/13 22:33, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>
>> The 3ad machine state spinlock can be used before it is inititialized
>> while doing bond_enslave() (and the port is being initialized) since
>> port->slave is set before the lock is prepared, thus causing soft
>> lock-ups and a multitude of other nasty bugs.
>
> Does this change cause the "uninitialized port" warnings in
> bond_3ad_state_machine_handler and bond_3ad_rx_indication to
> intermittently print during the enslavement process? If so (and it
> looks to me like it will), I think the warnings should be removed, since
> after this change, port->slave being NULL isn't really an error
> condition that needs a warning to the log.
>
This change couldn't cause that, it only initializes the spin lock
before the slave is set, currently after the first patch of this series
this is no longer a requirement as far as I can tell the only code that
can access the lock before the slave is set was that one, but it still
is a bug that can manifest later. I don't think it has anything to do
with the warnings, the only change is that the spin lock is initialized
prior to setting the slave to the port.
Am I missing something here ?
>> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>> ---
>> drivers/net/bonding/bond_3ad.c | 9 ++++-----
>> 1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index 1720742..96d471e 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -389,13 +389,13 @@ static u8 __get_duplex(struct port *port)
>>
>> /**
>> * __initialize_port_locks - initialize a port's STATE machine spinlock
>> - * @port: the port we're looking at
>> + * @port: the slave of the port we're looking at
>> *
>> */
>> -static inline void __initialize_port_locks(struct port *port)
>> +static inline void __initialize_port_locks(struct slave *port)
>> {
>> // make sure it isn't called twice
>> - spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
>> + spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
>
> Change the name of the variable here, too, not just the type.
> This is confusing.
>
> -J
Thanks, I saw that after posting, I have prepared this change already.
>
>> }
>>
>> //conversions
>> @@ -1910,6 +1910,7 @@ int bond_3ad_bind_slave(struct slave *slave)
>>
>> ad_initialize_port(port, bond->params.lacp_fast);
>>
>> + __initialize_port_locks(slave);
>> port->slave = slave;
>> port->actor_port_number = SLAVE_AD_INFO(slave).id;
>> // key is determined according to the link speed, duplex and user key(which is yet not supported)
>> @@ -1932,8 +1933,6 @@ int bond_3ad_bind_slave(struct slave *slave)
>> port->next_port_in_aggregator = NULL;
>>
>> __disable_port(port);
>> - __initialize_port_locks(port);
>> -
>>
>> // aggregator initialization
>> aggregator = &(SLAVE_AD_INFO(slave).aggregator);
>> --
>> 1.7.11.7
>>
>
> ---
> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
@ 2013-02-18 21:56 ` Jay Vosburgh
2013-02-18 22:13 ` Nikolay Aleksandrov
0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 21:56 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: netdev, davem, andy
Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>This patch fixes the following inconsistencies in bond_release_all:
>- IFF_BONDING flag is not stripped from slaves
>- MTU is not restored
>- no netdev notifiers are sent
>Instead of trying to keep bond_release and bond_release_all in sync
>I think we can re-use bond_release as the environment for calling it
>is correct (RTNL is held). I have been running tests for the past
>week and they came out successful. The only way for bond_release to fail
>is for the slave to be attached in a different bond or to not be a slave
>but that cannot happen as RTNL is held and no slave manipulations can be
>achieved.
It might be worthwhile to add an "all" argument to bond_release
that skips some things that don't make sense if all slaves are being
released. I'm thinking in particular of this block:
if (oldcurrent == slave) {
/*
* Note that we hold RTNL over this sequence, so there
* is no concern that another slave add/remove event
* will interfere.
*/
write_unlock_bh(&bond->lock);
read_lock(&bond->lock);
write_lock_bh(&bond->curr_slave_lock);
bond_select_active_slave(bond);
write_unlock_bh(&bond->curr_slave_lock);
read_unlock(&bond->lock);
write_lock_bh(&bond->lock);
}
as it's written now, for the release all case, the code may go
to the trouble of assigning a new active slave each time one slave is
removed (including various log messages, maybe sending IGMPs, etc). If
all slaves are being removed, that's pointless. This could be something
like:
if (release_all) {
bond->curr_active_slave = NULL;
} else if (oldcurrent == slave) {
[ the current block of stuff ]
}
it's safe here to unconditionally set curr_active_slave to NULL
because we hold bond->lock for write. The lock dance stuff for the
bond_select_active_slave() call is to satisfy its locking requirements.
-J
>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_main.c | 106 ++--------------------------------------
> 1 file changed, 5 insertions(+), 101 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 94c1534..fcfc880 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2140,113 +2140,17 @@ static int bond_release_and_destroy(struct net_device *bond_dev,
> /*
> * This function releases all slaves.
> */
>-static int bond_release_all(struct net_device *bond_dev)
>+static void bond_release_all(struct net_device *bond_dev)
> {
> struct bonding *bond = netdev_priv(bond_dev);
>- struct slave *slave;
>- struct net_device *slave_dev;
>- struct sockaddr addr;
>-
>- write_lock_bh(&bond->lock);
>-
>- netif_carrier_off(bond_dev);
>
> if (bond->slave_cnt == 0)
>- goto out;
>-
>- bond->current_arp_slave = NULL;
>- bond->primary_slave = NULL;
>- bond_change_active_slave(bond, NULL);
>-
>- while ((slave = bond->first_slave) != NULL) {
>- /* Inform AD package of unbinding of slave
>- * before slave is detached from the list.
>- */
>- if (bond->params.mode == BOND_MODE_8023AD)
>- bond_3ad_unbind_slave(slave);
>-
>- slave_dev = slave->dev;
>- bond_detach_slave(bond, slave);
>-
>- /* now that the slave is detached, unlock and perform
>- * all the undo steps that should not be called from
>- * within a lock.
>- */
>- write_unlock_bh(&bond->lock);
>-
>- /* unregister rx_handler early so bond_handle_frame wouldn't
>- * be called for this slave anymore.
>- */
>- netdev_rx_handler_unregister(slave_dev);
>- synchronize_net();
>-
>- if (bond_is_lb(bond)) {
>- /* must be called only after the slave
>- * has been detached from the list
>- */
>- bond_alb_deinit_slave(bond, slave);
>- }
>-
>- bond_destroy_slave_symlinks(bond_dev, slave_dev);
>- bond_del_vlans_from_slave(bond, slave_dev);
>-
>- /* If the mode USES_PRIMARY, then we should only remove its
>- * promisc and mc settings if it was the curr_active_slave, but that was
>- * already taken care of above when we detached the slave
>- */
>- if (!USES_PRIMARY(bond->params.mode)) {
>- /* unset promiscuity level from slave */
>- if (bond_dev->flags & IFF_PROMISC)
>- dev_set_promiscuity(slave_dev, -1);
>-
>- /* unset allmulti level from slave */
>- if (bond_dev->flags & IFF_ALLMULTI)
>- dev_set_allmulti(slave_dev, -1);
>-
>- /* flush master's mc_list from slave */
>- netif_addr_lock_bh(bond_dev);
>- bond_mc_list_flush(bond_dev, slave_dev);
>- netif_addr_unlock_bh(bond_dev);
>- }
>-
>- bond_upper_dev_unlink(bond_dev, slave_dev);
>-
>- slave_disable_netpoll(slave);
>-
>- /* close slave before restoring its mac address */
>- dev_close(slave_dev);
>-
>- if (!bond->params.fail_over_mac) {
>- /* restore original ("permanent") mac address*/
>- memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
>- addr.sa_family = slave_dev->type;
>- dev_set_mac_address(slave_dev, &addr);
>- }
>-
>- kfree(slave);
>-
>- /* re-acquire the lock before getting the next slave */
>- write_lock_bh(&bond->lock);
>- }
>-
>- eth_hw_addr_random(bond_dev);
>- bond->dev_addr_from_first = true;
>-
>- if (bond_vlan_used(bond)) {
>- pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
>- bond_dev->name, bond_dev->name);
>- pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
>- bond_dev->name);
>- }
>-
>+ return;
>+ while (bond->first_slave != NULL)
>+ bond_release(bond_dev, bond->first_slave->dev);
> pr_info("%s: released all slaves\n", bond_dev->name);
>
>-out:
>- write_unlock_bh(&bond->lock);
>-
>- bond_compute_features(bond);
>-
>- return 0;
>+ return;
> }
>
> /*
>--
>1.7.11.7
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
2013-02-18 21:56 ` Jay Vosburgh
@ 2013-02-18 22:13 ` Nikolay Aleksandrov
2013-02-18 23:17 ` Jay Vosburgh
0 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-18 22:13 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, davem, andy
On 18/02/13 22:56, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>
>> This patch fixes the following inconsistencies in bond_release_all:
>> - IFF_BONDING flag is not stripped from slaves
>> - MTU is not restored
>> - no netdev notifiers are sent
>> Instead of trying to keep bond_release and bond_release_all in sync
>> I think we can re-use bond_release as the environment for calling it
>> is correct (RTNL is held). I have been running tests for the past
>> week and they came out successful. The only way for bond_release to fail
>> is for the slave to be attached in a different bond or to not be a slave
>> but that cannot happen as RTNL is held and no slave manipulations can be
>> achieved.
>
> It might be worthwhile to add an "all" argument to bond_release
> that skips some things that don't make sense if all slaves are being
> released. I'm thinking in particular of this block:
>
> if (oldcurrent == slave) {
> /*
> * Note that we hold RTNL over this sequence, so there
> * is no concern that another slave add/remove event
> * will interfere.
> */
> write_unlock_bh(&bond->lock);
> read_lock(&bond->lock);
> write_lock_bh(&bond->curr_slave_lock);
>
> bond_select_active_slave(bond);
>
> write_unlock_bh(&bond->curr_slave_lock);
> read_unlock(&bond->lock);
> write_lock_bh(&bond->lock);
> }
>
> as it's written now, for the release all case, the code may go
> to the trouble of assigning a new active slave each time one slave is
> removed (including various log messages, maybe sending IGMPs, etc). If
> all slaves are being removed, that's pointless. This could be something
> like:
>
> if (release_all) {
> bond->curr_active_slave = NULL;
> } else if (oldcurrent == slave) {
> [ the current block of stuff ]
> }
>
> it's safe here to unconditionally set curr_active_slave to NULL
> because we hold bond->lock for write. The lock dance stuff for the
> bond_select_active_slave() call is to satisfy its locking requirements.
>
> -J
I see your point and I agree. I will prepare another version that
incorporates it, although I can't add it as an argument since
bond_release is used as ndo_del_slave. I'll have to make it a global
variable.
Nik
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies
2013-02-18 22:13 ` Nikolay Aleksandrov
@ 2013-02-18 23:17 ` Jay Vosburgh
0 siblings, 0 replies; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-18 23:17 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: netdev, davem, andy
Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>On 18/02/13 22:56, Jay Vosburgh wrote:
>> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>>
>>> This patch fixes the following inconsistencies in bond_release_all:
>>> - IFF_BONDING flag is not stripped from slaves
>>> - MTU is not restored
>>> - no netdev notifiers are sent
>>> Instead of trying to keep bond_release and bond_release_all in sync
>>> I think we can re-use bond_release as the environment for calling it
>>> is correct (RTNL is held). I have been running tests for the past
>>> week and they came out successful. The only way for bond_release to fail
>>> is for the slave to be attached in a different bond or to not be a slave
>>> but that cannot happen as RTNL is held and no slave manipulations can be
>>> achieved.
>>
>> It might be worthwhile to add an "all" argument to bond_release
>> that skips some things that don't make sense if all slaves are being
>> released. I'm thinking in particular of this block:
>>
>> if (oldcurrent == slave) {
>> /*
>> * Note that we hold RTNL over this sequence, so there
>> * is no concern that another slave add/remove event
>> * will interfere.
>> */
>> write_unlock_bh(&bond->lock);
>> read_lock(&bond->lock);
>> write_lock_bh(&bond->curr_slave_lock);
>>
>> bond_select_active_slave(bond);
>>
>> write_unlock_bh(&bond->curr_slave_lock);
>> read_unlock(&bond->lock);
>> write_lock_bh(&bond->lock);
>> }
>>
>> as it's written now, for the release all case, the code may go
>> to the trouble of assigning a new active slave each time one slave is
>> removed (including various log messages, maybe sending IGMPs, etc). If
>> all slaves are being removed, that's pointless. This could be something
>> like:
>>
>> if (release_all) {
>> bond->curr_active_slave = NULL;
>> } else if (oldcurrent == slave) {
>> [ the current block of stuff ]
>> }
>>
>> it's safe here to unconditionally set curr_active_slave to NULL
>> because we hold bond->lock for write. The lock dance stuff for the
>> bond_select_active_slave() call is to satisfy its locking requirements.
>>
>> -J
>I see your point and I agree. I will prepare another version that
>incorporates it, although I can't add it as an argument since
>bond_release is used as ndo_del_slave. I'll have to make it a global
>variable.
No, just rename the current bond_release to __bond_release_one,
add the extra argument, and create a new bond_release .ndo_del_slave
that calls __bond_release_one with "all=0". Then, bond_release_all
calls __bond_release_one with all=1.
Also, there's only one caller of bond_release_all, and since the
new & improved bond_release_all is trivial, it could be open coded into
bond_uninit, eliminating bond_release_all as a function.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies
2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
` (2 preceding siblings ...)
2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
@ 2013-02-19 0:09 ` Nikolay Aleksandrov
2013-02-19 3:12 ` Jay Vosburgh
3 siblings, 1 reply; 14+ messages in thread
From: Nikolay Aleksandrov @ 2013-02-19 0:09 UTC (permalink / raw)
To: netdev; +Cc: andy, fubar
This patch fixes the following inconsistencies in bond_release_all:
- IFF_BONDING flag is not stripped from slaves
- MTU is not restored
- no netdev notifiers are sent
Instead of trying to keep bond_release and bond_release_all in sync
I think we can re-use bond_release as the environment for calling it
is correct (RTNL is held). I have been running tests for the past
week and they came out successful. The only way for bond_release to fail
is for the slave to be attached in a different bond or to not be a slave
but that cannot happen as RTNL is held and no slave manipulations can be
achieved.
V2: As suggested bond_release is renamed to __bond_release_one with a
new parameter "all" introduced so to avoid calling unnecessary code while
destroying a bond, and a wrapper for it called bond_release is created
because of ndo_del_link. bond_release_all() is removed.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
drivers/net/bonding/bond_main.c | 135 ++++++----------------------------------
1 file changed, 18 insertions(+), 117 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 94c1534..e242dd1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1937,7 +1937,8 @@ err_undo_flags:
/*
* Try to release the slave device <slave> from the bond device <master>
* It is legal to access curr_active_slave without a lock because all the function
- * is write-locked.
+ * is write-locked. If "all" is true it means that the function is being called
+ * while destroying a bond interface and all slaves are being released.
*
* The rules for slave state should be:
* for Active/Backup:
@@ -1945,7 +1946,9 @@ err_undo_flags:
* for Bonded connections:
* The first up interface should be left on and all others downed.
*/
-int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
+static int __bond_release_one(struct net_device *bond_dev,
+ struct net_device *slave_dev,
+ bool all)
{
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave, *oldcurrent;
@@ -1982,7 +1985,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
synchronize_net();
write_lock_bh(&bond->lock);
- if (!bond->params.fail_over_mac) {
+ if (!all && !bond->params.fail_over_mac) {
if (ether_addr_equal(bond_dev->dev_addr, slave->perm_hwaddr) &&
bond->slave_cnt > 1)
pr_warning("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s. Set the HWaddr of %s to a different address to avoid conflicts.\n",
@@ -2028,7 +2031,9 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
write_lock_bh(&bond->lock);
}
- if (oldcurrent == slave) {
+ if (all) {
+ bond->curr_active_slave = NULL;
+ } else if (oldcurrent == slave) {
/*
* Note that we hold RTNL over this sequence, so there
* is no concern that another slave add/remove event
@@ -2117,6 +2122,12 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
return 0; /* deletion OK */
}
+/* A wrapper used because of ndo_del_link */
+int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
+{
+ return __bond_release_one(bond_dev, slave_dev, false);
+}
+
/*
* First release a slave and then destroy the bond if no more slaves are left.
* Must be under rtnl_lock when this function is called.
@@ -2138,118 +2149,6 @@ static int bond_release_and_destroy(struct net_device *bond_dev,
}
/*
- * This function releases all slaves.
- */
-static int bond_release_all(struct net_device *bond_dev)
-{
- struct bonding *bond = netdev_priv(bond_dev);
- struct slave *slave;
- struct net_device *slave_dev;
- struct sockaddr addr;
-
- write_lock_bh(&bond->lock);
-
- netif_carrier_off(bond_dev);
-
- if (bond->slave_cnt == 0)
- goto out;
-
- bond->current_arp_slave = NULL;
- bond->primary_slave = NULL;
- bond_change_active_slave(bond, NULL);
-
- while ((slave = bond->first_slave) != NULL) {
- /* Inform AD package of unbinding of slave
- * before slave is detached from the list.
- */
- if (bond->params.mode == BOND_MODE_8023AD)
- bond_3ad_unbind_slave(slave);
-
- slave_dev = slave->dev;
- bond_detach_slave(bond, slave);
-
- /* now that the slave is detached, unlock and perform
- * all the undo steps that should not be called from
- * within a lock.
- */
- write_unlock_bh(&bond->lock);
-
- /* unregister rx_handler early so bond_handle_frame wouldn't
- * be called for this slave anymore.
- */
- netdev_rx_handler_unregister(slave_dev);
- synchronize_net();
-
- if (bond_is_lb(bond)) {
- /* must be called only after the slave
- * has been detached from the list
- */
- bond_alb_deinit_slave(bond, slave);
- }
-
- bond_destroy_slave_symlinks(bond_dev, slave_dev);
- bond_del_vlans_from_slave(bond, slave_dev);
-
- /* If the mode USES_PRIMARY, then we should only remove its
- * promisc and mc settings if it was the curr_active_slave, but that was
- * already taken care of above when we detached the slave
- */
- if (!USES_PRIMARY(bond->params.mode)) {
- /* unset promiscuity level from slave */
- if (bond_dev->flags & IFF_PROMISC)
- dev_set_promiscuity(slave_dev, -1);
-
- /* unset allmulti level from slave */
- if (bond_dev->flags & IFF_ALLMULTI)
- dev_set_allmulti(slave_dev, -1);
-
- /* flush master's mc_list from slave */
- netif_addr_lock_bh(bond_dev);
- bond_mc_list_flush(bond_dev, slave_dev);
- netif_addr_unlock_bh(bond_dev);
- }
-
- bond_upper_dev_unlink(bond_dev, slave_dev);
-
- slave_disable_netpoll(slave);
-
- /* close slave before restoring its mac address */
- dev_close(slave_dev);
-
- if (!bond->params.fail_over_mac) {
- /* restore original ("permanent") mac address*/
- memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
- addr.sa_family = slave_dev->type;
- dev_set_mac_address(slave_dev, &addr);
- }
-
- kfree(slave);
-
- /* re-acquire the lock before getting the next slave */
- write_lock_bh(&bond->lock);
- }
-
- eth_hw_addr_random(bond_dev);
- bond->dev_addr_from_first = true;
-
- if (bond_vlan_used(bond)) {
- pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
- bond_dev->name, bond_dev->name);
- pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
- bond_dev->name);
- }
-
- pr_info("%s: released all slaves\n", bond_dev->name);
-
-out:
- write_unlock_bh(&bond->lock);
-
- bond_compute_features(bond);
-
- return 0;
-}
-
-/*
* This function changes the active slave to slave <slave_dev>.
* It returns -EINVAL in the following cases.
* - <slave_dev> is not found in the list.
@@ -4440,7 +4339,9 @@ static void bond_uninit(struct net_device *bond_dev)
bond_netpoll_cleanup(bond_dev);
/* Release the bonded slaves */
- bond_release_all(bond_dev);
+ while (bond->first_slave != NULL)
+ __bond_release_one(bond_dev, bond->first_slave->dev, true);
+ pr_info("%s: released all slaves\n", bond_dev->name);
list_del(&bond->bond_list);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies
2013-02-19 0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
@ 2013-02-19 3:12 ` Jay Vosburgh
2013-02-19 5:53 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Jay Vosburgh @ 2013-02-19 3:12 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: netdev, andy
Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>This patch fixes the following inconsistencies in bond_release_all:
>- IFF_BONDING flag is not stripped from slaves
>- MTU is not restored
>- no netdev notifiers are sent
>Instead of trying to keep bond_release and bond_release_all in sync
>I think we can re-use bond_release as the environment for calling it
>is correct (RTNL is held). I have been running tests for the past
>week and they came out successful. The only way for bond_release to fail
>is for the slave to be attached in a different bond or to not be a slave
>but that cannot happen as RTNL is held and no slave manipulations can be
>achieved.
>
>V2: As suggested bond_release is renamed to __bond_release_one with a
>new parameter "all" introduced so to avoid calling unnecessary code while
>destroying a bond, and a wrapper for it called bond_release is created
>because of ndo_del_link. bond_release_all() is removed.
>
>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>---
> drivers/net/bonding/bond_main.c | 135 ++++++----------------------------------
> 1 file changed, 18 insertions(+), 117 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 94c1534..e242dd1 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1937,7 +1937,8 @@ err_undo_flags:
> /*
> * Try to release the slave device <slave> from the bond device <master>
> * It is legal to access curr_active_slave without a lock because all the function
>- * is write-locked.
>+ * is write-locked. If "all" is true it means that the function is being called
>+ * while destroying a bond interface and all slaves are being released.
> *
> * The rules for slave state should be:
> * for Active/Backup:
>@@ -1945,7 +1946,9 @@ err_undo_flags:
> * for Bonded connections:
> * The first up interface should be left on and all others downed.
> */
>-int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
>+static int __bond_release_one(struct net_device *bond_dev,
>+ struct net_device *slave_dev,
>+ bool all)
> {
> struct bonding *bond = netdev_priv(bond_dev);
> struct slave *slave, *oldcurrent;
>@@ -1982,7 +1985,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> synchronize_net();
> write_lock_bh(&bond->lock);
>
>- if (!bond->params.fail_over_mac) {
>+ if (!all && !bond->params.fail_over_mac) {
> if (ether_addr_equal(bond_dev->dev_addr, slave->perm_hwaddr) &&
> bond->slave_cnt > 1)
> pr_warning("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s. Set the HWaddr of %s to a different address to avoid conflicts.\n",
>@@ -2028,7 +2031,9 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> write_lock_bh(&bond->lock);
> }
>
>- if (oldcurrent == slave) {
>+ if (all) {
>+ bond->curr_active_slave = NULL;
>+ } else if (oldcurrent == slave) {
> /*
> * Note that we hold RTNL over this sequence, so there
> * is no concern that another slave add/remove event
>@@ -2117,6 +2122,12 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> return 0; /* deletion OK */
> }
>
>+/* A wrapper used because of ndo_del_link */
>+int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
>+{
>+ return __bond_release_one(bond_dev, slave_dev, false);
>+}
>+
> /*
> * First release a slave and then destroy the bond if no more slaves are left.
> * Must be under rtnl_lock when this function is called.
>@@ -2138,118 +2149,6 @@ static int bond_release_and_destroy(struct net_device *bond_dev,
> }
>
> /*
>- * This function releases all slaves.
>- */
>-static int bond_release_all(struct net_device *bond_dev)
>-{
>- struct bonding *bond = netdev_priv(bond_dev);
>- struct slave *slave;
>- struct net_device *slave_dev;
>- struct sockaddr addr;
>-
>- write_lock_bh(&bond->lock);
>-
>- netif_carrier_off(bond_dev);
>-
>- if (bond->slave_cnt == 0)
>- goto out;
>-
>- bond->current_arp_slave = NULL;
>- bond->primary_slave = NULL;
>- bond_change_active_slave(bond, NULL);
>-
>- while ((slave = bond->first_slave) != NULL) {
>- /* Inform AD package of unbinding of slave
>- * before slave is detached from the list.
>- */
>- if (bond->params.mode == BOND_MODE_8023AD)
>- bond_3ad_unbind_slave(slave);
>-
>- slave_dev = slave->dev;
>- bond_detach_slave(bond, slave);
>-
>- /* now that the slave is detached, unlock and perform
>- * all the undo steps that should not be called from
>- * within a lock.
>- */
>- write_unlock_bh(&bond->lock);
>-
>- /* unregister rx_handler early so bond_handle_frame wouldn't
>- * be called for this slave anymore.
>- */
>- netdev_rx_handler_unregister(slave_dev);
>- synchronize_net();
>-
>- if (bond_is_lb(bond)) {
>- /* must be called only after the slave
>- * has been detached from the list
>- */
>- bond_alb_deinit_slave(bond, slave);
>- }
>-
>- bond_destroy_slave_symlinks(bond_dev, slave_dev);
>- bond_del_vlans_from_slave(bond, slave_dev);
>-
>- /* If the mode USES_PRIMARY, then we should only remove its
>- * promisc and mc settings if it was the curr_active_slave, but that was
>- * already taken care of above when we detached the slave
>- */
>- if (!USES_PRIMARY(bond->params.mode)) {
>- /* unset promiscuity level from slave */
>- if (bond_dev->flags & IFF_PROMISC)
>- dev_set_promiscuity(slave_dev, -1);
>-
>- /* unset allmulti level from slave */
>- if (bond_dev->flags & IFF_ALLMULTI)
>- dev_set_allmulti(slave_dev, -1);
>-
>- /* flush master's mc_list from slave */
>- netif_addr_lock_bh(bond_dev);
>- bond_mc_list_flush(bond_dev, slave_dev);
>- netif_addr_unlock_bh(bond_dev);
>- }
>-
>- bond_upper_dev_unlink(bond_dev, slave_dev);
>-
>- slave_disable_netpoll(slave);
>-
>- /* close slave before restoring its mac address */
>- dev_close(slave_dev);
>-
>- if (!bond->params.fail_over_mac) {
>- /* restore original ("permanent") mac address*/
>- memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
>- addr.sa_family = slave_dev->type;
>- dev_set_mac_address(slave_dev, &addr);
>- }
>-
>- kfree(slave);
>-
>- /* re-acquire the lock before getting the next slave */
>- write_lock_bh(&bond->lock);
>- }
>-
>- eth_hw_addr_random(bond_dev);
>- bond->dev_addr_from_first = true;
>-
>- if (bond_vlan_used(bond)) {
>- pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
>- bond_dev->name, bond_dev->name);
>- pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
>- bond_dev->name);
>- }
>-
>- pr_info("%s: released all slaves\n", bond_dev->name);
>-
>-out:
>- write_unlock_bh(&bond->lock);
>-
>- bond_compute_features(bond);
>-
>- return 0;
>-}
>-
>-/*
> * This function changes the active slave to slave <slave_dev>.
> * It returns -EINVAL in the following cases.
> * - <slave_dev> is not found in the list.
>@@ -4440,7 +4339,9 @@ static void bond_uninit(struct net_device *bond_dev)
> bond_netpoll_cleanup(bond_dev);
>
> /* Release the bonded slaves */
>- bond_release_all(bond_dev);
>+ while (bond->first_slave != NULL)
>+ __bond_release_one(bond_dev, bond->first_slave->dev, true);
>+ pr_info("%s: released all slaves\n", bond_dev->name);
>
> list_del(&bond->bond_list);
>
>--
>1.7.11.7
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
@ 2013-02-19 5:52 ` David Miller
0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-19 5:52 UTC (permalink / raw)
To: fubar; +Cc: nikolay, netdev, andy
From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 18 Feb 2013 13:09:13 -0800
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>
>>port->slave can be NULL since it's being initialized in bond_enslave
>>thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
>>Also fix a minor bug, which could cause a port not to have
>>AD_STATE_LACP_TIMEOUT since there's no sync between
>>bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
>>the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().
>
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>
>>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Applied
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock
2013-02-18 21:33 ` Jay Vosburgh
2013-02-18 21:51 ` Nikolay Aleksandrov
@ 2013-02-19 5:52 ` David Miller
1 sibling, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-19 5:52 UTC (permalink / raw)
To: fubar; +Cc: nikolay, netdev, andy
From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 18 Feb 2013 13:33:10 -0800
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>
>>The 3ad machine state spinlock can be used before it is inititialized
>>while doing bond_enslave() (and the port is being initialized) since
>>port->slave is set before the lock is prepared, thus causing soft
>>lock-ups and a multitude of other nasty bugs.
>
> Does this change cause the "uninitialized port" warnings in
> bond_3ad_state_machine_handler and bond_3ad_rx_indication to
> intermittently print during the enslavement process? If so (and it
> looks to me like it will), I think the warnings should be removed, since
> after this change, port->slave being NULL isn't really an error
> condition that needs a warning to the log.
>
>>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
...
>>-static inline void __initialize_port_locks(struct port *port)
>>+static inline void __initialize_port_locks(struct slave *port)
>> {
>> // make sure it isn't called twice
>>- spin_lock_init(&(SLAVE_AD_INFO(port->slave).state_machine_lock));
>>+ spin_lock_init(&(SLAVE_AD_INFO(port).state_machine_lock));
>
> Change the name of the variable here, too, not just the type.
> This is confusing.
I made this adjustment and applied Nikolay's patch.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies
2013-02-19 3:12 ` Jay Vosburgh
@ 2013-02-19 5:53 ` David Miller
0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-19 5:53 UTC (permalink / raw)
To: fubar; +Cc: nikolay, netdev, andy
From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 18 Feb 2013 19:12:01 -0800
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
>
>>This patch fixes the following inconsistencies in bond_release_all:
>>- IFF_BONDING flag is not stripped from slaves
>>- MTU is not restored
>>- no netdev notifiers are sent
>>Instead of trying to keep bond_release and bond_release_all in sync
>>I think we can re-use bond_release as the environment for calling it
>>is correct (RTNL is held). I have been running tests for the past
>>week and they came out successful. The only way for bond_release to fail
>>is for the slave to be attached in a different bond or to not be a slave
>>but that cannot happen as RTNL is held and no slave manipulations can be
>>achieved.
>>
>>V2: As suggested bond_release is renamed to __bond_release_one with a
>>new parameter "all" introduced so to avoid calling unnecessary code while
>>destroying a bond, and a wrapper for it called bond_release is created
>>because of ndo_del_link. bond_release_all() is removed.
>>
>>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Applied.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-02-19 5:53 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-18 17:59 [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Nikolay Aleksandrov
2013-02-18 17:59 ` [PATCH net 2/3] bonding: Fix initialize after use for 3ad machine state spinlock Nikolay Aleksandrov
2013-02-18 21:33 ` Jay Vosburgh
2013-02-18 21:51 ` Nikolay Aleksandrov
2013-02-19 5:52 ` David Miller
2013-02-18 17:59 ` [PATCH net-next 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
2013-02-18 21:56 ` Jay Vosburgh
2013-02-18 22:13 ` Nikolay Aleksandrov
2013-02-18 23:17 ` Jay Vosburgh
2013-02-18 21:09 ` [PATCH net 1/3] bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() Jay Vosburgh
2013-02-19 5:52 ` David Miller
2013-02-19 0:09 ` [PATCH net-next v2 3/3] bonding: fix bond_release_all inconsistencies Nikolay Aleksandrov
2013-02-19 3:12 ` Jay Vosburgh
2013-02-19 5:53 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.