netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] bonding: RCUify bond_set_rx_mode()
@ 2013-08-05  9:26 Veaceslav Falico
  2013-08-05 10:21 ` Nikolay Aleksandrov
  2013-08-05 18:31 ` Veaceslav Falico
  0 siblings, 2 replies; 6+ messages in thread
From: Veaceslav Falico @ 2013-08-05  9:26 UTC (permalink / raw)
  To: netdev
  Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek, Nikolay Aleksandrov

Currently, we might easily deadlock with bond_set_rx_mode() and
bond_hw_addr_swap(). bond_set_rx_mode() is called via dev_set_rx_mode(),
which already holds the netif_addr_lock_bh(bond), and inside it takes the
bond->curr_active_slave lock, while bond_hw_addr_swap() is called with
bond->curr_active_slave lock held and then takes netif_addr_lock_bh(bond),
which results in deadlock.

CPU0                    CPU1
----                    ----
lock(&bonding_netdev_addr_lock_key);
			lock(&bond->curr_slave_lock);
			lock(&bonding_netdev_addr_lock_key);
lock(&bond->curr_slave_lock);

Fix this by using the RCU primites in bond_set_rx_mode(). We're safe wrt
racing of dev_?c_(un)sync() because we hold
lock(&bonding_netdev_addr_lock_key), and thus nobody will be able to modify
these lists before we finish.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 476df7d..fdc01c6 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3571,24 +3571,22 @@ static void bond_set_rx_mode(struct net_device *bond_dev)
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave;
 
-	read_lock(&bond->lock);
+	rcu_read_lock();
 
 	if (USES_PRIMARY(bond->params.mode)) {
-		read_lock(&bond->curr_slave_lock);
-		slave = bond->curr_active_slave;
+		slave = rcu_dereference(bond->curr_active_slave);
 		if (slave) {
 			dev_uc_sync(slave->dev, bond_dev);
 			dev_mc_sync(slave->dev, bond_dev);
 		}
-		read_unlock(&bond->curr_slave_lock);
 	} else {
-		bond_for_each_slave(bond, slave) {
+		bond_for_each_slave_rcu(bond, slave) {
 			dev_uc_sync_multiple(slave->dev, bond_dev);
 			dev_mc_sync_multiple(slave->dev, bond_dev);
 		}
 	}
 
-	read_unlock(&bond->lock);
+	rcu_read_unlock();
 }
 
 static int bond_neigh_init(struct neighbour *n)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] bonding: RCUify bond_set_rx_mode()
  2013-08-05  9:26 [PATCH net-next] bonding: RCUify bond_set_rx_mode() Veaceslav Falico
@ 2013-08-05 10:21 ` Nikolay Aleksandrov
  2013-08-05 12:31   ` Veaceslav Falico
  2013-08-05 18:31 ` Veaceslav Falico
  1 sibling, 1 reply; 6+ messages in thread
From: Nikolay Aleksandrov @ 2013-08-05 10:21 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

On 08/05/2013 11:26 AM, Veaceslav Falico wrote:
> Currently, we might easily deadlock with bond_set_rx_mode() and
> bond_hw_addr_swap(). bond_set_rx_mode() is called via dev_set_rx_mode(),
> which already holds the netif_addr_lock_bh(bond), and inside it takes the
> bond->curr_active_slave lock, while bond_hw_addr_swap() is called with
> bond->curr_active_slave lock held and then takes netif_addr_lock_bh(bond),
> which results in deadlock.
> 
> CPU0                    CPU1
> ----                    ----
> lock(&bonding_netdev_addr_lock_key);
> 			lock(&bond->curr_slave_lock);
> 			lock(&bonding_netdev_addr_lock_key);
> lock(&bond->curr_slave_lock);
> 
> Fix this by using the RCU primites in bond_set_rx_mode(). We're safe wrt
> racing of dev_?c_(un)sync() because we hold
> lock(&bonding_netdev_addr_lock_key), and thus nobody will be able to modify
> these lists before we finish.
>
Hi,
I don't think this deadlock can actually happen because bond_hw_addr_swap() is
called from bond_change_active_slave() only in USES_PRIMARY mode, and in such
mode it's always called with rtnl acquired before that, and since
dev_set_rx_mode is called with rtnl, IMO such deadlock can't happen.
Also I think bond_set_rx_mode() can work without RCU because of the held rtnl
and converted to ASSERT_RTNL (this is optional) + rtnl_dereference for the
curr_active_slave.

Cheers,
 Nik

> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> CC: Nikolay Aleksandrov <nikolay@redhat.com>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
> ---

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] bonding: RCUify bond_set_rx_mode()
  2013-08-05 10:21 ` Nikolay Aleksandrov
@ 2013-08-05 12:31   ` Veaceslav Falico
  0 siblings, 0 replies; 6+ messages in thread
From: Veaceslav Falico @ 2013-08-05 12:31 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

On Mon, Aug 05, 2013 at 12:21:56PM +0200, Nikolay Aleksandrov wrote:
>On 08/05/2013 11:26 AM, Veaceslav Falico wrote:
>> Currently, we might easily deadlock with bond_set_rx_mode() and
>> bond_hw_addr_swap(). bond_set_rx_mode() is called via dev_set_rx_mode(),
>> which already holds the netif_addr_lock_bh(bond), and inside it takes the
>> bond->curr_active_slave lock, while bond_hw_addr_swap() is called with
>> bond->curr_active_slave lock held and then takes netif_addr_lock_bh(bond),
>> which results in deadlock.
>>
>> CPU0                    CPU1
>> ----                    ----
>> lock(&bonding_netdev_addr_lock_key);
>> 			lock(&bond->curr_slave_lock);
>> 			lock(&bonding_netdev_addr_lock_key);
>> lock(&bond->curr_slave_lock);
>>
>> Fix this by using the RCU primites in bond_set_rx_mode(). We're safe wrt
>> racing of dev_?c_(un)sync() because we hold
>> lock(&bonding_netdev_addr_lock_key), and thus nobody will be able to modify
>> these lists before we finish.
>>
>Hi,
>I don't think this deadlock can actually happen because bond_hw_addr_swap() is
>called from bond_change_active_slave() only in USES_PRIMARY mode, and in such
>mode it's always called with rtnl acquired before that, and since
>dev_set_rx_mode is called with rtnl, IMO such deadlock can't happen.

Yep, indeed, missed the part with USES_PRIMARY(). So the lockdep had a
false alarm.

>Also I think bond_set_rx_mode() can work without RCU because of the held rtnl
>and converted to ASSERT_RTNL (this is optional) + rtnl_dereference for the
>curr_active_slave.

Yes, we don't need the real rcu cause we're under rtnl and everybody else
who touches it also is under rtnl. Awesome catch.

Thanks, will resubmit another patch (hard to call it v2...).

>
>Cheers,
> Nik
>
>> CC: Jay Vosburgh <fubar@us.ibm.com>
>> CC: Andy Gospodarek <andy@greyhouse.net>
>> CC: Nikolay Aleksandrov <nikolay@redhat.com>
>> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>> ---
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] bonding: RCUify bond_set_rx_mode()
  2013-08-05  9:26 [PATCH net-next] bonding: RCUify bond_set_rx_mode() Veaceslav Falico
  2013-08-05 10:21 ` Nikolay Aleksandrov
@ 2013-08-05 18:31 ` Veaceslav Falico
  1 sibling, 0 replies; 6+ messages in thread
From: Veaceslav Falico @ 2013-08-05 18:31 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh, Andy Gospodarek, Nikolay Aleksandrov

On Mon, Aug 05, 2013 at 11:26:16AM +0200, Veaceslav Falico wrote:
>Currently, we might easily deadlock with bond_set_rx_mode() and
>bond_hw_addr_swap(). bond_set_rx_mode() is called via dev_set_rx_mode(),
>which already holds the netif_addr_lock_bh(bond), and inside it takes the
>bond->curr_active_slave lock, while bond_hw_addr_swap() is called with
>bond->curr_active_slave lock held and then takes netif_addr_lock_bh(bond),
>which results in deadlock.
>
>CPU0                    CPU1
>----                    ----
>lock(&bonding_netdev_addr_lock_key);
>			lock(&bond->curr_slave_lock);
>			lock(&bonding_netdev_addr_lock_key);
>lock(&bond->curr_slave_lock);
>
>Fix this by using the RCU primites in bond_set_rx_mode(). We're safe wrt
>racing of dev_?c_(un)sync() because we hold
>lock(&bonding_netdev_addr_lock_key), and thus nobody will be able to modify
>these lists before we finish.
>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: Nikolay Aleksandrov <nikolay@redhat.com>
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Self-NAK, for clarity. Posted a reworked patch - "[net-next] bonding: remove
locking from bond_set_rx_mode()" for the same issue.

>---
> drivers/net/bonding/bond_main.c |   10 ++++------
> 1 files changed, 4 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 476df7d..fdc01c6 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3571,24 +3571,22 @@ static void bond_set_rx_mode(struct net_device *bond_dev)
> 	struct bonding *bond = netdev_priv(bond_dev);
> 	struct slave *slave;
>
>-	read_lock(&bond->lock);
>+	rcu_read_lock();
>
> 	if (USES_PRIMARY(bond->params.mode)) {
>-		read_lock(&bond->curr_slave_lock);
>-		slave = bond->curr_active_slave;
>+		slave = rcu_dereference(bond->curr_active_slave);
> 		if (slave) {
> 			dev_uc_sync(slave->dev, bond_dev);
> 			dev_mc_sync(slave->dev, bond_dev);
> 		}
>-		read_unlock(&bond->curr_slave_lock);
> 	} else {
>-		bond_for_each_slave(bond, slave) {
>+		bond_for_each_slave_rcu(bond, slave) {
> 			dev_uc_sync_multiple(slave->dev, bond_dev);
> 			dev_mc_sync_multiple(slave->dev, bond_dev);
> 		}
> 	}
>
>-	read_unlock(&bond->lock);
>+	rcu_read_unlock();
> }
>
> static int bond_neigh_init(struct neighbour *n)
>-- 
>1.7.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] bonding: RCUify bond_set_rx_mode()
  2013-09-28 19:18 Veaceslav Falico
@ 2013-10-01  5:27 ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2013-10-01  5:27 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, joe.lawrence, fubar, andy

From: Veaceslav Falico <vfalico@redhat.com>
Date: Sat, 28 Sep 2013 21:18:56 +0200

> Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
> always the case:
> 
> RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
> ...
>  [<ffffffff81651ca5>] dump_stack+0x54/0x74
>  [<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
>  [<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
>  [<ffffffff81557ff8>] __dev_mc_add+0x58/0x70
>  [<ffffffff81558020>] dev_mc_add+0x10/0x20
>  [<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
>  [<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
>  [<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
>  [<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
> ...
> 
> Fix this by using RCU primitives.
> 
> Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
> Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next] bonding: RCUify bond_set_rx_mode()
@ 2013-09-28 19:18 Veaceslav Falico
  2013-10-01  5:27 ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Veaceslav Falico @ 2013-09-28 19:18 UTC (permalink / raw)
  To: netdev; +Cc: joe.lawrence, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
always the case:

RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
...
 [<ffffffff81651ca5>] dump_stack+0x54/0x74
 [<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
 [<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
 [<ffffffff81557ff8>] __dev_mc_add+0x58/0x70
 [<ffffffff81558020>] dev_mc_add+0x10/0x20
 [<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
 [<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
 [<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
 [<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
...

Fix this by using RCU primitives.

Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d5c3153..996d196 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3393,20 +3393,21 @@ static void bond_set_rx_mode(struct net_device *bond_dev)
 	struct list_head *iter;
 	struct slave *slave;
 
-	ASSERT_RTNL();
 
+	rcu_read_lock();
 	if (USES_PRIMARY(bond->params.mode)) {
-		slave = rtnl_dereference(bond->curr_active_slave);
+		slave = rcu_dereference(bond->curr_active_slave);
 		if (slave) {
 			dev_uc_sync(slave->dev, bond_dev);
 			dev_mc_sync(slave->dev, bond_dev);
 		}
 	} else {
-		bond_for_each_slave(bond, slave, iter) {
+		bond_for_each_slave_rcu(bond, slave, iter) {
 			dev_uc_sync_multiple(slave->dev, bond_dev);
 			dev_mc_sync_multiple(slave->dev, bond_dev);
 		}
 	}
+	rcu_read_unlock();
 }
 
 static int bond_neigh_init(struct neighbour *n)
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-10-01  5:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-05  9:26 [PATCH net-next] bonding: RCUify bond_set_rx_mode() Veaceslav Falico
2013-08-05 10:21 ` Nikolay Aleksandrov
2013-08-05 12:31   ` Veaceslav Falico
2013-08-05 18:31 ` Veaceslav Falico
2013-09-28 19:18 Veaceslav Falico
2013-10-01  5:27 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).