netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe
@ 2014-01-23 11:16 Veaceslav Falico
  2014-01-23 11:16 ` [PATCH net-next 1/2] bonding: RCUify bond_ab_arp_probe Veaceslav Falico
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Veaceslav Falico @ 2014-01-23 11:16 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh, Andy Gospodarek, Veaceslav Falico

Hi,

After the latest patches, on every call of bond_ab_arp_probe() without an
active slave I see the following warning:

[    7.912314] RTNL: assertion failed at net/core/dev.c (4494)
...
[    7.922495]  [<ffffffff817acc6f>] dump_stack+0x51/0x72
[    7.923714]  [<ffffffff8168795e>] netdev_master_upper_dev_get+0x6e/0x70
[    7.924940]  [<ffffffff816a2a66>] rtnl_link_fill+0x116/0x260
[    7.926143]  [<ffffffff817acc6f>] ? dump_stack+0x51/0x72
[    7.927333]  [<ffffffff816a350c>] rtnl_fill_ifinfo+0x95c/0xb90
[    7.928529]  [<ffffffff8167af2b>] ? __kmalloc_reserve+0x3b/0xa0
[    7.929681]  [<ffffffff8167bfcf>] ? __alloc_skb+0x9f/0x1e0
[    7.930827]  [<ffffffff816a3b64>] rtmsg_ifinfo+0x84/0x100
[    7.931960]  [<ffffffffa00bca07>] bond_ab_arp_probe+0x1a7/0x370 [bonding]
[    7.933133]  [<ffffffffa00bcd78>] bond_activebackup_arp_mon+0x1a8/0x2f0 [bonding]
...

It happens because in bond_ab_arp_probe() we change the flags of a slave
without holding the RTNL lock.

To fix this - remove the useless curr_active_lock, RCUify it completely and
lock RTNL while changing the slave's flags.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

---
 drivers/net/bonding/bond_main.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 1/2] bonding: RCUify bond_ab_arp_probe
  2014-01-23 11:16 [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico
@ 2014-01-23 11:16 ` Veaceslav Falico
  2014-01-23 11:16 ` [PATCH net-next 2/2] bonding: lock RTNL when setting (in)active slave flags Veaceslav Falico
  2014-01-23 11:25 ` [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico
  2 siblings, 0 replies; 4+ messages in thread
From: Veaceslav Falico @ 2014-01-23 11:16 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently bond_ab_arp_probe() is always called under rcu_read_lock(),
however to work with curr_active_slave we're still holding the
curr_slave_lock.

To remove that curr_slave_lock - rcu_dereference the bond's
curr_active_slave and use it further - so that we're sure the slave won't
go away, and we don't care if it will change in the meanwhile.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f9e0c8b..22d8b69 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2605,25 +2605,21 @@ do_failover:
 static void bond_ab_arp_probe(struct bonding *bond)
 {
 	struct slave *slave, *before = NULL, *new_slave = NULL,
-		     *curr_arp_slave = rcu_dereference(bond->current_arp_slave);
+		     *curr_arp_slave = rcu_dereference(bond->current_arp_slave),
+		     *curr_active_slave = rcu_dereference(bond->curr_active_slave);
 	struct list_head *iter;
 	bool found = false;
 
-	read_lock(&bond->curr_slave_lock);
-
-	if (curr_arp_slave && bond->curr_active_slave)
+	if (curr_arp_slave && curr_active_slave)
 		pr_info("PROBE: c_arp %s && cas %s BAD\n",
 			curr_arp_slave->dev->name,
-			bond->curr_active_slave->dev->name);
+			curr_active_slave->dev->name);
 
-	if (bond->curr_active_slave) {
-		bond_arp_send_all(bond, bond->curr_active_slave);
-		read_unlock(&bond->curr_slave_lock);
+	if (curr_active_slave) {
+		bond_arp_send_all(bond, curr_active_slave);
 		return;
 	}
 
-	read_unlock(&bond->curr_slave_lock);
-
 	/* if we don't have a curr_active_slave, search for the next available
 	 * backup slave from the current_arp_slave and make it the candidate
 	 * for becoming the curr_active_slave
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next 2/2] bonding: lock RTNL when setting (in)active slave flags
  2014-01-23 11:16 [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico
  2014-01-23 11:16 ` [PATCH net-next 1/2] bonding: RCUify bond_ab_arp_probe Veaceslav Falico
@ 2014-01-23 11:16 ` Veaceslav Falico
  2014-01-23 11:25 ` [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico
  2 siblings, 0 replies; 4+ messages in thread
From: Veaceslav Falico @ 2014-01-23 11:16 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently, on (in)active slave flag change, we notify the stack via
rtmsg_ifinfo(), which implies that we should hold the RTNL lock.

However, in bond_ab_arp_probe(), in case we don't have curr_active_slave -
we don't hold it, which issues a warning and might race with other slave
flags modifications.

Fix this by wrapping the changing in RTNL lock - it's not a hot path (runs
every arp_interval) - so no speed issues should arrive.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 22d8b69..50cddb9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2631,9 +2631,11 @@ static void bond_ab_arp_probe(struct bonding *bond)
 			return;
 	}
 
+	rtnl_lock();
+
 	bond_set_slave_inactive_flags(curr_arp_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
+	bond_for_each_slave(bond, slave, iter) {
 		if (!found && !before && IS_UP(slave->dev))
 			before = slave;
 
@@ -2660,6 +2662,8 @@ static void bond_ab_arp_probe(struct bonding *bond)
 			found = true;
 	}
 
+	rtnl_unlock();
+
 	if (!new_slave && before)
 		new_slave = before;
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe
  2014-01-23 11:16 [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico
  2014-01-23 11:16 ` [PATCH net-next 1/2] bonding: RCUify bond_ab_arp_probe Veaceslav Falico
  2014-01-23 11:16 ` [PATCH net-next 2/2] bonding: lock RTNL when setting (in)active slave flags Veaceslav Falico
@ 2014-01-23 11:25 ` Veaceslav Falico
  2 siblings, 0 replies; 4+ messages in thread
From: Veaceslav Falico @ 2014-01-23 11:25 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh, Andy Gospodarek

On Thu, Jan 23, 2014 at 12:16:02PM +0100, Veaceslav Falico wrote:
>Hi,
>
>After the latest patches, on every call of bond_ab_arp_probe() without an
>active slave I see the following warning:

Self-NAK, there are still warnings comming out from bond_ab_arp_probe and
from other parts. Calling rtnl-needed functions on that low-level functions
wasn't the best idea...

Will send v2 to fix this warning.

>
>[    7.912314] RTNL: assertion failed at net/core/dev.c (4494)
>...
>[    7.922495]  [<ffffffff817acc6f>] dump_stack+0x51/0x72
>[    7.923714]  [<ffffffff8168795e>] netdev_master_upper_dev_get+0x6e/0x70
>[    7.924940]  [<ffffffff816a2a66>] rtnl_link_fill+0x116/0x260
>[    7.926143]  [<ffffffff817acc6f>] ? dump_stack+0x51/0x72
>[    7.927333]  [<ffffffff816a350c>] rtnl_fill_ifinfo+0x95c/0xb90
>[    7.928529]  [<ffffffff8167af2b>] ? __kmalloc_reserve+0x3b/0xa0
>[    7.929681]  [<ffffffff8167bfcf>] ? __alloc_skb+0x9f/0x1e0
>[    7.930827]  [<ffffffff816a3b64>] rtmsg_ifinfo+0x84/0x100
>[    7.931960]  [<ffffffffa00bca07>] bond_ab_arp_probe+0x1a7/0x370 [bonding]
>[    7.933133]  [<ffffffffa00bcd78>] bond_activebackup_arp_mon+0x1a8/0x2f0 [bonding]
>...
>
>It happens because in bond_ab_arp_probe() we change the flags of a slave
>without holding the RTNL lock.
>
>To fix this - remove the useless curr_active_lock, RCUify it completely and
>lock RTNL while changing the slave's flags.
>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: netdev@vger.kernel.org
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>
>---
> drivers/net/bonding/bond_main.c | 22 +++++++++++-----------
> 1 file changed, 11 insertions(+), 11 deletions(-)
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-01-23 11:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-23 11:16 [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico
2014-01-23 11:16 ` [PATCH net-next 1/2] bonding: RCUify bond_ab_arp_probe Veaceslav Falico
2014-01-23 11:16 ` [PATCH net-next 2/2] bonding: lock RTNL when setting (in)active slave flags Veaceslav Falico
2014-01-23 11:25 ` [PATCH net-next 0/2] bonding: fix locking in bond_ab_arp_probe Veaceslav Falico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).