netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c
@ 2014-02-19  9:05 Ding Tianhong
  2014-02-19  9:05 ` [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode Ding Tianhong
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ding Tianhong @ 2014-02-19  9:05 UTC (permalink / raw)
  To: fubar, vfalico, andy
  Cc: cwang, jiri, thomas, eric.dumazet, sfeldma, davem, netdev

The commit 1d3ee88ae0d
(bonding: add netlink attributes to slave link dev)
make the bond_set_active_slave() and bond_set_backup_slave()
use rtmsg_ifinfo to send slave's states and this functions
should be called in RTNL.

But the 902.3ad and ARP monitor did not hold the RTNL when calling
thses two functions, so fix them.

v1->v2: Add new micro to indicate that the notification should be send
	later, not never.
	And add a new patch to fix the same problem for ARP mode.

Ding Tianhong (2):
  bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for
    802.3ad mode
  bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab
    arp monitor

 drivers/net/bonding/bond_3ad.c  |  16 +++++-
 drivers/net/bonding/bond_main.c | 117 ++++++++++++++++++++++------------------
 drivers/net/bonding/bonding.h   |  47 ++++++++++++++--
 3 files changed, 121 insertions(+), 59 deletions(-)

-- 
1.8.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode
  2014-02-19  9:05 [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Ding Tianhong
@ 2014-02-19  9:05 ` Ding Tianhong
  2014-02-25 22:32   ` David Miller
  2014-02-19  9:05 ` [PATCH net v2 2/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor Ding Tianhong
  2014-02-19 10:52 ` [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Thomas Glanzmann
  2 siblings, 1 reply; 7+ messages in thread
From: Ding Tianhong @ 2014-02-19  9:05 UTC (permalink / raw)
  To: fubar, vfalico, andy
  Cc: cwang, jiri, thomas, eric.dumazet, sfeldma, davem, netdev

The problem was introduced by the commit 1d3ee88ae0d
(bonding: add netlink attributes to slave link dev).
The bond_set_active_slave() and bond_set_backup_slave()
will use rtmsg_ifinfo to send slave's states, so these
two functions should be called in RTNL.

In 802.3ad mode, acquiring RTNL for the __enable_port and
__disable_port cases is difficult, as those calls generally
already hold the state machine lock, and cannot unconditionally
call rtnl_lock because either they already hold RTNL (for calls
via bond_3ad_unbind_slave) or due to the potential for deadlock
with bond_3ad_adapter_speed_changed, bond_3ad_adapter_duplex_changed,
bond_3ad_link_change, or bond_3ad_update_lacp_rate.  All four of
those are called with RTNL held, and acquire the state machine lock
second.  The calling contexts for __enable_port and __disable_port
already hold the state machine lock, and may or may not need RTNL.

According to the Jay's opinion, I don't think it is a problem that
the slave don't send notify message synchronously when the status
changed, normally the state machine is running every 100 ms, send
the notify message at the end of the state machine if the slave's
state changed should be better.

I fix the problem through these steps:

1). add a new function bond_set_slave_state() which could change
    the slave's state and call rtmsg_ifinfo() according to the input
    parameters called notify.

2). Add a new slave parameter which called should_notify, if the slave's state
    changed and don't notify yet, the parameter will be set to 1, and then if
    the slave's state changed again, the param will be set to 0, it indicate that
    the slave's state has been restored, no need to notify any one.

3). the __enable_port and __disable_port should not call rtmsg_ifinfo
    in the state machine lock, any change in the state of slave could
    set a flag in the slave, it will indicated that an rtmsg_ifinfo
    should be called at the end of the state machine.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_3ad.c  | 22 ++++++++++++++++++++--
 drivers/net/bonding/bond_main.c | 41 ++++++++++++++++++++++++++---------------
 drivers/net/bonding/bonding.h   | 34 +++++++++++++++++++++++++++++-----
 3 files changed, 75 insertions(+), 22 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 6d20fbd..d4ace0b 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -181,7 +181,7 @@ static inline int __agg_has_partner(struct aggregator *agg)
  */
 static inline void __disable_port(struct port *port)
 {
-	bond_set_slave_inactive_flags(port->slave);
+	bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
 }
 
 /**
@@ -193,7 +193,7 @@ static inline void __enable_port(struct port *port)
 	struct slave *slave = port->slave;
 
 	if ((slave->link == BOND_LINK_UP) && IS_UP(slave->dev))
-		bond_set_slave_active_flags(slave);
+		bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
 }
 
 /**
@@ -2062,6 +2062,7 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
 	struct list_head *iter;
 	struct slave *slave;
 	struct port *port;
+	int slave_should_notify = 0;
 
 	read_lock(&bond->lock);
 	rcu_read_lock();
@@ -2119,8 +2120,25 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
 	}
 
 re_arm:
+	bond_for_each_slave_rcu(bond, slave, iter) {
+		if (slave->should_notify) {
+			slave_should_notify = 1;
+			break;
+		}
+	}
 	rcu_read_unlock();
 	read_unlock(&bond->lock);
+
+	if (slave_should_notify && rtnl_trylock()) {
+		bond_for_each_slave(bond, slave, iter) {
+			if (slave->should_notify) {
+				rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0,
+					     GFP_KERNEL);
+				slave->should_notify = 0;
+			}
+		}
+		rtnl_unlock();
+	}
 	queue_delayed_work(bond->wq, &bond->ad_work, ad_delta_in_ticks);
 }
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1c6104d..e02029b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -829,21 +829,25 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
 	if (bond_is_lb(bond)) {
 		bond_alb_handle_active_change(bond, new_active);
 		if (old_active)
-			bond_set_slave_inactive_flags(old_active);
+			bond_set_slave_inactive_flags(old_active,
+						      BOND_SLAVE_NOTIFY_NOW);
 		if (new_active)
-			bond_set_slave_active_flags(new_active);
+			bond_set_slave_active_flags(new_active,
+						    BOND_SLAVE_NOTIFY_NOW);
 	} else {
 		rcu_assign_pointer(bond->curr_active_slave, new_active);
 	}
 
 	if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) {
 		if (old_active)
-			bond_set_slave_inactive_flags(old_active);
+			bond_set_slave_inactive_flags(old_active,
+						      BOND_SLAVE_NOTIFY_NOW);
 
 		if (new_active) {
 			bool should_notify_peers = false;
 
-			bond_set_slave_active_flags(new_active);
+			bond_set_slave_active_flags(new_active,
+						    BOND_SLAVE_NOTIFY_NOW);
 
 			if (bond->params.fail_over_mac)
 				bond_do_fail_over_mac(bond, new_active,
@@ -1463,14 +1467,15 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 
 	switch (bond->params.mode) {
 	case BOND_MODE_ACTIVEBACKUP:
-		bond_set_slave_inactive_flags(new_slave);
+		bond_set_slave_inactive_flags(new_slave,
+					      BOND_SLAVE_NOTIFY_NOW);
 		break;
 	case BOND_MODE_8023AD:
 		/* in 802.3ad mode, the internal mechanism
 		 * will activate the slaves in the selected
 		 * aggregator
 		 */
-		bond_set_slave_inactive_flags(new_slave);
+		bond_set_slave_inactive_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
 		/* if this is the first slave */
 		if (!prev_slave) {
 			SLAVE_AD_INFO(new_slave).id = 1;
@@ -1488,7 +1493,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	case BOND_MODE_TLB:
 	case BOND_MODE_ALB:
 		bond_set_active_slave(new_slave);
-		bond_set_slave_inactive_flags(new_slave);
+		bond_set_slave_inactive_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
 		break;
 	default:
 		pr_debug("This slave is always active in trunk mode\n");
@@ -2015,7 +2020,8 @@ static void bond_miimon_commit(struct bonding *bond)
 
 			if (bond->params.mode == BOND_MODE_ACTIVEBACKUP ||
 			    bond->params.mode == BOND_MODE_8023AD)
-				bond_set_slave_inactive_flags(slave);
+				bond_set_slave_inactive_flags(slave,
+							      BOND_SLAVE_NOTIFY_NOW);
 
 			pr_info("%s: link status definitely down for interface %s, disabling it\n",
 				bond->dev->name, slave->dev->name);
@@ -2562,7 +2568,8 @@ static void bond_ab_arp_commit(struct bonding *bond)
 				slave->link = BOND_LINK_UP;
 				if (bond->current_arp_slave) {
 					bond_set_slave_inactive_flags(
-						bond->current_arp_slave);
+						bond->current_arp_slave,
+						BOND_SLAVE_NOTIFY_NOW);
 					bond->current_arp_slave = NULL;
 				}
 
@@ -2582,7 +2589,8 @@ static void bond_ab_arp_commit(struct bonding *bond)
 				slave->link_failure_count++;
 
 			slave->link = BOND_LINK_DOWN;
-			bond_set_slave_inactive_flags(slave);
+			bond_set_slave_inactive_flags(slave,
+						      BOND_SLAVE_NOTIFY_NOW);
 
 			pr_info("%s: link status definitely down for interface %s, disabling it\n",
 				bond->dev->name, slave->dev->name);
@@ -2657,7 +2665,7 @@ static bool bond_ab_arp_probe(struct bonding *bond)
 		}
 	}
 
-	bond_set_slave_inactive_flags(curr_arp_slave);
+	bond_set_slave_inactive_flags(curr_arp_slave, BOND_SLAVE_NOTIFY_NOW);
 
 	bond_for_each_slave(bond, slave, iter) {
 		if (!found && !before && IS_UP(slave->dev))
@@ -2677,7 +2685,8 @@ static bool bond_ab_arp_probe(struct bonding *bond)
 			if (slave->link_failure_count < UINT_MAX)
 				slave->link_failure_count++;
 
-			bond_set_slave_inactive_flags(slave);
+			bond_set_slave_inactive_flags(slave,
+						      BOND_SLAVE_NOTIFY_NOW);
 
 			pr_info("%s: backup interface %s is now down.\n",
 				bond->dev->name, slave->dev->name);
@@ -2695,7 +2704,7 @@ static bool bond_ab_arp_probe(struct bonding *bond)
 	}
 
 	new_slave->link = BOND_LINK_BACK;
-	bond_set_slave_active_flags(new_slave);
+	bond_set_slave_active_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
 	bond_arp_send_all(bond, new_slave);
 	new_slave->jiffies = jiffies;
 	rcu_assign_pointer(bond->current_arp_slave, new_slave);
@@ -3046,9 +3055,11 @@ static int bond_open(struct net_device *bond_dev)
 		bond_for_each_slave(bond, slave, iter) {
 			if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP)
 				&& (slave != bond->curr_active_slave)) {
-				bond_set_slave_inactive_flags(slave);
+				bond_set_slave_inactive_flags(slave,
+							      BOND_SLAVE_NOTIFY_NOW);
 			} else {
-				bond_set_slave_active_flags(slave);
+				bond_set_slave_active_flags(slave,
+							    BOND_SLAVE_NOTIFY_NOW);
 			}
 		}
 		read_unlock(&bond->curr_slave_lock);
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 86ccfb9..b4e6c53 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -195,7 +195,8 @@ struct slave {
 	s8     new_link;
 	u8     backup:1,   /* indicates backup slave. Value corresponds with
 			      BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
-	       inactive:1; /* indicates inactive slave */
+	       inactive:1, /* indicates inactive slave */
+	       should_notify:1; /* indicateds whether the state changed */
 	u8     duplex;
 	u32    original_mtu;
 	u32    link_failure_count;
@@ -303,6 +304,24 @@ static inline void bond_set_backup_slave(struct slave *slave)
 	}
 }
 
+static inline void bond_set_slave_state(struct slave *slave,
+					int slave_state, bool notify)
+{
+	if (slave->backup == slave_state)
+		return;
+
+	slave->backup = slave_state;
+	if (notify) {
+		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_KERNEL);
+		slave->should_notify = 0;
+	} else {
+		if (slave->should_notify)
+			slave->should_notify = 0;
+		else
+			slave->should_notify = 1;
+	}
+}
+
 static inline void bond_slave_state_change(struct bonding *bond)
 {
 	struct list_head *iter;
@@ -343,6 +362,9 @@ static inline bool bond_is_active_slave(struct slave *slave)
 #define BOND_ARP_VALIDATE_ALL		(BOND_ARP_VALIDATE_ACTIVE | \
 					 BOND_ARP_VALIDATE_BACKUP)
 
+#define BOND_SLAVE_NOTIFY_NOW		true
+#define BOND_SLAVE_NOTIFY_LATER	false
+
 static inline int slave_do_arp_validate(struct bonding *bond,
 					struct slave *slave)
 {
@@ -394,17 +416,19 @@ static inline void bond_netpoll_send_skb(const struct slave *slave,
 }
 #endif
 
-static inline void bond_set_slave_inactive_flags(struct slave *slave)
+static inline void bond_set_slave_inactive_flags(struct slave *slave,
+						 bool notify)
 {
 	if (!bond_is_lb(slave->bond))
-		bond_set_backup_slave(slave);
+		bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
 	if (!slave->bond->params.all_slaves_active)
 		slave->inactive = 1;
 }
 
-static inline void bond_set_slave_active_flags(struct slave *slave)
+static inline void bond_set_slave_active_flags(struct slave *slave,
+					       bool notify)
 {
-	bond_set_active_slave(slave);
+	bond_set_slave_state(slave, BOND_STATE_ACTIVE, notify);
 	slave->inactive = 0;
 }
 
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net v2 2/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor
  2014-02-19  9:05 [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Ding Tianhong
  2014-02-19  9:05 ` [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode Ding Tianhong
@ 2014-02-19  9:05 ` Ding Tianhong
  2014-02-25 22:34   ` David Miller
  2014-02-19 10:52 ` [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Thomas Glanzmann
  2 siblings, 1 reply; 7+ messages in thread
From: Ding Tianhong @ 2014-02-19  9:05 UTC (permalink / raw)
  To: fubar, vfalico, andy
  Cc: cwang, jiri, thomas, eric.dumazet, sfeldma, davem, netdev

Veaceslav has reported and fix this problem by commit f2ebd477f141bc0
(bonding: restructure locking of bond_ab_arp_probe()). According Jay's
opinion, the current solution is not very well, because the notification
is to indicate that the interface has actually changed state in a meaningful
way, but these calls in the ab ARP monitor are internal settings of the flags
to allow the ARP monitor to search for a slave to become active when there are
no active slaves. The flag setting to active or backup is to permit the ARP
monitor's response logic to do the right thing when deciding if the test
slave (current_arp_slave) is up or not.

So the best way to fix the problem is that we should not send a notification
when the slave is in testing state, and check the state at the end of the
monitor, if the slave's state recover, avoid to send pointless notification
twice. And RTNL is really a big lock, hold it regardless the slave's state
changed or not when the current_active_slave is null will loss performance
(every 100ms), so we should hold it only when the slave's state changed and
need to notify.

I revert the old commit and add new modifications.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_3ad.c  |  8 +---
 drivers/net/bonding/bond_main.c | 82 +++++++++++++++++++++--------------------
 drivers/net/bonding/bonding.h   | 13 +++++++
 3 files changed, 56 insertions(+), 47 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index d4ace0b..bbe8448 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2130,13 +2130,7 @@ re_arm:
 	read_unlock(&bond->lock);
 
 	if (slave_should_notify && rtnl_trylock()) {
-		bond_for_each_slave(bond, slave, iter) {
-			if (slave->should_notify) {
-				rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0,
-					     GFP_KERNEL);
-				slave->should_notify = 0;
-			}
-		}
+		bond_slave_state_notify(bond);
 		rtnl_unlock();
 	}
 	queue_delayed_work(bond->wq, &bond->ad_work, ad_delta_in_ticks);
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e02029b..f107be9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2623,17 +2623,17 @@ do_failover:
 
 /*
  * Send ARP probes for active-backup mode ARP monitor.
+ *
+ * Called with rcu_read_lock hold.
  */
-static bool bond_ab_arp_probe(struct bonding *bond)
+static int bond_ab_arp_probe(struct bonding *bond)
 {
 	struct slave *slave, *before = NULL, *new_slave = NULL,
-		     *curr_arp_slave, *curr_active_slave;
+		     *curr_arp_slave = rcu_dereference(bond->current_arp_slave),
+		     *curr_active_slave = rcu_dereference(bond->curr_active_slave);
 	struct list_head *iter;
 	bool found = false;
-
-	rcu_read_lock();
-	curr_arp_slave = rcu_dereference(bond->current_arp_slave);
-	curr_active_slave = rcu_dereference(bond->curr_active_slave);
+	int slave_should_notify = 0;
 
 	if (curr_arp_slave && curr_active_slave)
 		pr_info("PROBE: c_arp %s && cas %s BAD\n",
@@ -2642,32 +2642,23 @@ static bool bond_ab_arp_probe(struct bonding *bond)
 
 	if (curr_active_slave) {
 		bond_arp_send_all(bond, curr_active_slave);
-		rcu_read_unlock();
-		return true;
+		return slave_should_notify;
 	}
-	rcu_read_unlock();
 
 	/* if we don't have a curr_active_slave, search for the next available
 	 * backup slave from the current_arp_slave and make it the candidate
 	 * for becoming the curr_active_slave
 	 */
 
-	if (!rtnl_trylock())
-		return false;
-	/* curr_arp_slave might have gone away */
-	curr_arp_slave = ACCESS_ONCE(bond->current_arp_slave);
-
 	if (!curr_arp_slave) {
-		curr_arp_slave = bond_first_slave(bond);
-		if (!curr_arp_slave) {
-			rtnl_unlock();
-			return true;
-		}
+		curr_arp_slave = bond_first_slave_rcu(bond);
+		if (!curr_arp_slave)
+			return slave_should_notify;
 	}
 
-	bond_set_slave_inactive_flags(curr_arp_slave, BOND_SLAVE_NOTIFY_NOW);
+	bond_set_slave_inactive_flags(curr_arp_slave, BOND_SLAVE_NOTIFY_LATER);
 
-	bond_for_each_slave(bond, slave, iter) {
+	bond_for_each_slave_rcu(bond, slave, iter) {
 		if (!found && !before && IS_UP(slave->dev))
 			before = slave;
 
@@ -2686,7 +2677,7 @@ static bool bond_ab_arp_probe(struct bonding *bond)
 				slave->link_failure_count++;
 
 			bond_set_slave_inactive_flags(slave,
-						      BOND_SLAVE_NOTIFY_NOW);
+						      BOND_SLAVE_NOTIFY_LATER);
 
 			pr_info("%s: backup interface %s is now down.\n",
 				bond->dev->name, slave->dev->name);
@@ -2698,27 +2689,32 @@ static bool bond_ab_arp_probe(struct bonding *bond)
 	if (!new_slave && before)
 		new_slave = before;
 
-	if (!new_slave) {
-		rtnl_unlock();
-		return true;
-	}
+	if (!new_slave)
+		goto check_state;
 
 	new_slave->link = BOND_LINK_BACK;
-	bond_set_slave_active_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
+	bond_set_slave_active_flags(new_slave, BOND_SLAVE_NOTIFY_LATER);
 	bond_arp_send_all(bond, new_slave);
 	new_slave->jiffies = jiffies;
 	rcu_assign_pointer(bond->current_arp_slave, new_slave);
-	rtnl_unlock();
 
-	return true;
+check_state:
+	bond_for_each_slave_rcu(bond, slave, iter) {
+		if (slave->should_notify) {
+			slave_should_notify = 1;
+			break;
+		}
+	}
+	return slave_should_notify;
 }
 
 static void bond_activebackup_arp_mon(struct work_struct *work)
 {
 	struct bonding *bond = container_of(work, struct bonding,
 					    arp_work.work);
-	bool should_notify_peers = false, should_commit = false;
+	bool should_notify_peers = false;
 	int delta_in_ticks;
+	int slave_should_notify = 0;
 
 	delta_in_ticks = msecs_to_jiffies(bond->params.arp_interval);
 
@@ -2726,11 +2722,12 @@ static void bond_activebackup_arp_mon(struct work_struct *work)
 		goto re_arm;
 
 	rcu_read_lock();
+
 	should_notify_peers = bond_should_notify_peers(bond);
-	should_commit = bond_ab_arp_inspect(bond);
-	rcu_read_unlock();
 
-	if (should_commit) {
+	if (bond_ab_arp_inspect(bond)) {
+		rcu_read_unlock();
+
 		/* Race avoidance with bond_close flush of workqueue */
 		if (!rtnl_trylock()) {
 			delta_in_ticks = 1;
@@ -2739,23 +2736,28 @@ static void bond_activebackup_arp_mon(struct work_struct *work)
 		}
 
 		bond_ab_arp_commit(bond);
+
 		rtnl_unlock();
+		rcu_read_lock();
 	}
 
-	if (!bond_ab_arp_probe(bond)) {
-		/* rtnl locking failed, re-arm */
-		delta_in_ticks = 1;
-		should_notify_peers = false;
-	}
+	slave_should_notify = bond_ab_arp_probe(bond);
+	rcu_read_unlock();
 
 re_arm:
 	if (bond->params.arp_interval)
 		queue_delayed_work(bond->wq, &bond->arp_work, delta_in_ticks);
 
-	if (should_notify_peers) {
+	if (should_notify_peers || slave_should_notify) {
 		if (!rtnl_trylock())
 			return;
-		call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev);
+
+		if (should_notify_peers)
+			call_netdevice_notifiers(NETDEV_NOTIFY_PEERS,
+						 bond->dev);
+		if (slave_should_notify)
+			bond_slave_state_notify(bond);
+
 		rtnl_unlock();
 	}
 }
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index b4e6c53..aeed642 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -335,6 +335,19 @@ static inline void bond_slave_state_change(struct bonding *bond)
 	}
 }
 
+static inline void bond_slave_state_notify(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *tmp;
+
+	bond_for_each_slave(bond, tmp, iter) {
+		if (tmp->should_notify) {
+			rtmsg_ifinfo(RTM_NEWLINK, tmp->dev, 0, GFP_KERNEL);
+			tmp->should_notify = 0;
+		}
+	}
+}
+
 static inline int bond_slave_state(struct slave *slave)
 {
 	return slave->backup;
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c
  2014-02-19  9:05 [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Ding Tianhong
  2014-02-19  9:05 ` [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode Ding Tianhong
  2014-02-19  9:05 ` [PATCH net v2 2/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor Ding Tianhong
@ 2014-02-19 10:52 ` Thomas Glanzmann
  2 siblings, 0 replies; 7+ messages in thread
From: Thomas Glanzmann @ 2014-02-19 10:52 UTC (permalink / raw)
  To: Ding Tianhong
  Cc: fubar, vfalico, andy, cwang, jiri, eric.dumazet, sfeldma, davem, netdev

Hello Ding,

> v1->v2: Add new micro to indicate that the notification should be send
> 	later, not never.
> 	And add a new patch to fix the same problem for ARP mode.

> Ding Tianhong (2):
>   bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for
>     802.3ad mode
>   bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab
>     arp monitor

the patch did not apply cleanly for me, so I manually merged it.

The resulting patch is here:

(node-62) [~/work/linux-2.6] git diff | pbot
http://pbot.rmdir.de/TDMvhGB775BVPpKvm6NOXg

Find the results here:

(node-62) [~] dmesg | pbot
http://pbot.rmdir.de/nogPmqfGRxxk9zMNBlr0QQ
(node-62) [~] pbot /proc/net/bonding/bond0
http://pbot.rmdir.de/zNSSqmjSI0o1Qvt6DrSEQw
(node-62) [~] pbot /proc/net/bonding/bond1
http://pbot.rmdir.de/CoI00Rguie2P-kOQytOM9w

Tested-by: Thomas Glanzmann <thomas@glanzmann.de>

Cheers,
        Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode
  2014-02-19  9:05 ` [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode Ding Tianhong
@ 2014-02-25 22:32   ` David Miller
  2014-02-26  1:47     ` Ding Tianhong
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2014-02-25 22:32 UTC (permalink / raw)
  To: dingtianhong
  Cc: fubar, vfalico, andy, cwang, jiri, thomas, eric.dumazet, sfeldma, netdev

From: Ding Tianhong <dingtianhong@huawei.com>
Date: Wed, 19 Feb 2014 17:05:10 +0800

> @@ -2062,6 +2062,7 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
>  	struct list_head *iter;
>  	struct slave *slave;
>  	struct port *port;
> +	int slave_should_notify = 0;

Please use bool, true/false, for slave_should_notify.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net v2 2/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor
  2014-02-19  9:05 ` [PATCH net v2 2/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor Ding Tianhong
@ 2014-02-25 22:34   ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2014-02-25 22:34 UTC (permalink / raw)
  To: dingtianhong
  Cc: fubar, vfalico, andy, cwang, jiri, thomas, eric.dumazet, sfeldma, netdev

From: Ding Tianhong <dingtianhong@huawei.com>
Date: Wed, 19 Feb 2014 17:05:11 +0800

> -static bool bond_ab_arp_probe(struct bonding *bond)
> +static int bond_ab_arp_probe(struct bonding *bond)
>  {
>  	struct slave *slave, *before = NULL, *new_slave = NULL,
> -		     *curr_arp_slave, *curr_active_slave;
> +		     *curr_arp_slave = rcu_dereference(bond->current_arp_slave),
> +		     *curr_active_slave = rcu_dereference(bond->curr_active_slave);
>  	struct list_head *iter;
>  	bool found = false;
> -
> -	rcu_read_lock();
> -	curr_arp_slave = rcu_dereference(bond->current_arp_slave);
> -	curr_active_slave = rcu_dereference(bond->curr_active_slave);
> +	int slave_should_notify = 0;

Likewise use bool here and then you don't have to change the return type
of this function.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode
  2014-02-25 22:32   ` David Miller
@ 2014-02-26  1:47     ` Ding Tianhong
  0 siblings, 0 replies; 7+ messages in thread
From: Ding Tianhong @ 2014-02-26  1:47 UTC (permalink / raw)
  To: David Miller
  Cc: fubar, vfalico, andy, cwang, jiri, thomas, eric.dumazet, sfeldma, netdev

On 2014/2/26 6:32, David Miller wrote:
> From: Ding Tianhong <dingtianhong@huawei.com>
> Date: Wed, 19 Feb 2014 17:05:10 +0800
> 
>> @@ -2062,6 +2062,7 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
>>  	struct list_head *iter;
>>  	struct slave *slave;
>>  	struct port *port;
>> +	int slave_should_notify = 0;
> 
> Please use bool, true/false, for slave_should_notify.
> 
> Thanks.
> 
> .
> 

Yes, thanks, also fix the 2 patch.

Ding 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-02-26  1:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-19  9:05 [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Ding Tianhong
2014-02-19  9:05 ` [PATCH net v2 1/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad mode Ding Tianhong
2014-02-25 22:32   ` David Miller
2014-02-26  1:47     ` Ding Tianhong
2014-02-19  9:05 ` [PATCH net v2 2/2] bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor Ding Tianhong
2014-02-25 22:34   ` David Miller
2014-02-19 10:52 ` [PATCH net v2 0/2] Fix RTNL: assertion failed at net/core/rtnetlink.c Thomas Glanzmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).