All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] bonding: restrict up state in 802.3ad mode
@ 2015-12-17  8:03 zyjzyj2000
  2015-12-17 21:57 ` Jay Vosburgh
  0 siblings, 1 reply; 52+ messages in thread
From: zyjzyj2000 @ 2015-12-17  8:03 UTC (permalink / raw)
  To: j.vosburgh, vfalico, gospo, netdev, Boris.Shteinbock

From: Zhu Yanjun <zyjzyj2000@gmail.com>

In 802.3ad mode, the speed and duplex is needed. But in some NIC,
there is a time span between NIC up state and getting speed and duplex.
As such, sometimes a slave in 802.3ad mode is in up state without
speed and duplex. This will make bonding in 802.3ad mode can not
work well. 
To make bonding driver be compatible with more NICs, it is
necessary to restrict the up state in 802.3ad mode.

Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
---
 drivers/net/bonding/bond_main.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9e0f8a7..0a80fb3 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1991,6 +1991,25 @@ static int bond_miimon_inspect(struct bonding *bond)
 
 		link_state = bond_check_dev_link(bond, slave->dev, 0);
 
+		/* Since some NIC has time span between netif_running and
+		 * getting speed and duples. That is, after a NIC is up (netif_running),
+		 * there is a time span before this NIC is negotiated with speed and duplex.
+		 * During this time span, the slave in 802.3ad is configured without speed
+		 * and duplex. This 802.3ad bonding will not work because it needs slave's speed
+		 * and duplex to generate key field.
+		 * As such, we restrict up in 802.3ad mode to: netif_running && peed != SPEED_UNKNOWN &&
+		 * duplex != DUPLEX_UNKNOWN
+		 */
+		if ((BMSR_LSTATUS == link_state) &&
+		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
+			bond_update_speed_duplex(slave);
+			if ((slave->speed == SPEED_UNKNOWN) ||
+			    (slave->duplex == DUPLEX_UNKNOWN)) {
+				link_state = 0;
+				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
+			}
+		}
+
 		switch (slave->link) {
 		case BOND_LINK_UP:
 			if (link_state)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2015-12-17  8:03 [PATCH 1/1] bonding: restrict up state in 802.3ad mode zyjzyj2000
@ 2015-12-17 21:57 ` Jay Vosburgh
  2015-12-18  4:36   ` zyjzyj2000
  2015-12-28  8:43   ` [PATCH 1/1] bonding: restrict up state " Michal Kubecek
  0 siblings, 2 replies; 52+ messages in thread
From: Jay Vosburgh @ 2015-12-17 21:57 UTC (permalink / raw)
  To: zyjzyj2000; +Cc: vfalico, gospo, netdev, Boris.Shteinbock

<zyjzyj2000@gmail.com> wrote:

>From: Zhu Yanjun <zyjzyj2000@gmail.com>
>
>In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>there is a time span between NIC up state and getting speed and duplex.
>As such, sometimes a slave in 802.3ad mode is in up state without
>speed and duplex. This will make bonding in 802.3ad mode can not
>work well. 
>To make bonding driver be compatible with more NICs, it is
>necessary to restrict the up state in 802.3ad mode.

	What device is this?  It seems a bit odd that an Ethernet device
can be carrier up but not have the duplex and speed available.

	Also, what are the option settings for bonding?  Specifically,
is "use_carrier" set to 0?  The default setting is 1.

	In general, though, bonding expects a speed or duplex change to
be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
propagate to the 802.3ad logic.

	If the device here is going carrier up prior to having speed or
duplex available, then maybe it should call netdev_state_change() when
the duplex and speed are available, or delay calling netif_carrier_on().

>Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
>---
> drivers/net/bonding/bond_main.c |   19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 9e0f8a7..0a80fb3 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1991,6 +1991,25 @@ static int bond_miimon_inspect(struct bonding *bond)
> 
> 		link_state = bond_check_dev_link(bond, slave->dev, 0);
> 
>+		/* Since some NIC has time span between netif_running and
>+		 * getting speed and duples. That is, after a NIC is up (netif_running),
>+		 * there is a time span before this NIC is negotiated with speed and duplex.
>+		 * During this time span, the slave in 802.3ad is configured without speed
>+		 * and duplex. This 802.3ad bonding will not work because it needs slave's speed
>+		 * and duplex to generate key field.
>+		 * As such, we restrict up in 802.3ad mode to: netif_running && peed != SPEED_UNKNOWN &&
>+		 * duplex != DUPLEX_UNKNOWN
>+		 */
>+		if ((BMSR_LSTATUS == link_state) &&
>+		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
>+			bond_update_speed_duplex(slave);
>+			if ((slave->speed == SPEED_UNKNOWN) ||
>+			    (slave->duplex == DUPLEX_UNKNOWN)) {
>+				link_state = 0;
>+				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
>+			}
>+		}

	Also, as a functional note on this patch, the above looks like
it will spam the log repeatedly every miimon interval for as long as the
"carrier up but no speed/duplex" situation persists.

	-J

> 		switch (slave->link) {
> 		case BOND_LINK_UP:
> 			if (link_state)
>-- 
>1.7.9.5
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2015-12-17 21:57 ` Jay Vosburgh
@ 2015-12-18  4:36   ` zyjzyj2000
  2015-12-18  4:36     ` [PATCH 1/1] bonding: delay up state without speed and duplex " zyjzyj2000
  2015-12-28  8:43   ` [PATCH 1/1] bonding: restrict up state " Michal Kubecek
  1 sibling, 1 reply; 52+ messages in thread
From: zyjzyj2000 @ 2015-12-18  4:36 UTC (permalink / raw)
  To: j.vosburgh; +Cc: vfalico, gospo, netdev, Boris.Shteinbock


Hi, Jay

Thanks for your reply.

Yes. The NIC is a bit odd. We have to be compatible with it.
I followed your advice to delay calling netif_carrier_on().

Changes:
Delay calling netif_carrier_on().

Best Regards!
Zhu Yanjun

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: delay up state without speed and duplex in 802.3ad mode
  2015-12-18  4:36   ` zyjzyj2000
@ 2015-12-18  4:36     ` zyjzyj2000
  2015-12-18  4:54       ` Jay Vosburgh
  2015-12-18 13:37       ` Sergei Shtylyov
  0 siblings, 2 replies; 52+ messages in thread
From: zyjzyj2000 @ 2015-12-18  4:36 UTC (permalink / raw)
  To: j.vosburgh; +Cc: vfalico, gospo, netdev, Boris.Shteinbock

From: yzhu1 <yzhu1@windriver.com>

In 802.3ad mode, the speed and duplex is needed. But in some NICs,
there is a time span between NIC up state and getting speed and duplex.
As such, sometimes a slave in 802.3ad mode is in up state without
speed and duplex. This will make bonding in 802.3ad mode can not
work well.

To make bonding driver robust and compatible with more NICs, it is
necessary to delay the up state without speed and duplex in 802.3ad
mode.

Signed-off-by: yzhu1 <yzhu1@windriver.com>
---
 drivers/net/bonding/bond_main.c |   34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9e0f8a7..a1d8708 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -419,6 +419,35 @@ const char *bond_slave_link_status(s8 link)
 	}
 }
 
+/* This function is to check the speed and duplex of a NIC.
+ * Since the speed and duplex of a slave device are very
+ * important to the bonding in the 802.3ad mode. As such,
+ * it is necessary to check the speed and duplex of a slave
+ * device in 802.3ad mode.
+ *
+ * speed != SPEED_UNKNOWN and duplex == DUPLEX_FULL  :  1
+ *                                           others  :  0
+ */
+static int __check_speed_duplex(struct net_device *netdev)
+{
+	struct ethtool_cmd ecmd;
+	u32 slave_speed = SPEED_UNKNOWN;
+	int res;
+
+	res = __ethtool_get_settings(netdev, &ecmd);
+	if (res < 0)
+		return 0;
+
+	slave_speed = ethtool_cmd_speed(&ecmd);
+	if (slave_speed == 0 || slave_speed == ((__u32) -1))
+		return 0;
+
+	if (DUPLEX_FULL != ecmd.duplex)
+		return 0;
+
+	return 1;
+}
+
 /* if <dev> supports MII link status reporting, check its link status.
  *
  * We either do MII/ETHTOOL ioctls, or check netif_carrier_ok(),
@@ -445,6 +474,11 @@ static int bond_check_dev_link(struct bonding *bond,
 	if (!reporting && !netif_running(slave_dev))
 		return 0;
 
+	/* Check the speed and duplex of the slave device in 802.3ad mode. */
+	if ((BOND_MODE(bond) == BOND_MODE_8023AD) &&
+	   !__check_speed_duplex(slave_dev))
+		return 0;
+
 	if (bond->params.use_carrier)
 		return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: delay up state without speed and duplex in 802.3ad mode
  2015-12-18  4:36     ` [PATCH 1/1] bonding: delay up state without speed and duplex " zyjzyj2000
@ 2015-12-18  4:54       ` Jay Vosburgh
  2015-12-18 13:37       ` Sergei Shtylyov
  1 sibling, 0 replies; 52+ messages in thread
From: Jay Vosburgh @ 2015-12-18  4:54 UTC (permalink / raw)
  To: zyjzyj2000; +Cc: vfalico, gospo, netdev, Boris.Shteinbock

<zyjzyj2000@gmail.com> wrote:

>From: yzhu1 <yzhu1@windriver.com>
>
>In 802.3ad mode, the speed and duplex is needed. But in some NICs,
>there is a time span between NIC up state and getting speed and duplex.
>As such, sometimes a slave in 802.3ad mode is in up state without
>speed and duplex. This will make bonding in 802.3ad mode can not
>work well.
>
>To make bonding driver robust and compatible with more NICs, it is
>necessary to delay the up state without speed and duplex in 802.3ad
>mode.

	You misunderstood my comment.  What I meant is that the device
driver for the network device should change to either delay carrier up
or issue a second notifier when speed and duplex are available.  If the
driver doesn't handle duplex and speed notification properly, it will
likely have trouble with more than just bonding.

	You also didn't mention the identity of the network device that
requires this special handling.  Is the driver part of the linux kernel?

	-J

>Signed-off-by: yzhu1 <yzhu1@windriver.com>
>---
> drivers/net/bonding/bond_main.c |   34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 9e0f8a7..a1d8708 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -419,6 +419,35 @@ const char *bond_slave_link_status(s8 link)
> 	}
> }
> 
>+/* This function is to check the speed and duplex of a NIC.
>+ * Since the speed and duplex of a slave device are very
>+ * important to the bonding in the 802.3ad mode. As such,
>+ * it is necessary to check the speed and duplex of a slave
>+ * device in 802.3ad mode.
>+ *
>+ * speed != SPEED_UNKNOWN and duplex == DUPLEX_FULL  :  1
>+ *                                           others  :  0
>+ */
>+static int __check_speed_duplex(struct net_device *netdev)
>+{
>+	struct ethtool_cmd ecmd;
>+	u32 slave_speed = SPEED_UNKNOWN;
>+	int res;
>+
>+	res = __ethtool_get_settings(netdev, &ecmd);
>+	if (res < 0)
>+		return 0;
>+
>+	slave_speed = ethtool_cmd_speed(&ecmd);
>+	if (slave_speed == 0 || slave_speed == ((__u32) -1))
>+		return 0;
>+
>+	if (DUPLEX_FULL != ecmd.duplex)
>+		return 0;
>+
>+	return 1;
>+}
>+
> /* if <dev> supports MII link status reporting, check its link status.
>  *
>  * We either do MII/ETHTOOL ioctls, or check netif_carrier_ok(),
>@@ -445,6 +474,11 @@ static int bond_check_dev_link(struct bonding *bond,
> 	if (!reporting && !netif_running(slave_dev))
> 		return 0;
> 
>+	/* Check the speed and duplex of the slave device in 802.3ad mode. */
>+	if ((BOND_MODE(bond) == BOND_MODE_8023AD) &&
>+	   !__check_speed_duplex(slave_dev))
>+		return 0;
>+
> 	if (bond->params.use_carrier)
> 		return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
> 
>-- 
>1.7.9.5
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: delay up state without speed and duplex in 802.3ad mode
  2015-12-18  4:36     ` [PATCH 1/1] bonding: delay up state without speed and duplex " zyjzyj2000
  2015-12-18  4:54       ` Jay Vosburgh
@ 2015-12-18 13:37       ` Sergei Shtylyov
  1 sibling, 0 replies; 52+ messages in thread
From: Sergei Shtylyov @ 2015-12-18 13:37 UTC (permalink / raw)
  To: zyjzyj2000, j.vosburgh; +Cc: vfalico, gospo, netdev, Boris.Shteinbock

Hello.

On 12/18/2015 7:36 AM, zyjzyj2000@gmail.com wrote:

> From: yzhu1 <yzhu1@windriver.com>
>
> In 802.3ad mode, the speed and duplex is needed. But in some NICs,
> there is a time span between NIC up state and getting speed and duplex.
> As such, sometimes a slave in 802.3ad mode is in up state without
> speed and duplex. This will make bonding in 802.3ad mode can not
> work well.
>
> To make bonding driver robust and compatible with more NICs, it is
> necessary to delay the up state without speed and duplex in 802.3ad
> mode.
>
> Signed-off-by: yzhu1 <yzhu1@windriver.com>
> ---
>   drivers/net/bonding/bond_main.c |   34 ++++++++++++++++++++++++++++++++++
>   1 file changed, 34 insertions(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 9e0f8a7..a1d8708 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -419,6 +419,35 @@ const char *bond_slave_link_status(s8 link)
>   	}
>   }
>
> +/* This function is to check the speed and duplex of a NIC.
> + * Since the speed and duplex of a slave device are very
> + * important to the bonding in the 802.3ad mode. As such,
> + * it is necessary to check the speed and duplex of a slave
> + * device in 802.3ad mode.
> + *
> + * speed != SPEED_UNKNOWN and duplex == DUPLEX_FULL  :  1
> + *                                           others  :  0
> + */
> +static int __check_speed_duplex(struct net_device *netdev)
> +{
> +	struct ethtool_cmd ecmd;
> +	u32 slave_speed = SPEED_UNKNOWN;
> +	int res;
> +
> +	res = __ethtool_get_settings(netdev, &ecmd);
> +	if (res < 0)
> +		return 0;
> +
> +	slave_speed = ethtool_cmd_speed(&ecmd);
> +	if (slave_speed == 0 || slave_speed == ((__u32) -1))
> +		return 0;
> +
> +	if (DUPLEX_FULL != ecmd.duplex)

    Please place the immediate operand to the right of the != operator.

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2015-12-17 21:57 ` Jay Vosburgh
  2015-12-18  4:36   ` zyjzyj2000
@ 2015-12-28  8:43   ` Michal Kubecek
  2015-12-28  9:19     ` zhuyj
  1 sibling, 1 reply; 52+ messages in thread
From: Michal Kubecek @ 2015-12-28  8:43 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: zyjzyj2000, vfalico, gospo, netdev, Boris.Shteinbock

On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
> <zyjzyj2000@gmail.com> wrote:
> >In 802.3ad mode, the speed and duplex is needed. But in some NIC,
> >there is a time span between NIC up state and getting speed and duplex.
> >As such, sometimes a slave in 802.3ad mode is in up state without
> >speed and duplex. This will make bonding in 802.3ad mode can not
> >work well. 
> >To make bonding driver be compatible with more NICs, it is
> >necessary to restrict the up state in 802.3ad mode.
> 
> 	What device is this?  It seems a bit odd that an Ethernet device
> can be carrier up but not have the duplex and speed available.
...
> 	In general, though, bonding expects a speed or duplex change to
> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
> propagate to the 802.3ad logic.
> 
> 	If the device here is going carrier up prior to having speed or
> duplex available, then maybe it should call netdev_state_change() when
> the duplex and speed are available, or delay calling netif_carrier_on().

I have encountered this problem (NIC having carrier on before being able
to detect speed/duplex and driver not notifying when speed/duplex
becomes available) with netxen cards earlier. But it was eventually
fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
handling.") so this example rather supports what you said.

                                                          Michal Kubecek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2015-12-28  8:43   ` [PATCH 1/1] bonding: restrict up state " Michal Kubecek
@ 2015-12-28  9:19     ` zhuyj
  2016-01-06  1:26       ` Tantilov, Emil S
  0 siblings, 1 reply; 52+ messages in thread
From: zhuyj @ 2015-12-28  9:19 UTC (permalink / raw)
  To: Michal Kubecek, Jay Vosburgh; +Cc: vfalico, gospo, netdev, Boris.Shteinbock

On 12/28/2015 04:43 PM, Michal Kubecek wrote:
> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>> <zyjzyj2000@gmail.com> wrote:
>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>> there is a time span between NIC up state and getting speed and duplex.
>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>> work well.
>>> To make bonding driver be compatible with more NICs, it is
>>> necessary to restrict the up state in 802.3ad mode.
>> 	What device is this?  It seems a bit odd that an Ethernet device
>> can be carrier up but not have the duplex and speed available.
> ...
>> 	In general, though, bonding expects a speed or duplex change to
>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>> propagate to the 802.3ad logic.
>>
>> 	If the device here is going carrier up prior to having speed or
>> duplex available, then maybe it should call netdev_state_change() when
>> the duplex and speed are available, or delay calling netif_carrier_on().
> I have encountered this problem (NIC having carrier on before being able
> to detect speed/duplex and driver not notifying when speed/duplex
> becomes available) with netxen cards earlier. But it was eventually
> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
> handling.") so this example rather supports what you said.
>
>                                                            Michal Kubecek
Thanks a lot.
I checked the commit 9d01412ae76f ("netxen: Fix link event
handling."). The symptoms are the same with mine.

The root cause is different. In my problem, the root cause is that LINKS 
register
can not provide link_up and link_speed at the same time. There is a time 
span between
link_up and link_speed. My solution is to force to synchronize link_up 
and link_speed in
ixgbe X540 NIC.

Best Regards!
Zhu Yanjun

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2015-12-28  9:19     ` zhuyj
@ 2016-01-06  1:26       ` Tantilov, Emil S
  2016-01-06  3:05         ` zhuyj
  0 siblings, 1 reply; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-06  1:26 UTC (permalink / raw)
  To: zhuyj, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>Behalf Of zhuyj
>Sent: Monday, December 28, 2015 1:19 AM
>To: Michal Kubecek; Jay Vosburgh
>Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>Shteinbock, Boris (Wind River)
>Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>
>On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>> <zyjzyj2000@gmail.com> wrote:
>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>> there is a time span between NIC up state and getting speed and duplex.
>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>> work well.
>>>> To make bonding driver be compatible with more NICs, it is
>>>> necessary to restrict the up state in 802.3ad mode.
>>> 	What device is this?  It seems a bit odd that an Ethernet device
>>> can be carrier up but not have the duplex and speed available.
>> ...
>>> 	In general, though, bonding expects a speed or duplex change to
>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>> propagate to the 802.3ad logic.
>>>
>>> 	If the device here is going carrier up prior to having speed or
>>> duplex available, then maybe it should call netdev_state_change() when
>>> the duplex and speed are available, or delay calling netif_carrier_on().
>> I have encountered this problem (NIC having carrier on before being able
>> to detect speed/duplex and driver not notifying when speed/duplex
>> becomes available) with netxen cards earlier. But it was eventually
>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>> handling.") so this example rather supports what you said.
>>
>>                                                            Michal Kubecek
>Thanks a lot.
>I checked the commit 9d01412ae76f ("netxen: Fix link event
>handling."). The symptoms are the same with mine.
>
>The root cause is different. In my problem, the root cause is that LINKS
>register[]  can not provide link_up and link_speed at the same time.
>There is a time span between link_up and link_speed.

The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are updated
simultaneously. Do you have any proof to show the delay you are referring to
as I am sure our HW engineers would like to know about it.

What we have seen in the case of bonding is that with some link partners there
may be a rapid link flap (up, down, up) and as result the bonding driver may
report the speed as unknown if just so happens that the speed is checked during
the period in which the interface is re-negotiating.

Thanks,
Emil
 
>My solution is to force to synchronize link_up and link_speed in ixgbe X540 NIC.
>
>Best Regards!
>Zhu Yanjun
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-06  1:26       ` Tantilov, Emil S
@ 2016-01-06  3:05         ` zhuyj
  2016-01-07  2:43           ` Tantilov, Emil S
  0 siblings, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-06  3:05 UTC (permalink / raw)
  To: Tantilov, Emil S, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>> Behalf Of zhuyj
>> Sent: Monday, December 28, 2015 1:19 AM
>> To: Michal Kubecek; Jay Vosburgh
>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>> Shteinbock, Boris (Wind River)
>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>
>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>> <zyjzyj2000@gmail.com> wrote:
>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>> there is a time span between NIC up state and getting speed and duplex.
>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>> work well.
>>>>> To make bonding driver be compatible with more NICs, it is
>>>>> necessary to restrict the up state in 802.3ad mode.
>>>> 	What device is this?  It seems a bit odd that an Ethernet device
>>>> can be carrier up but not have the duplex and speed available.
>>> ...
>>>> 	In general, though, bonding expects a speed or duplex change to
>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>> propagate to the 802.3ad logic.
>>>>
>>>> 	If the device here is going carrier up prior to having speed or
>>>> duplex available, then maybe it should call netdev_state_change() when
>>>> the duplex and speed are available, or delay calling netif_carrier_on().
>>> I have encountered this problem (NIC having carrier on before being able
>>> to detect speed/duplex and driver not notifying when speed/duplex
>>> becomes available) with netxen cards earlier. But it was eventually
>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>> handling.") so this example rather supports what you said.
>>>
>>>                                                             Michal Kubecek
>> Thanks a lot.
>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>> handling."). The symptoms are the same with mine.
>>
>> The root cause is different. In my problem, the root cause is that LINKS
>> register[]  can not provide link_up and link_speed at the same time.
>> There is a time span between link_up and link_speed.
> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are updated
> simultaneously. Do you have any proof to show the delay you are referring to
> as I am sure our HW engineers would like to know about it.
Sorry. I can not reproduce this problem locally. What I have is the 
feedback from the customer.

Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: Unknown!
    Duplex: Unknown! (255)
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: no
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: Unknown!
    Duplex: Unknown! (255)
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: Unknown!
    Duplex: Unknown! (255)
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: Unknown!
    Duplex: Unknown! (255)
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: Unknown!
    Duplex: Unknown! (255)
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  100baseT/Full
                            1000baseT/Full
                            10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: 10000Mb/s
    Duplex: Full
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes


I think the time span between link_up and link_speed lasts several seconds.

 From this function
/**
  * ixgbe_service_timer - Timer Call-back
  * @data: pointer to adapter cast into an unsigned long
  **/
static void ixgbe_service_timer(unsigned long data)
{
         struct ixgbe_adapter *adapter = (struct ixgbe_adapter *)data;
         unsigned long next_event_offset;

         /* poll faster when waiting for link */
         if (adapter->flags & IXGBE_FLAG_NEED_LINK_UPDATE)
                 next_event_offset = HZ / 10;
         else
                 next_event_offset = HZ * 2;

         /* Reset the timer */
         mod_timer(&adapter->service_timer, next_event_offset + jiffies);

         ixgbe_service_event_schedule(adapter);
}

The timer will check link state every 100ms. In this several seconds, 
the link state
is updated for about several dozens of times.
>
> What we have seen in the case of bonding is that with some link partners there
> may be a rapid link flap (up, down, up) and as result the bonding driver may
> report the speed as unknown if just so happens that the speed is checked during
> the period in which the interface is re-negotiating.
Sure. What we have done is to avoid link_up without link_speed. Unless both
link_up and link_speed are ready, the bonding driver will not be 
triggered to check
both link_up and link_speed in 802.3ad mode.

Thanks a lot.
Zhu Yanjun
> Thanks,
> Emil
>   
>> My solution is to force to synchronize link_up and link_speed in ixgbe X540 NIC.
>>
>> Best Regards!
>> Zhu Yanjun
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-06  3:05         ` zhuyj
@ 2016-01-07  2:43           ` Tantilov, Emil S
  2016-01-07  3:33             ` zhuyj
  2016-01-07  7:47             ` zhuyj
  0 siblings, 2 replies; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-07  2:43 UTC (permalink / raw)
  To: zhuyj, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: zhuyj [mailto:zyjzyj2000@gmail.com]
>Sent: Tuesday, January 05, 2016 7:05 PM
>To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>Shteinbock, Boris (Wind River)
>Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>
>On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>> -----Original Message-----
>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>On
>>> Behalf Of zhuyj
>>> Sent: Monday, December 28, 2015 1:19 AM
>>> To: Michal Kubecek; Jay Vosburgh
>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>netdev@vger.kernel.org;
>>> Shteinbock, Boris (Wind River)
>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>
>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>> there is a time span between NIC up state and getting speed and
>duplex.
>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>> work well.
>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>> 	What device is this?  It seems a bit odd that an Ethernet device
>>>>> can be carrier up but not have the duplex and speed available.
>>>> ...
>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>> propagate to the 802.3ad logic.
>>>>>
>>>>> 	If the device here is going carrier up prior to having speed or
>>>>> duplex available, then maybe it should call netdev_state_change() when
>>>>> the duplex and speed are available, or delay calling
>netif_carrier_on().
>>>> I have encountered this problem (NIC having carrier on before being
>able
>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>> becomes available) with netxen cards earlier. But it was eventually
>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>> handling.") so this example rather supports what you said.
>>>>
>>>>                                                             Michal
>Kubecek
>>> Thanks a lot.
>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>> handling."). The symptoms are the same with mine.
>>>
>>> The root cause is different. In my problem, the root cause is that LINKS
>>> register[]  can not provide link_up and link_speed at the same time.
>>> There is a time span between link_up and link_speed.
>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>updated
>> simultaneously. Do you have any proof to show the delay you are referring
>to
>> as I am sure our HW engineers would like to know about it.
>Sorry. I can not reproduce this problem locally. What I have is the
>feedback from the customer.

So you are assuming that there is a delay due to the issue you are seeing?

>Settings for eth0:
>    Supported ports: [ TP ]
>    Supported link modes:   100baseT/Full
>                            1000baseT/Full
>                            10000baseT/Full
>    Supported pause frame use: No
>    Supports auto-negotiation: Yes
>    Advertised link modes:  100baseT/Full
>                            1000baseT/Full
>                            10000baseT/Full
>    Advertised pause frame use: No
>    Advertised auto-negotiation: Yes
>    Speed: Unknown!
>    Duplex: Unknown! (255)
>    Port: Twisted Pair
>    PHYAD: 0
>    Transceiver: external
>    Auto-negotiation: on
>    MDI-X: Unknown
>    Supports Wake-on: d
>    Wake-on: d
>    Current message level: 0x00000007 (7)
>                   drv probe link
>    Link detected: yes

The speed and the link state here are reported from
different sources:

>    Link detected: yes

Comes from a netif_carrier_ok() check. This is done via ethtool_op_get_link().

Only the speed is reported through the LINKS register - that is why it is reported
as "Unknown" - in other words link_up is false.

This is a trace from the case where the bonding driver reports 0 Mbps:

   kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task: eth1: link_speed = 80, link_up = false
   kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task: eth1: link_speed = 80, link_up = false
   kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task: eth1: link_speed = 80, link_up = true
   kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
     kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
  NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
   kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0: link status definitely up for interface eth1, 0 Mbps full duplex
  NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
   kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task: eth1: link_speed = 80, link_up = true
   kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task: eth1: link_speed = 80, link_up = true

As you can see the link is initially established, but then lost and if just so happens that the
bonding driver is checking it at that time it will report 0 Mbps.

I will give your patch a try and see if it helps in this situation.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  2:43           ` Tantilov, Emil S
@ 2016-01-07  3:33             ` zhuyj
  2016-01-07  5:02               ` Tantilov, Emil S
  2016-01-07  7:47             ` zhuyj
  1 sibling, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-07  3:33 UTC (permalink / raw)
  To: Tantilov, Emil S, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>> Sent: Tuesday, January 05, 2016 7:05 PM
>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>> Shteinbock, Boris (Wind River)
>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>
>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>> -----Original Message-----
>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> On
>>>> Behalf Of zhuyj
>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>> To: Michal Kubecek; Jay Vosburgh
>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>> netdev@vger.kernel.org;
>>>> Shteinbock, Boris (Wind River)
>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>
>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>> there is a time span between NIC up state and getting speed and
>> duplex.
>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>> work well.
>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>> 	What device is this?  It seems a bit odd that an Ethernet device
>>>>>> can be carrier up but not have the duplex and speed available.
>>>>> ...
>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>> propagate to the 802.3ad logic.
>>>>>>
>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>> duplex available, then maybe it should call netdev_state_change() when
>>>>>> the duplex and speed are available, or delay calling
>> netif_carrier_on().
>>>>> I have encountered this problem (NIC having carrier on before being
>> able
>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>> handling.") so this example rather supports what you said.
>>>>>
>>>>>                                                              Michal
>> Kubecek
>>>> Thanks a lot.
>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>> handling."). The symptoms are the same with mine.
>>>>
>>>> The root cause is different. In my problem, the root cause is that LINKS
>>>> register[]  can not provide link_up and link_speed at the same time.
>>>> There is a time span between link_up and link_speed.
>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>> updated
>>> simultaneously. Do you have any proof to show the delay you are referring
>> to
>>> as I am sure our HW engineers would like to know about it.
>> Sorry. I can not reproduce this problem locally. What I have is the
>> feedback from the customer.
> So you are assuming that there is a delay due to the issue you are seeing?

Sure. Before I get the further feedback from the customer, I can not 
make further conclusion.
My patch is based on the feedback from the customer.

>
>> Settings for eth0:
>>     Supported ports: [ TP ]
>>     Supported link modes:   100baseT/Full
>>                             1000baseT/Full
>>                             10000baseT/Full
>>     Supported pause frame use: No
>>     Supports auto-negotiation: Yes
>>     Advertised link modes:  100baseT/Full
>>                             1000baseT/Full
>>                             10000baseT/Full
>>     Advertised pause frame use: No
>>     Advertised auto-negotiation: Yes
>>     Speed: Unknown!
>>     Duplex: Unknown! (255)
>>     Port: Twisted Pair
>>     PHYAD: 0
>>     Transceiver: external
>>     Auto-negotiation: on
>>     MDI-X: Unknown
>>     Supports Wake-on: d
>>     Wake-on: d
>>     Current message level: 0x00000007 (7)
>>                    drv probe link
>>     Link detected: yes
> The speed and the link state here are reported from
> different sources:
Sure. 
ixgbe_get_settings->hw->mac.ops.check_link(X540)->ixgbe_check_mac_link_generic
In this function ixgbe_check_mac_link_generic, the register IXGBE_LINKS 
is checked. link_up and
link_speed is got from this register.

>
>>     Link detected: yes
> Comes from a netif_carrier_ok() check. This is done via ethtool_op_get_link()
>
> Only the speed is reported through the LINKS register - that is why it is reported
> as "Unknown" - in other words link_up is false.
Sorry. I do not agree with you.

static inline bool netif_carrier_ok(const struct net_device *dev)
{
         return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
}

netif_carrier_ok will check __LINK_STATE_NOCARRIER. This 
__LINK_STATE_NOCARRIER is set by netif_carrier_on.

/**
  *      netif_carrier_on - set carrier
  *      @dev: network device
  *
  * Device has detected that carrier.
  */
void netif_carrier_on(struct net_device *dev)
{
         if (test_and_clear_bit(__LINK_STATE_NOCARRIER, &dev->state)) {
                 if (dev->reg_state == NETREG_UNINITIALIZED)
                         return;
                 atomic_inc(&dev->carrier_changes);
                 linkwatch_fire_event(dev);
                 if (netif_running(dev))
                         __netdev_watchdog_up(dev);
         }
}

In ixgbe driver, in ixgbe_main.c +6506, this function 
ixgbe_watchdog_link_is_up runs
netif_carrier_on function.

ixgbe_watchdog_link_is_up is in service_task. If 
IXGBE_FLAG_NEED_LINK_UPDATE is set in adapter->flags,
the function ixgbe_watchdog_link_is_up will run every 100ms.

IXGBE_FLAG_NEED_LINK_UPDATE is set in ixgbe_check_lsc in x540. This 
function ixgbe_check_lsc is in irq handler.
link_up will trigger it.

As such, link_up will trriger ixgbe_check_lsc to set 
IXGBE_FLAG_NEED_LINK_UPDATE in adapter->flags. In the end,
service_task will check the register IXGBE_LINKS every 100ms.

So ixgbe_get_settings and netif_carrier_ok travel different paths to the 
function ixgbe_check_mac_link_generic.
And the time span between ixgbe_get_settings and netif_carrier_ok is 
very tiny, about 100ms. So we can treat it simultaneous.

>
> This is a trace from the case where the bonding driver reports 0 Mbps:
>
>     kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>     kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>       kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>    NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0: link status definitely up for interface eth1, 0 Mbps full duplex
>    NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>     kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>
> As you can see the link is initially established, but then lost and if just so happens that the
> bonding driver is checking it at that time it will report 0 Mbps.
Thanks for your reply. I will delve into the source code.

Best Regards!
Zhu Yanjun
>
> I will give your patch a try and see if it helps in this situation.
>
> Thanks,
> Emil
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  3:33             ` zhuyj
@ 2016-01-07  5:02               ` Tantilov, Emil S
  2016-01-07  6:15                 ` zyjzyj2000
  0 siblings, 1 reply; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-07  5:02 UTC (permalink / raw)
  To: zhuyj, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: zhuyj [mailto:zyjzyj2000@gmail.com]
>Sent: Wednesday, January 06, 2016 7:34 PM
>To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>Shteinbock, Boris (Wind River)
>Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>
>On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>>> -----Original Message-----
>>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>>> Sent: Tuesday, January 05, 2016 7:05 PM
>>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>netdev@vger.kernel.org;
>>> Shteinbock, Boris (Wind River)
>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>
>>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>>> -----Original Message-----
>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>owner@vger.kernel.org]
>>> On
>>>>> Behalf Of zhuyj
>>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>>> To: Michal Kubecek; Jay Vosburgh
>>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>>> netdev@vger.kernel.org;
>>>>> Shteinbock, Boris (Wind River)
>>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>>
>>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>>> there is a time span between NIC up state and getting speed and
>>> duplex.
>>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>>> work well.
>>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>>> 	What device is this?  It seems a bit odd that an Ethernet
>device
>>>>>>> can be carrier up but not have the duplex and speed available.
>>>>>> ...
>>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>>> propagate to the 802.3ad logic.
>>>>>>>
>>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>>> duplex available, then maybe it should call netdev_state_change()
>when
>>>>>>> the duplex and speed are available, or delay calling
>>> netif_carrier_on().
>>>>>> I have encountered this problem (NIC having carrier on before being
>>> able
>>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>>> handling.") so this example rather supports what you said.
>>>>>>
>>>>>>                                                              Michal
>>> Kubecek
>>>>> Thanks a lot.
>>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>>> handling."). The symptoms are the same with mine.
>>>>>
>>>>> The root cause is different. In my problem, the root cause is that
>LINKS
>>>>> register[]  can not provide link_up and link_speed at the same time.
>>>>> There is a time span between link_up and link_speed.
>>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>>> updated
>>>> simultaneously. Do you have any proof to show the delay you are
>referring
>>> to
>>>> as I am sure our HW engineers would like to know about it.
>>> Sorry. I can not reproduce this problem locally. What I have is the
>>> feedback from the customer.
>> So you are assuming that there is a delay due to the issue you are
>seeing?
>
>Sure. Before I get the further feedback from the customer, I can not
>make further conclusion.
>My patch is based on the feedback from the customer.

Your patch is throwing an RTNL assertion warning:

RTNL: assertion failed at net/core/ethtool.c (357)

Looks like you may need to hold an RTNL lock for the slave before calling
bond_update_speed_duplex(), though I am not sure if it's a good idea in
general. 

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  5:02               ` Tantilov, Emil S
@ 2016-01-07  6:15                 ` zyjzyj2000
  2016-01-07  6:22                   ` zhuyj
                                     ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-07  6:15 UTC (permalink / raw)
  To: emil.s.tantilov, mkubecek, jay.vosburgh
  Cc: vfalico, gospo, netdev, boris.shteinbock

From: Zhu Yanjun <yanjun.zhu@windriver.com>

In 802.3ad mode, the speed and duplex is needed. But in some NIC,
there is a time span between NIC up state and getting speed and duplex.
As such, sometimes a slave in 802.3ad mode is in up state without
speed and duplex. This will make bonding in 802.3ad mode can not
work well.
To make bonding driver be compatible with more NICs, it is
necessary to restrict the up state in 802.3ad mode.

Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
---
 drivers/net/bonding/bond_main.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 09f8a48..7df8af5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond)
 
 		link_state = bond_check_dev_link(bond, slave->dev, 0);
 
+		if ((BMSR_LSTATUS == link_state) &&
+		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
+			rtnl_lock();
+			bond_update_speed_duplex(slave);
+			rtnl_unlock();
+			if ((slave->speed == SPEED_UNKNOWN) ||
+			    (slave->duplex == DUPLEX_UNKNOWN)) {
+				link_state = 0;
+				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
+			}
+		}
 		switch (slave->link) {
 		case BOND_LINK_UP:
 			if (link_state)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  6:15                 ` zyjzyj2000
@ 2016-01-07  6:22                   ` zhuyj
  2016-01-07  6:33                   ` Jay Vosburgh
  2016-01-07  6:53                   ` Michal Kubecek
  2 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-07  6:22 UTC (permalink / raw)
  To: emil.s.tantilov, mkubecek, jay.vosburgh
  Cc: vfalico, gospo, netdev, boris.shteinbock

Hi, Emil

Would you like to help me to make tests with this patch?
If the root cause is not the time span, I will make a new patch for this.

Thanks a lot.
Zhu Yanjun

On 01/07/2016 02:15 PM, zyjzyj2000@gmail.com wrote:
> From: Zhu Yanjun <yanjun.zhu@windriver.com>
>
> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
> there is a time span between NIC up state and getting speed and duplex.
> As such, sometimes a slave in 802.3ad mode is in up state without
> speed and duplex. This will make bonding in 802.3ad mode can not
> work well.
> To make bonding driver be compatible with more NICs, it is
> necessary to restrict the up state in 802.3ad mode.
>
> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
> ---
>   drivers/net/bonding/bond_main.c |   11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 09f8a48..7df8af5 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond)
>   
>   		link_state = bond_check_dev_link(bond, slave->dev, 0);
>   
> +		if ((BMSR_LSTATUS == link_state) &&
> +		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
> +			rtnl_lock();
> +			bond_update_speed_duplex(slave);
> +			rtnl_unlock();
> +			if ((slave->speed == SPEED_UNKNOWN) ||
> +			    (slave->duplex == DUPLEX_UNKNOWN)) {
> +				link_state = 0;
> +				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
> +			}
> +		}
>   		switch (slave->link) {
>   		case BOND_LINK_UP:
>   			if (link_state)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  6:15                 ` zyjzyj2000
  2016-01-07  6:22                   ` zhuyj
@ 2016-01-07  6:33                   ` Jay Vosburgh
  2016-01-07 15:27                     ` Tantilov, Emil S
                                       ` (2 more replies)
  2016-01-07  6:53                   ` Michal Kubecek
  2 siblings, 3 replies; 52+ messages in thread
From: Jay Vosburgh @ 2016-01-07  6:33 UTC (permalink / raw)
  To: zyjzyj2000
  Cc: emil.s.tantilov, mkubecek, vfalico, gospo, netdev, boris.shteinbock

<zyjzyj2000@gmail.com> wrote:

>From: Zhu Yanjun <yanjun.zhu@windriver.com>
>
>In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>there is a time span between NIC up state and getting speed and duplex.
>As such, sometimes a slave in 802.3ad mode is in up state without
>speed and duplex. This will make bonding in 802.3ad mode can not
>work well.

	From my reading of Emil's comments in the discussion, I'm not
sure the above is an accurate description of the problem.  If I'm
understanding correctly, the cause is due to link flaps racing with the
bonding monitor workqueue polling the state.  Is this correct?

>To make bonding driver be compatible with more NICs, it is
>necessary to restrict the up state in 802.3ad mode.
>
>Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
>---
> drivers/net/bonding/bond_main.c |   11 +++++++++++
> 1 file changed, 11 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 09f8a48..7df8af5 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond)
> 
> 		link_state = bond_check_dev_link(bond, slave->dev, 0);
> 
>+		if ((BMSR_LSTATUS == link_state) &&
>+		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
>+			rtnl_lock();
>+			bond_update_speed_duplex(slave);
>+			rtnl_unlock();

	This will add a round trip on the RTNL mutex for every miimon
interval when the slave is carrier up.  At common miimon rates (10 - 50
ms), this will hit RTNL between 20 and 100 times per second.  I do not
see how this is acceptable.

	I believe the proper solution here is to supplant the periodic
miimon polling from bonding with link state detection based on notifiers
(As Stephen suggested, not for the first time).

	My suggestion is to have bonding set slave link state based on
notifiers if miimon is set to zero, and poll as usual if it is not.
This would preserve any backwards compatibility with any device out
there that might possibly still be doing netif_carrier_on/off
incorrectly or not at all.  The only minor complication is synchronizing
notifier carrier state detection with the ARP monitor.

	This should have been done a long time ago; I'll work something
up tomorrow (it's late here right now) and post a patch for testing.

	-J

>+			if ((slave->speed == SPEED_UNKNOWN) ||
>+			    (slave->duplex == DUPLEX_UNKNOWN)) {
>+				link_state = 0;
>+				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
>+			}
>+		}
> 		switch (slave->link) {
> 		case BOND_LINK_UP:
> 			if (link_state)
>-- 
>1.7.9.5
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  6:15                 ` zyjzyj2000
  2016-01-07  6:22                   ` zhuyj
  2016-01-07  6:33                   ` Jay Vosburgh
@ 2016-01-07  6:53                   ` Michal Kubecek
  2016-01-07  7:37                     ` zhuyj
  2 siblings, 1 reply; 52+ messages in thread
From: Michal Kubecek @ 2016-01-07  6:53 UTC (permalink / raw)
  To: zyjzyj2000
  Cc: emil.s.tantilov, jay.vosburgh, vfalico, gospo, netdev, boris.shteinbock

On Thu, Jan 07, 2016 at 02:15:13PM +0800, zyjzyj2000@gmail.com wrote:
> From: Zhu Yanjun <yanjun.zhu@windriver.com>
> 
> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
> there is a time span between NIC up state and getting speed and duplex.
> As such, sometimes a slave in 802.3ad mode is in up state without
> speed and duplex. This will make bonding in 802.3ad mode can not
> work well.
> To make bonding driver be compatible with more NICs, it is
> necessary to restrict the up state in 802.3ad mode.
> 
> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
> ---
>  drivers/net/bonding/bond_main.c |   11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 09f8a48..7df8af5 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond)
>  
>  		link_state = bond_check_dev_link(bond, slave->dev, 0);
>  
> +		if ((BMSR_LSTATUS == link_state) &&
> +		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
> +			rtnl_lock();
> +			bond_update_speed_duplex(slave);
> +			rtnl_unlock();
> +			if ((slave->speed == SPEED_UNKNOWN) ||
> +			    (slave->duplex == DUPLEX_UNKNOWN)) {
> +				link_state = 0;
> +				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");

If I read this right, whenever this state (link up but speed/duplex
unknown) is entered, you'll keep writing this message into kernel log
every miimon milliseconds until something changes. I'm not sure how long
a NIC can stay in such state but it might get quite annoying (even more
if something really goes wrong and NIC stays that way which can't be
completely ruled out, IMHO).


> +			}
> +		}
>  		switch (slave->link) {
>  		case BOND_LINK_UP:
>  			if (link_state)

BtW, you accidentally submitted this patch twice.

                                                          Michal Kubecek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  6:53                   ` Michal Kubecek
@ 2016-01-07  7:37                     ` zhuyj
  2016-01-07  7:59                       ` Michal Kubecek
  0 siblings, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-07  7:37 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: emil.s.tantilov, jay.vosburgh, vfalico, gospo, netdev, boris.shteinbock

On 01/07/2016 02:53 PM, Michal Kubecek wrote:
> On Thu, Jan 07, 2016 at 02:15:13PM +0800, zyjzyj2000@gmail.com wrote:
>> From: Zhu Yanjun <yanjun.zhu@windriver.com>
>>
>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>> there is a time span between NIC up state and getting speed and duplex.
>> As such, sometimes a slave in 802.3ad mode is in up state without
>> speed and duplex. This will make bonding in 802.3ad mode can not
>> work well.
>> To make bonding driver be compatible with more NICs, it is
>> necessary to restrict the up state in 802.3ad mode.
>>
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
>> ---
>>   drivers/net/bonding/bond_main.c |   11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 09f8a48..7df8af5 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond)
>>   
>>   		link_state = bond_check_dev_link(bond, slave->dev, 0);
>>   
>> +		if ((BMSR_LSTATUS == link_state) &&
>> +		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
>> +			rtnl_lock();
>> +			bond_update_speed_duplex(slave);
>> +			rtnl_unlock();
>> +			if ((slave->speed == SPEED_UNKNOWN) ||
>> +			    (slave->duplex == DUPLEX_UNKNOWN)) {
>> +				link_state = 0;
>> +				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
> If I read this right, whenever this state (link up but speed/duplex
> unknown) is entered, you'll keep writing this message into kernel log
> every miimon milliseconds until something changes. I'm not sure how long
> a NIC can stay in such state but it might get quite annoying (even more
> if something really goes wrong and NIC stays that way which can't be
> completely ruled out, IMHO).

Sure, Thanks a lot. I want to confirm link_up without link_speed. It is 
not usual. So I think this only lasts for several seconds.
It is very important to us since it can help us to find the root cause.

Zhu Yanjun

>
>
>> +			}
>> +		}
>>   		switch (slave->link) {
>>   		case BOND_LINK_UP:
>>   			if (link_state)
> BtW, you accidentally submitted this patch twice.
>
>                                                            Michal Kubecek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  2:43           ` Tantilov, Emil S
  2016-01-07  3:33             ` zhuyj
@ 2016-01-07  7:47             ` zhuyj
  2016-01-07 18:28               ` Tantilov, Emil S
  1 sibling, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-07  7:47 UTC (permalink / raw)
  To: Tantilov, Emil S, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>> Sent: Tuesday, January 05, 2016 7:05 PM
>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>> Shteinbock, Boris (Wind River)
>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>
>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>> -----Original Message-----
>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> On
>>>> Behalf Of zhuyj
>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>> To: Michal Kubecek; Jay Vosburgh
>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>> netdev@vger.kernel.org;
>>>> Shteinbock, Boris (Wind River)
>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>
>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>> there is a time span between NIC up state and getting speed and
>> duplex.
>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>> work well.
>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>> 	What device is this?  It seems a bit odd that an Ethernet device
>>>>>> can be carrier up but not have the duplex and speed available.
>>>>> ...
>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>> propagate to the 802.3ad logic.
>>>>>>
>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>> duplex available, then maybe it should call netdev_state_change() when
>>>>>> the duplex and speed are available, or delay calling
>> netif_carrier_on().
>>>>> I have encountered this problem (NIC having carrier on before being
>> able
>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>> handling.") so this example rather supports what you said.
>>>>>
>>>>>                                                              Michal
>> Kubecek
>>>> Thanks a lot.
>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>> handling."). The symptoms are the same with mine.
>>>>
>>>> The root cause is different. In my problem, the root cause is that LINKS
>>>> register[]  can not provide link_up and link_speed at the same time.
>>>> There is a time span between link_up and link_speed.
>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>> updated
>>> simultaneously. Do you have any proof to show the delay you are referring
>> to
>>> as I am sure our HW engineers would like to know about it.
>> Sorry. I can not reproduce this problem locally. What I have is the
>> feedback from the customer.
> So you are assuming that there is a delay due to the issue you are seeing?
>
>> Settings for eth0:
>>     Supported ports: [ TP ]
>>     Supported link modes:   100baseT/Full
>>                             1000baseT/Full
>>                             10000baseT/Full
>>     Supported pause frame use: No
>>     Supports auto-negotiation: Yes
>>     Advertised link modes:  100baseT/Full
>>                             1000baseT/Full
>>                             10000baseT/Full
>>     Advertised pause frame use: No
>>     Advertised auto-negotiation: Yes
>>     Speed: Unknown!
>>     Duplex: Unknown! (255)
>>     Port: Twisted Pair
>>     PHYAD: 0
>>     Transceiver: external
>>     Auto-negotiation: on
>>     MDI-X: Unknown
>>     Supports Wake-on: d
>>     Wake-on: d
>>     Current message level: 0x00000007 (7)
>>                    drv probe link
>>     Link detected: yes
> The speed and the link state here are reported from
> different sources:
>
>>     Link detected: yes
> Comes from a netif_carrier_ok() check. This is done via ethtool_op_get_link().
>
> Only the speed is reported through the LINKS register - that is why it is reported
> as "Unknown" - in other words link_up is false.
>
> This is a trace from the case where the bonding driver reports 0 Mbps:
>
>     kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>     kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>       kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>    NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0: link status definitely up for interface eth1, 0 Mbps full duplex
>    NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings: eth1: link_speed = 80, link_up = false
>     kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task: eth1: link_speed = 80, link_up = true
>     kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task: eth1: link_speed = 80, link_up = true

Hi, Emil

Thanks for your feedback.
 From your log, I think the following can explain why bonding driver can 
not get speed.

bonding                           ixgbe
.                                   .
.      <-----------------------   NETDEV_UP
.                                   .
bond_slave_netdev_event           NETDEV_DOWN
.                                   .
.                                   .
.                                   .
NETDEV_UP                           .
.              ----------------> get_settings
                                     .
speed unknown  <---------------  link_up false
.
.
link_up = true
link_speed = unknown

In the above, ixgbe is up and bonding gets this message, then bonding 
calls bond_slave_netdev_event while ixgbe is down.
In bond_slave_netdev_event, bonding call get_settings in ixgbe to get 
link_speed. Since now ixgbe is down, so link_speed is
unknown. In the end, bonding get the final state of ixgbe as link_up 
without link_speed.

If you agree with me, would you like to help me to make tests with the 
following patch?

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index d681273..3efc4d8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -285,27 +285,24 @@ static int ixgbe_get_settings(struct net_device 
*netdev,
         }

         hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
-       if (link_up) {
-               switch (link_speed) {
-               case IXGBE_LINK_SPEED_10GB_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_10000);
-                       break;
-               case IXGBE_LINK_SPEED_2_5GB_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_2500);
-                       break;
-               case IXGBE_LINK_SPEED_1GB_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_1000);
-                       break;
-               case IXGBE_LINK_SPEED_100_FULL:
-                       ethtool_cmd_speed_set(ecmd, SPEED_100);
-                       break;
-               default:
-                       break;
-               }
-               ecmd->duplex = DUPLEX_FULL;
-       } else {
-               ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN);
+
+       ecmd->duplex = DUPLEX_FULL;
+       switch (link_speed) {
+       case IXGBE_LINK_SPEED_10GB_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_10000);
+               break;
+       case IXGBE_LINK_SPEED_2_5GB_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_2500);
+               break;
+       case IXGBE_LINK_SPEED_1GB_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_1000);
+               break;
+       case IXGBE_LINK_SPEED_100_FULL:
+               ethtool_cmd_speed_set(ecmd, SPEED_100);
+               break;
+       default:
                 ecmd->duplex = DUPLEX_UNKNOWN;
+               break;
         }

         return 0;

Thanks a lot.
Zhu Yanjun

>
> As you can see the link is initially established, but then lost and if just so happens that the
> bonding driver is checking it at that time it will report 0 Mbps.
>
> I will give your patch a try and see if it helps in this situation.
>
> Thanks,
> Emil
>
>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  7:37                     ` zhuyj
@ 2016-01-07  7:59                       ` Michal Kubecek
  2016-01-07  8:35                         ` zhuyj
  0 siblings, 1 reply; 52+ messages in thread
From: Michal Kubecek @ 2016-01-07  7:59 UTC (permalink / raw)
  To: zhuyj
  Cc: emil.s.tantilov, jay.vosburgh, vfalico, gospo, netdev, boris.shteinbock

On Thu, Jan 07, 2016 at 03:37:26PM +0800, zhuyj wrote:
> >If I read this right, whenever this state (link up but speed/duplex
> >unknown) is entered, you'll keep writing this message into kernel log
> >every miimon milliseconds until something changes. I'm not sure how long
> >a NIC can stay in such state but it might get quite annoying (even more
> >if something really goes wrong and NIC stays that way which can't be
> >completely ruled out, IMHO).
> 
> Sure, Thanks a lot. I want to confirm link_up without link_speed. It
> is not usual. So I think this only lasts for several seconds.
> It is very important to us since it can help us to find the root cause.

For debugging purposes it's fine, of course. But this looked like an
officially submitted patch so I didn't like the idea of log spamming
(even one second could result in 10-100 messages and admins certainly
would hate that).

                                                       Michal Kubecek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  7:59                       ` Michal Kubecek
@ 2016-01-07  8:35                         ` zhuyj
  0 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-07  8:35 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: emil.s.tantilov, jay.vosburgh, vfalico, gospo, netdev, boris.shteinbock

On 01/07/2016 03:59 PM, Michal Kubecek wrote:
> On Thu, Jan 07, 2016 at 03:37:26PM +0800, zhuyj wrote:
>>> If I read this right, whenever this state (link up but speed/duplex
>>> unknown) is entered, you'll keep writing this message into kernel log
>>> every miimon milliseconds until something changes. I'm not sure how long
>>> a NIC can stay in such state but it might get quite annoying (even more
>>> if something really goes wrong and NIC stays that way which can't be
>>> completely ruled out, IMHO).
>> Sure, Thanks a lot. I want to confirm link_up without link_speed. It
>> is not usual. So I think this only lasts for several seconds.
>> It is very important to us since it can help us to find the root cause.
> For debugging purposes it's fine, of course. But this looked like an
> officially submitted patch so I didn't like the idea of log spamming
> (even one second could result in 10-100 messages and admins certainly
> would hate that).
>
>                                                         Michal Kubecek
>
Thanks a lot.

Zhu Yanjun

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  6:33                   ` Jay Vosburgh
@ 2016-01-07 15:27                     ` Tantilov, Emil S
  2016-01-08  1:28                     ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Jay Vosburgh
  2016-01-08  2:29                     ` [PATCH 1/1] bonding: restrict up state in 802.3ad mode zhuyj
  2 siblings, 0 replies; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-07 15:27 UTC (permalink / raw)
  To: Jay Vosburgh, zyjzyj2000
  Cc: mkubecek, vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>Sent: Wednesday, January 06, 2016 10:34 PM
>To: zyjzyj2000@gmail.com
>Cc: Tantilov, Emil S; mkubecek@suse.cz; vfalico@gmail.com;
>gospo@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River)
>Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>
><zyjzyj2000@gmail.com> wrote:
>
>>From: Zhu Yanjun <yanjun.zhu@windriver.com>
>>
>>In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>there is a time span between NIC up state and getting speed and duplex.
>>As such, sometimes a slave in 802.3ad mode is in up state without
>>speed and duplex. This will make bonding in 802.3ad mode can not
>>work well.
>
>	From my reading of Emil's comments in the discussion, I'm not
>sure the above is an accurate description of the problem.  If I'm
>understanding correctly, the cause is due to link flaps racing with the
>bonding monitor workqueue polling the state.  Is this correct?

That is correct.

>>To make bonding driver be compatible with more NICs, it is
>>necessary to restrict the up state in 802.3ad mode.
>>
>>Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
>>---
>> drivers/net/bonding/bond_main.c |   11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>>index 09f8a48..7df8af5 100644
>>--- a/drivers/net/bonding/bond_main.c
>>+++ b/drivers/net/bonding/bond_main.c
>>@@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding
>*bond)
>>
>> 		link_state = bond_check_dev_link(bond, slave->dev, 0);
>>
>>+		if ((BMSR_LSTATUS == link_state) &&
>>+		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
>>+			rtnl_lock();
>>+			bond_update_speed_duplex(slave);
>>+			rtnl_unlock();
>
>	This will add a round trip on the RTNL mutex for every miimon
>interval when the slave is carrier up.  At common miimon rates (10 - 50
>ms), this will hit RTNL between 20 and 100 times per second.  I do not
>see how this is acceptable.
>
>	I believe the proper solution here is to supplant the periodic
>miimon polling from bonding with link state detection based on notifiers
>(As Stephen suggested, not for the first time).
>
>	My suggestion is to have bonding set slave link state based on
>notifiers if miimon is set to zero, and poll as usual if it is not.
>This would preserve any backwards compatibility with any device out
>there that might possibly still be doing netif_carrier_on/off
>incorrectly or not at all.  The only minor complication is synchronizing
>notifier carrier state detection with the ARP monitor.
>
>	This should have been done a long time ago; I'll work something
>up tomorrow (it's late here right now) and post a patch for testing.

That would be awesome. Looking forward to it.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  7:47             ` zhuyj
@ 2016-01-07 18:28               ` Tantilov, Emil S
  2016-01-08  6:09                 ` zhuyj
  0 siblings, 1 reply; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-07 18:28 UTC (permalink / raw)
  To: zhuyj, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: zhuyj [mailto:zyjzyj2000@gmail.com]
>Sent: Wednesday, January 06, 2016 11:47 PM
>To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>Shteinbock, Boris (Wind River)
>Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>
>On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>>> -----Original Message-----
>>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>>> Sent: Tuesday, January 05, 2016 7:05 PM
>>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>netdev@vger.kernel.org;
>>> Shteinbock, Boris (Wind River)
>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>
>>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>>> -----Original Message-----
>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>owner@vger.kernel.org]
>>> On
>>>>> Behalf Of zhuyj
>>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>>> To: Michal Kubecek; Jay Vosburgh
>>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>>> netdev@vger.kernel.org;
>>>>> Shteinbock, Boris (Wind River)
>>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>>
>>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>>> there is a time span between NIC up state and getting speed and
>>> duplex.
>>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>>> work well.
>>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>>> 	What device is this?  It seems a bit odd that an Ethernet
>device
>>>>>>> can be carrier up but not have the duplex and speed available.
>>>>>> ...
>>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>>> propagate to the 802.3ad logic.
>>>>>>>
>>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>>> duplex available, then maybe it should call netdev_state_change()
>when
>>>>>>> the duplex and speed are available, or delay calling
>>> netif_carrier_on().
>>>>>> I have encountered this problem (NIC having carrier on before being
>>> able
>>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>>> handling.") so this example rather supports what you said.
>>>>>>
>>>>>>                                                              Michal
>>> Kubecek
>>>>> Thanks a lot.
>>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>>> handling."). The symptoms are the same with mine.
>>>>>
>>>>> The root cause is different. In my problem, the root cause is that
>LINKS
>>>>> register[]  can not provide link_up and link_speed at the same time.
>>>>> There is a time span between link_up and link_speed.
>>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>>> updated
>>>> simultaneously. Do you have any proof to show the delay you are
>referring
>>> to
>>>> as I am sure our HW engineers would like to know about it.
>>> Sorry. I can not reproduce this problem locally. What I have is the
>>> feedback from the customer.
>> So you are assuming that there is a delay due to the issue you are
>seeing?
>>
>>> Settings for eth0:
>>>     Supported ports: [ TP ]
>>>     Supported link modes:   100baseT/Full
>>>                             1000baseT/Full
>>>                             10000baseT/Full
>>>     Supported pause frame use: No
>>>     Supports auto-negotiation: Yes
>>>     Advertised link modes:  100baseT/Full
>>>                             1000baseT/Full
>>>                             10000baseT/Full
>>>     Advertised pause frame use: No
>>>     Advertised auto-negotiation: Yes
>>>     Speed: Unknown!
>>>     Duplex: Unknown! (255)
>>>     Port: Twisted Pair
>>>     PHYAD: 0
>>>     Transceiver: external
>>>     Auto-negotiation: on
>>>     MDI-X: Unknown
>>>     Supports Wake-on: d
>>>     Wake-on: d
>>>     Current message level: 0x00000007 (7)
>>>                    drv probe link
>>>     Link detected: yes
>> The speed and the link state here are reported from
>> different sources:
>>
>>>     Link detected: yes
>> Comes from a netif_carrier_ok() check. This is done via
>ethtool_op_get_link().
>>
>> Only the speed is reported through the LINKS register - that is why it is
>reported
>> as "Unknown" - in other words link_up is false.
>>
>> This is a trace from the case where the bonding driver reports 0 Mbps:
>>
>>     kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task:
>eth1: link_speed = 80, link_up = true
>>     kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task:
>eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>>       kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings:
>eth1: link_speed = 80, link_up = false
>>    NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0:
>link status definitely up for interface eth1, 0 Mbps full duplex
>>    NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings:
>eth1: link_speed = 80, link_up = false
>>     kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task:
>eth1: link_speed = 80, link_up = true
>>     kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task:
>eth1: link_speed = 80, link_up = true
>
>Hi, Emil
>
>Thanks for your feedback.
> From your log, I think the following can explain why bonding driver can
>not get speed.
>
>bonding                           ixgbe
>.                                   .
>.      <-----------------------   NETDEV_UP
>.                                   .
>bond_slave_netdev_event           NETDEV_DOWN
>.                                   .
>.                                   .
>.                                   .
>NETDEV_UP                           .
>.              ----------------> get_settings
>                                     .
>speed unknown  <---------------  link_up false
>.
>.
>link_up = true
>link_speed = unknown
>
>In the above, ixgbe is up and bonding gets this message, then bonding
>calls bond_slave_netdev_event while ixgbe is down.
>In bond_slave_netdev_event, bonding call get_settings in ixgbe to get
>link_speed. Since now ixgbe is down, so link_speed is
>unknown. In the end, bonding get the final state of ixgbe as link_up
>without link_speed.
>
>If you agree with me, would you like to help me to make tests with the
>following patch?
>
>diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>index d681273..3efc4d8 100644
>--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>@@ -285,27 +285,24 @@ static int ixgbe_get_settings(struct net_device
>*netdev,
>         }
>
>         hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
>-       if (link_up) {
>-               switch (link_speed) {
>-               case IXGBE_LINK_SPEED_10GB_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_10000);
>-                       break;
>-               case IXGBE_LINK_SPEED_2_5GB_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_2500);
>-                       break;
>-               case IXGBE_LINK_SPEED_1GB_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_1000);
>-                       break;
>-               case IXGBE_LINK_SPEED_100_FULL:
>-                       ethtool_cmd_speed_set(ecmd, SPEED_100);
>-                       break;
>-               default:
>-                       break;
>-               }
>-               ecmd->duplex = DUPLEX_FULL;
>-       } else {
>-               ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN);
>+
>+       ecmd->duplex = DUPLEX_FULL;
>+       switch (link_speed) {
>+       case IXGBE_LINK_SPEED_10GB_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_10000);
>+               break;
>+       case IXGBE_LINK_SPEED_2_5GB_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_2500);
>+               break;
>+       case IXGBE_LINK_SPEED_1GB_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_1000);
>+               break;
>+       case IXGBE_LINK_SPEED_100_FULL:
>+               ethtool_cmd_speed_set(ecmd, SPEED_100);
>+               break;
>+       default:
>                 ecmd->duplex = DUPLEX_UNKNOWN;
>+               break;
>         }
>
>         return 0;

This will break speed reporting. You cannot ignore link_up.
The speed is only valid when the link_up bit is set.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-07  6:33                   ` Jay Vosburgh
  2016-01-07 15:27                     ` Tantilov, Emil S
@ 2016-01-08  1:28                     ` Jay Vosburgh
  2016-01-08  4:36                       ` zhuyj
  2016-01-09  1:35                       ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Tantilov, Emil S
  2016-01-08  2:29                     ` [PATCH 1/1] bonding: restrict up state in 802.3ad mode zhuyj
  2 siblings, 2 replies; 52+ messages in thread
From: Jay Vosburgh @ 2016-01-08  1:28 UTC (permalink / raw)
  To: zyjzyj2000, emil.s.tantilov
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock


	TEST PATCH

	This patch modifies bonding to utilize notifier callbacks to
detect slave link state changes.  It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay options to
bonding.  It's not as complicated as it looks; most of the change set is
to break out the inner loop of bond_miimon_inspect into its own
function.

	Yanjun, can you test this with miimon=0 and see if it changes
the behavior you're seeing?

	Thanks,

	-J


diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index cab99fd..6fe68b1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2012,104 +2012,103 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
 	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+				     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP,
 						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP,
-							  BOND_SLAVE_NOTIFY_LATER);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-			slave->delay--;
-			break;
+		slave->delay--;
+		break;
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK,
-						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.updelay;
-
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN,
-							  BOND_SLAVE_NOTIFY_LATER);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+						  BOND_SLAVE_NOTIFY_LATER);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
+	return 0;
+}
+
+static int bond_miimon_inspect(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+	int commit = 0;
+
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave);
+
 	return commit;
 }
 
@@ -3016,6 +3015,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave))
+			bond_miimon_commit(bond);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be


---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07  6:33                   ` Jay Vosburgh
  2016-01-07 15:27                     ` Tantilov, Emil S
  2016-01-08  1:28                     ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Jay Vosburgh
@ 2016-01-08  2:29                     ` zhuyj
  2 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-08  2:29 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: emil.s.tantilov, mkubecek, vfalico, gospo, netdev, boris.shteinbock

On 01/07/2016 02:33 PM, Jay Vosburgh wrote:
> <zyjzyj2000@gmail.com> wrote:
>
>> From: Zhu Yanjun <yanjun.zhu@windriver.com>
>>
>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>> there is a time span between NIC up state and getting speed and duplex.
>> As such, sometimes a slave in 802.3ad mode is in up state without
>> speed and duplex. This will make bonding in 802.3ad mode can not
>> work well.
> 	From my reading of Emil's comments in the discussion, I'm not
> sure the above is an accurate description of the problem.  If I'm
> understanding correctly, the cause is due to link flaps racing with the
> bonding monitor workqueue polling the state.  Is this correct?
The following are from my user. What I have done is based on it.
"
Here's one theory that would seem to match my observations:

It seems that the x540T can sometimes report 'link up' a few moments 
before the speed and duplex are known. If the bond driver reads and 
stores the speed and duplex immediately after the link becomes 'up', it 
may occasionally miss the actual parameters by reading them before they 
are ready. If the parameters are only read when the link state changes, 
the 'unknown' status will stay until next state change happens.

I have attached a file 'test.log' that shows the kernel log and states 
of the relevant interfaces after the problem is hit. It shows that the 
final state has all the links up and speeds are known on individual 
interfaces, but bond0 shows only single interface speed. After 
successful negotiation bond0 shows the aggregate speed i.e. 20000Mb/s. 
In the end of the file there is bunch of ethtool runs that were taken in 
a tight loop during the negotiation. It shows in the end that the link 
becomes up some time before the speed and duplex are actually known. If 
the bond driver only reads the speed and duplex during this window, it 
will get it wrong and it won't be corrected when the real speed becomes 
known since the link state won't change at that time.
"

Zhu Yanjun
>
>> To make bonding driver be compatible with more NICs, it is
>> necessary to restrict the up state in 802.3ad mode.
>>
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
>> ---
>> drivers/net/bonding/bond_main.c |   11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 09f8a48..7df8af5 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -1991,6 +1991,17 @@ static int bond_miimon_inspect(struct bonding *bond)
>>
>> 		link_state = bond_check_dev_link(bond, slave->dev, 0);
>>
>> +		if ((BMSR_LSTATUS == link_state) &&
>> +		    (BOND_MODE(bond) == BOND_MODE_8023AD)) {
>> +			rtnl_lock();
>> +			bond_update_speed_duplex(slave);
>> +			rtnl_unlock();
> 	This will add a round trip on the RTNL mutex for every miimon
> interval when the slave is carrier up.  At common miimon rates (10 - 50
> ms), this will hit RTNL between 20 and 100 times per second.  I do not
> see how this is acceptable.
>
> 	I believe the proper solution here is to supplant the periodic
> miimon polling from bonding with link state detection based on notifiers
> (As Stephen suggested, not for the first time).
>
> 	My suggestion is to have bonding set slave link state based on
> notifiers if miimon is set to zero, and poll as usual if it is not.
> This would preserve any backwards compatibility with any device out
> there that might possibly still be doing netif_carrier_on/off
> incorrectly or not at all.  The only minor complication is synchronizing
> notifier carrier state detection with the ARP monitor.
>
> 	This should have been done a long time ago; I'll work something
> up tomorrow (it's late here right now) and post a patch for testing.
>
> 	-J
>
>> +			if ((slave->speed == SPEED_UNKNOWN) ||
>> +			    (slave->duplex == DUPLEX_UNKNOWN)) {
>> +				link_state = 0;
>> +				netdev_info(bond->dev, "In 802.3ad mode, it is not enough to up without speed and duplex");
>> +			}
>> +		}
>> 		switch (slave->link) {
>> 		case BOND_LINK_UP:
>> 			if (link_state)
>> -- 
>> 1.7.9.5
>>
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-08  1:28                     ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Jay Vosburgh
@ 2016-01-08  4:36                       ` zhuyj
  2016-01-08  6:12                         ` Jay Vosburgh
  2016-01-09  1:35                       ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Tantilov, Emil S
  1 sibling, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-08  4:36 UTC (permalink / raw)
  To: Jay Vosburgh, emil.s.tantilov
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock

[-- Attachment #1: Type: text/plain, Size: 8443 bytes --]

Hi, Jay

Thank for your help.
I made a new patch based on the latest linux kernel. Now it is in the 
attachment.
When I run "make", the following errors will pop up.
And I can not find notifier callbacks in the patch.

   CHK     include/config/kernel.release
   CHK     include/generated/uapi/linux/version.h
   CHK     include/generated/utsrelease.h
   CHK     include/generated/bounds.h
   CHK     include/generated/timeconst.h
   CHK     include/generated/asm-offsets.h
   CALL    scripts/checksyscalls.sh
   CHK     include/generated/compile.h
   CC      block/blk-mq.o
   LD      block/built-in.o
   CC [M]  drivers/mtd/mtdcore.o
   LD [M]  drivers/mtd/mtd.o
   CC [M]  drivers/net/bonding/bond_main.o
drivers/net/bonding/bond_main.c: In function ‘bond_miimon_inspect_slave’:
drivers/net/bonding/bond_main.c:1996:3: error: too many arguments to 
function ‘bond_set_slave_link_state’
include/net/bonding.h:507:20: note: declared here
drivers/net/bonding/bond_main.c:2010:4: error: too many arguments to 
function ‘bond_set_slave_link_state’
include/net/bonding.h:507:20: note: declared here
drivers/net/bonding/bond_main.c:2030:3: error: too many arguments to 
function ‘bond_set_slave_link_state’
include/net/bonding.h:507:20: note: declared here
drivers/net/bonding/bond_main.c:2041:4: error: too many arguments to 
function ‘bond_set_slave_link_state’
include/net/bonding.h:507:20: note: declared here
make[3]: *** [drivers/net/bonding/bond_main.o] Error 1
make[2]: *** [drivers/net/bonding] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2

Best Regards!
Zhu Yanjun

On 01/08/2016 09:28 AM, Jay Vosburgh wrote:
> 	TEST PATCH
>
> 	This patch modifies bonding to utilize notifier callbacks to
> detect slave link state changes.  It is intended to be used with miimon
> set to zero, and does not support the updelay or downdelay options to
> bonding.  It's not as complicated as it looks; most of the change set is
> to break out the inner loop of bond_miimon_inspect into its own
> function.
>
> 	Yanjun, can you test this with miimon=0 and see if it changes
> the behavior you're seeing?
>
> 	Thanks,
>
> 	-J
>
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index cab99fd..6fe68b1 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2012,104 +2012,103 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>   /*-------------------------------- Monitoring -------------------------------*/
>   
>   /* called with rcu_read_lock() */
> -static int bond_miimon_inspect(struct bonding *bond)
> +static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
>   {
> -	int link_state, commit = 0;
> -	struct list_head *iter;
> -	struct slave *slave;
> +	int link_state;
>   	bool ignore_updelay;
>   
>   	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>   
> -	bond_for_each_slave_rcu(bond, slave, iter) {
> -		slave->new_link = BOND_LINK_NOCHANGE;
> +	slave->new_link = BOND_LINK_NOCHANGE;
>   
> -		link_state = bond_check_dev_link(bond, slave->dev, 0);
> +	link_state = bond_check_dev_link(bond, slave->dev, 0);
>   
> -		switch (slave->link) {
> -		case BOND_LINK_UP:
> -			if (link_state)
> -				continue;
> +	switch (slave->link) {
> +	case BOND_LINK_UP:
> +		if (link_state)
> +			return 0;
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
> +		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
> +					  BOND_SLAVE_NOTIFY_LATER);
> +		slave->delay = bond->params.downdelay;
> +		if (slave->delay) {
> +			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
> +				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
> +				    (bond_is_active_slave(slave) ?
> +				     "active " : "backup ") : "",
> +				    slave->dev->name,
> +				    bond->params.downdelay * bond->params.miimon);
> +		}
> +		/*FALLTHRU*/
> +	case BOND_LINK_FAIL:
> +		if (link_state) {
> +			/* recovered before downdelay expired */
> +			bond_set_slave_link_state(slave, BOND_LINK_UP,
>   						  BOND_SLAVE_NOTIFY_LATER);
> -			slave->delay = bond->params.downdelay;
> -			if (slave->delay) {
> -				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
> -					    (BOND_MODE(bond) ==
> -					     BOND_MODE_ACTIVEBACKUP) ?
> -					     (bond_is_active_slave(slave) ?
> -					      "active " : "backup ") : "",
> -					    slave->dev->name,
> -					    bond->params.downdelay * bond->params.miimon);
> -			}
> -			/*FALLTHRU*/
> -		case BOND_LINK_FAIL:
> -			if (link_state) {
> -				/* recovered before downdelay expired */
> -				bond_set_slave_link_state(slave, BOND_LINK_UP,
> -							  BOND_SLAVE_NOTIFY_LATER);
> -				slave->last_link_up = jiffies;
> -				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
> -					    (bond->params.downdelay - slave->delay) *
> -					    bond->params.miimon,
> -					    slave->dev->name);
> -				continue;
> -			}
> +			slave->last_link_up = jiffies;
> +			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
> +				    (bond->params.downdelay - slave->delay) *
> +				    bond->params.miimon, slave->dev->name);
> +			return 0;
> +		}
>   
> -			if (slave->delay <= 0) {
> -				slave->new_link = BOND_LINK_DOWN;
> -				commit++;
> -				continue;
> -			}
> +		if (slave->delay <= 0) {
> +			slave->new_link = BOND_LINK_DOWN;
> +			return 1;
> +		}
>   
> -			slave->delay--;
> -			break;
> +		slave->delay--;
> +		break;
>   
> -		case BOND_LINK_DOWN:
> -			if (!link_state)
> -				continue;
> +	case BOND_LINK_DOWN:
> +		if (!link_state)
> +			return 0;
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_BACK,
> -						  BOND_SLAVE_NOTIFY_LATER);
> -			slave->delay = bond->params.updelay;
> -
> -			if (slave->delay) {
> -				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
> -					    slave->dev->name,
> -					    ignore_updelay ? 0 :
> -					    bond->params.updelay *
> -					    bond->params.miimon);
> -			}
> -			/*FALLTHRU*/
> -		case BOND_LINK_BACK:
> -			if (!link_state) {
> -				bond_set_slave_link_state(slave,
> -							  BOND_LINK_DOWN,
> -							  BOND_SLAVE_NOTIFY_LATER);
> -				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
> -					    (bond->params.updelay - slave->delay) *
> -					    bond->params.miimon,
> -					    slave->dev->name);
> +		bond_set_slave_link_state(slave, BOND_LINK_BACK,
> +					  BOND_SLAVE_NOTIFY_LATER);
> +		slave->delay = bond->params.updelay;
>   
> -				continue;
> -			}
> +		if (slave->delay) {
> +			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
> +				    slave->dev->name, ignore_updelay ? 0 :
> +				    bond->params.updelay * bond->params.miimon);
> +		}
> +		/*FALLTHRU*/
> +	case BOND_LINK_BACK:
> +		if (!link_state) {
> +			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> +						  BOND_SLAVE_NOTIFY_LATER);
> +			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
> +				    (bond->params.updelay - slave->delay) *
> +				    bond->params.miimon, slave->dev->name);
>   
> -			if (ignore_updelay)
> -				slave->delay = 0;
> +			return 0;
> +		}
>   
> -			if (slave->delay <= 0) {
> -				slave->new_link = BOND_LINK_UP;
> -				commit++;
> -				ignore_updelay = false;
> -				continue;
> -			}
> +		if (ignore_updelay)
> +			slave->delay = 0;
>   
> -			slave->delay--;
> -			break;
> +		if (slave->delay <= 0) {
> +			slave->new_link = BOND_LINK_UP;
> +			return 1;
>   		}
> +
> +		slave->delay--;
> +		break;
>   	}
>   
> +	return 0;
> +}
> +
> +static int bond_miimon_inspect(struct bonding *bond)
> +{
> +	struct list_head *iter;
> +	struct slave *slave;
> +	int commit = 0;
> +
> +	bond_for_each_slave_rcu(bond, slave, iter)
> +		commit += bond_miimon_inspect_slave(bond, slave);
> +
>   	return commit;
>   }
>   
> @@ -3016,6 +3015,9 @@ static int bond_slave_netdev_event(unsigned long event,
>   			bond_3ad_adapter_speed_duplex_changed(slave);
>   		/* Fallthrough */
>   	case NETDEV_DOWN:
> +		if (bond_miimon_inspect_slave(bond, slave))
> +			bond_miimon_commit(bond);
> +
>   		/* Refresh slave-array if applicable!
>   		 * If the setup does not use miimon or arpmon (mode-specific!),
>   		 * then these events will not cause the slave-array to be
>
>
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com


[-- Attachment #2: 0001-bonding-utilize-notifier-callbacks-to-detect-slave-l.patch --]
[-- Type: text/x-patch, Size: 6556 bytes --]

>From 62c5cf3a4aa944bac0397748d3ce1dd28b358f79 Mon Sep 17 00:00:00 2001
From: Zhu Yanjun <yanjun.zhu@windriver.com>
Date: Fri, 8 Jan 2016 11:35:19 +0800
Subject: [PATCH 1/1] bonding: utilize notifier callbacks to detect slave link
 state changes

This patch modifies bonding to utilize notifier callbacks to
detect slave link state changes.  It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay options to
bonding.  It's not as complicated as it looks; most of the change set is
to break out the inner loop of bond_miimon_inspect into its own
function.

Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
---
 drivers/net/bonding/bond_main.c |  154 ++++++++++++++++++++-------------------
 1 file changed, 78 insertions(+), 76 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9e0f8a7..b28d6fd 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1977,101 +1977,100 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
 	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
-
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL, BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+			 	     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP, BOND_SLAVE_NOTIFY_LATER);
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+		slave->delay--;
+		break;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK);
-			slave->delay = bond->params.updelay;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK, BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN, BOND_SLAVE_NOTIFY_LATER);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
-	return commit;
+	return 0;
+}
+
+static int bond_miimon_inspect(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+	int commit = 0;
+
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave);
+
+ 	return commit;
 }
 
 static void bond_miimon_commit(struct bonding *bond)
@@ -2969,6 +2968,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave))
+			bond_miimon_commit(bond);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
  2016-01-07 18:28               ` Tantilov, Emil S
@ 2016-01-08  6:09                 ` zhuyj
  0 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-08  6:09 UTC (permalink / raw)
  To: Tantilov, Emil S, Michal Kubecek, Jay Vosburgh
  Cc: vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

On 01/08/2016 02:28 AM, Tantilov, Emil S wrote:
>> -----Original Message-----
>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>> Sent: Wednesday, January 06, 2016 11:47 PM
>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com; netdev@vger.kernel.org;
>> Shteinbock, Boris (Wind River)
>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>
>> On 01/07/2016 10:43 AM, Tantilov, Emil S wrote:
>>>> -----Original Message-----
>>>> From: zhuyj [mailto:zyjzyj2000@gmail.com]
>>>> Sent: Tuesday, January 05, 2016 7:05 PM
>>>> To: Tantilov, Emil S; Michal Kubecek; Jay Vosburgh
>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>> netdev@vger.kernel.org;
>>>> Shteinbock, Boris (Wind River)
>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>
>>>> On 01/06/2016 09:26 AM, Tantilov, Emil S wrote:
>>>>>> -----Original Message-----
>>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org]
>>>> On
>>>>>> Behalf Of zhuyj
>>>>>> Sent: Monday, December 28, 2015 1:19 AM
>>>>>> To: Michal Kubecek; Jay Vosburgh
>>>>>> Cc: vfalico@gmail.com; gospo@cumulusnetworks.com;
>>>> netdev@vger.kernel.org;
>>>>>> Shteinbock, Boris (Wind River)
>>>>>> Subject: Re: [PATCH 1/1] bonding: restrict up state in 802.3ad mode
>>>>>>
>>>>>> On 12/28/2015 04:43 PM, Michal Kubecek wrote:
>>>>>>> On Thu, Dec 17, 2015 at 01:57:16PM -0800, Jay Vosburgh wrote:
>>>>>>>> <zyjzyj2000@gmail.com> wrote:
>>>>>>>>> In 802.3ad mode, the speed and duplex is needed. But in some NIC,
>>>>>>>>> there is a time span between NIC up state and getting speed and
>>>> duplex.
>>>>>>>>> As such, sometimes a slave in 802.3ad mode is in up state without
>>>>>>>>> speed and duplex. This will make bonding in 802.3ad mode can not
>>>>>>>>> work well.
>>>>>>>>> To make bonding driver be compatible with more NICs, it is
>>>>>>>>> necessary to restrict the up state in 802.3ad mode.
>>>>>>>> 	What device is this?  It seems a bit odd that an Ethernet
>> device
>>>>>>>> can be carrier up but not have the duplex and speed available.
>>>>>>> ...
>>>>>>>> 	In general, though, bonding expects a speed or duplex change to
>>>>>>>> be announced via a NETDEV_UPDATE or NETDEV_UP notifier, which would
>>>>>>>> propagate to the 802.3ad logic.
>>>>>>>>
>>>>>>>> 	If the device here is going carrier up prior to having speed or
>>>>>>>> duplex available, then maybe it should call netdev_state_change()
>> when
>>>>>>>> the duplex and speed are available, or delay calling
>>>> netif_carrier_on().
>>>>>>> I have encountered this problem (NIC having carrier on before being
>>>> able
>>>>>>> to detect speed/duplex and driver not notifying when speed/duplex
>>>>>>> becomes available) with netxen cards earlier. But it was eventually
>>>>>>> fixed in the driver by commit 9d01412ae76f ("netxen: Fix link event
>>>>>>> handling.") so this example rather supports what you said.
>>>>>>>
>>>>>>>                                                               Michal
>>>> Kubecek
>>>>>> Thanks a lot.
>>>>>> I checked the commit 9d01412ae76f ("netxen: Fix link event
>>>>>> handling."). The symptoms are the same with mine.
>>>>>>
>>>>>> The root cause is different. In my problem, the root cause is that
>> LINKS
>>>>>> register[]  can not provide link_up and link_speed at the same time.
>>>>>> There is a time span between link_up and link_speed.
>>>>> The LINK_UP and LINK_SPEED bits in the LINKS register for ixgbe HW are
>>>> updated
>>>>> simultaneously. Do you have any proof to show the delay you are
>> referring
>>>> to
>>>>> as I am sure our HW engineers would like to know about it.
>>>> Sorry. I can not reproduce this problem locally. What I have is the
>>>> feedback from the customer.
>>> So you are assuming that there is a delay due to the issue you are
>> seeing?
>>>> Settings for eth0:
>>>>      Supported ports: [ TP ]
>>>>      Supported link modes:   100baseT/Full
>>>>                              1000baseT/Full
>>>>                              10000baseT/Full
>>>>      Supported pause frame use: No
>>>>      Supports auto-negotiation: Yes
>>>>      Advertised link modes:  100baseT/Full
>>>>                              1000baseT/Full
>>>>                              10000baseT/Full
>>>>      Advertised pause frame use: No
>>>>      Advertised auto-negotiation: Yes
>>>>      Speed: Unknown!
>>>>      Duplex: Unknown! (255)
>>>>      Port: Twisted Pair
>>>>      PHYAD: 0
>>>>      Transceiver: external
>>>>      Auto-negotiation: on
>>>>      MDI-X: Unknown
>>>>      Supports Wake-on: d
>>>>      Wake-on: d
>>>>      Current message level: 0x00000007 (7)
>>>>                     drv probe link
>>>>      Link detected: yes
>>> The speed and the link state here are reported from
>>> different sources:
>>>
>>>>      Link detected: yes
>>> Comes from a netif_carrier_ok() check. This is done via
>> ethtool_op_get_link().
>>> Only the speed is reported through the LINKS register - that is why it is
>> reported
>>> as "Unknown" - in other words link_up is false.
>>>
>>> This is a trace from the case where the bonding driver reports 0 Mbps:
>>>
>>>      kworker/u48:1-27950 [010] ....  6493.084916: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [011] ....  6493.184894: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [000] ....  6494.439883: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = true
>>>      kworker/u48:1-27950 [000] ....  6494.464204: ixgbe_service_task:
>> eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>>>        kworker/0:2-1926  [000] ....  6494.464249: ixgbe_get_settings:
>> eth1: link_speed = 80, link_up = false
>>>     NetworkManager-3819  [008] ....  6494.464484: ixgbe_get_settings:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [007] ....  6494.496886: bond_mii_monitor: bond0:
>> link status definitely up for interface eth1, 0 Mbps full duplex
>>>     NetworkManager-3819  [008] ....  6494.496967: ixgbe_get_settings:
>> eth1: link_speed = 80, link_up = false
>>>      kworker/u48:1-27950 [008] ....  6495.288798: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = true
>>>      kworker/u48:1-27950 [008] ....  6495.388806: ixgbe_service_task:
>> eth1: link_speed = 80, link_up = true
>>
>> Hi, Emil
>>
>> Thanks for your feedback.
>>  From your log, I think the following can explain why bonding driver can
>> not get speed.
>>
>> bonding                           ixgbe
>> .                                   .
>> .      <-----------------------   NETDEV_UP
>> .                                   .
>> bond_slave_netdev_event           NETDEV_DOWN
>> .                                   .
>> .                                   .
>> .                                   .
>> NETDEV_UP                           .
>> .              ----------------> get_settings
>>                                      .
>> speed unknown  <---------------  link_up false
>> .
>> .
>> link_up = true
>> link_speed = unknown
>>
>> In the above, ixgbe is up and bonding gets this message, then bonding
>> calls bond_slave_netdev_event while ixgbe is down.
>> In bond_slave_netdev_event, bonding call get_settings in ixgbe to get
>> link_speed. Since now ixgbe is down, so link_speed is
>> unknown. In the end, bonding get the final state of ixgbe as link_up
>> without link_speed.
>>
>> If you agree with me, would you like to help me to make tests with the
>> following patch?
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> index d681273..3efc4d8 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> @@ -285,27 +285,24 @@ static int ixgbe_get_settings(struct net_device
>> *netdev,
>>          }
>>
>>          hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
>> -       if (link_up) {
>> -               switch (link_speed) {
>> -               case IXGBE_LINK_SPEED_10GB_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_10000);
>> -                       break;
>> -               case IXGBE_LINK_SPEED_2_5GB_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_2500);
>> -                       break;
>> -               case IXGBE_LINK_SPEED_1GB_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_1000);
>> -                       break;
>> -               case IXGBE_LINK_SPEED_100_FULL:
>> -                       ethtool_cmd_speed_set(ecmd, SPEED_100);
>> -                       break;
>> -               default:
>> -                       break;
>> -               }
>> -               ecmd->duplex = DUPLEX_FULL;
>> -       } else {
>> -               ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN);
>> +
>> +       ecmd->duplex = DUPLEX_FULL;
>> +       switch (link_speed) {
>> +       case IXGBE_LINK_SPEED_10GB_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_10000);
>> +               break;
>> +       case IXGBE_LINK_SPEED_2_5GB_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_2500);
>> +               break;
>> +       case IXGBE_LINK_SPEED_1GB_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_1000);
>> +               break;
>> +       case IXGBE_LINK_SPEED_100_FULL:
>> +               ethtool_cmd_speed_set(ecmd, SPEED_100);
>> +               break;
>> +       default:
>>                  ecmd->duplex = DUPLEX_UNKNOWN;
>> +               break;
>>          }
>>
>>          return 0;
> This will break speed reporting. You cannot ignore link_up.
> The speed is only valid when the link_up bit is set.
Hi, Emil

Thanks for your reply.
But in this function ixgbe_check_mac_link_generic. The speed is reported 
whether the link_up is true or false.
I followed this function.

Thanks a lot.
Zhu Yanjun
>
> Thanks,
> Emil
>
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-08  4:36                       ` zhuyj
@ 2016-01-08  6:12                         ` Jay Vosburgh
  2016-01-08  7:41                           ` (unknown), zyjzyj2000
  0 siblings, 1 reply; 52+ messages in thread
From: Jay Vosburgh @ 2016-01-08  6:12 UTC (permalink / raw)
  To: zhuyj; +Cc: emil.s.tantilov, mkubecek, vfalico, gospo, netdev, boris.shteinbock

zhuyj <zyjzyj2000@gmail.com> wrote:

>Hi, Jay
>
>Thank for your help.
>I made a new patch based on the latest linux kernel. Now it is in the
>attachment.
>When I run "make", the following errors will pop up.
[...]
>drivers/net/bonding/bond_main.c:1996:3: error: too many arguments to
>function ‘bond_set_slave_link_state’
>include/net/bonding.h:507:20: note: declared here

	My patch was generated against the current net-next git
repository.  I suspect you're using an older kernel; since commit

5d397061ca20 ("bonding: allow notifications for bond_set_slave_link_state")

	the bond_set_slave_link_state function has three arguments.
This commit was added 3 Dec 2015.

	For example, from your patch:

>-			bond_set_slave_link_state(slave, BOND_LINK_FAIL);
[...]
>+		bond_set_slave_link_state(slave, BOND_LINK_FAIL, BOND_SLAVE_NOTIFY_LATER);

	For your kernel version, you'll need to change the patched code
to remove the third argument to bond_set_slave_link_state.

>And I can not find notifier callbacks in the patch.

	The bond_slave_netdev_event function is bonding's notifier
callback; the patch adds a call there for NETDEV_UP, NETDEV_CHANGE and
NETDEV_DOWN events to check link state:

> 	case NETDEV_DOWN:
>+		if (bond_miimon_inspect_slave(bond, slave))
>+			bond_miimon_commit(bond);
>+

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* (unknown), 
  2016-01-08  6:12                         ` Jay Vosburgh
@ 2016-01-08  7:41                           ` zyjzyj2000
  2016-01-08  7:41                             ` [PATCH 1/1] bonding: utilize notifier callbacks to detect slave link state changes zyjzyj2000
  0 siblings, 1 reply; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-08  7:41 UTC (permalink / raw)
  To: jay.vosburgh
  Cc: emil.s.tantilov, mkubecek, vfalico, gospo, netdev, boris.shteinbock


Sure. This patch is based on the latest linux kernel. Now I remove the third parameter in bond_set_slave_link_state.
I can build it successfully. Now the patch based on v4.4-rc8 is in the attachment.
If you confirm it, I will make tests with it.

Thanks a lot.
Zhu Yanjun

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: utilize notifier callbacks to detect slave link state changes
  2016-01-08  7:41                           ` (unknown), zyjzyj2000
@ 2016-01-08  7:41                             ` zyjzyj2000
  2016-01-08 10:18                               ` zhuyj
  0 siblings, 1 reply; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-08  7:41 UTC (permalink / raw)
  To: jay.vosburgh
  Cc: emil.s.tantilov, mkubecek, vfalico, gospo, netdev, boris.shteinbock

From: Zhu Yanjun <yanjun.zhu@windriver.com>

This patch modifies bonding to utilize notifier callbacks to
detect slave link state changes.  It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay options to
bonding.  It's not as complicated as it looks; most of the change set is
to break out the inner loop of bond_miimon_inspect into its own
function.

Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
---
 drivers/net/bonding/bond_main.c |  154 ++++++++++++++++++++-------------------
 1 file changed, 78 insertions(+), 76 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9e0f8a7..9a0e69e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1977,101 +1977,100 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
 	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
-
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+			 	     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP);
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+		slave->delay--;
+		break;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK);
-			slave->delay = bond->params.updelay;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
-	return commit;
+	return 0;
+}
+
+static int bond_miimon_inspect(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+	int commit = 0;
+
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave);
+
+ 	return commit;
 }
 
 static void bond_miimon_commit(struct bonding *bond)
@@ -2969,6 +2968,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave))
+			bond_miimon_commit(bond);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: utilize notifier callbacks to detect slave link state changes
  2016-01-08  7:41                             ` [PATCH 1/1] bonding: utilize notifier callbacks to detect slave link state changes zyjzyj2000
@ 2016-01-08 10:18                               ` zhuyj
  0 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-08 10:18 UTC (permalink / raw)
  To: jay.vosburgh
  Cc: emil.s.tantilov, mkubecek, vfalico, gospo, netdev, boris.shteinbock

Hi, Jay

I delved into your test patch. I noticed that bond_set_slave_link_state 
would call bond_netdev_notify_work.
And your patch is based on netdev notifier.

Will it result into a notifier loop? That is, bonding driver receives 
notifier, then bonding driver sends notifier.
In the end, there are more and more notifier.

How do you think about this?

Thanks a lot.
Zhu Yanjun

On 01/08/2016 03:41 PM, zyjzyj2000@gmail.com wrote:
> From: Zhu Yanjun <yanjun.zhu@windriver.com>
>
> This patch modifies bonding to utilize notifier callbacks to
> detect slave link state changes.  It is intended to be used with miimon
> set to zero, and does not support the updelay or downdelay options to
> bonding.  It's not as complicated as it looks; most of the change set is
> to break out the inner loop of bond_miimon_inspect into its own
> function.
>
> Signed-off-by: Zhu Yanjun <yanjun.zhu@windriver.com>
> ---
>   drivers/net/bonding/bond_main.c |  154 ++++++++++++++++++++-------------------
>   1 file changed, 78 insertions(+), 76 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 9e0f8a7..9a0e69e 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1977,101 +1977,100 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>   /*-------------------------------- Monitoring -------------------------------*/
>   
>   /* called with rcu_read_lock() */
> -static int bond_miimon_inspect(struct bonding *bond)
> +static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
>   {
> -	int link_state, commit = 0;
> -	struct list_head *iter;
> -	struct slave *slave;
> +	int link_state;
>   	bool ignore_updelay;
>   
>   	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>   
> -	bond_for_each_slave_rcu(bond, slave, iter) {
> -		slave->new_link = BOND_LINK_NOCHANGE;
> -
> -		link_state = bond_check_dev_link(bond, slave->dev, 0);
> +	slave->new_link = BOND_LINK_NOCHANGE;
>   
> -		switch (slave->link) {
> -		case BOND_LINK_UP:
> -			if (link_state)
> -				continue;
> +	link_state = bond_check_dev_link(bond, slave->dev, 0);
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_FAIL);
> -			slave->delay = bond->params.downdelay;
> -			if (slave->delay) {
> -				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
> -					    (BOND_MODE(bond) ==
> -					     BOND_MODE_ACTIVEBACKUP) ?
> -					     (bond_is_active_slave(slave) ?
> -					      "active " : "backup ") : "",
> -					    slave->dev->name,
> -					    bond->params.downdelay * bond->params.miimon);
> -			}
> -			/*FALLTHRU*/
> -		case BOND_LINK_FAIL:
> -			if (link_state) {
> -				/* recovered before downdelay expired */
> -				bond_set_slave_link_state(slave, BOND_LINK_UP);
> -				slave->last_link_up = jiffies;
> -				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
> -					    (bond->params.downdelay - slave->delay) *
> -					    bond->params.miimon,
> -					    slave->dev->name);
> -				continue;
> -			}
> +	switch (slave->link) {
> +	case BOND_LINK_UP:
> +		if (link_state)
> +			return 0;
>   
> -			if (slave->delay <= 0) {
> -				slave->new_link = BOND_LINK_DOWN;
> -				commit++;
> -				continue;
> -			}
> +		bond_set_slave_link_state(slave, BOND_LINK_FAIL);
> +		slave->delay = bond->params.downdelay;
> +		if (slave->delay) {
> +			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
> +				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
> +				    (bond_is_active_slave(slave) ?
> +			 	     "active " : "backup ") : "",
> +				    slave->dev->name,
> +				    bond->params.downdelay * bond->params.miimon);
> +		}
> +		/*FALLTHRU*/
> +	case BOND_LINK_FAIL:
> +		if (link_state) {
> +			/* recovered before downdelay expired */
> +			bond_set_slave_link_state(slave, BOND_LINK_UP);
> +			slave->last_link_up = jiffies;
> +			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
> +				    (bond->params.downdelay - slave->delay) *
> +				    bond->params.miimon, slave->dev->name);
> +			return 0;
> +		}
>   
> -			slave->delay--;
> -			break;
> +		if (slave->delay <= 0) {
> +			slave->new_link = BOND_LINK_DOWN;
> +			return 1;
> +		}
>   
> -		case BOND_LINK_DOWN:
> -			if (!link_state)
> -				continue;
> +		slave->delay--;
> +		break;
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_BACK);
> -			slave->delay = bond->params.updelay;
> +	case BOND_LINK_DOWN:
> +		if (!link_state)
> +			return 0;
>   
> -			if (slave->delay) {
> -				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
> -					    slave->dev->name,
> -					    ignore_updelay ? 0 :
> -					    bond->params.updelay *
> -					    bond->params.miimon);
> -			}
> -			/*FALLTHRU*/
> -		case BOND_LINK_BACK:
> -			if (!link_state) {
> -				bond_set_slave_link_state(slave,
> -							  BOND_LINK_DOWN);
> -				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
> -					    (bond->params.updelay - slave->delay) *
> -					    bond->params.miimon,
> -					    slave->dev->name);
> +		bond_set_slave_link_state(slave, BOND_LINK_BACK);
> +		slave->delay = bond->params.updelay;
>   
> -				continue;
> -			}
> +		if (slave->delay) {
> +			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
> +				    slave->dev->name, ignore_updelay ? 0 :
> +				    bond->params.updelay * bond->params.miimon);
> +		}
> +		/*FALLTHRU*/
> +	case BOND_LINK_BACK:
> +		if (!link_state) {
> +			bond_set_slave_link_state(slave, BOND_LINK_DOWN);
> +			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
> +				    (bond->params.updelay - slave->delay) *
> +				    bond->params.miimon, slave->dev->name);
>   
> -			if (ignore_updelay)
> -				slave->delay = 0;
> +			return 0;
> +		}
>   
> -			if (slave->delay <= 0) {
> -				slave->new_link = BOND_LINK_UP;
> -				commit++;
> -				ignore_updelay = false;
> -				continue;
> -			}
> +		if (ignore_updelay)
> +			slave->delay = 0;
>   
> -			slave->delay--;
> -			break;
> +		if (slave->delay <= 0) {
> +			slave->new_link = BOND_LINK_UP;
> +			return 1;
>   		}
> +
> +		slave->delay--;
> +		break;
>   	}
>   
> -	return commit;
> +	return 0;
> +}
> +
> +static int bond_miimon_inspect(struct bonding *bond)
> +{
> +	struct list_head *iter;
> +	struct slave *slave;
> +	int commit = 0;
> +
> +	bond_for_each_slave_rcu(bond, slave, iter)
> +		commit += bond_miimon_inspect_slave(bond, slave);
> +
> + 	return commit;
>   }
>   
>   static void bond_miimon_commit(struct bonding *bond)
> @@ -2969,6 +2968,9 @@ static int bond_slave_netdev_event(unsigned long event,
>   			bond_3ad_adapter_speed_duplex_changed(slave);
>   		/* Fallthrough */
>   	case NETDEV_DOWN:
> +		if (bond_miimon_inspect_slave(bond, slave))
> +			bond_miimon_commit(bond);
> +
>   		/* Refresh slave-array if applicable!
>   		 * If the setup does not use miimon or arpmon (mode-specific!),
>   		 * then these events will not cause the slave-array to be

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-08  1:28                     ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Jay Vosburgh
  2016-01-08  4:36                       ` zhuyj
@ 2016-01-09  1:35                       ` Tantilov, Emil S
  2016-01-09  2:19                         ` Jay Vosburgh
  1 sibling, 1 reply; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-09  1:35 UTC (permalink / raw)
  To: Jay Vosburgh, zyjzyj2000
  Cc: mkubecek, vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>Sent: Thursday, January 07, 2016 5:29 PM
>Subject: [RFC PATCH net-next] bonding: Use notifiers for slave link state
>detection
>
>
>	TEST PATCH
>
>	This patch modifies bonding to utilize notifier callbacks to
>detect slave link state changes.  It is intended to be used with miimon
>set to zero, and does not support the updelay or downdelay options to
>bonding.  It's not as complicated as it looks; most of the change set is
>to break out the inner loop of bond_miimon_inspect into its own
>function.

Jay,
 
I managed to do a quick test with this patch and occasionally there is a case where
I see the bonding driver reporting link up for an interface (eth1) that is not up just yet:

[12972.741999] bonding: bond0 is being created...
[12972.761907] bond0: Setting xmit hash policy to layer3+4 (1)
[12972.761990] bond0: Setting MII monitoring interval to 0
[12972.767131] bond0: Setting LACP rate to fast (1)
[12972.767916] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[12972.846158] bond0: Adding slave eth0
[12972.950548] pps pps0: new PPS source ptp0
[12972.950555] ixgbe 0000:01:00.0: registered PHC device on eth0
[12973.071750] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[12973.072023] bond0: Enslaving eth0 as a backup interface with an up link
[12974.122295] bond0: Adding slave eth1
[12974.227639] pps pps1: new PPS source ptp1
[12974.227645] ixgbe 0000:01:00.1: registered PHC device on eth1
[12974.349306] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[12974.349584] bond0: Enslaving eth1 as a backup interface with an up link
[12982.982797] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[12983.068437] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[12983.185720] ixgbe 0000:01:00.0 eth0: NIC Link is Down
[12983.982454] ixgbe 0000:01:00.0 eth0: speed changed to 0 for port eth0
[12983.982539] bond0: link status definitely down for interface eth0, disabling it
[12983.982546] bond0: link status definitely up for interface eth1, 0 Mbps full duplex
[12983.982550] bond0: first active interface up!
[12985.213752] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[12985.213970] bond0: link status definitely up for interface eth0, 10000 Mbps full duplex
[12985.213975] bond0: link status definitely up for interface eth1, 0 Mbps full duplex
[12989.195157] ixgbe 0000:01:00.1 eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-09  1:35                       ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Tantilov, Emil S
@ 2016-01-09  2:19                         ` Jay Vosburgh
  2016-01-11  9:03                           ` zhuyj
  2016-01-13 17:03                           ` Tantilov, Emil S
  0 siblings, 2 replies; 52+ messages in thread
From: Jay Vosburgh @ 2016-01-09  2:19 UTC (permalink / raw)
  To: Tantilov, Emil S
  Cc: zyjzyj2000, mkubecek, vfalico, gospo, netdev, Shteinbock,
	Boris (Wind River)

Tantilov, Emil S <emil.s.tantilov@intel.com> wrote:

>>-----Original Message-----
>From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>>Sent: Thursday, January 07, 2016 5:29 PM
>>Subject: [RFC PATCH net-next] bonding: Use notifiers for slave link state
>>detection
>>
>>
>>	TEST PATCH
>>
>>	This patch modifies bonding to utilize notifier callbacks to
>>detect slave link state changes.  It is intended to be used with miimon
>>set to zero, and does not support the updelay or downdelay options to
>>bonding.  It's not as complicated as it looks; most of the change set is
>>to break out the inner loop of bond_miimon_inspect into its own
>>function.
>
>Jay,
> 
>I managed to do a quick test with this patch and occasionally there is
>a case where I see the bonding driver reporting link up for an
>interface (eth1) that is not up just yet:
[...]
>[12985.213752] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>[12985.213970] bond0: link status definitely up for interface eth0, 10000 Mbps full duplex
>[12985.213975] bond0: link status definitely up for interface eth1, 0 Mbps full duplex

	Thanks for testing; the misbehavior is because I cheaped out and
didn't break out the commit function into a "single slave" version.  The
below patch (against net-next, replacing the original patch) shouldn't
generate the erroneous additional link messages any more.

	This does generate an RCU warning, although the code actually is
safe (since the notifier callback holds RTNL); I'll sort that out next
week.

	-J


diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index cab99fd..12dd533 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2012,203 +2012,206 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
 	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+				     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP,
 						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP,
-							  BOND_SLAVE_NOTIFY_LATER);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-			slave->delay--;
-			break;
+		slave->delay--;
+		break;
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK,
-						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.updelay;
-
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN,
-							  BOND_SLAVE_NOTIFY_LATER);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+						  BOND_SLAVE_NOTIFY_LATER);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
-	return commit;
+	return 0;
 }
 
-static void bond_miimon_commit(struct bonding *bond)
+static int bond_miimon_inspect(struct bonding *bond)
 {
 	struct list_head *iter;
-	struct slave *slave, *primary;
+	struct slave *slave;
+	int commit = 0;
 
-	bond_for_each_slave(bond, slave, iter) {
-		switch (slave->new_link) {
-		case BOND_LINK_NOCHANGE:
-			continue;
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave);
 
-		case BOND_LINK_UP:
-			bond_set_slave_link_state(slave, BOND_LINK_UP,
-						  BOND_SLAVE_NOTIFY_NOW);
-			slave->last_link_up = jiffies;
+	return commit;
+}
 
-			primary = rtnl_dereference(bond->primary_slave);
-			if (BOND_MODE(bond) == BOND_MODE_8023AD) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
-				/* make it immediately active */
-				bond_set_active_slave(slave);
-			} else if (slave != primary) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			}
+static void bond_miimon_commit_slave(struct bonding *bond, struct slave *slave)
+{
+	struct slave *primary;
 
-			netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
-				    slave->dev->name,
-				    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
-				    slave->duplex ? "full" : "half");
+	switch (slave->new_link) {
+	case BOND_LINK_NOCHANGE:
+		return;
 
-			/* notify ad that the link status has changed */
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave, BOND_LINK_UP);
+	case BOND_LINK_UP:
+		bond_set_slave_link_state(slave, BOND_LINK_UP,
+					  BOND_SLAVE_NOTIFY_NOW);
+		slave->last_link_up = jiffies;
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_UP);
+		primary = rtnl_dereference(bond->primary_slave);
+		if (BOND_MODE(bond) == BOND_MODE_8023AD) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
+			/* make it immediately active */
+			bond_set_active_slave(slave);
+		} else if (slave != primary) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		}
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
+			    slave->dev->name,
+			    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
+			    slave->duplex ? "full" : "half");
 
-			if (!bond->curr_active_slave || slave == primary)
-				goto do_failover;
+		/* notify ad that the link status has changed */
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_UP);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_UP);
 
-		case BOND_LINK_DOWN:
-			if (slave->link_failure_count < UINT_MAX)
-				slave->link_failure_count++;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
-						  BOND_SLAVE_NOTIFY_NOW);
+		if (!bond->curr_active_slave || slave == primary)
+			goto do_failover;
 
-			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
-			    BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_set_slave_inactive_flags(slave,
-							      BOND_SLAVE_NOTIFY_NOW);
+		goto out;
 
-			netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
-				    slave->dev->name);
+	case BOND_LINK_DOWN:
+		if (slave->link_failure_count < UINT_MAX)
+			slave->link_failure_count++;
 
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave,
-							    BOND_LINK_DOWN);
+		bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+					  BOND_SLAVE_NOTIFY_NOW);
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_DOWN);
+		if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
+		    BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_set_slave_inactive_flags(slave,
+						      BOND_SLAVE_NOTIFY_NOW);
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
+			    slave->dev->name);
 
-			if (slave == rcu_access_pointer(bond->curr_active_slave))
-				goto do_failover;
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_DOWN);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN);
 
-		default:
-			netdev_err(bond->dev, "invalid new link %d on slave %s\n",
-				   slave->new_link, slave->dev->name);
-			slave->new_link = BOND_LINK_NOCHANGE;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			continue;
-		}
+		if (slave == rcu_access_pointer(bond->curr_active_slave))
+			goto do_failover;
 
-do_failover:
-		block_netpoll_tx();
-		bond_select_active_slave(bond);
-		unblock_netpoll_tx();
+		goto out;
+
+	default:
+		netdev_err(bond->dev, "invalid new link %d on slave %s\n",
+			   slave->new_link, slave->dev->name);
+		slave->new_link = BOND_LINK_NOCHANGE;
+
+		goto out;
 	}
 
+do_failover:
+	block_netpoll_tx();
+	bond_select_active_slave(bond);
+	unblock_netpoll_tx();
+
+out:
 	bond_set_carrier(bond);
 }
 
+static void bond_miimon_commit(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+
+	bond_for_each_slave(bond, slave, iter)
+		bond_miimon_commit_slave(bond, slave);
+}
+
 /* bond_mii_monitor
  *
  * Really a wrapper that splits the mii monitor into two phases: an
@@ -3016,6 +3019,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave))
+			bond_miimon_commit_slave(bond, slave);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be


---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-09  2:19                         ` Jay Vosburgh
@ 2016-01-11  9:03                           ` zhuyj
  2016-01-13  2:54                             ` zhuyj
  2016-01-13 17:03                           ` Tantilov, Emil S
  1 sibling, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-11  9:03 UTC (permalink / raw)
  To: Jay Vosburgh, Tantilov, Emil S
  Cc: mkubecek, vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

Hi, Jay && Emil

I delved into the source code. This patch is based on notifiers. When a 
NETDEV_UP
notifier is received in bond_slave_netdev_event, in 
bond_miimon_inspect_slave, bond_check_dev_link
is called to detect link_state.

Because of link flap, link_state is sometimes different from NETDEV_UP. 
That is, though event is NETDEV_UP,
sometime link_state is down because of link flap.

In the following patch, if link_state is different from the event, it is 
unnecessary to make further setup.

diff --git a/drivers/net/bonding/bond_main.c 
b/drivers/net/bonding/bond_main.c
index 12dd533..1b53da0 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2012,7 +2012,8 @@ static int bond_slave_info_query(struct net_device 
*bond_dev, struct ifslave *in
  /*-------------------------------- Monitoring 
-------------------------------*/

  /* called with rcu_read_lock() */
-static int bond_miimon_inspect_slave(struct bonding *bond, struct slave 
*slave)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave 
*slave,
+                                    unsigned long event)
  {
         int link_state;
         bool ignore_updelay;
@@ -2022,6 +2023,17 @@ static int bond_miimon_inspect_slave(struct 
bonding *bond, struct slave *slave)
         slave->new_link = BOND_LINK_NOCHANGE;

         link_state = bond_check_dev_link(bond, slave->dev, 0);
+       switch (event) {
+       case NETDEV_UP:
+               if (!link_state)
+                       return 0;
+               break;
+
+       case NETDEV_DOWN:
+               if (link_state)
+                       return 0;
+               break;
+       }

         switch (slave->link) {
         case BOND_LINK_UP:
@@ -2107,7 +2119,7 @@ static int bond_miimon_inspect(struct bonding *bond)
         int commit = 0;

         bond_for_each_slave_rcu(bond, slave, iter)
-               commit += bond_miimon_inspect_slave(bond, slave);
+               commit += bond_miimon_inspect_slave(bond, slave, 0xFF);

         return commit;
  }
@@ -3019,7 +3031,7 @@ static int bond_slave_netdev_event(unsigned long 
event,
bond_3ad_adapter_speed_duplex_changed(slave);
                 /* Fallthrough */
         case NETDEV_DOWN:
-               if (bond_miimon_inspect_slave(bond, slave))
+               if (bond_miimon_inspect_slave(bond, slave, event))
                         bond_miimon_commit_slave(bond, slave);

                 /* Refresh slave-array if applicable!

Best Regards!
Zhu Yanjun

On 01/09/2016 10:19 AM, Jay Vosburgh wrote:
> Tantilov, Emil S <emil.s.tantilov@intel.com> wrote:
>
>>> -----Original Message-----
>> From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>>> Sent: Thursday, January 07, 2016 5:29 PM
>>> Subject: [RFC PATCH net-next] bonding: Use notifiers for slave link state
>>> detection
>>>
>>>
>>> 	TEST PATCH
>>>
>>> 	This patch modifies bonding to utilize notifier callbacks to
>>> detect slave link state changes.  It is intended to be used with miimon
>>> set to zero, and does not support the updelay or downdelay options to
>>> bonding.  It's not as complicated as it looks; most of the change set is
>>> to break out the inner loop of bond_miimon_inspect into its own
>>> function.
>> Jay,
>>
>> I managed to do a quick test with this patch and occasionally there is
>> a case where I see the bonding driver reporting link up for an
>> interface (eth1) that is not up just yet:
> [...]
>> [12985.213752] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>> [12985.213970] bond0: link status definitely up for interface eth0, 10000 Mbps full duplex
>> [12985.213975] bond0: link status definitely up for interface eth1, 0 Mbps full duplex
> 	Thanks for testing; the misbehavior is because I cheaped out and
> didn't break out the commit function into a "single slave" version.  The
> below patch (against net-next, replacing the original patch) shouldn't
> generate the erroneous additional link messages any more.
>
> 	This does generate an RCU warning, although the code actually is
> safe (since the notifier callback holds RTNL); I'll sort that out next
> week.
>
> 	-J
>
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index cab99fd..12dd533 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2012,203 +2012,206 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>   /*-------------------------------- Monitoring -------------------------------*/
>   
>   /* called with rcu_read_lock() */
> -static int bond_miimon_inspect(struct bonding *bond)
> +static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave)
>   {
> -	int link_state, commit = 0;
> -	struct list_head *iter;
> -	struct slave *slave;
> +	int link_state;
>   	bool ignore_updelay;
>   
>   	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>   
> -	bond_for_each_slave_rcu(bond, slave, iter) {
> -		slave->new_link = BOND_LINK_NOCHANGE;
> +	slave->new_link = BOND_LINK_NOCHANGE;
>   
> -		link_state = bond_check_dev_link(bond, slave->dev, 0);
> +	link_state = bond_check_dev_link(bond, slave->dev, 0);
>   
> -		switch (slave->link) {
> -		case BOND_LINK_UP:
> -			if (link_state)
> -				continue;
> +	switch (slave->link) {
> +	case BOND_LINK_UP:
> +		if (link_state)
> +			return 0;
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
> +		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
> +					  BOND_SLAVE_NOTIFY_LATER);
> +		slave->delay = bond->params.downdelay;
> +		if (slave->delay) {
> +			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
> +				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
> +				    (bond_is_active_slave(slave) ?
> +				     "active " : "backup ") : "",
> +				    slave->dev->name,
> +				    bond->params.downdelay * bond->params.miimon);
> +		}
> +		/*FALLTHRU*/
> +	case BOND_LINK_FAIL:
> +		if (link_state) {
> +			/* recovered before downdelay expired */
> +			bond_set_slave_link_state(slave, BOND_LINK_UP,
>   						  BOND_SLAVE_NOTIFY_LATER);
> -			slave->delay = bond->params.downdelay;
> -			if (slave->delay) {
> -				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
> -					    (BOND_MODE(bond) ==
> -					     BOND_MODE_ACTIVEBACKUP) ?
> -					     (bond_is_active_slave(slave) ?
> -					      "active " : "backup ") : "",
> -					    slave->dev->name,
> -					    bond->params.downdelay * bond->params.miimon);
> -			}
> -			/*FALLTHRU*/
> -		case BOND_LINK_FAIL:
> -			if (link_state) {
> -				/* recovered before downdelay expired */
> -				bond_set_slave_link_state(slave, BOND_LINK_UP,
> -							  BOND_SLAVE_NOTIFY_LATER);
> -				slave->last_link_up = jiffies;
> -				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
> -					    (bond->params.downdelay - slave->delay) *
> -					    bond->params.miimon,
> -					    slave->dev->name);
> -				continue;
> -			}
> +			slave->last_link_up = jiffies;
> +			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
> +				    (bond->params.downdelay - slave->delay) *
> +				    bond->params.miimon, slave->dev->name);
> +			return 0;
> +		}
>   
> -			if (slave->delay <= 0) {
> -				slave->new_link = BOND_LINK_DOWN;
> -				commit++;
> -				continue;
> -			}
> +		if (slave->delay <= 0) {
> +			slave->new_link = BOND_LINK_DOWN;
> +			return 1;
> +		}
>   
> -			slave->delay--;
> -			break;
> +		slave->delay--;
> +		break;
>   
> -		case BOND_LINK_DOWN:
> -			if (!link_state)
> -				continue;
> +	case BOND_LINK_DOWN:
> +		if (!link_state)
> +			return 0;
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_BACK,
> -						  BOND_SLAVE_NOTIFY_LATER);
> -			slave->delay = bond->params.updelay;
> -
> -			if (slave->delay) {
> -				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
> -					    slave->dev->name,
> -					    ignore_updelay ? 0 :
> -					    bond->params.updelay *
> -					    bond->params.miimon);
> -			}
> -			/*FALLTHRU*/
> -		case BOND_LINK_BACK:
> -			if (!link_state) {
> -				bond_set_slave_link_state(slave,
> -							  BOND_LINK_DOWN,
> -							  BOND_SLAVE_NOTIFY_LATER);
> -				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
> -					    (bond->params.updelay - slave->delay) *
> -					    bond->params.miimon,
> -					    slave->dev->name);
> +		bond_set_slave_link_state(slave, BOND_LINK_BACK,
> +					  BOND_SLAVE_NOTIFY_LATER);
> +		slave->delay = bond->params.updelay;
>   
> -				continue;
> -			}
> +		if (slave->delay) {
> +			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
> +				    slave->dev->name, ignore_updelay ? 0 :
> +				    bond->params.updelay * bond->params.miimon);
> +		}
> +		/*FALLTHRU*/
> +	case BOND_LINK_BACK:
> +		if (!link_state) {
> +			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> +						  BOND_SLAVE_NOTIFY_LATER);
> +			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
> +				    (bond->params.updelay - slave->delay) *
> +				    bond->params.miimon, slave->dev->name);
>   
> -			if (ignore_updelay)
> -				slave->delay = 0;
> +			return 0;
> +		}
>   
> -			if (slave->delay <= 0) {
> -				slave->new_link = BOND_LINK_UP;
> -				commit++;
> -				ignore_updelay = false;
> -				continue;
> -			}
> +		if (ignore_updelay)
> +			slave->delay = 0;
>   
> -			slave->delay--;
> -			break;
> +		if (slave->delay <= 0) {
> +			slave->new_link = BOND_LINK_UP;
> +			return 1;
>   		}
> +
> +		slave->delay--;
> +		break;
>   	}
>   
> -	return commit;
> +	return 0;
>   }
>   
> -static void bond_miimon_commit(struct bonding *bond)
> +static int bond_miimon_inspect(struct bonding *bond)
>   {
>   	struct list_head *iter;
> -	struct slave *slave, *primary;
> +	struct slave *slave;
> +	int commit = 0;
>   
> -	bond_for_each_slave(bond, slave, iter) {
> -		switch (slave->new_link) {
> -		case BOND_LINK_NOCHANGE:
> -			continue;
> +	bond_for_each_slave_rcu(bond, slave, iter)
> +		commit += bond_miimon_inspect_slave(bond, slave);
>   
> -		case BOND_LINK_UP:
> -			bond_set_slave_link_state(slave, BOND_LINK_UP,
> -						  BOND_SLAVE_NOTIFY_NOW);
> -			slave->last_link_up = jiffies;
> +	return commit;
> +}
>   
> -			primary = rtnl_dereference(bond->primary_slave);
> -			if (BOND_MODE(bond) == BOND_MODE_8023AD) {
> -				/* prevent it from being the active one */
> -				bond_set_backup_slave(slave);
> -			} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
> -				/* make it immediately active */
> -				bond_set_active_slave(slave);
> -			} else if (slave != primary) {
> -				/* prevent it from being the active one */
> -				bond_set_backup_slave(slave);
> -			}
> +static void bond_miimon_commit_slave(struct bonding *bond, struct slave *slave)
> +{
> +	struct slave *primary;
>   
> -			netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
> -				    slave->dev->name,
> -				    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
> -				    slave->duplex ? "full" : "half");
> +	switch (slave->new_link) {
> +	case BOND_LINK_NOCHANGE:
> +		return;
>   
> -			/* notify ad that the link status has changed */
> -			if (BOND_MODE(bond) == BOND_MODE_8023AD)
> -				bond_3ad_handle_link_change(slave, BOND_LINK_UP);
> +	case BOND_LINK_UP:
> +		bond_set_slave_link_state(slave, BOND_LINK_UP,
> +					  BOND_SLAVE_NOTIFY_NOW);
> +		slave->last_link_up = jiffies;
>   
> -			if (bond_is_lb(bond))
> -				bond_alb_handle_link_change(bond, slave,
> -							    BOND_LINK_UP);
> +		primary = rtnl_dereference(bond->primary_slave);
> +		if (BOND_MODE(bond) == BOND_MODE_8023AD) {
> +			/* prevent it from being the active one */
> +			bond_set_backup_slave(slave);
> +		} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
> +			/* make it immediately active */
> +			bond_set_active_slave(slave);
> +		} else if (slave != primary) {
> +			/* prevent it from being the active one */
> +			bond_set_backup_slave(slave);
> +		}
>   
> -			if (BOND_MODE(bond) == BOND_MODE_XOR)
> -				bond_update_slave_arr(bond, NULL);
> +		netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
> +			    slave->dev->name,
> +			    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
> +			    slave->duplex ? "full" : "half");
>   
> -			if (!bond->curr_active_slave || slave == primary)
> -				goto do_failover;
> +		/* notify ad that the link status has changed */
> +		if (BOND_MODE(bond) == BOND_MODE_8023AD)
> +			bond_3ad_handle_link_change(slave, BOND_LINK_UP);
>   
> -			continue;
> +		if (bond_is_lb(bond))
> +			bond_alb_handle_link_change(bond, slave, BOND_LINK_UP);
>   
> -		case BOND_LINK_DOWN:
> -			if (slave->link_failure_count < UINT_MAX)
> -				slave->link_failure_count++;
> +		if (BOND_MODE(bond) == BOND_MODE_XOR)
> +			bond_update_slave_arr(bond, NULL);
>   
> -			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> -						  BOND_SLAVE_NOTIFY_NOW);
> +		if (!bond->curr_active_slave || slave == primary)
> +			goto do_failover;
>   
> -			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
> -			    BOND_MODE(bond) == BOND_MODE_8023AD)
> -				bond_set_slave_inactive_flags(slave,
> -							      BOND_SLAVE_NOTIFY_NOW);
> +		goto out;
>   
> -			netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
> -				    slave->dev->name);
> +	case BOND_LINK_DOWN:
> +		if (slave->link_failure_count < UINT_MAX)
> +			slave->link_failure_count++;
>   
> -			if (BOND_MODE(bond) == BOND_MODE_8023AD)
> -				bond_3ad_handle_link_change(slave,
> -							    BOND_LINK_DOWN);
> +		bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> +					  BOND_SLAVE_NOTIFY_NOW);
>   
> -			if (bond_is_lb(bond))
> -				bond_alb_handle_link_change(bond, slave,
> -							    BOND_LINK_DOWN);
> +		if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
> +		    BOND_MODE(bond) == BOND_MODE_8023AD)
> +			bond_set_slave_inactive_flags(slave,
> +						      BOND_SLAVE_NOTIFY_NOW);
>   
> -			if (BOND_MODE(bond) == BOND_MODE_XOR)
> -				bond_update_slave_arr(bond, NULL);
> +		netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
> +			    slave->dev->name);
>   
> -			if (slave == rcu_access_pointer(bond->curr_active_slave))
> -				goto do_failover;
> +		if (BOND_MODE(bond) == BOND_MODE_8023AD)
> +			bond_3ad_handle_link_change(slave, BOND_LINK_DOWN);
>   
> -			continue;
> +		if (bond_is_lb(bond))
> +			bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN);
>   
> -		default:
> -			netdev_err(bond->dev, "invalid new link %d on slave %s\n",
> -				   slave->new_link, slave->dev->name);
> -			slave->new_link = BOND_LINK_NOCHANGE;
> +		if (BOND_MODE(bond) == BOND_MODE_XOR)
> +			bond_update_slave_arr(bond, NULL);
>   
> -			continue;
> -		}
> +		if (slave == rcu_access_pointer(bond->curr_active_slave))
> +			goto do_failover;
>   
> -do_failover:
> -		block_netpoll_tx();
> -		bond_select_active_slave(bond);
> -		unblock_netpoll_tx();
> +		goto out;
> +
> +	default:
> +		netdev_err(bond->dev, "invalid new link %d on slave %s\n",
> +			   slave->new_link, slave->dev->name);
> +		slave->new_link = BOND_LINK_NOCHANGE;
> +
> +		goto out;
>   	}
>   
> +do_failover:
> +	block_netpoll_tx();
> +	bond_select_active_slave(bond);
> +	unblock_netpoll_tx();
> +
> +out:
>   	bond_set_carrier(bond);
>   }
>   
> +static void bond_miimon_commit(struct bonding *bond)
> +{
> +	struct list_head *iter;
> +	struct slave *slave;
> +
> +	bond_for_each_slave(bond, slave, iter)
> +		bond_miimon_commit_slave(bond, slave);
> +}
> +
>   /* bond_mii_monitor
>    *
>    * Really a wrapper that splits the mii monitor into two phases: an
> @@ -3016,6 +3019,9 @@ static int bond_slave_netdev_event(unsigned long event,
>   			bond_3ad_adapter_speed_duplex_changed(slave);
>   		/* Fallthrough */
>   	case NETDEV_DOWN:
> +		if (bond_miimon_inspect_slave(bond, slave))
> +			bond_miimon_commit_slave(bond, slave);
> +
>   		/* Refresh slave-array if applicable!
>   		 * If the setup does not use miimon or arpmon (mode-specific!),
>   		 * then these events will not cause the slave-array to be
>
>
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-11  9:03                           ` zhuyj
@ 2016-01-13  2:54                             ` zhuyj
  0 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-13  2:54 UTC (permalink / raw)
  To: Jay Vosburgh, Tantilov, Emil S
  Cc: mkubecek, vfalico, gospo, netdev, Shteinbock, Boris (Wind River), zhuyj

Hi, Jay && Emil

Any comments?

Best Regards!
Zhu Yanjun

On 01/11/2016 05:03 PM, zhuyj wrote:
> Hi, Jay && Emil
>
> I delved into the source code. This patch is based on notifiers. When 
> a NETDEV_UP
> notifier is received in bond_slave_netdev_event, in 
> bond_miimon_inspect_slave, bond_check_dev_link
> is called to detect link_state.
>
> Because of link flap, link_state is sometimes different from 
> NETDEV_UP. That is, though event is NETDEV_UP,
> sometime link_state is down because of link flap.
>
> In the following patch, if link_state is different from the event, it 
> is unnecessary to make further setup.
>
> diff --git a/drivers/net/bonding/bond_main.c 
> b/drivers/net/bonding/bond_main.c
> index 12dd533..1b53da0 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2012,7 +2012,8 @@ static int bond_slave_info_query(struct 
> net_device *bond_dev, struct ifslave *in
>  /*-------------------------------- Monitoring 
> -------------------------------*/
>
>  /* called with rcu_read_lock() */
> -static int bond_miimon_inspect_slave(struct bonding *bond, struct 
> slave *slave)
> +static int bond_miimon_inspect_slave(struct bonding *bond, struct 
> slave *slave,
> +                                    unsigned long event)
>  {
>         int link_state;
>         bool ignore_updelay;
> @@ -2022,6 +2023,17 @@ static int bond_miimon_inspect_slave(struct 
> bonding *bond, struct slave *slave)
>         slave->new_link = BOND_LINK_NOCHANGE;
>
>         link_state = bond_check_dev_link(bond, slave->dev, 0);
> +       switch (event) {
> +       case NETDEV_UP:
> +               if (!link_state)
> +                       return 0;
> +               break;
> +
> +       case NETDEV_DOWN:
> +               if (link_state)
> +                       return 0;
> +               break;
> +       }
>
>         switch (slave->link) {
>         case BOND_LINK_UP:
> @@ -2107,7 +2119,7 @@ static int bond_miimon_inspect(struct bonding 
> *bond)
>         int commit = 0;
>
>         bond_for_each_slave_rcu(bond, slave, iter)
> -               commit += bond_miimon_inspect_slave(bond, slave);
> +               commit += bond_miimon_inspect_slave(bond, slave, 0xFF);
>
>         return commit;
>  }
> @@ -3019,7 +3031,7 @@ static int bond_slave_netdev_event(unsigned long 
> event,
> bond_3ad_adapter_speed_duplex_changed(slave);
>                 /* Fallthrough */
>         case NETDEV_DOWN:
> -               if (bond_miimon_inspect_slave(bond, slave))
> +               if (bond_miimon_inspect_slave(bond, slave, event))
>                         bond_miimon_commit_slave(bond, slave);
>
>                 /* Refresh slave-array if applicable!
>
> Best Regards!
> Zhu Yanjun
>
> On 01/09/2016 10:19 AM, Jay Vosburgh wrote:
>> Tantilov, Emil S <emil.s.tantilov@intel.com> wrote:
>>
>>>> -----Original Message-----
>>> From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>>>> Sent: Thursday, January 07, 2016 5:29 PM
>>>> Subject: [RFC PATCH net-next] bonding: Use notifiers for slave link 
>>>> state
>>>> detection
>>>>
>>>>
>>>>     TEST PATCH
>>>>
>>>>     This patch modifies bonding to utilize notifier callbacks to
>>>> detect slave link state changes.  It is intended to be used with 
>>>> miimon
>>>> set to zero, and does not support the updelay or downdelay options to
>>>> bonding.  It's not as complicated as it looks; most of the change 
>>>> set is
>>>> to break out the inner loop of bond_miimon_inspect into its own
>>>> function.
>>> Jay,
>>>
>>> I managed to do a quick test with this patch and occasionally there is
>>> a case where I see the bonding driver reporting link up for an
>>> interface (eth1) that is not up just yet:
>> [...]
>>> [12985.213752] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow 
>>> Control: RX/TX
>>> [12985.213970] bond0: link status definitely up for interface eth0, 
>>> 10000 Mbps full duplex
>>> [12985.213975] bond0: link status definitely up for interface eth1, 
>>> 0 Mbps full duplex
>>     Thanks for testing; the misbehavior is because I cheaped out and
>> didn't break out the commit function into a "single slave" version.  The
>> below patch (against net-next, replacing the original patch) shouldn't
>> generate the erroneous additional link messages any more.
>>
>>     This does generate an RCU warning, although the code actually is
>> safe (since the notifier callback holds RTNL); I'll sort that out next
>> week.
>>
>>     -J
>>
>>
>> diff --git a/drivers/net/bonding/bond_main.c 
>> b/drivers/net/bonding/bond_main.c
>> index cab99fd..12dd533 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2012,203 +2012,206 @@ static int bond_slave_info_query(struct 
>> net_device *bond_dev, struct ifslave *in
>>   /*-------------------------------- Monitoring 
>> -------------------------------*/
>>     /* called with rcu_read_lock() */
>> -static int bond_miimon_inspect(struct bonding *bond)
>> +static int bond_miimon_inspect_slave(struct bonding *bond, struct 
>> slave *slave)
>>   {
>> -    int link_state, commit = 0;
>> -    struct list_head *iter;
>> -    struct slave *slave;
>> +    int link_state;
>>       bool ignore_updelay;
>>         ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>>   -    bond_for_each_slave_rcu(bond, slave, iter) {
>> -        slave->new_link = BOND_LINK_NOCHANGE;
>> +    slave->new_link = BOND_LINK_NOCHANGE;
>>   -        link_state = bond_check_dev_link(bond, slave->dev, 0);
>> +    link_state = bond_check_dev_link(bond, slave->dev, 0);
>>   -        switch (slave->link) {
>> -        case BOND_LINK_UP:
>> -            if (link_state)
>> -                continue;
>> +    switch (slave->link) {
>> +    case BOND_LINK_UP:
>> +        if (link_state)
>> +            return 0;
>>   -            bond_set_slave_link_state(slave, BOND_LINK_FAIL,
>> +        bond_set_slave_link_state(slave, BOND_LINK_FAIL,
>> +                      BOND_SLAVE_NOTIFY_LATER);
>> +        slave->delay = bond->params.downdelay;
>> +        if (slave->delay) {
>> +            netdev_info(bond->dev, "link status down for %sinterface 
>> %s, disabling it in %d ms\n",
>> +                    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
>> +                    (bond_is_active_slave(slave) ?
>> +                     "active " : "backup ") : "",
>> +                    slave->dev->name,
>> +                    bond->params.downdelay * bond->params.miimon);
>> +        }
>> +        /*FALLTHRU*/
>> +    case BOND_LINK_FAIL:
>> +        if (link_state) {
>> +            /* recovered before downdelay expired */
>> +            bond_set_slave_link_state(slave, BOND_LINK_UP,
>>                             BOND_SLAVE_NOTIFY_LATER);
>> -            slave->delay = bond->params.downdelay;
>> -            if (slave->delay) {
>> -                netdev_info(bond->dev, "link status down for 
>> %sinterface %s, disabling it in %d ms\n",
>> -                        (BOND_MODE(bond) ==
>> -                         BOND_MODE_ACTIVEBACKUP) ?
>> -                         (bond_is_active_slave(slave) ?
>> -                          "active " : "backup ") : "",
>> -                        slave->dev->name,
>> -                        bond->params.downdelay * bond->params.miimon);
>> -            }
>> -            /*FALLTHRU*/
>> -        case BOND_LINK_FAIL:
>> -            if (link_state) {
>> -                /* recovered before downdelay expired */
>> -                bond_set_slave_link_state(slave, BOND_LINK_UP,
>> -                              BOND_SLAVE_NOTIFY_LATER);
>> -                slave->last_link_up = jiffies;
>> -                netdev_info(bond->dev, "link status up again after 
>> %d ms for interface %s\n",
>> -                        (bond->params.downdelay - slave->delay) *
>> -                        bond->params.miimon,
>> -                        slave->dev->name);
>> -                continue;
>> -            }
>> +            slave->last_link_up = jiffies;
>> +            netdev_info(bond->dev, "link status up again after %d ms 
>> for interface %s\n",
>> +                    (bond->params.downdelay - slave->delay) *
>> +                    bond->params.miimon, slave->dev->name);
>> +            return 0;
>> +        }
>>   -            if (slave->delay <= 0) {
>> -                slave->new_link = BOND_LINK_DOWN;
>> -                commit++;
>> -                continue;
>> -            }
>> +        if (slave->delay <= 0) {
>> +            slave->new_link = BOND_LINK_DOWN;
>> +            return 1;
>> +        }
>>   -            slave->delay--;
>> -            break;
>> +        slave->delay--;
>> +        break;
>>   -        case BOND_LINK_DOWN:
>> -            if (!link_state)
>> -                continue;
>> +    case BOND_LINK_DOWN:
>> +        if (!link_state)
>> +            return 0;
>>   -            bond_set_slave_link_state(slave, BOND_LINK_BACK,
>> -                          BOND_SLAVE_NOTIFY_LATER);
>> -            slave->delay = bond->params.updelay;
>> -
>> -            if (slave->delay) {
>> -                netdev_info(bond->dev, "link status up for interface 
>> %s, enabling it in %d ms\n",
>> -                        slave->dev->name,
>> -                        ignore_updelay ? 0 :
>> -                        bond->params.updelay *
>> -                        bond->params.miimon);
>> -            }
>> -            /*FALLTHRU*/
>> -        case BOND_LINK_BACK:
>> -            if (!link_state) {
>> -                bond_set_slave_link_state(slave,
>> -                              BOND_LINK_DOWN,
>> -                              BOND_SLAVE_NOTIFY_LATER);
>> -                netdev_info(bond->dev, "link status down again after 
>> %d ms for interface %s\n",
>> -                        (bond->params.updelay - slave->delay) *
>> -                        bond->params.miimon,
>> -                        slave->dev->name);
>> +        bond_set_slave_link_state(slave, BOND_LINK_BACK,
>> +                      BOND_SLAVE_NOTIFY_LATER);
>> +        slave->delay = bond->params.updelay;
>>   -                continue;
>> -            }
>> +        if (slave->delay) {
>> +            netdev_info(bond->dev, "link status up for interface %s, 
>> enabling it in %d ms\n",
>> +                    slave->dev->name, ignore_updelay ? 0 :
>> +                    bond->params.updelay * bond->params.miimon);
>> +        }
>> +        /*FALLTHRU*/
>> +    case BOND_LINK_BACK:
>> +        if (!link_state) {
>> +            bond_set_slave_link_state(slave, BOND_LINK_DOWN,
>> +                          BOND_SLAVE_NOTIFY_LATER);
>> +            netdev_info(bond->dev, "link status down again after %d 
>> ms for interface %s\n",
>> +                    (bond->params.updelay - slave->delay) *
>> +                    bond->params.miimon, slave->dev->name);
>>   -            if (ignore_updelay)
>> -                slave->delay = 0;
>> +            return 0;
>> +        }
>>   -            if (slave->delay <= 0) {
>> -                slave->new_link = BOND_LINK_UP;
>> -                commit++;
>> -                ignore_updelay = false;
>> -                continue;
>> -            }
>> +        if (ignore_updelay)
>> +            slave->delay = 0;
>>   -            slave->delay--;
>> -            break;
>> +        if (slave->delay <= 0) {
>> +            slave->new_link = BOND_LINK_UP;
>> +            return 1;
>>           }
>> +
>> +        slave->delay--;
>> +        break;
>>       }
>>   -    return commit;
>> +    return 0;
>>   }
>>   -static void bond_miimon_commit(struct bonding *bond)
>> +static int bond_miimon_inspect(struct bonding *bond)
>>   {
>>       struct list_head *iter;
>> -    struct slave *slave, *primary;
>> +    struct slave *slave;
>> +    int commit = 0;
>>   -    bond_for_each_slave(bond, slave, iter) {
>> -        switch (slave->new_link) {
>> -        case BOND_LINK_NOCHANGE:
>> -            continue;
>> +    bond_for_each_slave_rcu(bond, slave, iter)
>> +        commit += bond_miimon_inspect_slave(bond, slave);
>>   -        case BOND_LINK_UP:
>> -            bond_set_slave_link_state(slave, BOND_LINK_UP,
>> -                          BOND_SLAVE_NOTIFY_NOW);
>> -            slave->last_link_up = jiffies;
>> +    return commit;
>> +}
>>   -            primary = rtnl_dereference(bond->primary_slave);
>> -            if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>> -                /* prevent it from being the active one */
>> -                bond_set_backup_slave(slave);
>> -            } else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
>> -                /* make it immediately active */
>> -                bond_set_active_slave(slave);
>> -            } else if (slave != primary) {
>> -                /* prevent it from being the active one */
>> -                bond_set_backup_slave(slave);
>> -            }
>> +static void bond_miimon_commit_slave(struct bonding *bond, struct 
>> slave *slave)
>> +{
>> +    struct slave *primary;
>>   -            netdev_info(bond->dev, "link status definitely up for 
>> interface %s, %u Mbps %s duplex\n",
>> -                    slave->dev->name,
>> -                    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
>> -                    slave->duplex ? "full" : "half");
>> +    switch (slave->new_link) {
>> +    case BOND_LINK_NOCHANGE:
>> +        return;
>>   -            /* notify ad that the link status has changed */
>> -            if (BOND_MODE(bond) == BOND_MODE_8023AD)
>> -                bond_3ad_handle_link_change(slave, BOND_LINK_UP);
>> +    case BOND_LINK_UP:
>> +        bond_set_slave_link_state(slave, BOND_LINK_UP,
>> +                      BOND_SLAVE_NOTIFY_NOW);
>> +        slave->last_link_up = jiffies;
>>   -            if (bond_is_lb(bond))
>> -                bond_alb_handle_link_change(bond, slave,
>> -                                BOND_LINK_UP);
>> +        primary = rtnl_dereference(bond->primary_slave);
>> +        if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>> +            /* prevent it from being the active one */
>> +            bond_set_backup_slave(slave);
>> +        } else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
>> +            /* make it immediately active */
>> +            bond_set_active_slave(slave);
>> +        } else if (slave != primary) {
>> +            /* prevent it from being the active one */
>> +            bond_set_backup_slave(slave);
>> +        }
>>   -            if (BOND_MODE(bond) == BOND_MODE_XOR)
>> -                bond_update_slave_arr(bond, NULL);
>> +        netdev_info(bond->dev, "link status definitely up for 
>> interface %s, %u Mbps %s duplex\n",
>> +                slave->dev->name,
>> +                slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
>> +                slave->duplex ? "full" : "half");
>>   -            if (!bond->curr_active_slave || slave == primary)
>> -                goto do_failover;
>> +        /* notify ad that the link status has changed */
>> +        if (BOND_MODE(bond) == BOND_MODE_8023AD)
>> +            bond_3ad_handle_link_change(slave, BOND_LINK_UP);
>>   -            continue;
>> +        if (bond_is_lb(bond))
>> +            bond_alb_handle_link_change(bond, slave, BOND_LINK_UP);
>>   -        case BOND_LINK_DOWN:
>> -            if (slave->link_failure_count < UINT_MAX)
>> -                slave->link_failure_count++;
>> +        if (BOND_MODE(bond) == BOND_MODE_XOR)
>> +            bond_update_slave_arr(bond, NULL);
>>   -            bond_set_slave_link_state(slave, BOND_LINK_DOWN,
>> -                          BOND_SLAVE_NOTIFY_NOW);
>> +        if (!bond->curr_active_slave || slave == primary)
>> +            goto do_failover;
>>   -            if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
>> -                BOND_MODE(bond) == BOND_MODE_8023AD)
>> -                bond_set_slave_inactive_flags(slave,
>> -                                  BOND_SLAVE_NOTIFY_NOW);
>> +        goto out;
>>   -            netdev_info(bond->dev, "link status definitely down 
>> for interface %s, disabling it\n",
>> -                    slave->dev->name);
>> +    case BOND_LINK_DOWN:
>> +        if (slave->link_failure_count < UINT_MAX)
>> +            slave->link_failure_count++;
>>   -            if (BOND_MODE(bond) == BOND_MODE_8023AD)
>> -                bond_3ad_handle_link_change(slave,
>> -                                BOND_LINK_DOWN);
>> +        bond_set_slave_link_state(slave, BOND_LINK_DOWN,
>> +                      BOND_SLAVE_NOTIFY_NOW);
>>   -            if (bond_is_lb(bond))
>> -                bond_alb_handle_link_change(bond, slave,
>> -                                BOND_LINK_DOWN);
>> +        if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
>> +            BOND_MODE(bond) == BOND_MODE_8023AD)
>> +            bond_set_slave_inactive_flags(slave,
>> +                              BOND_SLAVE_NOTIFY_NOW);
>>   -            if (BOND_MODE(bond) == BOND_MODE_XOR)
>> -                bond_update_slave_arr(bond, NULL);
>> +        netdev_info(bond->dev, "link status definitely down for 
>> interface %s, disabling it\n",
>> +                slave->dev->name);
>>   -            if (slave == rcu_access_pointer(bond->curr_active_slave))
>> -                goto do_failover;
>> +        if (BOND_MODE(bond) == BOND_MODE_8023AD)
>> +            bond_3ad_handle_link_change(slave, BOND_LINK_DOWN);
>>   -            continue;
>> +        if (bond_is_lb(bond))
>> +            bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN);
>>   -        default:
>> -            netdev_err(bond->dev, "invalid new link %d on slave %s\n",
>> -                   slave->new_link, slave->dev->name);
>> -            slave->new_link = BOND_LINK_NOCHANGE;
>> +        if (BOND_MODE(bond) == BOND_MODE_XOR)
>> +            bond_update_slave_arr(bond, NULL);
>>   -            continue;
>> -        }
>> +        if (slave == rcu_access_pointer(bond->curr_active_slave))
>> +            goto do_failover;
>>   -do_failover:
>> -        block_netpoll_tx();
>> -        bond_select_active_slave(bond);
>> -        unblock_netpoll_tx();
>> +        goto out;
>> +
>> +    default:
>> +        netdev_err(bond->dev, "invalid new link %d on slave %s\n",
>> +               slave->new_link, slave->dev->name);
>> +        slave->new_link = BOND_LINK_NOCHANGE;
>> +
>> +        goto out;
>>       }
>>   +do_failover:
>> +    block_netpoll_tx();
>> +    bond_select_active_slave(bond);
>> +    unblock_netpoll_tx();
>> +
>> +out:
>>       bond_set_carrier(bond);
>>   }
>>   +static void bond_miimon_commit(struct bonding *bond)
>> +{
>> +    struct list_head *iter;
>> +    struct slave *slave;
>> +
>> +    bond_for_each_slave(bond, slave, iter)
>> +        bond_miimon_commit_slave(bond, slave);
>> +}
>> +
>>   /* bond_mii_monitor
>>    *
>>    * Really a wrapper that splits the mii monitor into two phases: an
>> @@ -3016,6 +3019,9 @@ static int bond_slave_netdev_event(unsigned 
>> long event,
>>               bond_3ad_adapter_speed_duplex_changed(slave);
>>           /* Fallthrough */
>>       case NETDEV_DOWN:
>> +        if (bond_miimon_inspect_slave(bond, slave))
>> +            bond_miimon_commit_slave(bond, slave);
>> +
>>           /* Refresh slave-array if applicable!
>>            * If the setup does not use miimon or arpmon 
>> (mode-specific!),
>>            * then these events will not cause the slave-array to be
>>
>>
>> ---
>>     -Jay Vosburgh, jay.vosburgh@canonical.com
>>
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC PATCH net-next] bonding: Use notifiers for slave link state detection
  2016-01-09  2:19                         ` Jay Vosburgh
  2016-01-11  9:03                           ` zhuyj
@ 2016-01-13 17:03                           ` Tantilov, Emil S
  2016-01-20  5:13                             ` [PATCH 1/1] " zyjzyj2000
  2016-01-21 10:16                             ` zyjzyj2000
  1 sibling, 2 replies; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-13 17:03 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: zyjzyj2000, mkubecek, vfalico, gospo, netdev, Shteinbock,
	Boris (Wind River)

>-----Original Message-----
>From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>Sent: Friday, January 08, 2016 6:20 PM
>To: Tantilov, Emil S
>Cc: zyjzyj2000@gmail.com; mkubecek@suse.cz; vfalico@gmail.com;
>gospo@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River)
>Subject: Re: [RFC PATCH net-next] bonding: Use notifiers for slave link
>state detection
>
>Tantilov, Emil S <emil.s.tantilov@intel.com> wrote:
>
>>>-----Original Message-----
>>From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>>>Sent: Thursday, January 07, 2016 5:29 PM
>>>Subject: [RFC PATCH net-next] bonding: Use notifiers for slave link state
>>>detection
>>>
>>>
>>>	TEST PATCH
>>>
>>>	This patch modifies bonding to utilize notifier callbacks to
>>>detect slave link state changes.  It is intended to be used with miimon
>>>set to zero, and does not support the updelay or downdelay options to
>>>bonding.  It's not as complicated as it looks; most of the change set is
>>>to break out the inner loop of bond_miimon_inspect into its own
>>>function.
>>
>>Jay,
>>
>>I managed to do a quick test with this patch and occasionally there is
>>a case where I see the bonding driver reporting link up for an
>>interface (eth1) that is not up just yet:
>[...]
>>[12985.213752] ixgbe 0000:01:00.0 eth0: NIC Link is Up 10 Gbps, Flow
>Control: RX/TX
>>[12985.213970] bond0: link status definitely up for interface eth0, 10000
>Mbps full duplex
>>[12985.213975] bond0: link status definitely up for interface eth1, 0 Mbps
>full duplex
>
>	Thanks for testing; the misbehavior is because I cheaped out and
>didn't break out the commit function into a "single slave" version.  The
>below patch (against net-next, replacing the original patch) shouldn't
>generate the erroneous additional link messages any more.
>
>	This does generate an RCU warning, although the code actually is
>safe (since the notifier callback holds RTNL); I'll sort that out next
>week.
>
>	-J

Alright, so I was able to kick off another test with this patch and it's
been running for 24 hours+ without errors. The setup I have has all kinds of
link issues, so it's a pretty good stress test.

Note that the issue that started this thread was due to the ixgbe driver 
reporting speed directly from the LINKS register, which is no longer the case,
so just using notifiers is probably not enough to get around this issue, but
this should not be a problem anymore (at least for ixgbe).

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-13 17:03                           ` Tantilov, Emil S
@ 2016-01-20  5:13                             ` zyjzyj2000
  2016-01-20  5:13                               ` zyjzyj2000
  2016-01-21 10:16                             ` zyjzyj2000
  1 sibling, 1 reply; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-20  5:13 UTC (permalink / raw)
  To: zyjzyj2000, mkubecek, vfalico, gospo, netdev, boris.shteinbock


Hi, Jay && Emil

Thanks for your hard work.

I think the similar patch is needed in linux kernel 4.4. As such, based on linux kernel 4.4, I made this patch. Please comment.

Best Regards!
Zhu Yanjun

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-20  5:13                             ` [PATCH 1/1] " zyjzyj2000
@ 2016-01-20  5:13                               ` zyjzyj2000
  0 siblings, 0 replies; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-20  5:13 UTC (permalink / raw)
  To: zyjzyj2000, mkubecek, vfalico, gospo, netdev, boris.shteinbock

From: Zhu Yanjun <zyjzyj2000@gmail.com>

Bonding will utilize notifier callbacks to detect slave
link state changes. It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay
options to bonding.

Because of link flap from the slave interface, if the notifier
is NETDEV_UP while the actual link state is down, it is not
necessary to continue.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Tested-by: Tantilov, Emil S <emil.s.tantilov@intel.com>
Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
---
 drivers/net/bonding/bond_main.c |  317 +++++++++++++++++++++------------------
 1 file changed, 170 insertions(+), 147 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b5605..9f67948 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2015,203 +2015,223 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
+				     unsigned long event)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
 	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	/* Because of link flap from the slave interface, it is possilbe that
+	 * the notifiler is NETDEV_UP while the actual link state is down. If
+	 * so, it is not necessary to contiune.
+	 */
+	switch (event) {
+	case NETDEV_UP:
+		if (!link_state)
+			return 0;
+		break;
+
+	case NETDEV_DOWN:
+		if (link_state)
+			return 0;
+		break;
+	}
+
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+				     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP,
 						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP,
-							  BOND_SLAVE_NOTIFY_LATER);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-			slave->delay--;
-			break;
+		slave->delay--;
+		break;
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK,
-						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.updelay;
-
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN,
-							  BOND_SLAVE_NOTIFY_LATER);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+						  BOND_SLAVE_NOTIFY_LATER);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
-	return commit;
+	return 0;
 }
 
-static void bond_miimon_commit(struct bonding *bond)
+static int bond_miimon_inspect(struct bonding *bond)
 {
 	struct list_head *iter;
-	struct slave *slave, *primary;
+	struct slave *slave;
+	int commit = 0;
 
-	bond_for_each_slave(bond, slave, iter) {
-		switch (slave->new_link) {
-		case BOND_LINK_NOCHANGE:
-			continue;
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave, 0xFF);
 
-		case BOND_LINK_UP:
-			bond_set_slave_link_state(slave, BOND_LINK_UP,
-						  BOND_SLAVE_NOTIFY_NOW);
-			slave->last_link_up = jiffies;
+	return commit;
+}
 
-			primary = rtnl_dereference(bond->primary_slave);
-			if (BOND_MODE(bond) == BOND_MODE_8023AD) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
-				/* make it immediately active */
-				bond_set_active_slave(slave);
-			} else if (slave != primary) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			}
+static void bond_miimon_commit_slave(struct bonding *bond, struct slave *slave)
+{
+	struct slave *primary;
 
-			netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
-				    slave->dev->name,
-				    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
-				    slave->duplex ? "full" : "half");
+	switch (slave->new_link) {
+	case BOND_LINK_NOCHANGE:
+		return;
 
-			/* notify ad that the link status has changed */
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave, BOND_LINK_UP);
+	case BOND_LINK_UP:
+		bond_set_slave_link_state(slave, BOND_LINK_UP,
+					  BOND_SLAVE_NOTIFY_NOW);
+		slave->last_link_up = jiffies;
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_UP);
+		primary = rtnl_dereference(bond->primary_slave);
+		if (BOND_MODE(bond) == BOND_MODE_8023AD) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
+			/* make it immediately active */
+			bond_set_active_slave(slave);
+		} else if (slave != primary) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		}
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
+			    slave->dev->name,
+			    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
+			    slave->duplex ? "full" : "half");
 
-			if (!bond->curr_active_slave || slave == primary)
-				goto do_failover;
+		/* notify ad that the link status has changed */
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_UP);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_UP);
 
-		case BOND_LINK_DOWN:
-			if (slave->link_failure_count < UINT_MAX)
-				slave->link_failure_count++;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
-						  BOND_SLAVE_NOTIFY_NOW);
+		if (!bond->curr_active_slave || slave == primary)
+			goto do_failover;
 
-			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
-			    BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_set_slave_inactive_flags(slave,
-							      BOND_SLAVE_NOTIFY_NOW);
+		goto out;
 
-			netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
-				    slave->dev->name);
+	case BOND_LINK_DOWN:
+		if (slave->link_failure_count < UINT_MAX)
+			slave->link_failure_count++;
 
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave,
-							    BOND_LINK_DOWN);
+		bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+					  BOND_SLAVE_NOTIFY_NOW);
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_DOWN);
+		if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
+		    BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_set_slave_inactive_flags(slave,
+						      BOND_SLAVE_NOTIFY_NOW);
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
+			    slave->dev->name);
 
-			if (slave == rcu_access_pointer(bond->curr_active_slave))
-				goto do_failover;
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_DOWN);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN);
 
-		default:
-			netdev_err(bond->dev, "invalid new link %d on slave %s\n",
-				   slave->new_link, slave->dev->name);
-			slave->new_link = BOND_LINK_NOCHANGE;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			continue;
-		}
+		if (slave == rcu_access_pointer(bond->curr_active_slave))
+			goto do_failover;
 
-do_failover:
-		block_netpoll_tx();
-		bond_select_active_slave(bond);
-		unblock_netpoll_tx();
+		goto out;
+
+	default:
+		netdev_err(bond->dev, "invalid new link %d on slave %s\n",
+			   slave->new_link, slave->dev->name);
+		slave->new_link = BOND_LINK_NOCHANGE;
+
+		goto out;
 	}
 
+do_failover:
+	block_netpoll_tx();
+	bond_select_active_slave(bond);
+	unblock_netpoll_tx();
+
+out:
 	bond_set_carrier(bond);
 }
 
+static void bond_miimon_commit(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+
+	bond_for_each_slave(bond, slave, iter)
+		bond_miimon_commit_slave(bond, slave);
+}
+
 /* bond_mii_monitor
  *
  * Really a wrapper that splits the mii monitor into two phases: an
@@ -3019,6 +3039,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave, event))
+			bond_miimon_commit_slave(bond, slave);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-13 17:03                           ` Tantilov, Emil S
  2016-01-20  5:13                             ` [PATCH 1/1] " zyjzyj2000
@ 2016-01-21 10:16                             ` zyjzyj2000
  2016-01-21 10:16                               ` zyjzyj2000
  2016-01-25 16:33                               ` Tantilov, Emil S
  1 sibling, 2 replies; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-21 10:16 UTC (permalink / raw)
  To: zyjzyj2000, mkubecek, vfalico, gospo, netdev, boris.shteinbock,
	jay.vosburgh, emil.s.tantilov


Hi, Jay && Emil

Thanks for your hard work. I forget to send this patch to you. Please help to review. Thanks a lot.

I think the similar patch is needed in linux kernel 4.4. As such, based on linux kernel 4.4, I made this patch. Please comment.

Best Regards!
Zhu Yanjun

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-21 10:16                             ` zyjzyj2000
@ 2016-01-21 10:16                               ` zyjzyj2000
  2016-01-25 16:37                                 ` Tantilov, Emil S
  2016-01-26  0:43                                 ` Jay Vosburgh
  2016-01-25 16:33                               ` Tantilov, Emil S
  1 sibling, 2 replies; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-21 10:16 UTC (permalink / raw)
  To: zyjzyj2000, mkubecek, vfalico, gospo, netdev, boris.shteinbock,
	jay.vosburgh, emil.s.tantilov

From: Zhu Yanjun <zyjzyj2000@gmail.com>

Bonding will utilize notifier callbacks to detect slave
link state changes. It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay
options to bonding.

Because of link flap from the slave interface, if the notifier
is NETDEV_UP while the actual link state is down, it is not
necessary to continue.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Tested-by: Tantilov, Emil S <emil.s.tantilov@intel.com>
Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
---
 drivers/net/bonding/bond_main.c |  317 +++++++++++++++++++++------------------
 1 file changed, 170 insertions(+), 147 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b5605..9f67948 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2015,203 +2015,223 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
+				     unsigned long event)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
 	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	/* Because of link flap from the slave interface, it is possilbe that
+	 * the notifiler is NETDEV_UP while the actual link state is down. If
+	 * so, it is not necessary to contiune.
+	 */
+	switch (event) {
+	case NETDEV_UP:
+		if (!link_state)
+			return 0;
+		break;
+
+	case NETDEV_DOWN:
+		if (link_state)
+			return 0;
+		break;
+	}
+
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+				     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP,
 						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP,
-							  BOND_SLAVE_NOTIFY_LATER);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-			slave->delay--;
-			break;
+		slave->delay--;
+		break;
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK,
-						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.updelay;
-
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN,
-							  BOND_SLAVE_NOTIFY_LATER);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+						  BOND_SLAVE_NOTIFY_LATER);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
-	return commit;
+	return 0;
 }
 
-static void bond_miimon_commit(struct bonding *bond)
+static int bond_miimon_inspect(struct bonding *bond)
 {
 	struct list_head *iter;
-	struct slave *slave, *primary;
+	struct slave *slave;
+	int commit = 0;
 
-	bond_for_each_slave(bond, slave, iter) {
-		switch (slave->new_link) {
-		case BOND_LINK_NOCHANGE:
-			continue;
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave, 0xFF);
 
-		case BOND_LINK_UP:
-			bond_set_slave_link_state(slave, BOND_LINK_UP,
-						  BOND_SLAVE_NOTIFY_NOW);
-			slave->last_link_up = jiffies;
+	return commit;
+}
 
-			primary = rtnl_dereference(bond->primary_slave);
-			if (BOND_MODE(bond) == BOND_MODE_8023AD) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
-				/* make it immediately active */
-				bond_set_active_slave(slave);
-			} else if (slave != primary) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			}
+static void bond_miimon_commit_slave(struct bonding *bond, struct slave *slave)
+{
+	struct slave *primary;
 
-			netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
-				    slave->dev->name,
-				    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
-				    slave->duplex ? "full" : "half");
+	switch (slave->new_link) {
+	case BOND_LINK_NOCHANGE:
+		return;
 
-			/* notify ad that the link status has changed */
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave, BOND_LINK_UP);
+	case BOND_LINK_UP:
+		bond_set_slave_link_state(slave, BOND_LINK_UP,
+					  BOND_SLAVE_NOTIFY_NOW);
+		slave->last_link_up = jiffies;
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_UP);
+		primary = rtnl_dereference(bond->primary_slave);
+		if (BOND_MODE(bond) == BOND_MODE_8023AD) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
+			/* make it immediately active */
+			bond_set_active_slave(slave);
+		} else if (slave != primary) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		}
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
+			    slave->dev->name,
+			    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
+			    slave->duplex ? "full" : "half");
 
-			if (!bond->curr_active_slave || slave == primary)
-				goto do_failover;
+		/* notify ad that the link status has changed */
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_UP);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_UP);
 
-		case BOND_LINK_DOWN:
-			if (slave->link_failure_count < UINT_MAX)
-				slave->link_failure_count++;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
-						  BOND_SLAVE_NOTIFY_NOW);
+		if (!bond->curr_active_slave || slave == primary)
+			goto do_failover;
 
-			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
-			    BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_set_slave_inactive_flags(slave,
-							      BOND_SLAVE_NOTIFY_NOW);
+		goto out;
 
-			netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
-				    slave->dev->name);
+	case BOND_LINK_DOWN:
+		if (slave->link_failure_count < UINT_MAX)
+			slave->link_failure_count++;
 
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave,
-							    BOND_LINK_DOWN);
+		bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+					  BOND_SLAVE_NOTIFY_NOW);
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_DOWN);
+		if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
+		    BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_set_slave_inactive_flags(slave,
+						      BOND_SLAVE_NOTIFY_NOW);
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
+			    slave->dev->name);
 
-			if (slave == rcu_access_pointer(bond->curr_active_slave))
-				goto do_failover;
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_DOWN);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN);
 
-		default:
-			netdev_err(bond->dev, "invalid new link %d on slave %s\n",
-				   slave->new_link, slave->dev->name);
-			slave->new_link = BOND_LINK_NOCHANGE;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			continue;
-		}
+		if (slave == rcu_access_pointer(bond->curr_active_slave))
+			goto do_failover;
 
-do_failover:
-		block_netpoll_tx();
-		bond_select_active_slave(bond);
-		unblock_netpoll_tx();
+		goto out;
+
+	default:
+		netdev_err(bond->dev, "invalid new link %d on slave %s\n",
+			   slave->new_link, slave->dev->name);
+		slave->new_link = BOND_LINK_NOCHANGE;
+
+		goto out;
 	}
 
+do_failover:
+	block_netpoll_tx();
+	bond_select_active_slave(bond);
+	unblock_netpoll_tx();
+
+out:
 	bond_set_carrier(bond);
 }
 
+static void bond_miimon_commit(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+
+	bond_for_each_slave(bond, slave, iter)
+		bond_miimon_commit_slave(bond, slave);
+}
+
 /* bond_mii_monitor
  *
  * Really a wrapper that splits the mii monitor into two phases: an
@@ -3019,6 +3039,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave, event))
+			bond_miimon_commit_slave(bond, slave);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-21 10:16                             ` zyjzyj2000
  2016-01-21 10:16                               ` zyjzyj2000
@ 2016-01-25 16:33                               ` Tantilov, Emil S
  2016-01-25 18:00                                 ` David Miller
  1 sibling, 1 reply; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-25 16:33 UTC (permalink / raw)
  To: zyjzyj2000, mkubecek, vfalico, gospo, netdev, Shteinbock,
	Boris (Wind River),
	jay.vosburgh

>-----Original Message-----
>From: zyjzyj2000@gmail.com [mailto:zyjzyj2000@gmail.com]
>Sent: Thursday, January 21, 2016 2:16 AM
>To: zyjzyj2000@gmail.com; mkubecek@suse.cz; vfalico@gmail.com;
>gospo@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River); jay.vosburgh@canonical.com; Tantilov, Emil S
>Subject: [PATCH 1/1] bonding: Use notifiers for slave link state detection
>
>
>Hi, Jay && Emil
>
>Thanks for your hard work. I forget to send this patch to you. Please help
>to review. Thanks a lot.
>
>I think the similar patch is needed in linux kernel 4.4. As such, based on
>linux kernel 4.4, I made this patch. Please comment.

The patch you are referring to has not been accepted in net-next yet.
If/when that happens you can request it to be ported to the stable tree.

Last version I tested seemed to work OK, but Jay mentioned some RCU warnings and 
I was expecting a follow up on it.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-21 10:16                               ` zyjzyj2000
@ 2016-01-25 16:37                                 ` Tantilov, Emil S
  2016-01-26  0:43                                 ` Jay Vosburgh
  1 sibling, 0 replies; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-25 16:37 UTC (permalink / raw)
  To: zyjzyj2000, mkubecek, vfalico, gospo, netdev, Shteinbock,
	Boris (Wind River),
	jay.vosburgh

>-----Original Message-----
>From: zyjzyj2000@gmail.com [mailto:zyjzyj2000@gmail.com]
>Sent: Thursday, January 21, 2016 2:16 AM
>To: zyjzyj2000@gmail.com; mkubecek@suse.cz; vfalico@gmail.com;
>gospo@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River); jay.vosburgh@canonical.com; Tantilov, Emil S
>Subject: [PATCH 1/1] bonding: Use notifiers for slave link state detection
>
>From: Zhu Yanjun <zyjzyj2000@gmail.com>
>
>Bonding will utilize notifier callbacks to detect slave
>link state changes. It is intended to be used with miimon
>set to zero, and does not support the updelay or downdelay
>options to bonding.
>
>Because of link flap from the slave interface, if the notifier
>is NETDEV_UP while the actual link state is down, it is not
>necessary to continue.
>
>Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>Tested-by: Tantilov, Emil S <emil.s.tantilov@intel.com>
>Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
>---
> drivers/net/bonding/bond_main.c |  317 +++++++++++++++++++++--------------
>----
> 1 file changed, 170 insertions(+), 147 deletions(-)

Just for the record - this is not a patch that I have tested.

I did run tests with the patch Jay Vosburgh submitted for introducing
notifiers and that is handled in a separate thread.

Why do you keep re-sending Jay's patches?

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-25 16:33                               ` Tantilov, Emil S
@ 2016-01-25 18:00                                 ` David Miller
  2016-01-25 18:37                                   ` Tantilov, Emil S
  0 siblings, 1 reply; 52+ messages in thread
From: David Miller @ 2016-01-25 18:00 UTC (permalink / raw)
  To: emil.s.tantilov
  Cc: zyjzyj2000, mkubecek, vfalico, gospo, netdev, boris.shteinbock,
	jay.vosburgh

From: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
Date: Mon, 25 Jan 2016 16:33:37 +0000

> The patch you are referring to has not been accepted in net-next yet.
> If/when that happens you can request it to be ported to the stable tree.

Wrong.

If you want a patch to get submitted to stable, it must be appropriate for
and you must target it to 'net', not 'net-next'.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-25 18:00                                 ` David Miller
@ 2016-01-25 18:37                                   ` Tantilov, Emil S
  0 siblings, 0 replies; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-25 18:37 UTC (permalink / raw)
  To: David Miller
  Cc: zyjzyj2000, mkubecek, vfalico, gospo, netdev, Shteinbock,
	Boris (Wind River),
	jay.vosburgh

>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>Sent: Monday, January 25, 2016 10:00 AM
>To: Tantilov, Emil S
>Cc: zyjzyj2000@gmail.com; mkubecek@suse.cz; vfalico@gmail.com;
>gospo@cumulusnetworks.com; netdev@vger.kernel.org; Shteinbock, Boris (Wind
>River); jay.vosburgh@canonical.com
>Subject: Re: [PATCH 1/1] bonding: Use notifiers for slave link state
>detection
>
>From: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
>Date: Mon, 25 Jan 2016 16:33:37 +0000
>
>> The patch you are referring to has not been accepted in net-next yet.
>> If/when that happens you can request it to be ported to the stable tree.
>
>Wrong.
>
>If you want a patch to get submitted to stable, it must be appropriate for
>and you must target it to 'net', not 'net-next'.

Yeah that came out wrong. I was just trying to point out that this patch
is a port to 4.4 kernel of a test patch from Jay that hasn't even gotten
into the net/net-next trees yet.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-21 10:16                               ` zyjzyj2000
  2016-01-25 16:37                                 ` Tantilov, Emil S
@ 2016-01-26  0:43                                 ` Jay Vosburgh
  2016-01-26  3:19                                   ` zhuyj
  1 sibling, 1 reply; 52+ messages in thread
From: Jay Vosburgh @ 2016-01-26  0:43 UTC (permalink / raw)
  To: zyjzyj2000
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock, emil.s.tantilov

<zyjzyj2000@gmail.com> wrote:

>From: Zhu Yanjun <zyjzyj2000@gmail.com>
>
>Bonding will utilize notifier callbacks to detect slave
>link state changes. It is intended to be used with miimon
>set to zero, and does not support the updelay or downdelay
>options to bonding.
>
>Because of link flap from the slave interface, if the notifier
>is NETDEV_UP while the actual link state is down, it is not
>necessary to continue.
>
>Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

	I haven't signed off on this patch.

	I've just started some testing, but as before immediately get an
RCU warning; it looks to be coming from bond_miimon_inspect_slave();

[  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
[  316.473059] 
[  316.473806] ===============================
[  316.475630] [ INFO: suspicious RCU usage. ]
[  316.477519] 4.4.0+ #38 Not tainted
[  316.479094] -------------------------------
[  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious rcu_dereference_check() usage!

	This is presumably because the "case NETDEV_DOWN" call to
bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
which should be safe for this usage (RTNL mutexes changes to the active
slave).  The appended patch on top of the original makes the warning go
away.

	I'm still testing the patch and have no comment about its
functionality as yet.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9f67948..e3faee9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 
 /*-------------------------------- Monitoring -------------------------------*/
 
-/* called with rcu_read_lock() */
+/* called with rcu_read_lock() or RTNL */
 static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
 				     unsigned long event)
 {
 	int link_state;
 	bool ignore_updelay;
 
-	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
 
 	slave->new_link = BOND_LINK_NOCHANGE;
 

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-26  0:43                                 ` Jay Vosburgh
@ 2016-01-26  3:19                                   ` zhuyj
  2016-01-26  6:00                                     ` Jay Vosburgh
  0 siblings, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-26  3:19 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock, emil.s.tantilov

On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
> <zyjzyj2000@gmail.com> wrote:
>
>> From: Zhu Yanjun <zyjzyj2000@gmail.com>
>>
>> Bonding will utilize notifier callbacks to detect slave
>> link state changes. It is intended to be used with miimon
>> set to zero, and does not support the updelay or downdelay
>> options to bonding.
>>
>> Because of link flap from the slave interface, if the notifier
>> is NETDEV_UP while the actual link state is down, it is not
>> necessary to continue.
>>
>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
> 	I haven't signed off on this patch.
>
> 	I've just started some testing, but as before immediately get an
> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>
> [  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
> [  316.473059]
> [  316.473806] ===============================
> [  316.475630] [ INFO: suspicious RCU usage. ]
> [  316.477519] 4.4.0+ #38 Not tainted
> [  316.479094] -------------------------------
> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious rcu_dereference_check() usage!
>
> 	This is presumably because the "case NETDEV_DOWN" call to
> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
> which should be safe for this usage (RTNL mutexes changes to the active
> slave).  The appended patch on top of the original makes the warning go
> away.
>
> 	I'm still testing the patch and have no comment about its
> functionality as yet.
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 9f67948..e3faee9 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>   
>   /*-------------------------------- Monitoring -------------------------------*/
>   
> -/* called with rcu_read_lock() */
> +/* called with rcu_read_lock() or RTNL */
>   static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
>   				     unsigned long event)
>   {
>   	int link_state;
>   	bool ignore_updelay;
>   
> -	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
> +	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);

Thanks a lot.
Because kernel v4.4 needs this kind of patch, I backport this patch from 
net-next to kernel v4.4.

If it is not appropriate, I will revert this patch.

Best Regards!
Zhu Yanjun

>   
>   	slave->new_link = BOND_LINK_NOCHANGE;
>   
>
> 	-J
>
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-26  3:19                                   ` zhuyj
@ 2016-01-26  6:00                                     ` Jay Vosburgh
  2016-01-26  6:26                                       ` zhuyj
                                                         ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Jay Vosburgh @ 2016-01-26  6:00 UTC (permalink / raw)
  To: zhuyj; +Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock, emil.s.tantilov

zhuyj <zyjzyj2000@gmail.com> wrote:

>On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
>> <zyjzyj2000@gmail.com> wrote:
>>
>>> From: Zhu Yanjun <zyjzyj2000@gmail.com>
>>>
>>> Bonding will utilize notifier callbacks to detect slave
>>> link state changes. It is intended to be used with miimon
>>> set to zero, and does not support the updelay or downdelay
>>> options to bonding.
>>>
>>> Because of link flap from the slave interface, if the notifier
>>> is NETDEV_UP while the actual link state is down, it is not
>>> necessary to continue.
>>>
>>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>> 	I haven't signed off on this patch.
>>
>> 	I've just started some testing, but as before immediately get an
>> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>>
>> [  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
>> [  316.473059]
>> [  316.473806] ===============================
>> [  316.475630] [ INFO: suspicious RCU usage. ]
>> [  316.477519] 4.4.0+ #38 Not tainted
>> [  316.479094] -------------------------------
>> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious rcu_dereference_check() usage!
>>
>> 	This is presumably because the "case NETDEV_DOWN" call to
>> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
>> which should be safe for this usage (RTNL mutexes changes to the active
>> slave).  The appended patch on top of the original makes the warning go
>> away.
>>
>> 	I'm still testing the patch and have no comment about its
>> functionality as yet.
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 9f67948..e3faee9 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>>     /*-------------------------------- Monitoring
>> -------------------------------*/
>>   -/* called with rcu_read_lock() */
>> +/* called with rcu_read_lock() or RTNL */
>>   static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
>>   				     unsigned long event)
>>   {
>>   	int link_state;
>>   	bool ignore_updelay;
>>   -	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>> +	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
>
>Thanks a lot.
>Because kernel v4.4 needs this kind of patch, I backport this patch from
>net-next to kernel v4.4.
>
>If it is not appropriate, I will revert this patch.

	I don't understand what you mean here.

	I've tested the patch (with my above modification), and while I
seem to be hitting an unrelated bug in the ARP monitor, I believe this
patch will misbehave when the ARP monitor is running.

	For example, if arp_interval=1000 and miimon=0, the link state
notifier callback will change a slave to up should a notifier event take
place.  So, hypothetically, if a slave is "down" according to the ARP
monitor (but actually carrier up), and then experience a carrier down
then up transition, the slave would be set to "up" even though the ARP
monitor believes it to be down.

	I'm not able to induce the speedy link flap events, so I'm not
sure about this portion of the patch:

+	/* Because of link flap from the slave interface, it is possilbe that
+	 * the notifiler is NETDEV_UP while the actual link state is down. If
+	 * so, it is not necessary to contiune.
+	 */
+	switch (event) {
+	case NETDEV_UP:
+		if (!link_state)
+			return 0;
+		break;
+
+	case NETDEV_DOWN:
+		if (link_state)
+			return 0;
+		break;
+	}
+

	Unless I misunderstood, Emil's comments elsewhere suggest that
the current ixgbe driver won't cause those, though.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-26  6:00                                     ` Jay Vosburgh
@ 2016-01-26  6:26                                       ` zhuyj
  2016-01-26  6:45                                         ` zhuyj
  2016-01-27 20:00                                       ` Tantilov, Emil S
  2016-01-29  7:05                                       ` zhuyj
  2 siblings, 1 reply; 52+ messages in thread
From: zhuyj @ 2016-01-26  6:26 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock,
	emil.s.tantilov, zhuyj

On 01/26/2016 02:00 PM, Jay Vosburgh wrote:
> zhuyj <zyjzyj2000@gmail.com> wrote:
>
>> On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
>>> <zyjzyj2000@gmail.com> wrote:
>>>
>>>> From: Zhu Yanjun <zyjzyj2000@gmail.com>
>>>>
>>>> Bonding will utilize notifier callbacks to detect slave
>>>> link state changes. It is intended to be used with miimon
>>>> set to zero, and does not support the updelay or downdelay
>>>> options to bonding.
>>>>
>>>> Because of link flap from the slave interface, if the notifier
>>>> is NETDEV_UP while the actual link state is down, it is not
>>>> necessary to continue.
>>>>
>>>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>>> 	I haven't signed off on this patch.
>>>
>>> 	I've just started some testing, but as before immediately get an
>>> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>>>
>>> [  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
>>> [  316.473059]
>>> [  316.473806] ===============================
>>> [  316.475630] [ INFO: suspicious RCU usage. ]
>>> [  316.477519] 4.4.0+ #38 Not tainted
>>> [  316.479094] -------------------------------
>>> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious rcu_dereference_check() usage!
>>>
>>> 	This is presumably because the "case NETDEV_DOWN" call to
>>> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
>>> which should be safe for this usage (RTNL mutexes changes to the active
>>> slave).  The appended patch on top of the original makes the warning go
>>> away.
>>>
>>> 	I'm still testing the patch and have no comment about its
>>> functionality as yet.
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index 9f67948..e3faee9 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>>>      /*-------------------------------- Monitoring
>>> -------------------------------*/
>>>    -/* called with rcu_read_lock() */
>>> +/* called with rcu_read_lock() or RTNL */
>>>    static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
>>>    				     unsigned long event)
>>>    {
>>>    	int link_state;
>>>    	bool ignore_updelay;
>>>    -	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>>> +	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
>> Thanks a lot.
>> Because kernel v4.4 needs this kind of patch, I backport this patch from
>> net-next to kernel v4.4.
>>
>> If it is not appropriate, I will revert this patch.
> 	I don't understand what you mean here.
>
> 	I've tested the patch (with my above modification), and while I
> seem to be hitting an unrelated bug in the ARP monitor, I believe this
> patch will misbehave when the ARP monitor is running.
>
> 	For example, if arp_interval=1000 and miimon=0, the link state
> notifier callback will change a slave to up should a notifier event take
> place.  So, hypothetically, if a slave is "down" according to the ARP
> monitor (but actually carrier up), and then experience a carrier down
> then up transition, the slave would be set to "up" even though the ARP
> monitor believes it to be down.
>
> 	I'm not able to induce the speedy link flap events, so I'm not
> sure about this portion of the patch:
>
> +	/* Because of link flap from the slave interface, it is possilbe that
> +	 * the notifiler is NETDEV_UP while the actual link state is down. If
> +	 * so, it is not necessary to contiune.
> +	 */
> +	switch (event) {
> +	case NETDEV_UP:
> +		if (!link_state)
> +			return 0;
> +		break;
> +
> +	case NETDEV_DOWN:
> +		if (link_state)
> +			return 0;
> +		break;
> +	}
> +
>
> 	Unless I misunderstood, Emil's comments elsewhere suggest that
> the current ixgbe driver won't cause those, though.
This patch will avoid useless configuration because of link flap.

Hi, Emil

Does the current ixgbe driver not cause link flap?

Thanks a lot.
Zhu Yanjun

>
> 	-J
>
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-26  6:26                                       ` zhuyj
@ 2016-01-26  6:45                                         ` zhuyj
  0 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-26  6:45 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock,
	emil.s.tantilov, zhuyj

On 01/26/2016 02:26 PM, zhuyj wrote:
> On 01/26/2016 02:00 PM, Jay Vosburgh wrote:
>> zhuyj <zyjzyj2000@gmail.com> wrote:
>>
>>> On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
>>>> <zyjzyj2000@gmail.com> wrote:
>>>>
>>>>> From: Zhu Yanjun <zyjzyj2000@gmail.com>
>>>>>
>>>>> Bonding will utilize notifier callbacks to detect slave
>>>>> link state changes. It is intended to be used with miimon
>>>>> set to zero, and does not support the updelay or downdelay
>>>>> options to bonding.
>>>>>
>>>>> Because of link flap from the slave interface, if the notifier
>>>>> is NETDEV_UP while the actual link state is down, it is not
>>>>> necessary to continue.
>>>>>
>>>>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>>>>     I haven't signed off on this patch.
>>>>
>>>>     I've just started some testing, but as before immediately get an
>>>> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>>>>
>>>> [  316.473050] bond1: Enslaving eth1 as a backup interface with an 
>>>> up link
>>>> [  316.473059]
>>>> [  316.473806] ===============================
>>>> [  316.475630] [ INFO: suspicious RCU usage. ]
>>>> [  316.477519] 4.4.0+ #38 Not tainted
>>>> [  316.479094] -------------------------------
>>>> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious 
>>>> rcu_dereference_check() usage!
>>>>
>>>>     This is presumably because the "case NETDEV_DOWN" call to
>>>> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, 
>>>> though,
>>>> which should be safe for this usage (RTNL mutexes changes to the 
>>>> active
>>>> slave).  The appended patch on top of the original makes the 
>>>> warning go
>>>> away.
>>>>
>>>>     I'm still testing the patch and have no comment about its
>>>> functionality as yet.
>>>>
>>>> diff --git a/drivers/net/bonding/bond_main.c 
>>>> b/drivers/net/bonding/bond_main.c
>>>> index 9f67948..e3faee9 100644
>>>> --- a/drivers/net/bonding/bond_main.c
>>>> +++ b/drivers/net/bonding/bond_main.c
>>>> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct 
>>>> net_device *bond_dev, struct ifslave *in
>>>>      /*-------------------------------- Monitoring
>>>> -------------------------------*/
>>>>    -/* called with rcu_read_lock() */
>>>> +/* called with rcu_read_lock() or RTNL */
>>>>    static int bond_miimon_inspect_slave(struct bonding *bond, 
>>>> struct slave *slave,
>>>>                         unsigned long event)
>>>>    {
>>>>        int link_state;
>>>>        bool ignore_updelay;
>>>>    -    ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>>>> +    ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
>>> Thanks a lot.
>>> Because kernel v4.4 needs this kind of patch, I backport this patch 
>>> from
>>> net-next to kernel v4.4.
>>>
>>> If it is not appropriate, I will revert this patch.
>>     I don't understand what you mean here.
>>
>>     I've tested the patch (with my above modification), and while I
>> seem to be hitting an unrelated bug in the ARP monitor, I believe this
>> patch will misbehave when the ARP monitor is running.
>>
>>     For example, if arp_interval=1000 and miimon=0, the link state
>> notifier callback will change a slave to up should a notifier event take
>> place.  So, hypothetically, if a slave is "down" according to the ARP
>> monitor (but actually carrier up), and then experience a carrier down
>> then up transition, the slave would be set to "up" even though the ARP
>> monitor believes it to be down.
>>
>>     I'm not able to induce the speedy link flap events, so I'm not
>> sure about this portion of the patch:
>>
>> +    /* Because of link flap from the slave interface, it is possilbe 
>> that
>> +     * the notifiler is NETDEV_UP while the actual link state is 
>> down. If
>> +     * so, it is not necessary to contiune.
>> +     */
>> +    switch (event) {
>> +    case NETDEV_UP:
>> +        if (!link_state)
>> +            return 0;
>> +        break;
>> +
>> +    case NETDEV_DOWN:
>> +        if (link_state)
>> +            return 0;
>> +        break;
>> +    }
>> +
>>
>>     Unless I misunderstood, Emil's comments elsewhere suggest that
>> the current ixgbe driver won't cause those, though.
> This patch will avoid useless configuration because of link flap.
Hi, Jay

Sorry. My bad. If there is no link flap in the current ixgbe driver, this
patch is not necessary.;-)

Best Regards!
Zhu Yanjun

>
> Hi, Emil
>
> Does the current ixgbe driver not cause link flap?
>
> Thanks a lot.
> Zhu Yanjun
>
>>
>>     -J
>>
>> ---
>>     -Jay Vosburgh, jay.vosburgh@canonical.com
>>
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-26  6:00                                     ` Jay Vosburgh
  2016-01-26  6:26                                       ` zhuyj
@ 2016-01-27 20:00                                       ` Tantilov, Emil S
  2016-01-28  8:44                                         ` zyjzyj2000
  2016-01-29  7:05                                       ` zhuyj
  2 siblings, 1 reply; 52+ messages in thread
From: Tantilov, Emil S @ 2016-01-27 20:00 UTC (permalink / raw)
  To: Jay Vosburgh, zhuyj
  Cc: mkubecek, vfalico, gospo, netdev, Shteinbock, Boris (Wind River)

>-----Original Message-----
>From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
>Sent: Monday, January 25, 2016 10:01 PM
>To: zhuyj
>Cc: mkubecek@suse.cz; vfalico@gmail.com; gospo@cumulusnetworks.com;
>netdev@vger.kernel.org; Shteinbock, Boris (Wind River); Tantilov, Emil S
>Subject: Re: [PATCH 1/1] bonding: Use notifiers for slave link state
>detection
>
>zhuyj <zyjzyj2000@gmail.com> wrote:
>
>>On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
>>> <zyjzyj2000@gmail.com> wrote:
>>>
>>>> From: Zhu Yanjun <zyjzyj2000@gmail.com>
>>>>
>>>> Bonding will utilize notifier callbacks to detect slave
>>>> link state changes. It is intended to be used with miimon
>>>> set to zero, and does not support the updelay or downdelay
>>>> options to bonding.
>>>>
>>>> Because of link flap from the slave interface, if the notifier
>>>> is NETDEV_UP while the actual link state is down, it is not
>>>> necessary to continue.
>>>>
>>>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>>> 	I haven't signed off on this patch.
>>>
>>> 	I've just started some testing, but as before immediately get an
>>> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>>>
>>> [  316.473050] bond1: Enslaving eth1 as a backup interface with an up
>link
>>> [  316.473059]
>>> [  316.473806] ===============================
>>> [  316.475630] [ INFO: suspicious RCU usage. ]
>>> [  316.477519] 4.4.0+ #38 Not tainted
>>> [  316.479094] -------------------------------
>>> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious
>rcu_dereference_check() usage!
>>>
>>> 	This is presumably because the "case NETDEV_DOWN" call to
>>> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
>>> which should be safe for this usage (RTNL mutexes changes to the active
>>> slave).  The appended patch on top of the original makes the warning go
>>> away.
>>>
>>> 	I'm still testing the patch and have no comment about its
>>> functionality as yet.
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>>> index 9f67948..e3faee9 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct
>net_device *bond_dev, struct ifslave *in
>>>     /*-------------------------------- Monitoring
>>> -------------------------------*/
>>>   -/* called with rcu_read_lock() */
>>> +/* called with rcu_read_lock() or RTNL */
>>>   static int bond_miimon_inspect_slave(struct bonding *bond, struct
>slave *slave,
>>>   				     unsigned long event)
>>>   {
>>>   	int link_state;
>>>   	bool ignore_updelay;
>>>   -	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>>> +	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
>>
>>Thanks a lot.
>>Because kernel v4.4 needs this kind of patch, I backport this patch from
>>net-next to kernel v4.4.
>>
>>If it is not appropriate, I will revert this patch.
>
>	I don't understand what you mean here.
>
>	I've tested the patch (with my above modification), and while I
>seem to be hitting an unrelated bug in the ARP monitor, I believe this
>patch will misbehave when the ARP monitor is running.
>
>	For example, if arp_interval=1000 and miimon=0, the link state
>notifier callback will change a slave to up should a notifier event take
>place.  So, hypothetically, if a slave is "down" according to the ARP
>monitor (but actually carrier up), and then experience a carrier down
>then up transition, the slave would be set to "up" even though the ARP
>monitor believes it to be down.
>
>	I'm not able to induce the speedy link flap events, so I'm not
>sure about this portion of the patch:
>
>+	/* Because of link flap from the slave interface, it is possilbe that
>+	 * the notifiler is NETDEV_UP while the actual link state is down. If
>+	 * so, it is not necessary to contiune.
>+	 */
>+	switch (event) {
>+	case NETDEV_UP:
>+		if (!link_state)
>+			return 0;
>+		break;
>+
>+	case NETDEV_DOWN:
>+		if (link_state)
>+			return 0;
>+		break;
>+	}
>+
>
>	Unless I misunderstood, Emil's comments elsewhere suggest that
>the current ixgbe driver won't cause those, though.

I ran tests with the above checks and I can't get them to trigger either way.
So at least in my setup this patch has no effect.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-27 20:00                                       ` Tantilov, Emil S
@ 2016-01-28  8:44                                         ` zyjzyj2000
  0 siblings, 0 replies; 52+ messages in thread
From: zyjzyj2000 @ 2016-01-28  8:44 UTC (permalink / raw)
  To: emil.s.tantilov, jay.vosburgh, zyjzyj2000, mkubecek, vfalico,
	gospo, netdev, boris.shteinbock

From: Zhu Yanjun <zyjzyj2000@gmail.com>

This is just a test patch. Jay and Emil helped me a lot. The original patch
is in net-next. But kernel v4.4 needs this patch, too. As such, I backport
to kernel v4.4.
 
Bonding will utilize notifier callbacks to detect slave
link state changes. It is intended to be used with miimon
set to zero, and does not support the updelay or downdelay
options to bonding.

Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
---
 drivers/net/bonding/bond_main.c |  303 ++++++++++++++++++++-------------------
 1 file changed, 155 insertions(+), 148 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b5605..5156ad1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2015,203 +2015,207 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
 /*-------------------------------- Monitoring -------------------------------*/
 
 /* called with rcu_read_lock() */
-static int bond_miimon_inspect(struct bonding *bond)
+static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
+				     unsigned long event)
 {
-	int link_state, commit = 0;
-	struct list_head *iter;
-	struct slave *slave;
+	int link_state;
 	bool ignore_updelay;
 
-	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
 
-	bond_for_each_slave_rcu(bond, slave, iter) {
-		slave->new_link = BOND_LINK_NOCHANGE;
+	slave->new_link = BOND_LINK_NOCHANGE;
 
-		link_state = bond_check_dev_link(bond, slave->dev, 0);
+	link_state = bond_check_dev_link(bond, slave->dev, 0);
 
-		switch (slave->link) {
-		case BOND_LINK_UP:
-			if (link_state)
-				continue;
+	switch (slave->link) {
+	case BOND_LINK_UP:
+		if (link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+		bond_set_slave_link_state(slave, BOND_LINK_FAIL,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.downdelay;
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
+				    (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) ?
+				    (bond_is_active_slave(slave) ?
+				     "active " : "backup ") : "",
+				    slave->dev->name,
+				    bond->params.downdelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_FAIL:
+		if (link_state) {
+			/* recovered before downdelay expired */
+			bond_set_slave_link_state(slave, BOND_LINK_UP,
 						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
-					    (BOND_MODE(bond) ==
-					     BOND_MODE_ACTIVEBACKUP) ?
-					     (bond_is_active_slave(slave) ?
-					      "active " : "backup ") : "",
-					    slave->dev->name,
-					    bond->params.downdelay * bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_FAIL:
-			if (link_state) {
-				/* recovered before downdelay expired */
-				bond_set_slave_link_state(slave, BOND_LINK_UP,
-							  BOND_SLAVE_NOTIFY_LATER);
-				slave->last_link_up = jiffies;
-				netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
-					    (bond->params.downdelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
-				continue;
-			}
+			slave->last_link_up = jiffies;
+			netdev_info(bond->dev, "link status up again after %d ms for interface %s\n",
+				    (bond->params.downdelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_DOWN;
-				commit++;
-				continue;
-			}
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_DOWN;
+			return 1;
+		}
 
-			slave->delay--;
-			break;
+		slave->delay--;
+		break;
 
-		case BOND_LINK_DOWN:
-			if (!link_state)
-				continue;
+	case BOND_LINK_DOWN:
+		if (!link_state)
+			return 0;
 
-			bond_set_slave_link_state(slave, BOND_LINK_BACK,
-						  BOND_SLAVE_NOTIFY_LATER);
-			slave->delay = bond->params.updelay;
-
-			if (slave->delay) {
-				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
-					    slave->dev->name,
-					    ignore_updelay ? 0 :
-					    bond->params.updelay *
-					    bond->params.miimon);
-			}
-			/*FALLTHRU*/
-		case BOND_LINK_BACK:
-			if (!link_state) {
-				bond_set_slave_link_state(slave,
-							  BOND_LINK_DOWN,
-							  BOND_SLAVE_NOTIFY_LATER);
-				netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
-					    (bond->params.updelay - slave->delay) *
-					    bond->params.miimon,
-					    slave->dev->name);
+		bond_set_slave_link_state(slave, BOND_LINK_BACK,
+					  BOND_SLAVE_NOTIFY_LATER);
+		slave->delay = bond->params.updelay;
 
-				continue;
-			}
+		if (slave->delay) {
+			netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
+				    slave->dev->name, ignore_updelay ? 0 :
+				    bond->params.updelay * bond->params.miimon);
+		}
+		/*FALLTHRU*/
+	case BOND_LINK_BACK:
+		if (!link_state) {
+			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+						  BOND_SLAVE_NOTIFY_LATER);
+			netdev_info(bond->dev, "link status down again after %d ms for interface %s\n",
+				    (bond->params.updelay - slave->delay) *
+				    bond->params.miimon, slave->dev->name);
 
-			if (ignore_updelay)
-				slave->delay = 0;
+			return 0;
+		}
 
-			if (slave->delay <= 0) {
-				slave->new_link = BOND_LINK_UP;
-				commit++;
-				ignore_updelay = false;
-				continue;
-			}
+		if (ignore_updelay)
+			slave->delay = 0;
 
-			slave->delay--;
-			break;
+		if (slave->delay <= 0) {
+			slave->new_link = BOND_LINK_UP;
+			return 1;
 		}
+
+		slave->delay--;
+		break;
 	}
 
-	return commit;
+	return 0;
 }
 
-static void bond_miimon_commit(struct bonding *bond)
+static int bond_miimon_inspect(struct bonding *bond)
 {
 	struct list_head *iter;
-	struct slave *slave, *primary;
+	struct slave *slave;
+	int commit = 0;
 
-	bond_for_each_slave(bond, slave, iter) {
-		switch (slave->new_link) {
-		case BOND_LINK_NOCHANGE:
-			continue;
+	bond_for_each_slave_rcu(bond, slave, iter)
+		commit += bond_miimon_inspect_slave(bond, slave, 0xFF);
 
-		case BOND_LINK_UP:
-			bond_set_slave_link_state(slave, BOND_LINK_UP,
-						  BOND_SLAVE_NOTIFY_NOW);
-			slave->last_link_up = jiffies;
+	return commit;
+}
 
-			primary = rtnl_dereference(bond->primary_slave);
-			if (BOND_MODE(bond) == BOND_MODE_8023AD) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
-				/* make it immediately active */
-				bond_set_active_slave(slave);
-			} else if (slave != primary) {
-				/* prevent it from being the active one */
-				bond_set_backup_slave(slave);
-			}
+static void bond_miimon_commit_slave(struct bonding *bond, struct slave *slave)
+{
+	struct slave *primary;
 
-			netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
-				    slave->dev->name,
-				    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
-				    slave->duplex ? "full" : "half");
+	switch (slave->new_link) {
+	case BOND_LINK_NOCHANGE:
+		return;
 
-			/* notify ad that the link status has changed */
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave, BOND_LINK_UP);
+	case BOND_LINK_UP:
+		bond_set_slave_link_state(slave, BOND_LINK_UP,
+					  BOND_SLAVE_NOTIFY_NOW);
+		slave->last_link_up = jiffies;
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_UP);
+		primary = rtnl_dereference(bond->primary_slave);
+		if (BOND_MODE(bond) == BOND_MODE_8023AD) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		} else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
+			/* make it immediately active */
+			bond_set_active_slave(slave);
+		} else if (slave != primary) {
+			/* prevent it from being the active one */
+			bond_set_backup_slave(slave);
+		}
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely up for interface %s, %u Mbps %s duplex\n",
+			    slave->dev->name,
+			    slave->speed == SPEED_UNKNOWN ? 0 : slave->speed,
+			    slave->duplex ? "full" : "half");
 
-			if (!bond->curr_active_slave || slave == primary)
-				goto do_failover;
+		/* notify ad that the link status has changed */
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_UP);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_UP);
 
-		case BOND_LINK_DOWN:
-			if (slave->link_failure_count < UINT_MAX)
-				slave->link_failure_count++;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
-						  BOND_SLAVE_NOTIFY_NOW);
+		if (!bond->curr_active_slave || slave == primary)
+			goto do_failover;
 
-			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
-			    BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_set_slave_inactive_flags(slave,
-							      BOND_SLAVE_NOTIFY_NOW);
+		goto out;
 
-			netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
-				    slave->dev->name);
+	case BOND_LINK_DOWN:
+		if (slave->link_failure_count < UINT_MAX)
+			slave->link_failure_count++;
 
-			if (BOND_MODE(bond) == BOND_MODE_8023AD)
-				bond_3ad_handle_link_change(slave,
-							    BOND_LINK_DOWN);
+		bond_set_slave_link_state(slave, BOND_LINK_DOWN,
+					  BOND_SLAVE_NOTIFY_NOW);
 
-			if (bond_is_lb(bond))
-				bond_alb_handle_link_change(bond, slave,
-							    BOND_LINK_DOWN);
+		if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
+		    BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_set_slave_inactive_flags(slave,
+						      BOND_SLAVE_NOTIFY_NOW);
 
-			if (BOND_MODE(bond) == BOND_MODE_XOR)
-				bond_update_slave_arr(bond, NULL);
+		netdev_info(bond->dev, "link status definitely down for interface %s, disabling it\n",
+			    slave->dev->name);
 
-			if (slave == rcu_access_pointer(bond->curr_active_slave))
-				goto do_failover;
+		if (BOND_MODE(bond) == BOND_MODE_8023AD)
+			bond_3ad_handle_link_change(slave, BOND_LINK_DOWN);
 
-			continue;
+		if (bond_is_lb(bond))
+			bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN);
 
-		default:
-			netdev_err(bond->dev, "invalid new link %d on slave %s\n",
-				   slave->new_link, slave->dev->name);
-			slave->new_link = BOND_LINK_NOCHANGE;
+		if (BOND_MODE(bond) == BOND_MODE_XOR)
+			bond_update_slave_arr(bond, NULL);
 
-			continue;
-		}
+		if (slave == rcu_access_pointer(bond->curr_active_slave))
+			goto do_failover;
 
-do_failover:
-		block_netpoll_tx();
-		bond_select_active_slave(bond);
-		unblock_netpoll_tx();
+		goto out;
+
+	default:
+		netdev_err(bond->dev, "invalid new link %d on slave %s\n",
+			   slave->new_link, slave->dev->name);
+		slave->new_link = BOND_LINK_NOCHANGE;
+
+		goto out;
 	}
 
+do_failover:
+	block_netpoll_tx();
+	bond_select_active_slave(bond);
+	unblock_netpoll_tx();
+
+out:
 	bond_set_carrier(bond);
 }
 
+static void bond_miimon_commit(struct bonding *bond)
+{
+	struct list_head *iter;
+	struct slave *slave;
+
+	bond_for_each_slave(bond, slave, iter)
+		bond_miimon_commit_slave(bond, slave);
+}
+
 /* bond_mii_monitor
  *
  * Really a wrapper that splits the mii monitor into two phases: an
@@ -3019,6 +3023,9 @@ static int bond_slave_netdev_event(unsigned long event,
 			bond_3ad_adapter_speed_duplex_changed(slave);
 		/* Fallthrough */
 	case NETDEV_DOWN:
+		if (bond_miimon_inspect_slave(bond, slave, event))
+			bond_miimon_commit_slave(bond, slave);
+
 		/* Refresh slave-array if applicable!
 		 * If the setup does not use miimon or arpmon (mode-specific!),
 		 * then these events will not cause the slave-array to be
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection
  2016-01-26  6:00                                     ` Jay Vosburgh
  2016-01-26  6:26                                       ` zhuyj
  2016-01-27 20:00                                       ` Tantilov, Emil S
@ 2016-01-29  7:05                                       ` zhuyj
  2 siblings, 0 replies; 52+ messages in thread
From: zhuyj @ 2016-01-29  7:05 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: mkubecek, vfalico, gospo, netdev, boris.shteinbock,
	emil.s.tantilov, zhuyj

Thanks a lot.

Maybe this patch is to miimon and notifier. Maybe it is not appropriate to arp monitor.
So the following patch will avoid arp monitor.

Thanks a lot.

Zhu Yanjun

+	/* Because of link flap from the slave interface, it is possilbe that
+	 * the notifiler is NETDEV_UP while the actual link state is down. If
+	 * so, it is not necessary to contiune.
+	 */
+	if (!bond->params.arp_interval) {
+		switch (event) {
+		case NETDEV_UP:
+			if (!link_state)
+				return 0;
+			break;
+
+		case NETDEV_DOWN:
+			if (link_state)
+				return 0;
+			break;
+		}
+	}


On 01/26/2016 02:00 PM, Jay Vosburgh wrote:
> +	/* Because of link flap from the slave interface, it is possilbe that
> +	 * the notifiler is NETDEV_UP while the actual link state is down. If
> +	 * so, it is not necessary to contiune.
> +	 */
> +	switch (event) {
> +	case NETDEV_UP:
> +		if (!link_state)
> +			return 0;
> +		break;
> +
> +	case NETDEV_DOWN:
> +		if (link_state)
> +			return 0;
> +		break;
> +	}
> +

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2016-01-29  7:05 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-17  8:03 [PATCH 1/1] bonding: restrict up state in 802.3ad mode zyjzyj2000
2015-12-17 21:57 ` Jay Vosburgh
2015-12-18  4:36   ` zyjzyj2000
2015-12-18  4:36     ` [PATCH 1/1] bonding: delay up state without speed and duplex " zyjzyj2000
2015-12-18  4:54       ` Jay Vosburgh
2015-12-18 13:37       ` Sergei Shtylyov
2015-12-28  8:43   ` [PATCH 1/1] bonding: restrict up state " Michal Kubecek
2015-12-28  9:19     ` zhuyj
2016-01-06  1:26       ` Tantilov, Emil S
2016-01-06  3:05         ` zhuyj
2016-01-07  2:43           ` Tantilov, Emil S
2016-01-07  3:33             ` zhuyj
2016-01-07  5:02               ` Tantilov, Emil S
2016-01-07  6:15                 ` zyjzyj2000
2016-01-07  6:22                   ` zhuyj
2016-01-07  6:33                   ` Jay Vosburgh
2016-01-07 15:27                     ` Tantilov, Emil S
2016-01-08  1:28                     ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Jay Vosburgh
2016-01-08  4:36                       ` zhuyj
2016-01-08  6:12                         ` Jay Vosburgh
2016-01-08  7:41                           ` (unknown), zyjzyj2000
2016-01-08  7:41                             ` [PATCH 1/1] bonding: utilize notifier callbacks to detect slave link state changes zyjzyj2000
2016-01-08 10:18                               ` zhuyj
2016-01-09  1:35                       ` [RFC PATCH net-next] bonding: Use notifiers for slave link state detection Tantilov, Emil S
2016-01-09  2:19                         ` Jay Vosburgh
2016-01-11  9:03                           ` zhuyj
2016-01-13  2:54                             ` zhuyj
2016-01-13 17:03                           ` Tantilov, Emil S
2016-01-20  5:13                             ` [PATCH 1/1] " zyjzyj2000
2016-01-20  5:13                               ` zyjzyj2000
2016-01-21 10:16                             ` zyjzyj2000
2016-01-21 10:16                               ` zyjzyj2000
2016-01-25 16:37                                 ` Tantilov, Emil S
2016-01-26  0:43                                 ` Jay Vosburgh
2016-01-26  3:19                                   ` zhuyj
2016-01-26  6:00                                     ` Jay Vosburgh
2016-01-26  6:26                                       ` zhuyj
2016-01-26  6:45                                         ` zhuyj
2016-01-27 20:00                                       ` Tantilov, Emil S
2016-01-28  8:44                                         ` zyjzyj2000
2016-01-29  7:05                                       ` zhuyj
2016-01-25 16:33                               ` Tantilov, Emil S
2016-01-25 18:00                                 ` David Miller
2016-01-25 18:37                                   ` Tantilov, Emil S
2016-01-08  2:29                     ` [PATCH 1/1] bonding: restrict up state in 802.3ad mode zhuyj
2016-01-07  6:53                   ` Michal Kubecek
2016-01-07  7:37                     ` zhuyj
2016-01-07  7:59                       ` Michal Kubecek
2016-01-07  8:35                         ` zhuyj
2016-01-07  7:47             ` zhuyj
2016-01-07 18:28               ` Tantilov, Emil S
2016-01-08  6:09                 ` zhuyj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.