All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-26  6:02 ` Mitsuo Hayasaka
  0 siblings, 0 replies; 20+ messages in thread
From: Mitsuo Hayasaka @ 2011-08-26  6:02 UTC (permalink / raw)
  To: Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, herbert
  Cc: netdev, linux-kernel, yrl.pp-manager.tt, Mitsuo Hayasaka,
	Patrick McHardy, David S. Miller, Eric Dumazet,
	"MichałMirosław",
	Tom Herbert, Jesse Gross

There is a time-lag of IFF_RUNNING flag consistency between vlan and real
devices when the real devices are in problem such as link or cable broken.
This leads to a degradation of Availability such as a delay of failover in
HA systems using vlan since the detection of the problem at real device is
delayed.

Why this happens:
Network devices' flags can be checked using ioctl with SIOCGIFFLAGS. When
vlan technique is used, it checks the flags of vlan device, not real
device.

Patch:
This patch adds vlan-device check into dev_get_flags(). So, it can check
flags of the real device even if the vlan is used.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Jesse Gross <jesse@nicira.com>
---

 include/linux/if_vlan.h |    2 +-
 net/core/dev.c          |    7 +++++++
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 44da482..4df4e6f 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -91,7 +91,7 @@ struct vlan_group {
 	struct rcu_head		rcu;
 };
 
-static inline int is_vlan_dev(struct net_device *dev)
+static inline int is_vlan_dev(const struct net_device *dev)
 {
         return dev->priv_flags & IFF_802_1Q_VLAN;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index a4306f7..527e21b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4603,6 +4603,13 @@ unsigned dev_get_flags(const struct net_device *dev)
 		(dev->gflags & (IFF_PROMISC |
 				IFF_ALLMULTI));
 
+	/*
+	 * If we're trying to get flags on a vlan device
+	 * use the underlying physical device instead.
+	 */
+	if (is_vlan_dev(dev))
+		dev = vlan_dev_real_dev(dev);
+
 	if (netif_running(dev)) {
 		if (netif_oper_up(dev))
 			flags |= IFF_RUNNING;


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-26  6:02 ` Mitsuo Hayasaka
  0 siblings, 0 replies; 20+ messages in thread
From: Mitsuo Hayasaka @ 2011-08-26  6:02 UTC (permalink / raw)
  To: Patrick McHardy, David S. Miller, Eric Dumazet, MichałMirosław
  Cc: netdev, linux-kernel, yrl.pp-manager.tt, Mitsuo Hayasaka,
	Patrick McHardy, David S. Miller, Eric Dumazet,
	"MichałMirosław",
	Tom Herbert, Jesse Gross

There is a time-lag of IFF_RUNNING flag consistency between vlan and real
devices when the real devices are in problem such as link or cable broken.
This leads to a degradation of Availability such as a delay of failover in
HA systems using vlan since the detection of the problem at real device is
delayed.

Why this happens:
Network devices' flags can be checked using ioctl with SIOCGIFFLAGS. When
vlan technique is used, it checks the flags of vlan device, not real
device.

Patch:
This patch adds vlan-device check into dev_get_flags(). So, it can check
flags of the real device even if the vlan is used.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Jesse Gross <jesse@nicira.com>
---

 include/linux/if_vlan.h |    2 +-
 net/core/dev.c          |    7 +++++++
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 44da482..4df4e6f 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -91,7 +91,7 @@ struct vlan_group {
 	struct rcu_head		rcu;
 };
 
-static inline int is_vlan_dev(struct net_device *dev)
+static inline int is_vlan_dev(const struct net_device *dev)
 {
         return dev->priv_flags & IFF_802_1Q_VLAN;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index a4306f7..527e21b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4603,6 +4603,13 @@ unsigned dev_get_flags(const struct net_device *dev)
 		(dev->gflags & (IFF_PROMISC |
 				IFF_ALLMULTI));
 
+	/*
+	 * If we're trying to get flags on a vlan device
+	 * use the underlying physical device instead.
+	 */
+	if (is_vlan_dev(dev))
+		dev = vlan_dev_real_dev(dev);
+
 	if (netif_running(dev)) {
 		if (netif_oper_up(dev))
 			flags |= IFF_RUNNING;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-26  6:02 ` Mitsuo Hayasaka
@ 2011-08-26  6:08   ` Stephen Hemminger
  -1 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2011-08-26  6:08 UTC (permalink / raw)
  To: Mitsuo Hayasaka
  Cc: Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, herbert,
	netdev, linux-kernel, yrl.pp-manager.tt

On Fri, 26 Aug 2011 15:02:57 +0900
Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> wrote:

> There is a time-lag of IFF_RUNNING flag consistency between vlan and real
> devices when the real devices are in problem such as link or cable broken.
> This leads to a degradation of Availability such as a delay of failover in
> HA systems using vlan since the detection of the problem at real device is
> delayed.
> 
> Why this happens:
> Network devices' flags can be checked using ioctl with SIOCGIFFLAGS. When
> vlan technique is used, it checks the flags of vlan device, not real
> device.
> 
> Patch:
> This patch adds vlan-device check into dev_get_flags(). So, it can check
> flags of the real device even if the vlan is used.
> 
> Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Jesse Gross <jesse@nicira.com>

I don't think this is the right way to solve the problem.

The flags are supposed to propagate back from real device to vlan
via network notifications.

Just doing this for ioctl is not enough, API's other than user space depend on this.
Also the user may have manually set different flags on vlan than on
the real device.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-26  6:08   ` Stephen Hemminger
  0 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2011-08-26  6:08 UTC (permalink / raw)
  To: Mitsuo Hayasaka
  Cc: Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, herbert,
	netdev, linux-kernel, yrl.pp-manager.tt

On Fri, 26 Aug 2011 15:02:57 +0900
Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> wrote:

> There is a time-lag of IFF_RUNNING flag consistency between vlan and real
> devices when the real devices are in problem such as link or cable broken.
> This leads to a degradation of Availability such as a delay of failover in
> HA systems using vlan since the detection of the problem at real device is
> delayed.
> 
> Why this happens:
> Network devices' flags can be checked using ioctl with SIOCGIFFLAGS. When
> vlan technique is used, it checks the flags of vlan device, not real
> device.
> 
> Patch:
> This patch adds vlan-device check into dev_get_flags(). So, it can check
> flags of the real device even if the vlan is used.
> 
> Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Jesse Gross <jesse@nicira.com>

I don't think this is the right way to solve the problem.

The flags are supposed to propagate back from real device to vlan
via network notifications.

Just doing this for ioctl is not enough, API's other than user space depend on this.
Also the user may have manually set different flags on vlan than on
the real device.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-26  6:08   ` Stephen Hemminger
@ 2011-08-26  6:45     ` Herbert Xu
  -1 siblings, 0 replies; 20+ messages in thread
From: Herbert Xu @ 2011-08-26  6:45 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Mitsuo Hayasaka, Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
>
> Just doing this for ioctl is not enough, API's other than user space depend on this.
> Also the user may have manually set different flags on vlan than on
> the real device.

Right, anything that tests netif_carrier_ok directly on the VLAN
device will still be delayed.

Now I remember discussing this issue in Japan.  However, I can't
recall the exact scenario in which the delay occured.

Is the issue with the link status going down on the real device,
or the real device coming up?

IIRC we already have mechanisms in place to ensure that down events
are not delayed by linkwatch.  Of course it is possible that this
isn't working for some reason, or some other part of the system is
causing the delay.

So please clarify the scenario for us Hayasaka-san.  Also please
let us know how you measured the delay.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-26  6:45     ` Herbert Xu
  0 siblings, 0 replies; 20+ messages in thread
From: Herbert Xu @ 2011-08-26  6:45 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Mitsuo Hayasaka, Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
>
> Just doing this for ioctl is not enough, API's other than user space depend on this.
> Also the user may have manually set different flags on vlan than on
> the real device.

Right, anything that tests netif_carrier_ok directly on the VLAN
device will still be delayed.

Now I remember discussing this issue in Japan.  However, I can't
recall the exact scenario in which the delay occured.

Is the issue with the link status going down on the real device,
or the real device coming up?

IIRC we already have mechanisms in place to ensure that down events
are not delayed by linkwatch.  Of course it is possible that this
isn't working for some reason, or some other part of the system is
causing the delay.

So please clarify the scenario for us Hayasaka-san.  Also please
let us know how you measured the delay.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-26  6:45     ` Herbert Xu
@ 2011-08-28 13:20       ` HAYASAKA Mitsuo
  -1 siblings, 0 replies; 20+ messages in thread
From: HAYASAKA Mitsuo @ 2011-08-28 13:20 UTC (permalink / raw)
  To: Herbert Xu, Stephen Hemminger
  Cc: Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

Hi Stephen and Herbert

Thank you for your comments.

(2011/08/26 15:08), Stephen Hemminger wrote:
> I don't think this is the right way to solve the problem.
>
> The flags are supposed to propagate back from real device to vlan
> via network notifications.
>
> Just doing this for ioctl is not enough, API's other than user space depend on this.
> Also the user may have manually set different flags on vlan than on
> the real device.

I agreed.
I will try another way to solve this problem, as you said.


(2011/08/26 15:45), Herbert Xu wrote:
> On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
>> Just doing this for ioctl is not enough, API's other than user space depend on this.
>> Also the user may have manually set different flags on vlan than on
>> the real device.
> Right, anything that tests netif_carrier_ok directly on the VLAN
> device will still be delayed.
>
> Now I remember discussing this issue in Japan.  However, I can't
> recall the exact scenario in which the delay occured.
>
> Is the issue with the link status going down on the real device,
> or the real device coming up?
>
> IIRC we already have mechanisms in place to ensure that down events
> are not delayed by linkwatch.  Of course it is possible that this
> isn't working for some reason, or some other part of the system is
> causing the delay.
>
> So please clarify the scenario for us Hayasaka-san.  Also please
> let us know how you measured the delay.
>
> Thanks,

This issue happens when the link status is going down on the real 
device.

ex) A cable is broken, or is unplugged from a NIC.

I measured the delay using ioctl with SIOCGIFFLAGS from userspace 
in order to check if there is a time-lag of the flag between vlan 
and real devices.

Also, you can check it using a script below.

-------------------------
#!/bin/sh
t=0
while :
do
	echo $t; t=$((t+1))
	echo -n real; ifconfig RealDev | grep UP
	echo -n vlan; ifconfig VlanDev | grep UP
	sleep 0.2
done
-------------------------

The result is shown as follows.
It is observed that there is a time-lag of RUNNING status between 
real and vlan devices.


....

19
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
20
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1  * A cable is unplugged from NIC.
21
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
22
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
23
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
24
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
25
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
26
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
27
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
28
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
29
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
30
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
31
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
32
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
33
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
34
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
35
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST MULTICAST  MTU:1500  Metric:1
36
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST MULTICAST  MTU:1500  Metric:1


Thanks.









^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-28 13:20       ` HAYASAKA Mitsuo
  0 siblings, 0 replies; 20+ messages in thread
From: HAYASAKA Mitsuo @ 2011-08-28 13:20 UTC (permalink / raw)
  To: Herbert Xu, Stephen Hemminger
  Cc: Patrick McHardy, David S. Miller, Eric Dumazet,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

Hi Stephen and Herbert

Thank you for your comments.

(2011/08/26 15:08), Stephen Hemminger wrote:
> I don't think this is the right way to solve the problem.
>
> The flags are supposed to propagate back from real device to vlan
> via network notifications.
>
> Just doing this for ioctl is not enough, API's other than user space depend on this.
> Also the user may have manually set different flags on vlan than on
> the real device.

I agreed.
I will try another way to solve this problem, as you said.


(2011/08/26 15:45), Herbert Xu wrote:
> On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
>> Just doing this for ioctl is not enough, API's other than user space depend on this.
>> Also the user may have manually set different flags on vlan than on
>> the real device.
> Right, anything that tests netif_carrier_ok directly on the VLAN
> device will still be delayed.
>
> Now I remember discussing this issue in Japan.  However, I can't
> recall the exact scenario in which the delay occured.
>
> Is the issue with the link status going down on the real device,
> or the real device coming up?
>
> IIRC we already have mechanisms in place to ensure that down events
> are not delayed by linkwatch.  Of course it is possible that this
> isn't working for some reason, or some other part of the system is
> causing the delay.
>
> So please clarify the scenario for us Hayasaka-san.  Also please
> let us know how you measured the delay.
>
> Thanks,

This issue happens when the link status is going down on the real 
device.

ex) A cable is broken, or is unplugged from a NIC.

I measured the delay using ioctl with SIOCGIFFLAGS from userspace 
in order to check if there is a time-lag of the flag between vlan 
and real devices.

Also, you can check it using a script below.

-------------------------
#!/bin/sh
t=0
while :
do
	echo $t; t=$((t+1))
	echo -n real; ifconfig RealDev | grep UP
	echo -n vlan; ifconfig VlanDev | grep UP
	sleep 0.2
done
-------------------------

The result is shown as follows.
It is observed that there is a time-lag of RUNNING status between 
real and vlan devices.


....

19
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
20
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1  * A cable is unplugged from NIC.
21
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
22
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
23
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
24
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
25
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
26
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
27
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
28
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
29
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
30
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
31
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
32
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
33
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
34
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
35
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST MULTICAST  MTU:1500  Metric:1
36
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST MULTICAST  MTU:1500  Metric:1


Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-28 13:20       ` HAYASAKA Mitsuo
@ 2011-08-28 14:09         ` Eric Dumazet
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-08-28 14:09 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> Hi Stephen and Herbert
> 
> Thank you for your comments.
> 
> (2011/08/26 15:08), Stephen Hemminger wrote:
> > I don't think this is the right way to solve the problem.
> >
> > The flags are supposed to propagate back from real device to vlan
> > via network notifications.
> >
> > Just doing this for ioctl is not enough, API's other than user space depend on this.
> > Also the user may have manually set different flags on vlan than on
> > the real device.
> 
> I agreed.
> I will try another way to solve this problem, as you said.
> 
> 
> (2011/08/26 15:45), Herbert Xu wrote:
> > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
> >> Just doing this for ioctl is not enough, API's other than user space depend on this.
> >> Also the user may have manually set different flags on vlan than on
> >> the real device.
> > Right, anything that tests netif_carrier_ok directly on the VLAN
> > device will still be delayed.
> >
> > Now I remember discussing this issue in Japan.  However, I can't
> > recall the exact scenario in which the delay occured.
> >
> > Is the issue with the link status going down on the real device,
> > or the real device coming up?
> >
> > IIRC we already have mechanisms in place to ensure that down events
> > are not delayed by linkwatch.  Of course it is possible that this
> > isn't working for some reason, or some other part of the system is
> > causing the delay.
> >
> > So please clarify the scenario for us Hayasaka-san.  Also please
> > let us know how you measured the delay.
> >
> > Thanks,
> 
> This issue happens when the link status is going down on the real 
> device.
> 
> ex) A cable is broken, or is unplugged from a NIC.
> 
> I measured the delay using ioctl with SIOCGIFFLAGS from userspace 
> in order to check if there is a time-lag of the flag between vlan 
> and real devices.
> 
> Also, you can check it using a script below.
> 
> -------------------------
> #!/bin/sh
> t=0
> while :
> do
> 	echo $t; t=$((t+1))
> 	echo -n real; ifconfig RealDev | grep UP
> 	echo -n vlan; ifconfig VlanDev | grep UP
> 	sleep 0.2
> done
> -------------------------
> 
> The result is shown as follows.
> It is observed that there is a time-lag of RUNNING status between 
> real and vlan devices.
> 
> 

Hi !

This reminds me some work done in linkwatch

Please take a look at commit e014debecd3ee3832e647 (linkwatch:
linkwatch_forget_dev() to speedup device dismantle)

And more generally, code in net/core/link_watch.c





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-28 14:09         ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-08-28 14:09 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> Hi Stephen and Herbert
> 
> Thank you for your comments.
> 
> (2011/08/26 15:08), Stephen Hemminger wrote:
> > I don't think this is the right way to solve the problem.
> >
> > The flags are supposed to propagate back from real device to vlan
> > via network notifications.
> >
> > Just doing this for ioctl is not enough, API's other than user space depend on this.
> > Also the user may have manually set different flags on vlan than on
> > the real device.
> 
> I agreed.
> I will try another way to solve this problem, as you said.
> 
> 
> (2011/08/26 15:45), Herbert Xu wrote:
> > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
> >> Just doing this for ioctl is not enough, API's other than user space depend on this.
> >> Also the user may have manually set different flags on vlan than on
> >> the real device.
> > Right, anything that tests netif_carrier_ok directly on the VLAN
> > device will still be delayed.
> >
> > Now I remember discussing this issue in Japan.  However, I can't
> > recall the exact scenario in which the delay occured.
> >
> > Is the issue with the link status going down on the real device,
> > or the real device coming up?
> >
> > IIRC we already have mechanisms in place to ensure that down events
> > are not delayed by linkwatch.  Of course it is possible that this
> > isn't working for some reason, or some other part of the system is
> > causing the delay.
> >
> > So please clarify the scenario for us Hayasaka-san.  Also please
> > let us know how you measured the delay.
> >
> > Thanks,
> 
> This issue happens when the link status is going down on the real 
> device.
> 
> ex) A cable is broken, or is unplugged from a NIC.
> 
> I measured the delay using ioctl with SIOCGIFFLAGS from userspace 
> in order to check if there is a time-lag of the flag between vlan 
> and real devices.
> 
> Also, you can check it using a script below.
> 
> -------------------------
> #!/bin/sh
> t=0
> while :
> do
> 	echo $t; t=$((t+1))
> 	echo -n real; ifconfig RealDev | grep UP
> 	echo -n vlan; ifconfig VlanDev | grep UP
> 	sleep 0.2
> done
> -------------------------
> 
> The result is shown as follows.
> It is observed that there is a time-lag of RUNNING status between 
> real and vlan devices.
> 
> 

Hi !

This reminds me some work done in linkwatch

Please take a look at commit e014debecd3ee3832e647 (linkwatch:
linkwatch_forget_dev() to speedup device dismantle)

And more generally, code in net/core/link_watch.c

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-28 14:09         ` Eric Dumazet
@ 2011-08-29  6:06           ` Stephen Hemminger
  -1 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2011-08-29  6:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl pp-manager tt, HAYASAKA Mitsuo



----- Original Message -----
> Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> > Hi Stephen and Herbert
> > 
> > Thank you for your comments.
> > 
> > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > I don't think this is the right way to solve the problem.
> > >
> > > The flags are supposed to propagate back from real device to vlan
> > > via network notifications.
> > >
> > > Just doing this for ioctl is not enough, API's other than user
> > > space depend on this.
> > > Also the user may have manually set different flags on vlan than
> > > on
> > > the real device.
> > 
> > I agreed.
> > I will try another way to solve this problem, as you said.
> > 
> > 
> > (2011/08/26 15:45), Herbert Xu wrote:
> > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > wrote:
> > >> Just doing this for ioctl is not enough, API's other than user
> > >> space depend on this.
> > >> Also the user may have manually set different flags on vlan than
> > >> on
> > >> the real device.
> > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > device will still be delayed.
> > >
> > > Now I remember discussing this issue in Japan.  However, I can't
> > > recall the exact scenario in which the delay occured.
> > >
> > > Is the issue with the link status going down on the real device,
> > > or the real device coming up?
> > >
> > > IIRC we already have mechanisms in place to ensure that down
> > > events
> > > are not delayed by linkwatch.  Of course it is possible that this
> > > isn't working for some reason, or some other part of the system
> > > is
> > > causing the delay.
> > >
> > > So please clarify the scenario for us Hayasaka-san.  Also please
> > > let us know how you measured the delay.
> > >
> > > Thanks,
> > 
> > This issue happens when the link status is going down on the real
> > device.
> > 
> > ex) A cable is broken, or is unplugged from a NIC.
> > 
> > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > in order to check if there is a time-lag of the flag between vlan
> > and real devices.
> > 
> > Also, you can check it using a script below.
> > 
> > -------------------------
> > #!/bin/sh
> > t=0
> > while :
> > do
> > 	echo $t; t=$((t+1))
> > 	echo -n real; ifconfig RealDev | grep UP
> > 	echo -n vlan; ifconfig VlanDev | grep UP
> > 	sleep 0.2
> > done
> > -------------------------
> > 
> > The result is shown as follows.
> > It is observed that there is a time-lag of RUNNING status between
> > real and vlan devices.
> > 
> > 
> 
> Hi !
> 
> This reminds me some work done in linkwatch
> 
> Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> linkwatch_forget_dev() to speedup device dismantle)
> 
> And more generally, code in net/core/link_watch.c

Maybe the problem is specific to a ethernet driver. Some devices poll
for link changes, and also do a manual check when ioctl was done.
This was mostly typical of older hardware that did not have a PHY
interrupt.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-29  6:06           ` Stephen Hemminger
  0 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2011-08-29  6:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl pp-manager tt, HAYASAKA Mitsuo



----- Original Message -----
> Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> > Hi Stephen and Herbert
> > 
> > Thank you for your comments.
> > 
> > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > I don't think this is the right way to solve the problem.
> > >
> > > The flags are supposed to propagate back from real device to vlan
> > > via network notifications.
> > >
> > > Just doing this for ioctl is not enough, API's other than user
> > > space depend on this.
> > > Also the user may have manually set different flags on vlan than
> > > on
> > > the real device.
> > 
> > I agreed.
> > I will try another way to solve this problem, as you said.
> > 
> > 
> > (2011/08/26 15:45), Herbert Xu wrote:
> > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > wrote:
> > >> Just doing this for ioctl is not enough, API's other than user
> > >> space depend on this.
> > >> Also the user may have manually set different flags on vlan than
> > >> on
> > >> the real device.
> > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > device will still be delayed.
> > >
> > > Now I remember discussing this issue in Japan.  However, I can't
> > > recall the exact scenario in which the delay occured.
> > >
> > > Is the issue with the link status going down on the real device,
> > > or the real device coming up?
> > >
> > > IIRC we already have mechanisms in place to ensure that down
> > > events
> > > are not delayed by linkwatch.  Of course it is possible that this
> > > isn't working for some reason, or some other part of the system
> > > is
> > > causing the delay.
> > >
> > > So please clarify the scenario for us Hayasaka-san.  Also please
> > > let us know how you measured the delay.
> > >
> > > Thanks,
> > 
> > This issue happens when the link status is going down on the real
> > device.
> > 
> > ex) A cable is broken, or is unplugged from a NIC.
> > 
> > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > in order to check if there is a time-lag of the flag between vlan
> > and real devices.
> > 
> > Also, you can check it using a script below.
> > 
> > -------------------------
> > #!/bin/sh
> > t=0
> > while :
> > do
> > 	echo $t; t=$((t+1))
> > 	echo -n real; ifconfig RealDev | grep UP
> > 	echo -n vlan; ifconfig VlanDev | grep UP
> > 	sleep 0.2
> > done
> > -------------------------
> > 
> > The result is shown as follows.
> > It is observed that there is a time-lag of RUNNING status between
> > real and vlan devices.
> > 
> > 
> 
> Hi !
> 
> This reminds me some work done in linkwatch
> 
> Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> linkwatch_forget_dev() to speedup device dismantle)
> 
> And more generally, code in net/core/link_watch.c

Maybe the problem is specific to a ethernet driver. Some devices poll
for link changes, and also do a manual check when ioctl was done.
This was mostly typical of older hardware that did not have a PHY
interrupt.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-29  6:06           ` Stephen Hemminger
@ 2011-08-29  6:23             ` Eric Dumazet
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-08-29  6:23 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Herbert Xu, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl pp-manager tt, HAYASAKA Mitsuo

Le dimanche 28 août 2011 à 23:06 -0700, Stephen Hemminger a écrit :
> 
> ----- Original Message -----
> > Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> > > Hi Stephen and Herbert
> > > 
> > > Thank you for your comments.
> > > 
> > > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > > I don't think this is the right way to solve the problem.
> > > >
> > > > The flags are supposed to propagate back from real device to vlan
> > > > via network notifications.
> > > >
> > > > Just doing this for ioctl is not enough, API's other than user
> > > > space depend on this.
> > > > Also the user may have manually set different flags on vlan than
> > > > on
> > > > the real device.
> > > 
> > > I agreed.
> > > I will try another way to solve this problem, as you said.
> > > 
> > > 
> > > (2011/08/26 15:45), Herbert Xu wrote:
> > > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > > wrote:
> > > >> Just doing this for ioctl is not enough, API's other than user
> > > >> space depend on this.
> > > >> Also the user may have manually set different flags on vlan than
> > > >> on
> > > >> the real device.
> > > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > > device will still be delayed.
> > > >
> > > > Now I remember discussing this issue in Japan.  However, I can't
> > > > recall the exact scenario in which the delay occured.
> > > >
> > > > Is the issue with the link status going down on the real device,
> > > > or the real device coming up?
> > > >
> > > > IIRC we already have mechanisms in place to ensure that down
> > > > events
> > > > are not delayed by linkwatch.  Of course it is possible that this
> > > > isn't working for some reason, or some other part of the system
> > > > is
> > > > causing the delay.
> > > >
> > > > So please clarify the scenario for us Hayasaka-san.  Also please
> > > > let us know how you measured the delay.
> > > >
> > > > Thanks,
> > > 
> > > This issue happens when the link status is going down on the real
> > > device.
> > > 
> > > ex) A cable is broken, or is unplugged from a NIC.
> > > 
> > > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > > in order to check if there is a time-lag of the flag between vlan
> > > and real devices.
> > > 
> > > Also, you can check it using a script below.
> > > 
> > > -------------------------
> > > #!/bin/sh
> > > t=0
> > > while :
> > > do
> > > 	echo $t; t=$((t+1))
> > > 	echo -n real; ifconfig RealDev | grep UP
> > > 	echo -n vlan; ifconfig VlanDev | grep UP
> > > 	sleep 0.2
> > > done
> > > -------------------------
> > > 
> > > The result is shown as follows.
> > > It is observed that there is a time-lag of RUNNING status between
> > > real and vlan devices.
> > > 
> > > 
> > 
> > Hi !
> > 
> > This reminds me some work done in linkwatch
> > 
> > Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> > linkwatch_forget_dev() to speedup device dismantle)
> > 
> > And more generally, code in net/core/link_watch.c
> 
> Maybe the problem is specific to a ethernet driver. Some devices poll
> for link changes, and also do a manual check when ioctl was done.
> This was mostly typical of older hardware that did not have a PHY
> interrupt.

Hmm, I just tried the script on my laptop, and reproduced the problem
with a tg3 driver, considered as a reference one ;)

the 'carrier is on' event is immediately present on both devices, but
the 'carrier is off' is delayed by one second.

09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755M
Gigabit Ethernet PCI Express (rev 02)
	Subsystem: Dell Device 01f9
	Flags: bus master, fast devsel, latency 0, IRQ 45
	Memory at f1ef0000 (64-bit, non-prefetchable) [size=64K]
	Expansion ROM at <ignored> [disabled]
	Capabilities: <access denied>
	Kernel driver in use: tg3
	Kernel modules: tg3



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
@ 2011-08-29  6:23             ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-08-29  6:23 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Herbert Xu, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl pp-manager tt, HAYASAKA Mitsuo

Le dimanche 28 août 2011 à 23:06 -0700, Stephen Hemminger a écrit :
> 
> ----- Original Message -----
> > Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> > > Hi Stephen and Herbert
> > > 
> > > Thank you for your comments.
> > > 
> > > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > > I don't think this is the right way to solve the problem.
> > > >
> > > > The flags are supposed to propagate back from real device to vlan
> > > > via network notifications.
> > > >
> > > > Just doing this for ioctl is not enough, API's other than user
> > > > space depend on this.
> > > > Also the user may have manually set different flags on vlan than
> > > > on
> > > > the real device.
> > > 
> > > I agreed.
> > > I will try another way to solve this problem, as you said.
> > > 
> > > 
> > > (2011/08/26 15:45), Herbert Xu wrote:
> > > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > > wrote:
> > > >> Just doing this for ioctl is not enough, API's other than user
> > > >> space depend on this.
> > > >> Also the user may have manually set different flags on vlan than
> > > >> on
> > > >> the real device.
> > > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > > device will still be delayed.
> > > >
> > > > Now I remember discussing this issue in Japan.  However, I can't
> > > > recall the exact scenario in which the delay occured.
> > > >
> > > > Is the issue with the link status going down on the real device,
> > > > or the real device coming up?
> > > >
> > > > IIRC we already have mechanisms in place to ensure that down
> > > > events
> > > > are not delayed by linkwatch.  Of course it is possible that this
> > > > isn't working for some reason, or some other part of the system
> > > > is
> > > > causing the delay.
> > > >
> > > > So please clarify the scenario for us Hayasaka-san.  Also please
> > > > let us know how you measured the delay.
> > > >
> > > > Thanks,
> > > 
> > > This issue happens when the link status is going down on the real
> > > device.
> > > 
> > > ex) A cable is broken, or is unplugged from a NIC.
> > > 
> > > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > > in order to check if there is a time-lag of the flag between vlan
> > > and real devices.
> > > 
> > > Also, you can check it using a script below.
> > > 
> > > -------------------------
> > > #!/bin/sh
> > > t=0
> > > while :
> > > do
> > > 	echo $t; t=$((t+1))
> > > 	echo -n real; ifconfig RealDev | grep UP
> > > 	echo -n vlan; ifconfig VlanDev | grep UP
> > > 	sleep 0.2
> > > done
> > > -------------------------
> > > 
> > > The result is shown as follows.
> > > It is observed that there is a time-lag of RUNNING status between
> > > real and vlan devices.
> > > 
> > > 
> > 
> > Hi !
> > 
> > This reminds me some work done in linkwatch
> > 
> > Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> > linkwatch_forget_dev() to speedup device dismantle)
> > 
> > And more generally, code in net/core/link_watch.c
> 
> Maybe the problem is specific to a ethernet driver. Some devices poll
> for link changes, and also do a manual check when ioctl was done.
> This was mostly typical of older hardware that did not have a PHY
> interrupt.

Hmm, I just tried the script on my laptop, and reproduced the problem
with a tg3 driver, considered as a reference one ;)

the 'carrier is on' event is immediately present on both devices, but
the 'carrier is off' is delayed by one second.

09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755M
Gigabit Ethernet PCI Express (rev 02)
	Subsystem: Dell Device 01f9
	Flags: bus master, fast devsel, latency 0, IRQ 45
	Memory at f1ef0000 (64-bit, non-prefetchable) [size=64K]
	Expansion ROM at <ignored> [disabled]
	Capabilities: <access denied>
	Kernel driver in use: tg3
	Kernel modules: tg3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices
  2011-08-29  6:06           ` Stephen Hemminger
  (?)
  (?)
@ 2011-08-29  6:34           ` David Miller
  -1 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-08-29  6:34 UTC (permalink / raw)
  To: stephen.hemminger
  Cc: eric.dumazet, herbert, kaber, mirq-linux, therbert, jesse,
	netdev, linux-kernel, yrl.pp-manager.tt, mitsuo.hayasaka.hu

From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Sun, 28 Aug 2011 23:06:28 -0700 (PDT)

> This was mostly typical of older hardware that did not have a PHY
> interrupt.

Many have to poll because the PHY interrupt is simply unreliable.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster
  2011-08-28 14:09         ` Eric Dumazet
@ 2011-08-31  9:31           ` Eric Dumazet
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-08-31  9:31 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

There is a time-lag of IFF_RUNNING flag consistency between vlan and
real devices when the real devices are in problem such as link or cable
broken.

This leads to a degradation of Availability such as a delay of failover
in HA systems using vlan since the detection of the problem at real
device is delayed.

We can avoid the linkwatch delay (~1 sec) for devices linked to another
ones, since delay is already done for the realdev.

Based on a previous patch from Mitsuo Hayasaka

Reported-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jesse Gross <jesse@nicira.com>
---
 net/core/link_watch.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 357bd4e..c3519c6 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -78,8 +78,13 @@ static void rfc2863_policy(struct net_device *dev)
 
 static bool linkwatch_urgent_event(struct net_device *dev)
 {
-	return netif_running(dev) && netif_carrier_ok(dev) &&
-		qdisc_tx_changing(dev);
+	if (!netif_running(dev))
+		return false;
+
+	if (dev->ifindex != dev->iflink)
+		return true;
+
+	return netif_carrier_ok(dev) &&	qdisc_tx_changing(dev);
 }
 
 



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster
@ 2011-08-31  9:31           ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-08-31  9:31 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

There is a time-lag of IFF_RUNNING flag consistency between vlan and
real devices when the real devices are in problem such as link or cable
broken.

This leads to a degradation of Availability such as a delay of failover
in HA systems using vlan since the detection of the problem at real
device is delayed.

We can avoid the linkwatch delay (~1 sec) for devices linked to another
ones, since delay is already done for the realdev.

Based on a previous patch from Mitsuo Hayasaka

Reported-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jesse Gross <jesse@nicira.com>
---
 net/core/link_watch.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 357bd4e..c3519c6 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -78,8 +78,13 @@ static void rfc2863_policy(struct net_device *dev)
 
 static bool linkwatch_urgent_event(struct net_device *dev)
 {
-	return netif_running(dev) && netif_carrier_ok(dev) &&
-		qdisc_tx_changing(dev);
+	if (!netif_running(dev))
+		return false;
+
+	if (dev->ifindex != dev->iflink)
+		return true;
+
+	return netif_carrier_ok(dev) &&	qdisc_tx_changing(dev);
 }
 
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster
  2011-08-31  9:31           ` Eric Dumazet
@ 2011-09-01 11:53             ` HAYASAKA Mitsuo
  -1 siblings, 0 replies; 20+ messages in thread
From: HAYASAKA Mitsuo @ 2011-09-01 11:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

Hi Eric,

I checked this patch solves the time-lag of IFF_RUNNING flag consistency
between vlan and real devices. 

Cheers.

Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>


(2011/08/31 18:31), Eric Dumazet wrote:
> There is a time-lag of IFF_RUNNING flag consistency between vlan and
> real devices when the real devices are in problem such as link or cable
> broken.
> 
> This leads to a degradation of Availability such as a delay of failover
> in HA systems using vlan since the detection of the problem at real
> device is delayed.
> 
> We can avoid the linkwatch delay (~1 sec) for devices linked to another
> ones, since delay is already done for the realdev.
> 
> Based on a previous patch from Mitsuo Hayasaka
> 
> Reported-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: Jesse Gross <jesse@nicira.com>
> ---
>  net/core/link_watch.c |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/link_watch.c b/net/core/link_watch.c
> index 357bd4e..c3519c6 100644
> --- a/net/core/link_watch.c
> +++ b/net/core/link_watch.c
> @@ -78,8 +78,13 @@ static void rfc2863_policy(struct net_device *dev)
>  
>  static bool linkwatch_urgent_event(struct net_device *dev)
>  {
> -	return netif_running(dev) && netif_carrier_ok(dev) &&
> -		qdisc_tx_changing(dev);
> +	if (!netif_running(dev))
> +		return false;
> +
> +	if (dev->ifindex != dev->iflink)
> +		return true;
> +
> +	return netif_carrier_ok(dev) &&	qdisc_tx_changing(dev);
>  }
>  
>  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster
@ 2011-09-01 11:53             ` HAYASAKA Mitsuo
  0 siblings, 0 replies; 20+ messages in thread
From: HAYASAKA Mitsuo @ 2011-09-01 11:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt

Hi Eric,

I checked this patch solves the time-lag of IFF_RUNNING flag consistency
between vlan and real devices. 

Cheers.

Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>


(2011/08/31 18:31), Eric Dumazet wrote:
> There is a time-lag of IFF_RUNNING flag consistency between vlan and
> real devices when the real devices are in problem such as link or cable
> broken.
> 
> This leads to a degradation of Availability such as a delay of failover
> in HA systems using vlan since the detection of the problem at real
> device is delayed.
> 
> We can avoid the linkwatch delay (~1 sec) for devices linked to another
> ones, since delay is already done for the realdev.
> 
> Based on a previous patch from Mitsuo Hayasaka
> 
> Reported-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: Jesse Gross <jesse@nicira.com>
> ---
>  net/core/link_watch.c |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/link_watch.c b/net/core/link_watch.c
> index 357bd4e..c3519c6 100644
> --- a/net/core/link_watch.c
> +++ b/net/core/link_watch.c
> @@ -78,8 +78,13 @@ static void rfc2863_policy(struct net_device *dev)
>  
>  static bool linkwatch_urgent_event(struct net_device *dev)
>  {
> -	return netif_running(dev) && netif_carrier_ok(dev) &&
> -		qdisc_tx_changing(dev);
> +	if (!netif_running(dev))
> +		return false;
> +
> +	if (dev->ifindex != dev->iflink)
> +		return true;
> +
> +	return netif_carrier_ok(dev) &&	qdisc_tx_changing(dev);
>  }
>  
>  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster
  2011-08-31  9:31           ` Eric Dumazet
  (?)
  (?)
@ 2011-09-15 19:44           ` David Miller
  -1 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-09-15 19:44 UTC (permalink / raw)
  To: eric.dumazet
  Cc: mitsuo.hayasaka.hu, herbert, shemminger, kaber, mirq-linux,
	therbert, jesse, netdev, linux-kernel, yrl.pp-manager.tt

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 31 Aug 2011 11:31:58 +0200

> There is a time-lag of IFF_RUNNING flag consistency between vlan and
> real devices when the real devices are in problem such as link or cable
> broken.
> 
> This leads to a degradation of Availability such as a delay of failover
> in HA systems using vlan since the detection of the problem at real
> device is delayed.
> 
> We can avoid the linkwatch delay (~1 sec) for devices linked to another
> ones, since delay is already done for the realdev.
> 
> Based on a previous patch from Mitsuo Hayasaka
> 
> Reported-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-09-15 19:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-26  6:02 [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices Mitsuo Hayasaka
2011-08-26  6:02 ` Mitsuo Hayasaka
2011-08-26  6:08 ` Stephen Hemminger
2011-08-26  6:08   ` Stephen Hemminger
2011-08-26  6:45   ` Herbert Xu
2011-08-26  6:45     ` Herbert Xu
2011-08-28 13:20     ` HAYASAKA Mitsuo
2011-08-28 13:20       ` HAYASAKA Mitsuo
2011-08-28 14:09       ` Eric Dumazet
2011-08-28 14:09         ` Eric Dumazet
2011-08-29  6:06         ` Stephen Hemminger
2011-08-29  6:06           ` Stephen Hemminger
2011-08-29  6:23           ` Eric Dumazet
2011-08-29  6:23             ` Eric Dumazet
2011-08-29  6:34           ` David Miller
2011-08-31  9:31         ` [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster Eric Dumazet
2011-08-31  9:31           ` Eric Dumazet
2011-09-01 11:53           ` HAYASAKA Mitsuo
2011-09-01 11:53             ` HAYASAKA Mitsuo
2011-09-15 19:44           ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.