netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bonding + arp monitoring fails if interface is a vlan
@ 2013-08-01 12:11 Santiago Garcia Mantinan
  2013-08-01 13:00 ` Erik Hugne
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-01 12:11 UTC (permalink / raw)
  To: netdev

Hi!

I'm trying to setup a bond of a couple of vlans, these vlans are different
paths to an upstream switch from a local switch.  I want to do arp
monitoring of the link in order for the bonding interface to know which path
is ok and wich one is broken.  If I set it up using arp monitoring and
without using vlans it works ok, it also works if I set it up using vlans
but without arp monitoring, so the broken setup seems to be with bonding +
arp monitoring + vlans. Here is a schema:

 -------------
|Remote Switch|
 -------------
   |      |
   P      P
   A      A
   T      T
   H      H
   1      2
   |      |
 ------------
|Local switch|
 ------------
      |
      | VLAN for PATH1
      | VLAN for PATH2
      |
 Linux machine

The broken setup seems to work but arp monitoring makes it loose the logical
link from time to time, thus changing to other slave if available.  What I
saw when monitoring this with tcpdump is that all the arp requests were
going out and that all the replies where coming in, so acording to the
traffic seen on tcpdump the link should have been stable, but
/proc/net/bonding/bond0 showed the link failures increasing and when testing
with just a vlan interface I was loosing ping when the link was going down.

I've tried this on Debian wheezy with its 3.2.46 kernel and also the 3.10.3
version in unstable, the tests where done on a couple of machines using a 32
bits kernel with different nics (r8169 and skge).

I created a small lab to replicate the problem, on this setup I avoided all
the switching and I directly connected the machine with bonding to another
Linux on which I just had eth0.1002 configured with ip 192.168.1.1, the
results where the same as in the full scenario, link on the bonding slave
was going down from time to time.

This is the setup on the bonding interface.

auto bond0
iface bond0 inet static
        address 192.168.1.2
        netmask 255.255.255.0
        bond-slaves eth0.1002
        bond-mode active-backup
        bond-arp_validate 0
        bond-arp_interval 5000
        bond-arp_ip_target 192.168.1.1
        pre-up ip link set eth0 up || true
        pre-up ip link add link eth0 name eth0.1002 type vlan id 1002 || true
        down ip link delete eth0.1002 || true

These are the messages I was seing on the bonding machines:

[  452.436750] bonding: bond0: adding ARP target 192.168.1.1.
[  452.436851] bonding: bond0: Setting ARP monitoring interval to 5000.
[  452.440287] bonding: bond0: setting mode to active-backup (1).
[  452.440429] bonding: bond0: setting arp_validate to none (0).
[  452.458349] bonding: bond0: Adding slave eth0.1002.
[  452.458964] bonding: bond0: making interface eth0.1002 the new active one.
[  452.458983] bonding: bond0: first active interface up!
[  452.458999] bonding: bond0: enslaving eth0.1002 as an active interface with an up link.
[  452.482560] 8021q: adding VLAN 0 to HW filter on device bond0
[  467.500143] bonding: bond0: link status definitely down for interface eth0.1002, disabling it
[  467.500193] bonding: bond0: now running without any active interface !
[  622.748102] bonding: bond0: link status definitely up for interface eth0.1002.
[  622.748122] bonding: bond0: making interface eth0.1002 the new active one.
[  622.748522] bonding: bond0: first active interface up!
[  637.772179] bonding: bond0: link status definitely down for interface eth0.1002, disabling it
[  637.772228] bonding: bond0: now running without any active interface !
[  642.780173] bonding: bond0: link status definitely up for interface eth0.1002.
[  642.780192] bonding: bond0: making interface eth0.1002 the new active one.
[  642.780603] bonding: bond0: first active interface up!
[  657.804154] bonding: bond0: link status definitely down for interface eth0.1002, disabling it
[  657.804209] bonding: bond0: now running without any active interface !
[  662.812165] bonding: bond0: link status definitely up for interface eth0.1002.
[  662.812185] bonding: bond0: making interface eth0.1002 the new active one.
[  662.812592] bonding: bond0: first active interface up!
[  677.836167] bonding: bond0: link status definitely down for interface eth0.1002, disabling it
[  677.836223] bonding: bond0: now running without any active interface !
[  682.844162] bonding: bond0: link status definitely up for interface eth0.1002.
[  682.844181] bonding: bond0: making interface eth0.1002 the new active one.
[  682.844590] bonding: bond0: first active interface up!
[  697.868153] bonding: bond0: link status definitely down for interface eth0.1002, disabling it

Like I said, running tcpdump on both Linux shows everything fine, all arp
replies and requests are there, but link goes down from time to time, on
this setup the bond is built just with one slave, so network is lost when
link goes down.

Some questions:

am I doing something wrong here?
Is this setup not supported?
If it should work... can anybody reproduce this?
Bug?

What should I do now?

Regards...
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-01 12:11 bonding + arp monitoring fails if interface is a vlan Santiago Garcia Mantinan
@ 2013-08-01 13:00 ` Erik Hugne
  2013-08-02  7:26   ` Santiago Garcia Mantinan
  2013-08-01 20:21 ` Veaceslav Falico
  2013-08-02 11:58 ` Nikolay Aleksandrov
  2 siblings, 1 reply; 18+ messages in thread
From: Erik Hugne @ 2013-08-01 13:00 UTC (permalink / raw)
  To: Santiago Garcia Mantinan; +Cc: netdev

On Thu, Aug 01, 2013 at 02:11:42PM +0200, Santiago Garcia Mantinan wrote:
> Hi!
> 
> I'm trying to setup a bond of a couple of vlans, these vlans are different
> paths to an upstream switch from a local switch.  I want to do arp
> monitoring of the link in order for the bonding interface to know which path
> is ok and wich one is broken.  If I set it up using arp monitoring and
> without using vlans it works ok, it also works if I set it up using vlans
> but without arp monitoring, so the broken setup seems to be with bonding +
> arp monitoring + vlans. 

This have helped me troubleshoot various bonding problems in the past:
mount -t debugfs none /sys/kernel/debug/
ln -s /sys/kernel/debug /debug
echo -n 'module bonding +p' > /debug/dynamic_debug/control

> The broken setup seems to work but arp monitoring makes it loose the logical
> link from time to time, thus changing to other slave if available.  What I
> saw when monitoring this with tcpdump is that all the arp requests were
> going out and that all the replies where coming in, so acording to the
> traffic seen on tcpdump the link should have been stable, but
> /proc/net/bonding/bond0 showed the link failures increasing and when testing
> with just a vlan interface I was loosing ping when the link was going down.

Did you sniff externally, on the native device, bond slaves or on bond0?

//E

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-01 12:11 bonding + arp monitoring fails if interface is a vlan Santiago Garcia Mantinan
  2013-08-01 13:00 ` Erik Hugne
@ 2013-08-01 20:21 ` Veaceslav Falico
  2013-08-02  7:30   ` Santiago Garcia Mantinan
  2013-08-02 11:58 ` Nikolay Aleksandrov
  2 siblings, 1 reply; 18+ messages in thread
From: Veaceslav Falico @ 2013-08-01 20:21 UTC (permalink / raw)
  To: Santiago Garcia Mantinan; +Cc: netdev

On Thu, Aug 1, 2013 at 2:11 PM, Santiago Garcia Mantinan
<manty@manty.net> wrote:
> Hi!
...snip...
>
> This is the setup on the bonding interface.
>
> auto bond0
> iface bond0 inet static
>         address 192.168.1.2
>         netmask 255.255.255.0
>         bond-slaves eth0.1002
>         bond-mode active-backup
>         bond-arp_validate 0

Could you please try with arp_validate=1?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-01 13:00 ` Erik Hugne
@ 2013-08-02  7:26   ` Santiago Garcia Mantinan
  2013-08-02  9:33     ` Santiago Garcia Mantinan
  0 siblings, 1 reply; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-02  7:26 UTC (permalink / raw)
  To: Erik Hugne; +Cc: netdev

2013/8/1 Erik Hugne <erik.hugne@ericsson.com>
> This have helped me troubleshoot various bonding problems in the past:
> mount -t debugfs none /sys/kernel/debug/
> ln -s /sys/kernel/debug /debug
> echo -n 'module bonding +p' > /debug/dynamic_debug/control

I'm compiling a 3.11-rc3 version with dynamic_debug enabled in order
to be able to test this.

> Did you sniff externally, on the native device, bond slaves or on bond0?

The sniffing was done on both the bonding host (eth0 device) and the
remote host, the one with just the vlan.

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-01 20:21 ` Veaceslav Falico
@ 2013-08-02  7:30   ` Santiago Garcia Mantinan
  0 siblings, 0 replies; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-02  7:30 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev

2013/8/1 Veaceslav Falico <darkmag@gmail.com>:
>>         bond-slaves eth0.1002
>>         bond-mode active-backup
>>         bond-arp_validate 0
> Could you please try with arp_validate=1?

arp_validate=1 was what I first tried, then I thought that making
validations could be dropping packages so I went for the 0
validations, both had the same problem.

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-02  7:26   ` Santiago Garcia Mantinan
@ 2013-08-02  9:33     ` Santiago Garcia Mantinan
  0 siblings, 0 replies; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-02  9:33 UTC (permalink / raw)
  To: Erik Hugne; +Cc: netdev

2013/8/2 Santiago Garcia Mantinan <manty@manty.net>:
> 2013/8/1 Erik Hugne <erik.hugne@ericsson.com>
>> This have helped me troubleshoot various bonding problems in the past:
>> mount -t debugfs none /sys/kernel/debug/
>> ln -s /sys/kernel/debug /debug
>> echo -n 'module bonding +p' > /debug/dynamic_debug/control
>
> I'm compiling a 3.11-rc3 version with dynamic_debug enabled in order
> to be able to test this.

Done with this, running 3.11-rc3 with this debug activated, what I see
is that I'm consistently getting failures 1/4 of the arp probes, so it
gets three ok and then one fails, then again 3 ok and then one fails.
I got the same 1/4 ratio when testing with 2 secs and 5 secs of
arp_interval.

I'm pasting here the debug output in case this can help find what is
going on to somebody else, it hasn't helped me at all :-(

11:16:29 [ 1510.493414] bonding: event_dev: eth0, event: 15
11:16:29 [ 1510.493459] bonding: event_dev: eth0.1002, event: 10
11:16:29 [ 1510.493979] bonding: event_dev: eth0.1002, event: 5
11:16:29 [ 1510.526266] bonding: bond0: adding ARP target 192.168.1.1.
11:16:29 [ 1510.526391] bonding: bond0: Setting ARP monitoring interval to 2000.
11:16:29 [ 1510.530093] bonding: bond0: setting mode to active-backup (1).
11:16:29 [ 1510.530217] bonding: bond0: setting arp_validate to none (0).
11:16:29 [ 1510.537106] bonding: bond0: Adding slave eth0.1002.
11:16:29 [ 1510.537119] bonding: eth0.1002: ! NETIF_F_VLAN_CHALLENGED
11:16:29 [ 1510.537136] bonding: event_dev: eth0.1002, event: 14
11:16:29 [ 1510.537147] bonding: bond_dev=f5581000 slave_dev=f4dcc000
slave_dev->addr_len=6
11:16:29 [ 1510.537206] bonding: event_dev: bond0, event: 8
11:16:29 [ 1510.537213] bonding: IFF_MASTER
11:16:29 [ 1510.537244] bonding: event_dev: eth0.1002, event: 8
11:16:29 [ 1510.537283] bonding: event_dev: eth0.1002, event: 15
11:16:29 [ 1510.537308] bonding: event_dev: eth0.1002, event: d
11:16:29 [ 1510.537389] bonding: event_dev: eth0.1002, event: 1
11:16:29 [ 1510.537424] bonding: event_dev: bond0, event: b
11:16:29 [ 1510.537430] bonding: IFF_MASTER
11:16:29 [ 1510.537945] bonding: Initial state of slave_dev is BOND_LINK_UP
11:16:29 [ 1510.537955] bonding: bond0: making interface eth0.1002 the
new active one.
11:16:29 [ 1510.538032] bonding: event_dev: bond0, event: c
11:16:29 [ 1510.538039] bonding: IFF_MASTER
11:16:29 [ 1510.538051] bonding: bond0: first active interface up!
11:16:29 [ 1510.538073] bonding: bond0: enslaving eth0.1002 as an
active interface with an up link.
11:16:29 [ 1510.565482] bonding: event_dev: bond0, event: d
11:16:29 [ 1510.565490] bonding: IFF_MASTER
11:16:29 [ 1510.565564] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:29 [ 1510.565572] bonding: basa: target 192.168.1.1
11:16:29 [ 1510.565577] bonding: basa: empty vlan: arp_send
11:16:29 [ 1510.565587] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:29 [ 1510.565789] 8021q: adding VLAN 0 to HW filter on device bond0
11:16:29 [ 1510.565798] bonding: bond: bond0, vlan id 0
11:16:29 [ 1510.565803] bonding: added VLAN ID 0 on bond bond0
11:16:29 [ 1510.565873] bonding: event_dev: bond0, event: 1
11:16:29 [ 1510.565876] bonding: IFF_MASTER
11:16:30 [ 1511.492281] bonding: event_dev: bond0, event: 4
11:16:30 [ 1511.492293] bonding: IFF_MASTER
11:16:31 [ 1512.568138] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:31 [ 1512.568154] bonding: basa: target 192.168.1.1
11:16:31 [ 1512.568174] bonding: basa: rtdev == bond->dev: arp_send
11:16:31 [ 1512.568189] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:33 [ 1514.572184] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:33 [ 1514.572203] bonding: basa: target 192.168.1.1
11:16:33 [ 1514.572224] bonding: basa: rtdev == bond->dev: arp_send
11:16:33 [ 1514.572238] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:35 [ 1516.576150] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:35 [ 1516.576171] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:16:35 [ 1516.576225] bonding: bond0: now running without any active
interface !
11:16:35 [ 1516.576237] bonding: basa: target 192.168.1.1
11:16:35 [ 1516.576254] bonding: basa: rtdev == bond->dev: arp_send
11:16:35 [ 1516.576266] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:35 [ 1516.576512] bonding: event_dev: bond0, event: 4
11:16:35 [ 1516.576520] bonding: IFF_MASTER
11:16:37 [ 1518.580153] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:16:37 [ 1518.580174] bonding: bond0: link status definitely up for
interface eth0.1002.
11:16:37 [ 1518.580187] bonding: bond0: making interface eth0.1002 the
new active one.
11:16:37 [ 1518.580232] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:37 [ 1518.580437] bonding: event_dev: bond0, event: c
11:16:37 [ 1518.580444] bonding: IFF_MASTER
11:16:37 [ 1518.580621] bonding: event_dev: bond0, event: 13
11:16:37 [ 1518.580629] bonding: IFF_MASTER
11:16:37 [ 1518.580658] bonding: bond0: first active interface up!
11:16:37 [ 1518.580673] bonding: basa: target 192.168.1.1
11:16:37 [ 1518.580696] bonding: basa: rtdev == bond->dev: arp_send
11:16:37 [ 1518.580715] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:37 [ 1518.580920] bonding: event_dev: bond0, event: 4
11:16:37 [ 1518.580927] bonding: IFF_MASTER
11:16:39 [ 1520.584150] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:39 [ 1520.584170] bonding: basa: target 192.168.1.1
11:16:39 [ 1520.584195] bonding: basa: rtdev == bond->dev: arp_send
11:16:39 [ 1520.584209] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:41 [ 1522.588153] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:41 [ 1522.588174] bonding: basa: target 192.168.1.1
11:16:41 [ 1522.588198] bonding: basa: rtdev == bond->dev: arp_send
11:16:41 [ 1522.588211] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:43 [ 1524.592161] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:43 [ 1524.592182] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:16:43 [ 1524.592227] bonding: bond0: now running without any active
interface !
11:16:43 [ 1524.592238] bonding: basa: target 192.168.1.1
11:16:43 [ 1524.592253] bonding: basa: rtdev == bond->dev: arp_send
11:16:43 [ 1524.592267] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:43 [ 1524.592507] bonding: event_dev: bond0, event: 4
11:16:43 [ 1524.592516] bonding: IFF_MASTER
11:16:45 [ 1526.596178] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:16:45 [ 1526.596199] bonding: bond0: link status definitely up for
interface eth0.1002.
11:16:45 [ 1526.596212] bonding: bond0: making interface eth0.1002 the
new active one.
11:16:45 [ 1526.596249] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:45 [ 1526.596449] bonding: event_dev: bond0, event: c
11:16:45 [ 1526.596457] bonding: IFF_MASTER
11:16:45 [ 1526.596638] bonding: event_dev: bond0, event: 13
11:16:45 [ 1526.596646] bonding: IFF_MASTER
11:16:45 [ 1526.596676] bonding: bond0: first active interface up!
11:16:45 [ 1526.596691] bonding: basa: target 192.168.1.1
11:16:45 [ 1526.596713] bonding: basa: rtdev == bond->dev: arp_send
11:16:45 [ 1526.596730] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:45 [ 1526.596939] bonding: event_dev: bond0, event: 4
11:16:45 [ 1526.596947] bonding: IFF_MASTER
11:16:47 [ 1528.600160] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:47 [ 1528.600181] bonding: basa: target 192.168.1.1
11:16:47 [ 1528.600207] bonding: basa: rtdev == bond->dev: arp_send
11:16:47 [ 1528.600220] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:49 [ 1530.604152] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:49 [ 1530.604172] bonding: basa: target 192.168.1.1
11:16:49 [ 1530.604196] bonding: basa: rtdev == bond->dev: arp_send
11:16:49 [ 1530.604210] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:51 [ 1532.608155] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:51 [ 1532.608176] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:16:51 [ 1532.608232] bonding: bond0: now running without any active
interface !
11:16:51 [ 1532.608244] bonding: basa: target 192.168.1.1
11:16:51 [ 1532.608259] bonding: basa: rtdev == bond->dev: arp_send
11:16:51 [ 1532.608272] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:51 [ 1532.608505] bonding: event_dev: bond0, event: 4
11:16:51 [ 1532.608513] bonding: IFF_MASTER
11:16:53 [ 1534.612150] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:16:53 [ 1534.612170] bonding: bond0: link status definitely up for
interface eth0.1002.
11:16:53 [ 1534.612183] bonding: bond0: making interface eth0.1002 the
new active one.
11:16:53 [ 1534.612220] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:53 [ 1534.612419] bonding: event_dev: bond0, event: c
11:16:53 [ 1534.612427] bonding: IFF_MASTER
11:16:53 [ 1534.612606] bonding: event_dev: bond0, event: 13
11:16:53 [ 1534.612614] bonding: IFF_MASTER
11:16:53 [ 1534.612643] bonding: bond0: first active interface up!
11:16:53 [ 1534.612658] bonding: basa: target 192.168.1.1
11:16:53 [ 1534.612676] bonding: basa: rtdev == bond->dev: arp_send
11:16:53 [ 1534.612691] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:53 [ 1534.612934] bonding: event_dev: bond0, event: 4
11:16:53 [ 1534.612942] bonding: IFF_MASTER
11:16:55 [ 1536.616158] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:55 [ 1536.616180] bonding: basa: target 192.168.1.1
11:16:55 [ 1536.616204] bonding: basa: rtdev == bond->dev: arp_send
11:16:55 [ 1536.616218] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:57 [ 1538.620145] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:57 [ 1538.620166] bonding: basa: target 192.168.1.1
11:16:57 [ 1538.620179] bonding: basa: rtdev == bond->dev: arp_send
11:16:57 [ 1538.620192] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:59 [ 1540.624155] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:16:59 [ 1540.624176] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:16:59 [ 1540.624227] bonding: bond0: now running without any active
interface !
11:16:59 [ 1540.624238] bonding: basa: target 192.168.1.1
11:16:59 [ 1540.624253] bonding: basa: rtdev == bond->dev: arp_send
11:16:59 [ 1540.624266] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:16:59 [ 1540.624501] bonding: event_dev: bond0, event: 4
11:16:59 [ 1540.624509] bonding: IFF_MASTER
11:17:01 [ 1542.628132] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:17:01 [ 1542.628148] bonding: bond0: link status definitely up for
interface eth0.1002.
11:17:01 [ 1542.628158] bonding: bond0: making interface eth0.1002 the
new active one.
11:17:01 [ 1542.628198] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:01 [ 1542.628396] bonding: event_dev: bond0, event: c
11:17:01 [ 1542.628404] bonding: IFF_MASTER
11:17:01 [ 1542.628581] bonding: event_dev: bond0, event: 13
11:17:01 [ 1542.628589] bonding: IFF_MASTER
11:17:01 [ 1542.628619] bonding: bond0: first active interface up!
11:17:01 [ 1542.628634] bonding: basa: target 192.168.1.1
11:17:01 [ 1542.628657] bonding: basa: rtdev == bond->dev: arp_send
11:17:01 [ 1542.628675] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:01 [ 1542.628846] bonding: event_dev: bond0, event: 4
11:17:01 [ 1542.628856] bonding: IFF_MASTER
11:17:03 [ 1544.632138] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:03 [ 1544.632154] bonding: basa: target 192.168.1.1
11:17:03 [ 1544.632177] bonding: basa: rtdev == bond->dev: arp_send
11:17:03 [ 1544.632190] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:05 [ 1546.636147] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:05 [ 1546.636167] bonding: basa: target 192.168.1.1
11:17:05 [ 1546.636186] bonding: basa: rtdev == bond->dev: arp_send
11:17:05 [ 1546.636201] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:07 [ 1548.640129] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:07 [ 1548.640150] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:17:07 [ 1548.640208] bonding: bond0: now running without any active
interface !
11:17:07 [ 1548.640221] bonding: basa: target 192.168.1.1
11:17:07 [ 1548.640235] bonding: basa: rtdev == bond->dev: arp_send
11:17:07 [ 1548.640249] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:07 [ 1548.640524] bonding: event_dev: bond0, event: 4
11:17:07 [ 1548.640532] bonding: IFF_MASTER
11:17:09 [ 1550.644152] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:17:09 [ 1550.644172] bonding: bond0: link status definitely up for
interface eth0.1002.
11:17:09 [ 1550.644184] bonding: bond0: making interface eth0.1002 the
new active one.
11:17:09 [ 1550.644226] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:09 [ 1550.644421] bonding: event_dev: bond0, event: c
11:17:09 [ 1550.644429] bonding: IFF_MASTER
11:17:09 [ 1550.644608] bonding: event_dev: bond0, event: 13
11:17:09 [ 1550.644616] bonding: IFF_MASTER
11:17:09 [ 1550.644645] bonding: bond0: first active interface up!
11:17:09 [ 1550.644659] bonding: basa: target 192.168.1.1
11:17:09 [ 1550.644683] bonding: basa: rtdev == bond->dev: arp_send
11:17:09 [ 1550.644697] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:09 [ 1550.644866] bonding: event_dev: bond0, event: 4
11:17:09 [ 1550.644875] bonding: IFF_MASTER
11:17:11 [ 1552.648128] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:11 [ 1552.648144] bonding: basa: target 192.168.1.1
11:17:11 [ 1552.648167] bonding: basa: rtdev == bond->dev: arp_send
11:17:11 [ 1552.648180] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:13 [ 1554.652132] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:13 [ 1554.652149] bonding: basa: target 192.168.1.1
11:17:13 [ 1554.652166] bonding: basa: rtdev == bond->dev: arp_send
11:17:13 [ 1554.652179] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:15 [ 1556.656131] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:15 [ 1556.656147] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:17:15 [ 1556.656198] bonding: bond0: now running without any active
interface !
11:17:15 [ 1556.656210] bonding: basa: target 192.168.1.1
11:17:15 [ 1556.656224] bonding: basa: rtdev == bond->dev: arp_send
11:17:15 [ 1556.656238] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:15 [ 1556.656488] bonding: event_dev: bond0, event: 4
11:17:15 [ 1556.656497] bonding: IFF_MASTER
11:17:17 [ 1558.660147] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:17:17 [ 1558.660168] bonding: bond0: link status definitely up for
interface eth0.1002.
11:17:17 [ 1558.660180] bonding: bond0: making interface eth0.1002 the
new active one.
11:17:17 [ 1558.660221] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:17 [ 1558.660424] bonding: event_dev: bond0, event: c
11:17:17 [ 1558.660432] bonding: IFF_MASTER
11:17:17 [ 1558.660611] bonding: event_dev: bond0, event: 13
11:17:17 [ 1558.660619] bonding: IFF_MASTER
11:17:17 [ 1558.660649] bonding: bond0: first active interface up!
11:17:17 [ 1558.660665] bonding: basa: target 192.168.1.1
11:17:17 [ 1558.660683] bonding: basa: rtdev == bond->dev: arp_send
11:17:17 [ 1558.660699] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:17 [ 1558.660905] bonding: event_dev: bond0, event: 4
11:17:17 [ 1558.660913] bonding: IFF_MASTER
11:17:19 [ 1560.664082] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:19 [ 1560.664103] bonding: basa: target 192.168.1.1
11:17:19 [ 1560.664128] bonding: basa: rtdev == bond->dev: arp_send
11:17:19 [ 1560.664141] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:21 [ 1562.668149] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:21 [ 1562.668169] bonding: basa: target 192.168.1.1
11:17:21 [ 1562.668189] bonding: basa: rtdev == bond->dev: arp_send
11:17:21 [ 1562.668202] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:23 [ 1564.672153] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:23 [ 1564.672174] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:17:23 [ 1564.672230] bonding: bond0: now running without any active
interface !
11:17:23 [ 1564.672243] bonding: basa: target 192.168.1.1
11:17:23 [ 1564.672258] bonding: basa: rtdev == bond->dev: arp_send
11:17:23 [ 1564.672271] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:23 [ 1564.672512] bonding: event_dev: bond0, event: 4
11:17:23 [ 1564.672520] bonding: IFF_MASTER
11:17:25 [ 1566.676149] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:17:25 [ 1566.676170] bonding: bond0: link status definitely up for
interface eth0.1002.
11:17:25 [ 1566.676182] bonding: bond0: making interface eth0.1002 the
new active one.
11:17:25 [ 1566.676222] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:25 [ 1566.676422] bonding: event_dev: bond0, event: c
11:17:25 [ 1566.676431] bonding: IFF_MASTER
11:17:25 [ 1566.676610] bonding: event_dev: bond0, event: 13
11:17:25 [ 1566.676617] bonding: IFF_MASTER
11:17:25 [ 1566.676646] bonding: bond0: first active interface up!
11:17:25 [ 1566.676660] bonding: basa: target 192.168.1.1
11:17:25 [ 1566.676678] bonding: basa: rtdev == bond->dev: arp_send
11:17:25 [ 1566.676695] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:25 [ 1566.676906] bonding: event_dev: bond0, event: 4
11:17:25 [ 1566.676914] bonding: IFF_MASTER
11:17:27 [ 1568.680155] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:27 [ 1568.680176] bonding: basa: target 192.168.1.1
11:17:27 [ 1568.680200] bonding: basa: rtdev == bond->dev: arp_send
11:17:27 [ 1568.680214] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:29 [ 1570.684149] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:29 [ 1570.684171] bonding: basa: target 192.168.1.1
11:17:29 [ 1570.684189] bonding: basa: rtdev == bond->dev: arp_send
11:17:29 [ 1570.684203] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:31 [ 1572.688152] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:31 [ 1572.688173] bonding: bond0: link status definitely down
for interface eth0.1002, disabling it
11:17:31 [ 1572.688225] bonding: bond0: now running without any active
interface !
11:17:31 [ 1572.688237] bonding: basa: target 192.168.1.1
11:17:31 [ 1572.688253] bonding: basa: rtdev == bond->dev: arp_send
11:17:31 [ 1572.688266] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:31 [ 1572.688505] bonding: event_dev: bond0, event: 4
11:17:31 [ 1572.688513] bonding: IFF_MASTER
11:17:33 [ 1574.692147] bonding: bond_should_notify_peers: bond bond0 slave NULL
11:17:33 [ 1574.692168] bonding: bond0: link status definitely up for
interface eth0.1002.
11:17:33 [ 1574.692179] bonding: bond0: making interface eth0.1002 the
new active one.
11:17:33 [ 1574.692221] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:33 [ 1574.692424] bonding: event_dev: bond0, event: c
11:17:33 [ 1574.692432] bonding: IFF_MASTER
11:17:33 [ 1574.692610] bonding: event_dev: bond0, event: 13
11:17:33 [ 1574.692618] bonding: IFF_MASTER
11:17:33 [ 1574.692648] bonding: bond0: first active interface up!
11:17:33 [ 1574.692664] bonding: basa: target 192.168.1.1
11:17:33 [ 1574.692681] bonding: basa: rtdev == bond->dev: arp_send
11:17:33 [ 1574.692697] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:33 [ 1574.692907] bonding: event_dev: bond0, event: 4
11:17:33 [ 1574.692915] bonding: IFF_MASTER
11:17:35 [ 1576.696156] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:35 [ 1576.696176] bonding: basa: target 192.168.1.1
11:17:35 [ 1576.696203] bonding: basa: rtdev == bond->dev: arp_send
11:17:35 [ 1576.696218] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0
11:17:37 [ 1578.700129] bonding: bond_should_notify_peers: bond bond0
slave eth0.1002
11:17:37 [ 1578.700145] bonding: basa: target 192.168.1.1
11:17:37 [ 1578.700162] bonding: basa: rtdev == bond->dev: arp_send
11:17:37 [ 1578.700175] bonding: arp 1 on slave eth0.1002: dst
192.168.1.1 src 192.168.1.2 vid 0

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-01 12:11 bonding + arp monitoring fails if interface is a vlan Santiago Garcia Mantinan
  2013-08-01 13:00 ` Erik Hugne
  2013-08-01 20:21 ` Veaceslav Falico
@ 2013-08-02 11:58 ` Nikolay Aleksandrov
  2013-08-02 15:49   ` Jay Vosburgh
  2013-08-04 10:45   ` Santiago Garcia Mantinan
  2 siblings, 2 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2013-08-02 11:58 UTC (permalink / raw)
  To: Santiago Garcia Mantinan; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 2800 bytes --]

On 08/01/2013 02:11 PM, Santiago Garcia Mantinan wrote:
> Hi!
> 
> I'm trying to setup a bond of a couple of vlans, these vlans are different
> paths to an upstream switch from a local switch.  I want to do arp
> monitoring of the link in order for the bonding interface to know which path
> is ok and wich one is broken.  If I set it up using arp monitoring and
> without using vlans it works ok, it also works if I set it up using vlans
> but without arp monitoring, so the broken setup seems to be with bonding +
> arp monitoring + vlans. Here is a schema:
> 
>  -------------
> |Remote Switch|
>  -------------
>    |      |
>    P      P
>    A      A
>    T      T
>    H      H
>    1      2
>    |      |
>  ------------
> |Local switch|
>  ------------
>       |
>       | VLAN for PATH1
>       | VLAN for PATH2
>       |
>  Linux machine
> 
> The broken setup seems to work but arp monitoring makes it loose the logical
> link from time to time, thus changing to other slave if available.  What I
> saw when monitoring this with tcpdump is that all the arp requests were
> going out and that all the replies where coming in, so acording to the
> traffic seen on tcpdump the link should have been stable, but
> /proc/net/bonding/bond0 showed the link failures increasing and when testing
> with just a vlan interface I was loosing ping when the link was going down.
> 
> I've tried this on Debian wheezy with its 3.2.46 kernel and also the 3.10.3
> version in unstable, the tests where done on a couple of machines using a 32
> bits kernel with different nics (r8169 and skge).
> 
> I created a small lab to replicate the problem, on this setup I avoided all
> the switching and I directly connected the machine with bonding to another
> Linux on which I just had eth0.1002 configured with ip 192.168.1.1, the
> results where the same as in the full scenario, link on the bonding slave
> was going down from time to time.
> 
> This is the setup on the bonding interface.
> 
> auto bond0
> iface bond0 inet static
>         address 192.168.1.2
>         netmask 255.255.255.0
>         bond-slaves eth0.1002
>         bond-mode active-backup
>         bond-arp_validate 0
>         bond-arp_interval 5000
>         bond-arp_ip_target 192.168.1.1
>         pre-up ip link set eth0 up || true
>         pre-up ip link add link eth0 name eth0.1002 type vlan id 1002 || true
>         down ip link delete eth0.1002 || true
> 
I believe that it is because dev_trans_start() returns 0 for 8021q devices and
so the calculations if the slave has transmitted are wrong, and the flip-flop
happens.
Please try the attached patch, it should resolve your issue (basically it gets
the dev_trans_start of the vlan's underlying device if a vlan is found).

The patch is against Linus' tree.

Cheers,
 Nik



[-- Attachment #2: bond-trans-start.patch --]
[-- Type: text/x-patch, Size: 1729 bytes --]

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 07f257d4..6aac0ae 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -665,6 +665,16 @@ static int bond_check_dev_link(struct bonding *bond,
 	return reporting ? -1 : BMSR_LSTATUS;
 }
 
+static unsigned long bond_dev_trans_start(struct net_device *dev)
+{
+        struct net_device *real_dev = dev;
+
+        if (dev->priv_flags & IFF_802_1Q_VLAN)
+                real_dev = vlan_dev_real_dev(dev);
+
+        return dev_trans_start(real_dev);
+}
+
 /*----------------------------- Multicast list ------------------------------*/
 
 /*
@@ -2750,7 +2760,7 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
 	 *       so it can wait
 	 */
 	bond_for_each_slave(bond, slave, i) {
-		unsigned long trans_start = dev_trans_start(slave->dev);
+		unsigned long trans_start = bond_dev_trans_start(slave->dev);
 
 		if (slave->link != BOND_LINK_UP) {
 			if (time_in_range(jiffies,
@@ -2912,7 +2922,7 @@ static int bond_ab_arp_inspect(struct bonding *bond, int delta_in_ticks)
 		 * - (more than 2*delta since receive AND
 		 *    the bond has an IP address)
 		 */
-		trans_start = dev_trans_start(slave->dev);
+		trans_start = bond_dev_trans_start(slave->dev);
 		if (bond_is_active_slave(slave) &&
 		    (!time_in_range(jiffies,
 			trans_start - delta_in_ticks,
@@ -2947,7 +2957,7 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks)
 			continue;
 
 		case BOND_LINK_UP:
-			trans_start = dev_trans_start(slave->dev);
+			trans_start = bond_dev_trans_start(slave->dev);
 			if ((!bond->curr_active_slave &&
 			     time_in_range(jiffies,
 					   trans_start - delta_in_ticks,

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-02 11:58 ` Nikolay Aleksandrov
@ 2013-08-02 15:49   ` Jay Vosburgh
  2013-08-02 16:13     ` Nikolay Aleksandrov
  2013-08-04 10:45   ` Santiago Garcia Mantinan
  1 sibling, 1 reply; 18+ messages in thread
From: Jay Vosburgh @ 2013-08-02 15:49 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: Santiago Garcia Mantinan, netdev

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>On 08/01/2013 02:11 PM, Santiago Garcia Mantinan wrote:
>> Hi!
>> 
>> I'm trying to setup a bond of a couple of vlans, these vlans are different
>> paths to an upstream switch from a local switch.  I want to do arp
>> monitoring of the link in order for the bonding interface to know which path
>> is ok and wich one is broken.  If I set it up using arp monitoring and
>> without using vlans it works ok, it also works if I set it up using vlans
>> but without arp monitoring, so the broken setup seems to be with bonding +
>> arp monitoring + vlans. Here is a schema:
>> 
>>  -------------
>> |Remote Switch|
>>  -------------
>>    |      |
>>    P      P
>>    A      A
>>    T      T
>>    H      H
>>    1      2
>>    |      |
>>  ------------
>> |Local switch|
>>  ------------
>>       |
>>       | VLAN for PATH1
>>       | VLAN for PATH2
>>       |
>>  Linux machine
>> 
>> The broken setup seems to work but arp monitoring makes it loose the logical
>> link from time to time, thus changing to other slave if available.  What I
>> saw when monitoring this with tcpdump is that all the arp requests were
>> going out and that all the replies where coming in, so acording to the
>> traffic seen on tcpdump the link should have been stable, but
>> /proc/net/bonding/bond0 showed the link failures increasing and when testing
>> with just a vlan interface I was loosing ping when the link was going down.
>> 
>> I've tried this on Debian wheezy with its 3.2.46 kernel and also the 3.10.3
>> version in unstable, the tests where done on a couple of machines using a 32
>> bits kernel with different nics (r8169 and skge).
>> 
>> I created a small lab to replicate the problem, on this setup I avoided all
>> the switching and I directly connected the machine with bonding to another
>> Linux on which I just had eth0.1002 configured with ip 192.168.1.1, the
>> results where the same as in the full scenario, link on the bonding slave
>> was going down from time to time.
>> 
>> This is the setup on the bonding interface.
>> 
>> auto bond0
>> iface bond0 inet static
>>         address 192.168.1.2
>>         netmask 255.255.255.0
>>         bond-slaves eth0.1002
>>         bond-mode active-backup
>>         bond-arp_validate 0
>>         bond-arp_interval 5000
>>         bond-arp_ip_target 192.168.1.1
>>         pre-up ip link set eth0 up || true
>>         pre-up ip link add link eth0 name eth0.1002 type vlan id 1002 || true
>>         down ip link delete eth0.1002 || true
>> 
>I believe that it is because dev_trans_start() returns 0 for 8021q devices and
>so the calculations if the slave has transmitted are wrong, and the flip-flop
>happens.
>Please try the attached patch, it should resolve your issue (basically it gets
>the dev_trans_start of the vlan's underlying device if a vlan is found).
>
>The patch is against Linus' tree.
>
>Cheers,
> Nik
>
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 07f257d4..6aac0ae 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -665,6 +665,16 @@ static int bond_check_dev_link(struct bonding *bond,
> 	return reporting ? -1 : BMSR_LSTATUS;
> }
>
>+static unsigned long bond_dev_trans_start(struct net_device *dev)
>+{
>+        struct net_device *real_dev = dev;
>+
>+        if (dev->priv_flags & IFF_802_1Q_VLAN)
>+                real_dev = vlan_dev_real_dev(dev);
>+
>+        return dev_trans_start(real_dev);
>+}

	Should this handle nested VLANs?  E.g.,

static unsigned long bond_dev_trans_start(struct net_device *dev)
{
	while (dev->priv_flags & IFF_802_1Q_VLAN)
		dev = vlan_dev_real_dev(dev);

        return dev_trans_start(dev);
}

	Also, this (ARP monitoring of a VLAN slave) has likely never
worked, and therefore this change should be considered for -stable.

	-J

>+
> /*----------------------------- Multicast list ------------------------------*/
>
> /*
>@@ -2750,7 +2760,7 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
> 	 *       so it can wait
> 	 */
> 	bond_for_each_slave(bond, slave, i) {
>-		unsigned long trans_start = dev_trans_start(slave->dev);
>+		unsigned long trans_start = bond_dev_trans_start(slave->dev);
>
> 		if (slave->link != BOND_LINK_UP) {
> 			if (time_in_range(jiffies,
>@@ -2912,7 +2922,7 @@ static int bond_ab_arp_inspect(struct bonding *bond, int delta_in_ticks)
> 		 * - (more than 2*delta since receive AND
> 		 *    the bond has an IP address)
> 		 */
>-		trans_start = dev_trans_start(slave->dev);
>+		trans_start = bond_dev_trans_start(slave->dev);
> 		if (bond_is_active_slave(slave) &&
> 		    (!time_in_range(jiffies,
> 			trans_start - delta_in_ticks,
>@@ -2947,7 +2957,7 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks)
> 			continue;
>
> 		case BOND_LINK_UP:
>-			trans_start = dev_trans_start(slave->dev);
>+			trans_start = bond_dev_trans_start(slave->dev);
> 			if ((!bond->curr_active_slave &&
> 			     time_in_range(jiffies,
> 					   trans_start - delta_in_ticks,

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-02 15:49   ` Jay Vosburgh
@ 2013-08-02 16:13     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Aleksandrov @ 2013-08-02 16:13 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Santiago Garcia Mantinan, netdev

On 08/02/2013 05:49 PM, Jay Vosburgh wrote:
> Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> 
>> On 08/01/2013 02:11 PM, Santiago Garcia Mantinan wrote:
>>> Hi!
>>>
>>> I'm trying to setup a bond of a couple of vlans, these vlans are different
>>> paths to an upstream switch from a local switch.  I want to do arp
>>> monitoring of the link in order for the bonding interface to know which path
>>> is ok and wich one is broken.  If I set it up using arp monitoring and
>>> without using vlans it works ok, it also works if I set it up using vlans
>>> but without arp monitoring, so the broken setup seems to be with bonding +
>>> arp monitoring + vlans. Here is a schema:
>>>
>>>  -------------
>>> |Remote Switch|
>>>  -------------
>>>    |      |
>>>    P      P
>>>    A      A
>>>    T      T
>>>    H      H
>>>    1      2
>>>    |      |
>>>  ------------
>>> |Local switch|
>>>  ------------
>>>       |
>>>       | VLAN for PATH1
>>>       | VLAN for PATH2
>>>       |
>>>  Linux machine
>>>
>>> The broken setup seems to work but arp monitoring makes it loose the logical
>>> link from time to time, thus changing to other slave if available.  What I
>>> saw when monitoring this with tcpdump is that all the arp requests were
>>> going out and that all the replies where coming in, so acording to the
>>> traffic seen on tcpdump the link should have been stable, but
>>> /proc/net/bonding/bond0 showed the link failures increasing and when testing
>>> with just a vlan interface I was loosing ping when the link was going down.
>>>
>>> I've tried this on Debian wheezy with its 3.2.46 kernel and also the 3.10.3
>>> version in unstable, the tests where done on a couple of machines using a 32
>>> bits kernel with different nics (r8169 and skge).
>>>
>>> I created a small lab to replicate the problem, on this setup I avoided all
>>> the switching and I directly connected the machine with bonding to another
>>> Linux on which I just had eth0.1002 configured with ip 192.168.1.1, the
>>> results where the same as in the full scenario, link on the bonding slave
>>> was going down from time to time.
>>>
>>> This is the setup on the bonding interface.
>>>
>>> auto bond0
>>> iface bond0 inet static
>>>         address 192.168.1.2
>>>         netmask 255.255.255.0
>>>         bond-slaves eth0.1002
>>>         bond-mode active-backup
>>>         bond-arp_validate 0
>>>         bond-arp_interval 5000
>>>         bond-arp_ip_target 192.168.1.1
>>>         pre-up ip link set eth0 up || true
>>>         pre-up ip link add link eth0 name eth0.1002 type vlan id 1002 || true
>>>         down ip link delete eth0.1002 || true
>>>
>> I believe that it is because dev_trans_start() returns 0 for 8021q devices and
>> so the calculations if the slave has transmitted are wrong, and the flip-flop
>> happens.
>> Please try the attached patch, it should resolve your issue (basically it gets
>> the dev_trans_start of the vlan's underlying device if a vlan is found).
>>
>> The patch is against Linus' tree.
>>
>> Cheers,
>> Nik
>>
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 07f257d4..6aac0ae 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -665,6 +665,16 @@ static int bond_check_dev_link(struct bonding *bond,
>> 	return reporting ? -1 : BMSR_LSTATUS;
>> }
>>
>> +static unsigned long bond_dev_trans_start(struct net_device *dev)
>> +{
>> +        struct net_device *real_dev = dev;
>> +
>> +        if (dev->priv_flags & IFF_802_1Q_VLAN)
>> +                real_dev = vlan_dev_real_dev(dev);
>> +
>> +        return dev_trans_start(real_dev);
>> +}
> 
> 	Should this handle nested VLANs?  E.g.,
> 
> static unsigned long bond_dev_trans_start(struct net_device *dev)
> {
> 	while (dev->priv_flags & IFF_802_1Q_VLAN)
> 		dev = vlan_dev_real_dev(dev);
> 
>         return dev_trans_start(dev);
> }
> 
> 	Also, this (ARP monitoring of a VLAN slave) has likely never
> worked, and therefore this change should be considered for -stable.
> 
> 	-J
> 
Yes, it should :-)
Thanks Jay, I'll re-submit it as a proper patch for -net in a bit.

Nik

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-02 11:58 ` Nikolay Aleksandrov
  2013-08-02 15:49   ` Jay Vosburgh
@ 2013-08-04 10:45   ` Santiago Garcia Mantinan
  2013-08-05 10:26     ` Santiago Garcia Mantinan
  1 sibling, 1 reply; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-04 10:45 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev

2013/8/2 Nikolay Aleksandrov <nikolay@redhat.com>:
> I believe that it is because dev_trans_start() returns 0 for 8021q devices and
> so the calculations if the slave has transmitted are wrong, and the flip-flop
> happens.
> Please try the attached patch, it should resolve your issue (basically it gets
> the dev_trans_start of the vlan's underlying device if a vlan is found).

Thanks, patched and compiling, I'll try today with my laptops and
tomorrow at the lab I had setup and then at the original machine.

I'll let you know how things go.

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-05 10:26     ` Santiago Garcia Mantinan
@ 2013-08-05 10:26       ` Nikolay Aleksandrov
  2013-08-07  7:26         ` Santiago Garcia Mantinan
  0 siblings, 1 reply; 18+ messages in thread
From: Nikolay Aleksandrov @ 2013-08-05 10:26 UTC (permalink / raw)
  To: Santiago Garcia Mantinan; +Cc: netdev

On 08/05/2013 12:26 PM, Santiago Garcia Mantinan wrote:
> 2013/8/4 Santiago Garcia Mantinan <manty@manty.net>:
>> 2013/8/2 Nikolay Aleksandrov <nikolay@redhat.com>:
>>> I believe that it is because dev_trans_start() returns 0 for 8021q devices and
>>> so the calculations if the slave has transmitted are wrong, and the flip-flop
>>> happens.
>>> Please try the attached patch, it should resolve your issue (basically it gets
>>> the dev_trans_start of the vlan's underlying device if a vlan is found).
>>
>> Thanks, patched and compiling, I'll try today with my laptops and
>> tomorrow at the lab I had setup and then at the original machine.
>>
>> I'll let you know how things go.
> 
> Ok, initial tests seem to show that a bonding defined like I had on my
> very basic setup that I sent to the list is now working.
> 
> What doesn't seem to be working is if I set it up using bonding under
> the vlans and then doing a bond of those, I mean:
> 
> iface bond0 inet manual
>         bond-slaves eth0
>         bond-mode 802.3ad
>         bond-miimon 100
> ...
> iface bond2 inet static
>         address 192.168.1.2
>         netmask 255.255.255.0
>         bond-slaves bond0.1001 bond0.1002
>         bond-mode active-backup
>         bond-arp_validate 0
>         bond-arp_interval 2000
>         bond-arp_ip_target 192.168.1.1
> ...
> 
> Should this bond of bonds work?
> 
No, because we take the first non-vlan's interface trans_start after the patch
which in this case is a bonding interface which also doesn't update its
trans_start, i.e. bond over bond (or over vlans over bond) with arp monitoring
shouldn't work.

> I'm doing more tests to make sure that the basic eth0.1001 and
> eth0.1002 works 100% after finding that the bond of bonds wasn't
> working ok, just in case the basic was also failing, but at least the
> double bond is failing and basic bond seems to work ok.
> 
> Regards.
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-04 10:45   ` Santiago Garcia Mantinan
@ 2013-08-05 10:26     ` Santiago Garcia Mantinan
  2013-08-05 10:26       ` Nikolay Aleksandrov
  0 siblings, 1 reply; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-05 10:26 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev

2013/8/4 Santiago Garcia Mantinan <manty@manty.net>:
> 2013/8/2 Nikolay Aleksandrov <nikolay@redhat.com>:
>> I believe that it is because dev_trans_start() returns 0 for 8021q devices and
>> so the calculations if the slave has transmitted are wrong, and the flip-flop
>> happens.
>> Please try the attached patch, it should resolve your issue (basically it gets
>> the dev_trans_start of the vlan's underlying device if a vlan is found).
>
> Thanks, patched and compiling, I'll try today with my laptops and
> tomorrow at the lab I had setup and then at the original machine.
>
> I'll let you know how things go.

Ok, initial tests seem to show that a bonding defined like I had on my
very basic setup that I sent to the list is now working.

What doesn't seem to be working is if I set it up using bonding under
the vlans and then doing a bond of those, I mean:

iface bond0 inet manual
        bond-slaves eth0
        bond-mode 802.3ad
        bond-miimon 100
...
iface bond2 inet static
        address 192.168.1.2
        netmask 255.255.255.0
        bond-slaves bond0.1001 bond0.1002
        bond-mode active-backup
        bond-arp_validate 0
        bond-arp_interval 2000
        bond-arp_ip_target 192.168.1.1
...

Should this bond of bonds work?

I'm doing more tests to make sure that the basic eth0.1001 and
eth0.1002 works 100% after finding that the bond of bonds wasn't
working ok, just in case the basic was also failing, but at least the
double bond is failing and basic bond seems to work ok.

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-05 10:26       ` Nikolay Aleksandrov
@ 2013-08-07  7:26         ` Santiago Garcia Mantinan
  2013-08-07  7:39           ` Nikolay Aleksandrov
  0 siblings, 1 reply; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-07  7:26 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev

2013/8/5 Nikolay Aleksandrov <nikolay@redhat.com>:
>>         bond-arp_validate 0
> No, because we take the first non-vlan's interface trans_start after the patch
> which in this case is a bonding interface which also doesn't update its
> trans_start, i.e. bond over bond (or over vlans over bond) with arp monitoring
> shouldn't work.

Ok, after several days of testing it seems to work ok if I go with
arp_validate 0, going for arp_validate 1 would cause the link failure
count to be increased from time to time, is this ok?

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-07  7:26         ` Santiago Garcia Mantinan
@ 2013-08-07  7:39           ` Nikolay Aleksandrov
  2013-08-07 10:44             ` Santiago Garcia Mantinan
  0 siblings, 1 reply; 18+ messages in thread
From: Nikolay Aleksandrov @ 2013-08-07  7:39 UTC (permalink / raw)
  To: Santiago Garcia Mantinan; +Cc: netdev

On 08/07/2013 09:26 AM, Santiago Garcia Mantinan wrote:
> 2013/8/5 Nikolay Aleksandrov <nikolay@redhat.com>:
>>>         bond-arp_validate 0
>> No, because we take the first non-vlan's interface trans_start after the patch
>> which in this case is a bonding interface which also doesn't update its
>> trans_start, i.e. bond over bond (or over vlans over bond) with arp monitoring
>> shouldn't work.
> 
> Ok, after several days of testing it seems to work ok if I go with
> arp_validate 0, going for arp_validate 1 would cause the link failure
> count to be increased from time to time, is this ok?
> 
> Regards.
> 
If arp_interval is changed then you have to disable the interface (e.g.
ifconfig bondX down) and enable it again (ifconfig bondX up), or set it
while the interface is down. Also there're pr_debug()s in bond_validate_arp
and bond_arp_rcv that you can enable to check if it's validated properly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-07  7:39           ` Nikolay Aleksandrov
@ 2013-08-07 10:44             ` Santiago Garcia Mantinan
  2013-08-20  8:05               ` Santiago Garcia Mantinan
  0 siblings, 1 reply; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-07 10:44 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev

2013/8/7 Nikolay Aleksandrov <nikolay@redhat.com>:
>> Ok, after several days of testing it seems to work ok if I go with
>> arp_validate 0, going for arp_validate 1 would cause the link failure
>> count to be increased from time to time, is this ok?
>>
> If arp_interval is changed then you have to disable the interface (e.g.
> ifconfig bondX down) and enable it again (ifconfig bondX up), or set it
> while the interface is down. Also there're pr_debug()s in bond_validate_arp
> and bond_arp_rcv that you can enable to check if it's validated properly.

I don't quite get what you mean here, if you want me to activate de
debug and check that please explain me what I should do.

Aside this, I've been doing some other tests like enabling arp
validate with xor balance and I've got a problem:

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (xor)
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 2000
ARP IP target/s (n.n.n.n form): 192.168.1.1

Slave Interface: eth1.1001
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:23:7d:30:bc:48
Slave queue ID: 0

Slave Interface: eth1.1002
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:23:7d:30:bc:48
Slave queue ID: 0

Slave Interface: eth2.1001
MII Status: down
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:23:7d:30:bc:48
Slave queue ID: 0

Slave Interface: eth2.1002
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:23:7d:30:bc:48

The arp target is only visible through the vlan 1002, not the vlan
1001, then eth1.1001 and eth2.1001 shoult both be down, however
eth1.1001 is up. I'm monitoring eth1.1001 and I can see the arp
queries but no replies, then... why is it up?

arp validation is incompatible with xor balance? if so... why is
eth2.1001 set to down?

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-07 10:44             ` Santiago Garcia Mantinan
@ 2013-08-20  8:05               ` Santiago Garcia Mantinan
  2013-08-20 10:11                 ` Nikolay Aleksandrov
  0 siblings, 1 reply; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-20  8:05 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev

Hi!

Sorry it took me so long to reply back. I've been doing more tests on
xor mode and I see that arp monitoring is not working at all. I
haven't found any doc that says which modes should be compatible with
arp monitoring, maybe xor mode shouldn't be used at all.

My last setup is a Linux with a couple of vlans both interfaces
(eth2.1001 and eth2.1002) with IP 192.168.1.1 (no bonding at all) and
another Linux machine with a 3.11-rc3 with Nicolay's arp fix for
bonding and a bond configured like this:

iface bond0 inet static
        address 192.168.1.2
        netmask 255.255.255.0
        bond-slaves eth0.1001 eth0.1002 eth1.1001 eth1.1002
        bond-mode balance-xor
        bond-arp_validate 0
        bond-arp_interval 2000
        bond-arp_ip_target 192.168.1.1

A silly switch connects the couple of ethernets of the machine with
the bond to the interface of the not bonded machine.

What I saw was that the bonded machine didn't detect the ifconfig down
of the interfaces of the not bonded machine at all. That drove me to
the hypothesis that the bonded machine was considering its own traffic
(there was no traffic but the arp requests of the bonding) as
indication that the link was ok.

To test the hypothesis, when the not bonded machine (192.168.1.1)
which is the target for arp requests was unplugged and the bonding was
seeing all interfaces up (not detecting that the other machine was not
responding) I unplugged one of the bonded interfaces and all 4 slaves
went to down, then if I replugged it all 4 would go up.

Maybe this is something to be expected due to arp monitoring not
working with balance-xor, but I haven't found any doc saying this.

If you need the debug info for this I can send it, but the events show
nothing, as there are no event saying that link is lost or anything
:-(

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-20  8:05               ` Santiago Garcia Mantinan
@ 2013-08-20 10:11                 ` Nikolay Aleksandrov
  2013-08-21  7:39                   ` Santiago Garcia Mantinan
  0 siblings, 1 reply; 18+ messages in thread
From: Nikolay Aleksandrov @ 2013-08-20 10:11 UTC (permalink / raw)
  To: Santiago Garcia Mantinan; +Cc: netdev

On 08/20/2013 10:05 AM, Santiago Garcia Mantinan wrote:
> Hi!
> 
> Sorry it took me so long to reply back. I've been doing more tests on
> xor mode and I see that arp monitoring is not working at all. I
> haven't found any doc that says which modes should be compatible with
> arp monitoring, maybe xor mode shouldn't be used at all.
> 
> My last setup is a Linux with a couple of vlans both interfaces
> (eth2.1001 and eth2.1002) with IP 192.168.1.1 (no bonding at all) and
> another Linux machine with a 3.11-rc3 with Nicolay's arp fix for
> bonding and a bond configured like this:
> 
> iface bond0 inet static
>         address 192.168.1.2
>         netmask 255.255.255.0
>         bond-slaves eth0.1001 eth0.1002 eth1.1001 eth1.1002
>         bond-mode balance-xor
>         bond-arp_validate 0
>         bond-arp_interval 2000
>         bond-arp_ip_target 192.168.1.1
> 
> A silly switch connects the couple of ethernets of the machine with
> the bond to the interface of the not bonded machine.
> 
> What I saw was that the bonded machine didn't detect the ifconfig down
> of the interfaces of the not bonded machine at all. That drove me to
> the hypothesis that the bonded machine was considering its own traffic
> (there was no traffic but the arp requests of the bonding) as
> indication that the link was ok.
> 
> To test the hypothesis, when the not bonded machine (192.168.1.1)
> which is the target for arp requests was unplugged and the bonding was
> seeing all interfaces up (not detecting that the other machine was not
> responding) I unplugged one of the bonded interfaces and all 4 slaves
> went to down, then if I replugged it all 4 would go up.
> 
> Maybe this is something to be expected due to arp monitoring not
> working with balance-xor, but I haven't found any doc saying this.
> 
> If you need the debug info for this I can send it, but the events show
> nothing, as there are no event saying that link is lost or anything
> :-(
> 
> Regards.
> 
Hi,
This setup works for me, what might be wrong with your setup is that you connect
all 4 ports to a "dumb" switch, and you have the same vlans over the real
devices that are connected so they see each other's packets and the port's
last_rx gets updated so they stay up.
I tried your setup with a "smart" switch so the ports couldn't see each other
and only the one that saw 192.168.1.1 was up, and the moment 192.168.1.1 went
down - the port went down in the bonding.

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bonding + arp monitoring fails if interface is a vlan
  2013-08-20 10:11                 ` Nikolay Aleksandrov
@ 2013-08-21  7:39                   ` Santiago Garcia Mantinan
  0 siblings, 0 replies; 18+ messages in thread
From: Santiago Garcia Mantinan @ 2013-08-21  7:39 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev

Hi!

I think we have to clarify the setup...

>> iface bond0 inet static
>>         address 192.168.1.2
>>         netmask 255.255.255.0
>>         bond-slaves eth0.1001 eth0.1002 eth1.1001 eth1.1002
>>         bond-mode balance-xor
>>         bond-arp_validate 0
>>         bond-arp_interval 2000
>>         bond-arp_ip_target 192.168.1.1

> This setup works for me, what might be wrong with your setup is that you connect
> all 4 ports to a "dumb" switch,

What I have is three ports, not 4, I have two network cards on the
bonded machine and one on the not bonded machine, so I have three
ports. On the not bonded machine I configure the two vlan interfaces
over the same physical ethernet like this:
ifconfig eth2.1001 192.168.1.1
ifconfig eth2.1002 192.168.1.1

> and you have the same vlans over the real
> devices that are connected so they see each other's packets and the port's
> last_rx gets updated so they stay up.

I'd like to clarify this a bit, reading the bonding.txt file (the
howto) specially the arp_all_targets option (I haven't set this on my
setup) one would think that only a arp reply from at least one of the
specified targets had to be received in order for the link to be
considered on good state, not any traffic, specially if the traffic is
generated by your very own bonding driver. Isn't this like that?

What I'm trying to check on the real world scenario is if the gw,
which is on a remote location, is available, but I can have local
traffic that could be incrementing the counters.

> I tried your setup with a "smart" switch so the ports couldn't see each other
> and only the one that saw 192.168.1.1 was up, and the moment 192.168.1.1 went
> down - the port went down in the bonding.

I think that the problem here is not the "dumb" or "smart" switch. I
believe we are having different setups somehow. Please, if you don't
understand anything on my setup (two machines, one with the bonding
config I explained, and the other with one card and the ifconfig
commands I said up there) just let me know.

My first "dumb" switch was the switch of a soho adsl wifi router, then
I tried a soho "dumb" 8 ports 10/100 switch, then I tried an old
Cabletron SSR2000 where I had to define the two vlans on the three
ports and make these ports tagged ports, then I tried on a Enterasys
B5 (where I also had to specify that this ports had those vlans as
egress and tagged). On the smart machines the slaves were considered
to be down when vlans were not configured, as it was dropping all
traffic, but once the vlans were setup the slaves came up.

The behaviour I get is the same on "dumb" and "smart" switches, when I
have the eth2.1001 and 1002 interfaces up everything is like expected,
but then I run:
ifconfig eth2.1001 0.0.0.0 down
ifconfig eth2.1002 0.0.0.0 down
and the bonded machine still sees all the slaves up even though I can
see on the tcpdump I run on eth2 on the target machine that all 4
requests are arriving but none of them is being replied.

I have checked the counters you said and indeed they are being
increased, both in "dumb" and "smart" switches (note that I haven't
defined any bond on the switch side). I believe that either switch has
to forward what comes from eth0.1001 (connected to switch port X) to
eth1.1001 (connected to switch port Y) as they are broadcast messages
and I haven't defined any bonding, so he has to forward what comes on
port X to port Y, not doing so would break broadcast for a lot of
setups. What doesn't make sense to me is the assumption that
increasing counters when none of the specified targets are replying
means we have a good link.

I don't know what else to add to clarify what is going on, please, if
something is not clear ask me.

Thanks for your replies.

Regards.
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-08-21  7:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01 12:11 bonding + arp monitoring fails if interface is a vlan Santiago Garcia Mantinan
2013-08-01 13:00 ` Erik Hugne
2013-08-02  7:26   ` Santiago Garcia Mantinan
2013-08-02  9:33     ` Santiago Garcia Mantinan
2013-08-01 20:21 ` Veaceslav Falico
2013-08-02  7:30   ` Santiago Garcia Mantinan
2013-08-02 11:58 ` Nikolay Aleksandrov
2013-08-02 15:49   ` Jay Vosburgh
2013-08-02 16:13     ` Nikolay Aleksandrov
2013-08-04 10:45   ` Santiago Garcia Mantinan
2013-08-05 10:26     ` Santiago Garcia Mantinan
2013-08-05 10:26       ` Nikolay Aleksandrov
2013-08-07  7:26         ` Santiago Garcia Mantinan
2013-08-07  7:39           ` Nikolay Aleksandrov
2013-08-07 10:44             ` Santiago Garcia Mantinan
2013-08-20  8:05               ` Santiago Garcia Mantinan
2013-08-20 10:11                 ` Nikolay Aleksandrov
2013-08-21  7:39                   ` Santiago Garcia Mantinan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).