netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Uwe Koziolek <uwe.koziolek@redknee.com>
To: Jarod Wilson <jarod@redhat.com>,
	Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Veaceslav Falico <vfalico@gmail.com>,
	<linux-kernel@vger.kernel.org>,
	Andy Gospodarek <gospo@cumulusnetworks.com>,
	<netdev@vger.kernel.org>
Subject: Re: [PATCH] net/bonding: send arp in interval if no active slave
Date: Wed, 2 Sep 2015 01:15:33 +0200	[thread overview]
Message-ID: <55E63195.400@redknee.com> (raw)
In-Reply-To: <55E4D35B.4090502@redhat.com>

On Tue, 01.09.2015 at 00:21 +0200 Jarod Wilson wrote:
> On 2015-08-17 4:51 PM, Uwe Koziolek wrote:
>> On Mon, Aug 17, 2015 at 09:14PM +0200, Jay Vosburgh wrote:
>>> Uwe Koziolek <uwe.koziolek@redknee.com> wrote:
>>>
>>>> On2015-08-17 07:12 PM,Jarod Wilson wrote:
> ...
>>>>> Uwe, can you perhaps further enlighten us as to what num_grat_arp
>>>>> settings were tried that didn't help? I'm still of the mind that if
>>>>> num_grat_arp *didn't* help, we probably need to do something keyed 
>>>>> off
>>>>> num_grat_arp.
>>>> The bonding slaves are connected to high available switches, each 
>>>> of the
>>>> slaves is connected to a different switch. If the bond is starting, 
>>>> only
>>>> the selected slave sends one arp-request. If a matching 
>>>> arp_response was
>>>> received, this slave and the bond is going into state up, sending the
>>>> gratitious arps...
>>>> But if you got no arp reply the next slave was selected.
>>>> With most of the newer switches, not overloaded, or with other 
>>>> software
>>>> bugs, or with a single switch configuration, you would get a arp
>>>> response
>>>> on the first arp request.
>>>> But in case of high availability configuration with non perfect 
>>>> switches
>>>> like HP ProCurve 54xx, also with some Cisco models, you may not get a
>>>> response on the first arp request.
>>>>
>>>> I have seen network snoops, there the switches are not responding 
>>>> to the
>>>> first arp request on slave 1, the second arp request was sent on 
>>>> slave 2
>>>> but the response was received on slave one,  and all following arp
>>>> requests are anwsered on the wrong slave for a longer time.
>>>     Could you elaborate on the exact "high availability
>>> configuration" here, including the model(s) of switch(es) involved?
>>>
>>>     Is this some kind of race between the switch or switches
>>> updating the forwarding tables and the bond flip flopping between the
>>> slaves?  E.g., source MAC from ARP sent on slave 1 is used to populate
>>> the forwarding table, but (for whatever reason) there is no reply.  ARP
>>> on slave 2 is sent (using the same source MAC, unless you set
>>> fail_over_mac), but forwarding tables still send that MAC to slave 
>>> 1, so
>>> reply is sent there.
>> High availability:
>> 2 managed switches with routing capabilities have an interconnect.
>> One slave of a bonding interface is connected to the first switch, the
>> second slave is connected to the other switch.
>> The switch models are HP ProCurve 5406 and HP ProCurve 5412. As far as i
>> remember also HP E 3500 and  E 3800 are also
>> affected, for the affected Cisco models I can't answer today.
>> Affected single switch configurations was not seen.
>>
>> Yes, race conditions with delayed upgrades of the forwarding tables is a
>> well matching explanation for the problem.
>>
>>>> The proposed change sents up to 3 arp requests on a down bond using 
>>>> the
>>>> same slave, delayed by arp_interval.
>>>> Using problematic switches i have seen the the arp response on the 
>>>> right
>>>> slave at latest on the second arp request. So the bond is going into
>>>> state
>>>> up.
>>>>
>>>> How does it works:
>>>> The bonds in up state are handled on the beginning of 
>>>> bond_ab_arp_probe
>>>> procedure, the other part of this procedure is handling the slave
>>>> change.
>>>> The proposed change is bypassing the slave change for 2 additional 
>>>> calls
>>>> of bond_ab_arp_probe.
>>>> Now the retries are not only for an up bond available, they are also
>>>> implemented for a down bond.
>>>     Does this delay failover or bringup on switches that are not
>>> "problematic"?  I.e., if arp_interval is, say, 1000 (1 second), will
>>> this impact failover / recovery times?
>>>
>>>     -J
>> It depends.
>> failover times are not impacted, this is handled different.
>> Only the transition from a down bonding interface (bond and all slaves
>> are down) to the state up can be increased by up to 2 times 
>> arp_interval,
>> If the selected interface did not came up .If well working switches are
>> used, and everything other is also ok, there are no impacts.
>
> Jay, any further thoughts on this given Uwe's reply? Uwe, did you have 
> a chance to get affected Cisco model numbers too?
>
The affected Cisco model was a C3750.

  reply	other threads:[~2015-09-01 23:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-17 16:23 [PATCH] net/bonding: send arp in interval if no active slave Jarod Wilson
2015-08-17 16:55 ` Veaceslav Falico
2015-08-17 17:12   ` Jarod Wilson
2015-08-17 18:56     ` Uwe Koziolek
2015-08-17 19:14       ` Jay Vosburgh
2015-08-17 20:51         ` Uwe Koziolek
2015-08-31 22:21           ` Jarod Wilson
2015-09-01 23:15             ` Uwe Koziolek [this message]
2015-09-01 15:41           ` Andy Gospodarek
2015-09-01 23:10             ` Uwe Koziolek
2015-09-03 15:05               ` Jay Vosburgh
2015-09-04 11:04                 ` Uwe Koziolek
2015-09-28 13:31                   ` Jarod Wilson
2015-10-06 19:53                     ` [PATCH v4] " Jarod Wilson
2015-10-06 19:58                       ` Jarod Wilson
2015-10-07 12:03                       ` Nikolay Aleksandrov
2015-10-07 13:29                         ` Jarod Wilson
2015-10-09 14:36                           ` Jarod Wilson
2015-10-09 15:25                             ` Nikolay Aleksandrov
2015-10-09 15:31                             ` Jay Vosburgh
2015-10-12 15:33                               ` Jarod Wilson
2015-10-30 18:59                           ` Uwe Koziolek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55E63195.400@redknee.com \
    --to=uwe.koziolek@redknee.com \
    --cc=gospo@cumulusnetworks.com \
    --cc=jarod@redhat.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=vfalico@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).