netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Strange routing with VRF and 5.2.7+
@ 2019-09-10 22:17 Ben Greear
  2019-09-11  1:08 ` Ben Greear
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2019-09-10 22:17 UTC (permalink / raw)
  To: netdev

Today we were testing creating 200 virtual station vdevs on ath9k, and using
VRF for the routing.

This really slows down the machine in question.

During the minutes that it takes to bring these up and configure them,
we loose network connectivity on the management port.

If I do 'ip route show', it just shows the default route out of eth0, and
the subnet route.  But, if I try to ping the gateway, I get an ICMP error
coming back from the gateway of one of the virtual stations (which should be
safely using VRFs and so not in use when I do a plain 'ping' from the shell).

I tried running tshark on eth0 in the background and running ping, and it captures
no packets leaving eth0.

After some time (and during this time, my various scripts will be (re)configuring
vrfs and stations and related vrf routing tables and such,
but should *not* be messing with the main routing table, then suddenly
things start working again.

I am curious if anyone has seen anything similar or has suggestions for more
ways to debug this.  It seems reproducible, but it is a pain to
debug.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Strange routing with VRF and 5.2.7+
  2019-09-10 22:17 Strange routing with VRF and 5.2.7+ Ben Greear
@ 2019-09-11  1:08 ` Ben Greear
  2019-09-20 15:57   ` Ben Greear
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2019-09-11  1:08 UTC (permalink / raw)
  To: netdev

On 9/10/19 3:17 PM, Ben Greear wrote:
> Today we were testing creating 200 virtual station vdevs on ath9k, and using
> VRF for the routing.

Looks like the same issue happens w/out VRF, but there I have oodles of routing
rules, so it is an area ripe for failure.

Will upgrade to 5.2.14+ and retest, and try 4.20 as well....

Thanks,
Ben

> 
> This really slows down the machine in question.
> 
> During the minutes that it takes to bring these up and configure them,
> we loose network connectivity on the management port.
> 
> If I do 'ip route show', it just shows the default route out of eth0, and
> the subnet route.  But, if I try to ping the gateway, I get an ICMP error
> coming back from the gateway of one of the virtual stations (which should be
> safely using VRFs and so not in use when I do a plain 'ping' from the shell).
> 
> I tried running tshark on eth0 in the background and running ping, and it captures
> no packets leaving eth0.
> 
> After some time (and during this time, my various scripts will be (re)configuring
> vrfs and stations and related vrf routing tables and such,
> but should *not* be messing with the main routing table, then suddenly
> things start working again.
> 
> I am curious if anyone has seen anything similar or has suggestions for more
> ways to debug this.  It seems reproducible, but it is a pain to
> debug.
> 
> Thanks,
> Ben
> 


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Strange routing with VRF and 5.2.7+
  2019-09-11  1:08 ` Ben Greear
@ 2019-09-20 15:57   ` Ben Greear
  2019-09-22 19:23     ` David Ahern
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2019-09-20 15:57 UTC (permalink / raw)
  To: netdev

On 9/10/19 6:08 PM, Ben Greear wrote:
> On 9/10/19 3:17 PM, Ben Greear wrote:
>> Today we were testing creating 200 virtual station vdevs on ath9k, and using
>> VRF for the routing.
> 
> Looks like the same issue happens w/out VRF, but there I have oodles of routing
> rules, so it is an area ripe for failure.
> 
> Will upgrade to 5.2.14+ and retest, and try 4.20 as well....

Turns out, this was ipsec (strongswan) inserting a rule that pointed to a table
that we then used for a vrf w/out realizing the rule was added.

Stopping strongswan and/or reconfiguring how routing tables are assigned
resolved the issue.

Thanks,
Ben

> 
> Thanks,
> Ben
> 
>>
>> This really slows down the machine in question.
>>
>> During the minutes that it takes to bring these up and configure them,
>> we loose network connectivity on the management port.
>>
>> If I do 'ip route show', it just shows the default route out of eth0, and
>> the subnet route.  But, if I try to ping the gateway, I get an ICMP error
>> coming back from the gateway of one of the virtual stations (which should be
>> safely using VRFs and so not in use when I do a plain 'ping' from the shell).
>>
>> I tried running tshark on eth0 in the background and running ping, and it captures
>> no packets leaving eth0.
>>
>> After some time (and during this time, my various scripts will be (re)configuring
>> vrfs and stations and related vrf routing tables and such,
>> but should *not* be messing with the main routing table, then suddenly
>> things start working again.
>>
>> I am curious if anyone has seen anything similar or has suggestions for more
>> ways to debug this.  It seems reproducible, but it is a pain to
>> debug.
>>
>> Thanks,
>> Ben
>>
> 
> 


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Strange routing with VRF and 5.2.7+
  2019-09-20 15:57   ` Ben Greear
@ 2019-09-22 19:23     ` David Ahern
  2019-09-30 18:45       ` Ben Greear
  0 siblings, 1 reply; 6+ messages in thread
From: David Ahern @ 2019-09-22 19:23 UTC (permalink / raw)
  To: Ben Greear, netdev

On 9/20/19 9:57 AM, Ben Greear wrote:
> On 9/10/19 6:08 PM, Ben Greear wrote:
>> On 9/10/19 3:17 PM, Ben Greear wrote:
>>> Today we were testing creating 200 virtual station vdevs on ath9k,
>>> and using
>>> VRF for the routing.
>>
>> Looks like the same issue happens w/out VRF, but there I have oodles
>> of routing
>> rules, so it is an area ripe for failure.
>>
>> Will upgrade to 5.2.14+ and retest, and try 4.20 as well....
> 
> Turns out, this was ipsec (strongswan) inserting a rule that pointed to
> a table
> that we then used for a vrf w/out realizing the rule was added.
> 
> Stopping strongswan and/or reconfiguring how routing tables are assigned
> resolved the issue.
> 

Hi Ben:

Since you are the pioneer with vrf and ipsec, can you add an ipsec
section with some notes to Documentation/networking/vrf.txt?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Strange routing with VRF and 5.2.7+
  2019-09-22 19:23     ` David Ahern
@ 2019-09-30 18:45       ` Ben Greear
  2019-10-14 17:33         ` Ben Greear
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2019-09-30 18:45 UTC (permalink / raw)
  To: David Ahern, netdev

On 9/22/19 12:23 PM, David Ahern wrote:
> On 9/20/19 9:57 AM, Ben Greear wrote:
>> On 9/10/19 6:08 PM, Ben Greear wrote:
>>> On 9/10/19 3:17 PM, Ben Greear wrote:
>>>> Today we were testing creating 200 virtual station vdevs on ath9k,
>>>> and using
>>>> VRF for the routing.
>>>
>>> Looks like the same issue happens w/out VRF, but there I have oodles
>>> of routing
>>> rules, so it is an area ripe for failure.
>>>
>>> Will upgrade to 5.2.14+ and retest, and try 4.20 as well....
>>
>> Turns out, this was ipsec (strongswan) inserting a rule that pointed to
>> a table
>> that we then used for a vrf w/out realizing the rule was added.
>>
>> Stopping strongswan and/or reconfiguring how routing tables are assigned
>> resolved the issue.
>>
> 
> Hi Ben:
> 
> Since you are the pioneer with vrf and ipsec, can you add an ipsec
> section with some notes to Documentation/networking/vrf.txt?

I need to to some more testing, an initial attempt to reproduce my working
config on another system did not work properly, and I have not yet dug into
it.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Strange routing with VRF and 5.2.7+
  2019-09-30 18:45       ` Ben Greear
@ 2019-10-14 17:33         ` Ben Greear
  0 siblings, 0 replies; 6+ messages in thread
From: Ben Greear @ 2019-10-14 17:33 UTC (permalink / raw)
  To: David Ahern, netdev

On 9/30/19 11:45 AM, Ben Greear wrote:
> On 9/22/19 12:23 PM, David Ahern wrote:
>> On 9/20/19 9:57 AM, Ben Greear wrote:
>>> On 9/10/19 6:08 PM, Ben Greear wrote:
>>>> On 9/10/19 3:17 PM, Ben Greear wrote:
>>>>> Today we were testing creating 200 virtual station vdevs on ath9k,
>>>>> and using
>>>>> VRF for the routing.
>>>>
>>>> Looks like the same issue happens w/out VRF, but there I have oodles
>>>> of routing
>>>> rules, so it is an area ripe for failure.
>>>>
>>>> Will upgrade to 5.2.14+ and retest, and try 4.20 as well....
>>>
>>> Turns out, this was ipsec (strongswan) inserting a rule that pointed to
>>> a table
>>> that we then used for a vrf w/out realizing the rule was added.
>>>
>>> Stopping strongswan and/or reconfiguring how routing tables are assigned
>>> resolved the issue.
>>>
>>
>> Hi Ben:
>>
>> Since you are the pioneer with vrf and ipsec, can you add an ipsec
>> section with some notes to Documentation/networking/vrf.txt?
> 
> I need to to some more testing, an initial attempt to reproduce my working
> config on another system did not work properly, and I have not yet dug into
> it.

I'm still grinding out the bugs...  Here is my current quandry.

In the VRF I have the 'real' device, say eth4 with IP 192.168.5.5.  This talks to
the VPN gateway device at 192.168.5.1.

When I add the xfrm, it is given the address 192.168.10.100.

I need all traffic routing out the vrf to use the xfrm as source IP,
except the eth4 still needs to be able to talk to the 5.1 device
(I think?)

Evidently, adding this type of route below will do the trick, at least in
non-vrf setup, and with this route in its own table that is queried after
'local' routing table, but before the others via use of a fairly generic rule....

default via 192.168.5.1 dev enp1s0 proto static src 192.168.10.100

I am guessing that in VRF world, I can get rid of the rule, and replace the
existing default route (given to eth4 when it does DHCP or is statically assigned)
with something like the above.  And, maybe I need a special route for the VPN
gateway itself as destination so that ipsec logic on eth4 can still talk to it?

(I am thinking of the case where the VPN gateway is not on the local subnet
and so we have to route to it special???)

Any insight is welcome.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-14 17:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-10 22:17 Strange routing with VRF and 5.2.7+ Ben Greear
2019-09-11  1:08 ` Ben Greear
2019-09-20 15:57   ` Ben Greear
2019-09-22 19:23     ` David Ahern
2019-09-30 18:45       ` Ben Greear
2019-10-14 17:33         ` Ben Greear

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).