All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: GRE-NAT broken - SOLVED
@ 2018-01-31  5:29 Grant Taylor
  2018-01-31  5:33 ` Grant Taylor
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Grant Taylor @ 2018-01-31  5:29 UTC (permalink / raw)
  To: lartc


[-- Attachment #1.1: Type: text/plain, Size: 5184 bytes --]

On 01/30/2018 03:37 PM, Grant Taylor wrote:
> It seems as if something is intercepting the packets.  -  I doubt that 
> it's the NAT module, but I can't rule it out.

Well, I think I ran into the same problem in my tests.

Spoiler:  I did manage to overcome it.

I think the Connection Tracking was part of my (initial?) problem.

> Wait.  tcpdump shows that packets are entering one network interface but 
> they aren't leaving another network interface?

I was seeing this behavior too.

> That sounds like something is filtering the packets.

I think connection tracking (thus NAT) was (at least) part of the culprit.

> I feel like the kicker is that the traffic is never making it out of the 
> local system to the far side.  As such the far side never gets anything, 
> much less replies.

I don't know if this was the case for my testing or not.  I did all of 
my testing from the far side in.

> Ya, the [UNREPLIED] bothers me.  As does the fact that you aren't seeing 
> the traffic leaving the host's external interface.

The [UNREPLIED] was the kicker for me.

> I'd look more into the TRACE option (target) that you seem to have 
> enabled in the raw table.  That should give you more information about 
> the packets flowing through the kernel.

I ended up not using TRACE.

I'm not sure why I did a "conntrack -D", but as soon as I did, my long 
running ping started working.

Upon retesting I can confirm that "conntrack -D" was required to make 
things work.

Further testing and using "conntrack -L" showed that there were some 
connection tracking states that were in an [UNREPLIED] state.  I think 
that "conntrack -D" cleared the stale connections and allowed things to 
start working.

> My hunch is that the packets aren't making it out onto the wire for some 
> reason.  Thus the lack of reply.

After the testing that I did, I suspect that packets did make it onto 
the wire, but were swallowed by connection tracking, thus NAT as you had 
originally thought.

> I'll see if I can't throw together a PoC in Network namespaces this 
> evening to evaluate if NATing GRE works.  -  I'd like to test NATing 
> different sets of endpoints (1:1) and NATing multiple remote endpoints 
> to one local endpoint (many:1).

I threw together a Proof of Concept using Network Namespaces, using a 
pair of OVSs (bri1 & bri2) and a pair of vEths (between R3 / R2 and R2 / 
R1).

I was able to establish a initially establish a GRE tunnels between H1 / 
H3 and H2 / H4.

After figuring out the the connection tracking problem I was also able 
to bring up additional GRE tunnels between H1 / H4 and H2 / H3.

Take a look at the attached GRE-NAT.sh script.  -  I do take some 
liberties and set up aliases to make things easier.  (Read: I'm lazy and 
don't want to type any more characters than I have to.)

alias vsctl='ovs-vsctl'
alias h1='ip netns exec h1'
alias h2='ip netns exec h2'
alias h3='ip netns exec h3'
alias h4='ip netns exec h4'
alias r1='ip netns exec r1'
alias r2='ip netns exec r2'
alias r3='ip netns exec r3'

The network between:

H1 / H2 / R3 is Test-Net-1, 192.0.2.0/24
R3 / R2 is Test-Net-2, 198.51.100.0/24
R2 / R1 is Test-Net-3, 203.0.113.0/24
R1 / H3 / H4 is RFC 1918 private, 192.168.0.0/24

I addressed the GRE tunnels as RFC 1918 private, 10.<Left #>.<Right 
#>.<Device #/24.

R3 & R1 are numbered the way that they are so that their device # 
doesn't conflict with something local.

I did manage to get the PoC to work without needing to issue the 
"conntrack -D" command by simply moving the NAT rules earlier in the 
script before I tried to establish the tunnels.

I can only surmise that there was some sort of bad state that connection 
tracking learned that couldn't fix itself.  -  This was sort of random 
and unpredictable, much like what you're saying.  -  It also likely has 
to do with what end talks first.

I found that I could get things to start working if I issued the 
following command:

(ip netns exec) r1 conntrack -D

Ultimately I was able to issue the following commands:

h1 ping -c 4 10.1.3.3
h1 ping -c 4 10.1.4.4

h2 ping -c 4 10.2.3.3
h2 ping -c 4 10.2.4.4

h3 ping -c 4 10.1.3.1
h3 ping -c 4 10.2.3.2

h4 ping -c 4 10.1.4.1
h4 ping -c 4 10.2.4.2

I /think/ that this is what you were wanting to do.  And, I think you 
were correct all along in that NAT ~> connection tracking was in fact 
messing with you.

Anyway, have fun with the PoC.  Ask if you have any questions about what 
/ why / how I did something.

Oh, ya, I did have the following GRE related modules loaded:

# lsmod | grep -i gre
nf_conntrack_proto_gre    16384  0
nf_nat_proto_gre          16384  0
ip_gre                    24576  0
ip_tunnel                 28672  1 ip_gre
gre                       16384  1 ip_gre

I'm running kernel 4.9.76-gentoo-r1.

> You might be onto something about the first packet.  At least as far as 
> what connection tracking sees.

I think the kicker has to do with connection tracking learning state on 
the first packet.



-- 
Grant. . . .
unix || die

[-- Attachment #1.2: GRE-NAT.png --]
[-- Type: image/png, Size: 6793 bytes --]

[-- Attachment #1.3: GRE-NAT.sh --]
[-- Type: application/x-sh, Size: 3556 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
@ 2018-01-31  5:33 ` Grant Taylor
  2018-02-01 10:34 ` Matthias Walther
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-01-31  5:33 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

On 01/30/2018 10:29 PM, Grant Taylor wrote:
> Anyway, have fun with the PoC.  Ask if you have any questions about what 
> / why / how I did something.

Here's how I cleaned up the PoC in between tests.

# vsctl del-br bri1
# vsctl del-br bri2
# ip netns del h1
# ip netns del h2
# ip netns del h3
# ip netns del h4
# ip netns del r1
# ip netns del r2
# ip netns del r3

I actually had the following long command, again because I'm lazy.

# vim Desktop/GRE-NAT.sh && vsctl del-br bri1 && vsctl del-br bri2 && ip 
netns del h1 && ip netns del h2 && ip netns del h3 && ip netns del h4 && 
ip netns del r1 && ip netns del r2 && ip netns del r3 && source 
Desktop/GRE-NAT.sh



-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
  2018-01-31  5:33 ` Grant Taylor
@ 2018-02-01 10:34 ` Matthias Walther
  2018-02-01 18:31 ` Grant Taylor
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Matthias Walther @ 2018-02-01 10:34 UTC (permalink / raw)
  To: lartc

Hello Grant,

I think I missed an email, as I don't know whom you're quoting here.

So after all it's a race condition during start up. It's awesome, that
you found the cause! Thanks for all your work, you put into this.

Last night, I managed to make all connections work by executing
conntrack -D on the hypervisor. Awesome!

But this morning, there were some broken tunnels again. This doesn't
seem to last very long.

You wrote, that I should change the order in startup. So I just should
postpone the starting of the VMs for a little while? Or do I need to
change the order in my iptables rules somehow?

To me this looks like a bug in the conntrack module. It shouldn't be
necessary to clean the table manually once in a while.

Bye,
Matthias


Am 31.01.2018 um 06:29 schrieb Grant Taylor:
> On 01/30/2018 03:37 PM, Grant Taylor wrote:
>> It seems as if something is intercepting the packets.  -  I doubt
>> that it's the NAT module, but I can't rule it out.
>
> Well, I think I ran into the same problem in my tests.
>
> Spoiler:  I did manage to overcome it.
>
> I think the Connection Tracking was part of my (initial?) problem.
>
>> Wait.  tcpdump shows that packets are entering one network interface
>> but they aren't leaving another network interface?
>
> I was seeing this behavior too.
>
>> That sounds like something is filtering the packets.
>
> I think connection tracking (thus NAT) was (at least) part of the
> culprit.
>
>> I feel like the kicker is that the traffic is never making it out of
>> the local system to the far side.  As such the far side never gets
>> anything, much less replies.
>
> I don't know if this was the case for my testing or not.  I did all of
> my testing from the far side in.
>
>> Ya, the [UNREPLIED] bothers me.  As does the fact that you aren't
>> seeing the traffic leaving the host's external interface.
>
> The [UNREPLIED] was the kicker for me.
>
>> I'd look more into the TRACE option (target) that you seem to have
>> enabled in the raw table.  That should give you more information
>> about the packets flowing through the kernel.
>
> I ended up not using TRACE.
>
> I'm not sure why I did a "conntrack -D", but as soon as I did, my long
> running ping started working.
>
> Upon retesting I can confirm that "conntrack -D" was required to make
> things work.
>
> Further testing and using "conntrack -L" showed that there were some
> connection tracking states that were in an [UNREPLIED] state.  I think
> that "conntrack -D" cleared the stale connections and allowed things
> to start working.
>
>> My hunch is that the packets aren't making it out onto the wire for
>> some reason.  Thus the lack of reply.
>
> After the testing that I did, I suspect that packets did make it onto
> the wire, but were swallowed by connection tracking, thus NAT as you
> had originally thought.
>
>> I'll see if I can't throw together a PoC in Network namespaces this
>> evening to evaluate if NATing GRE works.  -  I'd like to test NATing
>> different sets of endpoints (1:1) and NATing multiple remote
>> endpoints to one local endpoint (many:1).
>
> I threw together a Proof of Concept using Network Namespaces, using a
> pair of OVSs (bri1 & bri2) and a pair of vEths (between R3 / R2 and R2
> / R1).
>
> I was able to establish a initially establish a GRE tunnels between H1
> / H3 and H2 / H4.
>
> After figuring out the the connection tracking problem I was also able
> to bring up additional GRE tunnels between H1 / H4 and H2 / H3.
>
> Take a look at the attached GRE-NAT.sh script.  -  I do take some
> liberties and set up aliases to make things easier.  (Read: I'm lazy
> and don't want to type any more characters than I have to.)
>
> alias vsctl='ovs-vsctl'
> alias h1='ip netns exec h1'
> alias h2='ip netns exec h2'
> alias h3='ip netns exec h3'
> alias h4='ip netns exec h4'
> alias r1='ip netns exec r1'
> alias r2='ip netns exec r2'
> alias r3='ip netns exec r3'
>
> The network between:
>
> H1 / H2 / R3 is Test-Net-1, 192.0.2.0/24
> R3 / R2 is Test-Net-2, 198.51.100.0/24
> R2 / R1 is Test-Net-3, 203.0.113.0/24
> R1 / H3 / H4 is RFC 1918 private, 192.168.0.0/24
>
> I addressed the GRE tunnels as RFC 1918 private, 10.<Left #>.<Right
> #>.<Device #/24.
>
> R3 & R1 are numbered the way that they are so that their device #
> doesn't conflict with something local.
>
> I did manage to get the PoC to work without needing to issue the
> "conntrack -D" command by simply moving the NAT rules earlier in the
> script before I tried to establish the tunnels.
>
> I can only surmise that there was some sort of bad state that
> connection tracking learned that couldn't fix itself.  -  This was
> sort of random and unpredictable, much like what you're saying.  -  It
> also likely has to do with what end talks first.
>
> I found that I could get things to start working if I issued the
> following command:
>
> (ip netns exec) r1 conntrack -D
>
> Ultimately I was able to issue the following commands:
>
> h1 ping -c 4 10.1.3.3
> h1 ping -c 4 10.1.4.4
>
> h2 ping -c 4 10.2.3.3
> h2 ping -c 4 10.2.4.4
>
> h3 ping -c 4 10.1.3.1
> h3 ping -c 4 10.2.3.2
>
> h4 ping -c 4 10.1.4.1
> h4 ping -c 4 10.2.4.2
>
> I /think/ that this is what you were wanting to do.  And, I think you
> were correct all along in that NAT ~> connection tracking was in fact
> messing with you.
>
> Anyway, have fun with the PoC.  Ask if you have any questions about
> what / why / how I did something.
>
> Oh, ya, I did have the following GRE related modules loaded:
>
> # lsmod | grep -i gre
> nf_conntrack_proto_gre    16384  0
> nf_nat_proto_gre          16384  0
> ip_gre                    24576  0
> ip_tunnel                 28672  1 ip_gre
> gre                       16384  1 ip_gre
>
> I'm running kernel 4.9.76-gentoo-r1.
>
>> You might be onto something about the first packet.  At least as far
>> as what connection tracking sees.
>
> I think the kicker has to do with connection tracking learning state
> on the first packet.
>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
  2018-01-31  5:33 ` Grant Taylor
  2018-02-01 10:34 ` Matthias Walther
@ 2018-02-01 18:31 ` Grant Taylor
  2018-02-02 12:33 ` Matthias Walther
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-02-01 18:31 UTC (permalink / raw)
  To: lartc

On 02/01/2018 03:34 AM, Matthias Walther wrote:
> Hello Grant,

Hi Matthias,

> I think I missed an email, as I don't know whom you're quoting here.

I was actually quoting myself in an email I sent about 7 hours prior. 
Here's a link to it:  https://www.spinics.net/lists/lartc/msg23508.html

> So after all it's a race condition during start up. It's awesome, that 
> you found the cause! Thanks for all your work, you put into this.

You're welcome.

> Last night, I managed to make all connections work by executing conntrack 
> -D on the hypervisor. Awesome!

Yay!

> But this morning, there were some broken tunnels again. This doesn't 
> seem to last very long.

Hum.  :-/

> You wrote, that I should change the order in startup. So I just should 
> postpone the starting of the VMs for a little while? Or do I need to 
> change the order in my iptables rules somehow?

I think it's an issue between when the IPTables rules are entered vs 
when the GRE tunnels are brought up.

You might not have the ability to control when GRE packets come in from 
the remote sites.  Thus connection tracking may learn about something 
before IPTables is ready.

I think that you will need to do some more digging into connection 
tracking and how to interpret the output.  At least enough so that you 
can learn what is necessary to surgically add / remove entries to the 
connection tracking table.  That way you won't need to blow the entire 
connection tracking table away like "conntrack -D" does.

> To me this looks like a bug in the conntrack module. It shouldn't be 
> necessary to clean the table manually once in a while.

I don't know if it's a bug in connection tracking or not.  It might 
simply be a race condition.  I.e. depending on which direction CT sees 
GRE packets from first, and possibly associated replies.  Possibly 
leading to an undesired state ala race condition.

Note:  CT state expiration can also likely cause the "seen first" issue 
again, even after the systems have been up and the tunnels have passed 
traffic.

Try clearing the connection tracking table, and then starting a 
persistent ping through each tunnel and seeing if the tunnels stay up 
and functional.  -  I.e. constantly send traffic through the tunnels to 
make sure that the connection tracking table entries don't become stale, 
which leads to them getting purged, which means a new "first seen" 
condition again.

If the persistent ping does work, 1) you have a workaround, and 2) you 
know that it's likely CT state expiration, which means that there may be 
a tunable that can help prevent the relevant state information from 
expiring.



-- 
Grant. . . .
unix || die

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (2 preceding siblings ...)
  2018-02-01 18:31 ` Grant Taylor
@ 2018-02-02 12:33 ` Matthias Walther
  2018-02-02 20:21 ` Grant Taylor
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Matthias Walther @ 2018-02-02 12:33 UTC (permalink / raw)
  To: lartc

Hello,

Am 01.02.2018 um 19:31 schrieb Grant Taylor:
> I think that you will need to do some more digging into connection
> tracking and how to interpret the output.  At least enough so that you
> can learn what is necessary to surgically add / remove entries to the
> connection tracking table. 
do you have some reading material on this besides the man page of course?

> I don't know if it's a bug in connection tracking or not.  It might
> simply be a race condition.
I'd consider a race condition in this paticular case a bug. As GRE is
stateless, the NAT module needs to be capable of handling first
connection attemps from both sides.

I haven't seen the code so far, maybe I just need another source-NAT
based rule for GRE?

Like, do not only nat incomming packages and learn from that how to
handle outgoing packages. But something like "Do nat incoming GRE
packages to that IP. Do nat outgoing GRE packages to my public IP
address with source NAT."

At my current point of understanding it just seems logical, that this
might be needed to prevent race conditions.

But on second thought, the masquerading should do this already. Which
brings me back the point, that I didn't understand something here or I'd
consider this a bug.

>
> Try clearing the connection tracking table, and then starting a
> persistent ping through each tunnel and seeing if the tunnels stay up
> and functional.  -  I.e. constantly send traffic through the tunnels
> to make sure that the connection tracking table entries don't become
> stale, which leads to them getting purged, which means a new "first
> seen" condition again.
BGP should be holding the tunnels open with its status packages. I could
still give it a try. Test started. We'll know the results later.

Bye,
Matthias

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (3 preceding siblings ...)
  2018-02-02 12:33 ` Matthias Walther
@ 2018-02-02 20:21 ` Grant Taylor
  2018-02-02 21:30 ` Matthias Walther
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-02-02 20:21 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 2398 bytes --]

On 02/02/2018 05:33 AM, Matthias Walther wrote:
> do you have some reading material on this besides the man page of course?

Sorry, I don't have anything specific.

I suspect there's a mailing list or other support resources, other than 
LARTC, around iproute2 and connection tracking.

> I'd consider a race condition in this paticular case a bug. As GRE is 
> stateless, the NAT module needs to be capable of handling first connection 
> attemps from both sides.

I think the race condition is between which side sends a GRE packet 
first, after the connection tracking state has been cleared.  I.e. the 
local inside system sending the GRE packet to the remote outside system, 
or the remote outside system sending the GRE packet to the local inside 
system.

> I haven't seen the code so far, maybe I just need another source-NAT 
> based rule for GRE?

I don't know.

Take a look at the GRE-NAT.sh script that I shared in a previous email.

> Like, do not only nat incomming packages and learn from that how to handle 
> outgoing packages. But something like "Do nat incoming GRE packages 
> to that IP. Do nat outgoing GRE packages to my public IP address with 
> source NAT."

I know that other NAT implementations need NAT rules for incoming and 
outgoing traffic.  But IPTables has always managed both of those as one 
atomic unit, which handled both directions.

> At my current point of understanding it just seems logical, that this 
> might be needed to prevent race conditions.

I think the race is who sends packets first, not a problem in the code 
or implementation.

> But on second thought, the masquerading should do this already. Which 
> brings me back the point, that I didn't understand something here or 
> I'd consider this a bug.

First, compare what I think you are considering the race condition vs 
what I'm considering the race condition.

> BGP should be holding the tunnels open with its status packages. I could 
> still give it a try. Test started. We'll know the results later.

How often does BGP send packets if there aren't any updates or changes 
to advertise?  -  Cursory Google search makes me think that BGP sends a 
a keepalive (heartbeat) packet every minute.  -  I would think that 
would be often enough to keep connection tracking entries from timing out.



-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (4 preceding siblings ...)
  2018-02-02 20:21 ` Grant Taylor
@ 2018-02-02 21:30 ` Matthias Walther
  2018-02-02 23:18 ` Grant Taylor
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Matthias Walther @ 2018-02-02 21:30 UTC (permalink / raw)
  To: lartc

Hi,

Am 02.02.2018 um 21:21 schrieb Grant Taylor:
>
>> I haven't seen the code so far, maybe I just need another source-NAT
>> based rule for GRE?
>
> I don't know.
>
> Take a look at the GRE-NAT.sh script that I shared in a previous email.
You have a SNAT rule in there.

But my masquerading rule should do the exact same thing:
-A POSTROUTING -s 192.168.10.0/24 ! -d 192.168.10.0/24 -j MASQUERADE

Both cases, the first package from the inside and from the outside
should be covered. Or am I missing something here?

>
> I think the race is who sends packets first, not a problem in the code
> or implementation.
>
True, but the implementation and my configuration of the same should
handle both cases.
>
> How often does BGP send packets if there aren't any updates or changes
> to advertise?  -  Cursory Google search makes me think that BGP sends
> a a keepalive (heartbeat) packet every minute.  -  I would think that
> would be often enough to keep connection tracking entries from timing
> out.
>
I'd have to look that up. So far the ping keeps the tunnels going.

Bye,
Matthias


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (5 preceding siblings ...)
  2018-02-02 21:30 ` Matthias Walther
@ 2018-02-02 23:18 ` Grant Taylor
  2018-02-05  0:17 ` Matthias Walther
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-02-02 23:18 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 1516 bytes --]

On 02/02/2018 02:30 PM, Matthias Walther wrote:
> You have a SNAT rule in there.
> 
> But my masquerading rule should do the exact same thing: -A POSTROUTING 
> -s 192.168.10.0/24 ! -d 192.168.10.0/24 -j MASQUERADE

Maybe, maybe not.

I thought you had additional globally routed IPs bound to the outside 
interface that were DNATed into the VMs.  (Have I remembered incorrectly?)

If that is the case, then MASQUERADEing will likely cause at least one 
of the tunnels to end up with the wrong GRE source IP on outgoing packets.

> Both cases, the first package from the inside and from the outside should 
> be covered. Or am I missing something here?

I think that the IPTables rules do account for packets in either 
direction to arrive first.

But I think the problem is when both ends send packets with timing that 
does not jive with state that Connection Tracking is expecting.

I /think/.

> True, but the implementation and my configuration of the same should 
> handle both cases.

I think it's a complication related to interaction between arrival 
timing and what Connection Tracking is expecting.  Hence the "UNREPLIED" 
in the output of conntrack -L.

> I'd have to look that up. So far the ping keeps the tunnels going.

Well, I think that's a good thing.  It seems like we're narrowing in on 
the problem.  The solution to said problem may be something else. 
(Unless you want to just leave persistent pings running.  }:-)



-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (6 preceding siblings ...)
  2018-02-02 23:18 ` Grant Taylor
@ 2018-02-05  0:17 ` Matthias Walther
  2018-02-05  1:05 ` Grant Taylor
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Matthias Walther @ 2018-02-05  0:17 UTC (permalink / raw)
  To: lartc

Hi,

Am 03.02.2018 um 00:18 schrieb Grant Taylor:
>
> Maybe, maybe not.
>
> I thought you had additional globally routed IPs bound to the outside
> interface that were DNATed into the VMs.  (Have I remembered
> incorrectly?)
They're bridged through the physical interface and should not interfere
with the other packages.
>
> If that is the case, then MASQUERADEing will likely cause at least one
> of the tunnels to end up with the wrong GRE source IP on outgoing
> packets.
>
How do you mean that with at least one of the tunnels? Could you give an
example?

In fact I do have one tunnel, that is still down. I ignored it, because
I thought there might be another problem with that one.
>
>> True, but the implementation and my configuration of the same should
>> handle both cases.
>
> I think it's a complication related to interaction between arrival
> timing and what Connection Tracking is expecting.  Hence the
> "UNREPLIED" in the output of conntrack -L.
How do you mean this exactly? The first package might be incoming or
outgoing. Or are you thinking of the case, that they might arrive the
(almost) the same time?
>
>> I'd have to look that up. So far the ping keeps the tunnels going.
>
> Well, I think that's a good thing.  It seems like we're narrowing in
> on the problem.  The solution to said problem may be something else.
> (Unless you want to just leave persistent pings running.  }:-)
>
The ping workaround still works :).

Bye,
Matthias


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (7 preceding siblings ...)
  2018-02-05  0:17 ` Matthias Walther
@ 2018-02-05  1:05 ` Grant Taylor
  2018-02-05 14:00 ` Matthias Walther
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-02-05  1:05 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 1821 bytes --]

On 02/04/2018 05:17 PM, Matthias Walther wrote:
> They're bridged through the physical interface and should not interfere 
> with the other packages.

I agree that the bridged packets shouldn't interfere.  But that doesn't 
account for the two VMs.  I thought that each VM had an additional IP 
bound to the outside.

Too many things.  I've lost track.

> How do you mean that with at least one of the tunnels? Could you give 
> an example?

Suppose that you have two additional IPs bound to the eth0 interface, 
which has it's own IP, and DNATing the traffic into the VMs.

I believe that the simple MASQUERADE will end up SNATing egress packets 
with one IP address.  I expect it will either be the IP that shows up in 
ifconfig or the first address added or the numerically lowest IP.

So, outgoing packets from at least on of the VMs will possibly be 
MASQUERADEd to the wrong IP.

> In fact I do have one tunnel, that is still down. I ignored it, because 
> I thought there might be another problem with that one.

You'll have to give details before I can speculate.

> How do you mean this exactly? The first package might be incoming or 
> outgoing. Or are you thinking of the case, that they might arrive the 
> (almost) the same time?

Yes, I'm referring to the first packet that connection tracking seeing 
(after booting or being cleared) could be incoming /or/ outgoing.  This 
unpredictability has everything to do with timing of the sequence of events.

> The ping workaround still works

Good.  Then we might be on to something.

Skim the man page for conntrack.  There's a way that you can get it to 
show you events as they happen.  Perhaps you can watch them and figure 
out a pattern to when things do and do not work.



-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (8 preceding siblings ...)
  2018-02-05  1:05 ` Grant Taylor
@ 2018-02-05 14:00 ` Matthias Walther
  2018-02-05 23:10 ` Grant Taylor
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Matthias Walther @ 2018-02-05 14:00 UTC (permalink / raw)
  To: lartc

Hi,

Am 05.02.2018 um 02:05 schrieb Grant Taylor:
> On 02/04/2018 05:17 PM, Matthias Walther wrote:
>> They're bridged through the physical interface and should not
>> interfere with the other packages.
>
> I agree that the bridged packets shouldn't interfere.  But that
> doesn't account for the two VMs.  I thought that each VM had an
> additional IP bound to the outside.
>
no, there are VMs with public IPs, which run in bridged mode. And VMs,
which have only private IPs. Those are natted on the hypervisor.

The eth0s of the VMs have no address but a public IPV4 and a public IPV6
address.
>
>> How do you mean that with at least one of the tunnels? Could you give
>> an example?
>
> Suppose that you have two additional IPs bound to the eth0 interface,
> which has it's own IP, and DNATing the traffic into the VMs.
>
> I believe that the simple MASQUERADE will end up SNATing egress
> packets with one IP address.  I expect it will either be the IP that
> shows up in ifconfig or the first address added or the numerically
> lowest IP.
No, all masquerading goes through the same, one and only public ip on
the hypervisor.


>> In fact I do have one tunnel, that is still down. I ignored it,
>> because I thought there might be another problem with that one.
>
> You'll have to give details before I can speculate.
The ping doesn't go through, even with correct source IP (-I
local_tunnel_ip) and resetting the connection with conntrack doesn't work.
> Yes, I'm referring to the first packet that connection tracking seeing
> (after booting or being cleared) could be incoming /or/ outgoing. 
> This unpredictability has everything to do with timing of the sequence
> of events.
ACK.
> Skim the man page for conntrack.  There's a way that you can get it to
> show you events as they happen.  Perhaps you can watch them and figure
> out a pattern to when things do and do not work.
conntrack -E shows this, when I delete an entry:
[DESTROY] gre      47 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]
    [NEW] gre      47 30 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0
dstkey=0x0
 [UPDATE] gre      47 30 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
 [UPDATE] gre      47 180 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]

(This paticular tunnel still doesn't work.)

Then I tried a working tunnel:

Set up a ping, flushed the entry, but:
root@unimatrixzero ~ # conntrack -D -s 185.66.194.1
conntrack v1.4.3 (conntrack-tools): 0 flow entries have been deleted.
root@unimatrixzero ~ # conntrack -D -d 185.66.194.1
gre      47 179 src\x192.168.10.62 dst\x185.66.194.1 srckey=0x0 dstkey=0x0
src\x185.66.194.1 dst\x176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED] mark=0
use=1
conntrack v1.4.3 (conntrack-tools): 1 flow entries have been deleted.

The not working tunnel seems to have a conntrack entry based on the
remote IP as source. The working tunnel seems to have a conntrack entry
based on the remote IP as destination.

But it's not as simple as that. I tried to making a not working tunnel
work. Set up a ping (one packet per second) and deleted entries, till it
worked:

root@unimatrixzero ~ # conntrack -E|grep 185.66.195.0
    [NEW] gre      47 30 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0
dstkey=0x0
[DESTROY] gre      47 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0
dstkey=0x0
    [NEW] gre      47 30 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0
dstkey=0x0
 [UPDATE] gre      47 29 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0
 [UPDATE] gre      47 180 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED]
[DESTROY] gre      47 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED]
[DESTROY] gre      47 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]
    [NEW] gre      47 30 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0
dstkey=0x0
 [UPDATE] gre      47 29 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
 [UPDATE] gre      47 180 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]
    [NEW] gre      47 30 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0
dstkey=0x0
[DESTROY] gre      47 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]
 [UPDATE] gre      47 30 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0
 [UPDATE] gre      47 180 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED]
    [NEW] gre      47 30 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0
dstkey=0x0
 [UPDATE] gre      47 29 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
 [UPDATE] gre      47 180 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]
[DESTROY] gre      47 src\x176.9.38.158 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED]
[DESTROY] gre      47 src\x185.66.195.0 dst\x176.9.38.158 srckey=0x0
dstkey=0x0 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0 dstkey=0x0
[ASSURED]
    [NEW] gre      47 30 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 [UNREPLIED] src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0
dstkey=0x0
 [UPDATE] gre      47 30 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0
 [UPDATE] gre      47 180 src\x192.168.10.62 dst\x185.66.195.0 srckey=0x0
dstkey=0x0 src\x185.66.195.0 dst\x176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED]


As you can see the third lines is identical to the third last. But the
tunnel didn't work with the entry from the third line, but did so with
the third last line.

There must be something else going on.

Regards,
Matthias

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (9 preceding siblings ...)
  2018-02-05 14:00 ` Matthias Walther
@ 2018-02-05 23:10 ` Grant Taylor
  2018-02-06 23:01 ` Matthias Walther
  2018-02-07  3:52 ` Grant Taylor
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-02-05 23:10 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 6744 bytes --]

Re-sending to LARTC.

On 02/05/2018 07:00 AM, Matthias Walther wrote:
> Hi,

Hi Matthias,

> no, there are VMs with public IPs, which run in bridged mode. And VMs, 
> which have only private IPs. Those are natted on the hypervisor.
> 
> The eth0s of the VMs have no address but a public IPV4 and a public 
> IPV6 address.

Okay.

I've apparently conflated multiple things that I've been working on.

I'm sorry for the confusion.

> No, all masquerading goes through the same, one and only public ip on 
> the hypervisor.

ACK

Then MASQUERADE should be fine.

> The ping doesn't go through, even with correct source IP (-I 
> local_tunnel_ip) and resetting the connection with conntrack doesn't work.

Okay.  :-/

> conntrack -E shows this, when I delete an entry:
> [DESTROY] gre      47 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]
>     [NEW] gre      47 30 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=192.168.10.62 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0
>  [UPDATE] gre      47 30 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0
>  [UPDATE] gre      47 180 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]

Okay.  I see the DESTROY which corresponds to the command that you 
issues.  Then the NEW (not currently in the connection tracking table), 
the UPDATE for the reply, and the UPDATE for the reply to the reply 
going the same direction as the NEW packet, thus marking the connection 
ASSURED.

> (This paticular tunnel still doesn't work.)

Okay.

> Then I tried a working tunnel:
> 
> Set up a ping, flushed the entry, but:
> root@unimatrixzero ~ # conntrack -D -s 185.66.194.1
> conntrack v1.4.3 (conntrack-tools): 0 flow entries have been deleted.
> root@unimatrixzero ~ # conntrack -D -d 185.66.194.1
> gre      47 179 src=192.168.10.62 dst=185.66.194.1 srckey=0x0 dstkey=0x0 
> src=185.66.194.1 dst=176.9.38.150 srckey=0x0 dstkey=0x0 [ASSURED] 
> mark=0 use=1
> conntrack v1.4.3 (conntrack-tools): 1 flow entries have been deleted.
> 
> The not working tunnel seems to have a conntrack entry based on the 
> remote IP as source. The working tunnel seems to have a conntrack entry 
> based on the remote IP as destination.

You might be onto something.  This may come back to the race condition 
that I was referring to.

> But it's not as simple as that. I tried to making a not working tunnel 
> work. Set up a ping (one packet per second) and deleted entries, till 
> it worked:
> 
> root@unimatrixzero ~ # conntrack -E|grep 185.66.195.0
>     [NEW] gre      47 30 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=185.66.195.0 dst=176.9.38.150 srckey=0x0 
> dstkey=0x0
> [DESTROY] gre      47 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=185.66.195.0 dst=176.9.38.150 srckey=0x0 
> dstkey=0x0
>     [NEW] gre      47 30 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=185.66.195.0 dst=176.9.38.150 srckey=0x0 
> dstkey=0x0
>  [UPDATE] gre      47 29 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0
>  [UPDATE] gre      47 180 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0 
> [ASSURED]
> [DESTROY] gre      47 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0 
> [ASSURED]
> [DESTROY] gre      47 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]
>     [NEW] gre      47 30 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=192.168.10.62 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0
>  [UPDATE] gre      47 29 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0
>  [UPDATE] gre      47 180 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]
>     [NEW] gre      47 30 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=185.66.195.0 dst=176.9.38.150 srckey=0x0 
> dstkey=0x0
> [DESTROY] gre      47 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]
>  [UPDATE] gre      47 30 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0
>  [UPDATE] gre      47 180 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0 
> [ASSURED]
>     [NEW] gre      47 30 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=192.168.10.62 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0
>  [UPDATE] gre      47 29 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0
>  [UPDATE] gre      47 180 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]
> [DESTROY] gre      47 src=176.9.38.158 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0 
> [ASSURED]
> [DESTROY] gre      47 src=185.66.195.0 dst=176.9.38.158 srckey=0x0 
> dstkey=0x0 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 dstkey=0x0 
> [ASSURED]
>     [NEW] gre      47 30 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 [UNREPLIED] src=185.66.195.0 dst=176.9.38.150 srckey=0x0 
> dstkey=0x0
>  [UPDATE] gre      47 30 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0
>  [UPDATE] gre      47 180 src=192.168.10.62 dst=185.66.195.0 srckey=0x0 
> dstkey=0x0 src=185.66.195.0 dst=176.9.38.150 srckey=0x0 dstkey=0x0 
> [ASSURED]
>  
> As you can see the third lines is identical to the third last. But the 
> tunnel didn't work with the entry from the third line, but did so with 
> the third last line.

I think I need to do some more reading on conntrack and how to manually 
manipulate the connection tracking table.

> There must be something else going on.

Maybe.

I feel like it is connection tracking related.

Are the tunnels that had the persistent ping running still working 
correctly?



-- 
Grant. . . .
unix || die



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (10 preceding siblings ...)
  2018-02-05 23:10 ` Grant Taylor
@ 2018-02-06 23:01 ` Matthias Walther
  2018-02-07  3:52 ` Grant Taylor
  12 siblings, 0 replies; 14+ messages in thread
From: Matthias Walther @ 2018-02-06 23:01 UTC (permalink / raw)
  To: lartc

Hi again :)

Am 06.02.2018 um 00:10 schrieb Grant Taylor:
>
>> Then I tried a working tunnel:
>>
>> Set up a ping, flushed the entry, but:
>> root@unimatrixzero ~ # conntrack -D -s 185.66.194.1
>> conntrack v1.4.3 (conntrack-tools): 0 flow entries have been deleted.
>> root@unimatrixzero ~ # conntrack -D -d 185.66.194.1
>> gre      47 179 src\x192.168.10.62 dst\x185.66.194.1 srckey=0x0
>> dstkey=0x0 src\x185.66.194.1 dst\x176.9.38.150 srckey=0x0 dstkey=0x0
>> [ASSURED] mark=0 use=1
>> conntrack v1.4.3 (conntrack-tools): 1 flow entries have been deleted.
>>
>> The not working tunnel seems to have a conntrack entry based on the
>> remote IP as source. The working tunnel seems to have a conntrack
>> entry based on the remote IP as destination.
>
> You might be onto something.  This may come back to the race condition
> that I was referring to.
Indeed. But I still don't understand why the exact same entries in the
conntrack table sometimes work and sometimes not.
>
> Are the tunnels that had the persistent ping running still working
> correctly?
>
The ones, that had a running ping, didn't break down. I made all tunnels
working now by repeatedly deleting the conntrack entries till every
single tunnel came up.

For the first time, since we started writing here, every single of the
seven tunnels works at the same time. I set up pings for every single
one of them. So in theory this should be stable until the next reboot.

One thing noticing though: In once case, the ping went though the tunnel
correctly, but BGP couldn't establish a connection. Only after deleting
the entry for a couple of times, BGP came up aswell. I don't know yet
what this means.

ffrl_fra0 BGP      ffnet    up     23:43:55    Established  
ffrl_fra1 BGP      ffnet    up     2018-02-01  Established  
ffrl_ber0 BGP      ffnet    up     2018-02-05  Established  
ffrl_ber1 BGP      ffnet    up     23:36:19    Established  
ffrl_dus0 BGP      ffnet    up     23:38:22    Established  
ffrl_dus1 BGP      ffnet    up     2018-02-05  Established  
ibgp_gw02 BGP      ffnet    up     2018-02-05  Established  

As you can see, the other tunnels have been running for quite some time
now.

Bye,

Matthias

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GRE-NAT broken - SOLVED
  2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
                   ` (11 preceding siblings ...)
  2018-02-06 23:01 ` Matthias Walther
@ 2018-02-07  3:52 ` Grant Taylor
  12 siblings, 0 replies; 14+ messages in thread
From: Grant Taylor @ 2018-02-07  3:52 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 2124 bytes --]

On 02/06/2018 04:01 PM, Matthias Walther wrote:
> Hi again

Hi Matthias,

> Indeed. But I still don't understand why the exact same entries in the 
> conntrack table sometimes work and sometimes not.

I don't know.

I think you're going to need to enlist the help of someone that 
understand connection tracking better than I do.

> The ones, that had a running ping, didn't break down.

Yay.

> I made all tunnels working now by repeatedly deleting the conntrack 
> entries till every single tunnel came up.

I'm glad that it's working.

I don't like the fact that you needed to do repeatedly delete connection 
tracking entries to make them work.

Did you delete all entries?  Or did you selectively delete the ones that 
weren't working?

I wonder if GRE tunnels (on Linux) have anything comparable to BGP's 
passive mode.  Maybe setting one side passive and having the other side 
initiate things might work better.

> For the first time, since we started writing here, every single of the 
> seven tunnels works at the same time. I set up pings for every single 
> one of them. So in theory this should be stable until the next reboot.

:-)

> One thing noticing though: In once case, the ping went though the tunnel 
> correctly, but BGP couldn't establish a connection. Only after deleting 
> the entry for a couple of times, BGP came up aswell. I don't know yet 
> what this means.

That's really odd.

I'd hope that a tcpdump would shed some light on that situation (if it 
ever happens again).

> ffrl_fra0 BGP      ffnet    up     23:43:55    Established
> ffrl_fra1 BGP      ffnet    up     2018-02-01  Established
> ffrl_ber0 BGP      ffnet    up     2018-02-05  Established
> ffrl_ber1 BGP      ffnet    up     23:36:19    Established
> ffrl_dus0 BGP      ffnet    up     23:38:22    Established
> ffrl_dus1 BGP      ffnet    up     2018-02-05  Established
> ibgp_gw02 BGP      ffnet    up     2018-02-05  Established
> 
> As you can see, the other tunnels have been running for quite some 
> time now.

Nice.



-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3982 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-02-07  3:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-31  5:29 GRE-NAT broken - SOLVED Grant Taylor
2018-01-31  5:33 ` Grant Taylor
2018-02-01 10:34 ` Matthias Walther
2018-02-01 18:31 ` Grant Taylor
2018-02-02 12:33 ` Matthias Walther
2018-02-02 20:21 ` Grant Taylor
2018-02-02 21:30 ` Matthias Walther
2018-02-02 23:18 ` Grant Taylor
2018-02-05  0:17 ` Matthias Walther
2018-02-05  1:05 ` Grant Taylor
2018-02-05 14:00 ` Matthias Walther
2018-02-05 23:10 ` Grant Taylor
2018-02-06 23:01 ` Matthias Walther
2018-02-07  3:52 ` Grant Taylor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.