Ivan,

I tried the SNAT idea, and still have issue.

here is an example configuration of one of the nodes:

[Interface]

ListenPort = 5555

PrivateKey = ---

[Peer]

PublicKey = H09cwQeUUly2AIdTAhyr5zvzFK9bED0NYiKgJultYwE=

AllowedIPs = 10.128.2.0/23

Endpoint = 192.168.99.12:31112

PersistentKeepalive = 25

[Peer]

PublicKey = 5nC5cyDg9WZ/2R3CPEbM+fSXzsn5Yx1mX48iizdfdHU=

AllowedIPs = 10.128.0.0/23

Endpoint = 192.168.99.14:32188

PersistentKeepalive = 25

[Peer]

PublicKey = MzFg1tMaLUFC3kD9maiZZAHWywfCDyPlYF1zu6Dj30E=

AllowedIPs = 10.130.0.0/23

Endpoint = 192.168.99.13:31992

PersistentKeepalive = 25

[Peer]

PublicKey = s7/lxyvFQCxE+KBoUJ/9vpPgLZ6pTdYUCsJ/snp3mUk=

AllowedIPs = 10.129.0.0/23

Endpoint = 192.168.99.15:30305

PersistentKeepalive = 25

[Peer]

PublicKey = SuO927DbGm2h2I8hcf24LvYWglKp+4wGAuiyisin/yY=

AllowedIPs = 10.131.0.0/23

Endpoint = 192.168.99.7:31714

PersistentKeepalive = 25

[Peer]

PublicKey = a+tK21LKdsBkQNqmqdRpvS9HLpz2W8rwDijTPkXEc0Q=

AllowedIPs = 10.129.2.0/23

Endpoint = 192.168.99.6:31165

PersistentKeepalive = 25

the private IP to VIP map for the peers of this node is:

10.128.2.10-192.168.99.12 10.128.1.94-192.168.99.14 10.130.0.136-192.168.99.13 10.129.1.158-192.168.99.15 10.131.0.199-192.168.99.7 10.129.2.217-192.168.99.6

I create the following iptables rules:

sh-4.2# iptables -t nat -n -L Chain PREROUTING (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination SNAT udp -- 10.128.2.10 0.0.0.0/0 udp dpt:5555 to:192.168.99.12:5555 SNAT udp -- 10.128.1.94 0.0.0.0/0 udp dpt:5555 to:192.168.99.14:5555 SNAT udp -- 10.130.0.136 0.0.0.0/0 udp dpt:5555 to:192.168.99.13:5555 SNAT udp -- 10.129.1.158 0.0.0.0/0 udp dpt:5555 to:192.168.99.15:5555 SNAT udp -- 10.131.0.199 0.0.0.0/0 udp dpt:5555 to:192.168.99.7:5555 SNAT udp -- 10.129.2.217 0.0.0.0/0 udp dpt:5555 to:192.168.99.6:5555 Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination

after the handshake the configuration changes to:

sh-4.2# wg

interface: sdn-tunnel

  public key: gCFgNjOpObU71Vmjub/R9KIn3MHgzXnKtrh9Tf+W628=

  private key: (hidden)

  listening port: 5555

peer: s7/lxyvFQCxE+KBoUJ/9vpPgLZ6pTdYUCsJ/snp3mUk=

  endpoint: 10.134.0.1:1033

  allowed ips: 10.129.0.0/23

  latest handshake: 31 seconds ago

  transfer: 180 B received, 452 B sent

  persistent keepalive: every 25 seconds

peer: MzFg1tMaLUFC3kD9maiZZAHWywfCDyPlYF1zu6Dj30E=

  endpoint: 10.134.0.1:1032

  allowed ips: 10.130.0.0/23

  latest handshake: 37 seconds ago

  transfer: 212 B received, 272 B sent

  persistent keepalive: every 25 seconds

peer: 5nC5cyDg9WZ/2R3CPEbM+fSXzsn5Yx1mX48iizdfdHU=

  endpoint: 10.134.0.1:1031

  allowed ips: 10.128.0.0/23

  latest handshake: 39 seconds ago

  transfer: 180 B received, 304 B sent

  persistent keepalive: every 25 seconds

peer: a+tK21LKdsBkQNqmqdRpvS9HLpz2W8rwDijTPkXEc0Q=

  endpoint: 192.168.99.6:31165

  allowed ips: 10.129.2.0/23

  latest handshake: 41 seconds ago

  transfer: 156 B received, 180 B sent

  persistent keepalive: every 25 seconds

peer: H09cwQeUUly2AIdTAhyr5zvzFK9bED0NYiKgJultYwE=

  endpoint: 192.168.99.12:31112

  allowed ips: 10.128.2.0/23

  latest handshake: 41 seconds ago

  transfer: 156 B received, 180 B sent

  persistent keepalive: every 25 seconds

peer: SuO927DbGm2h2I8hcf24LvYWglKp+4wGAuiyisin/yY=

  endpoint: 192.168.99.7:31714

  allowed ips: 10.131.0.0/23

  latest handshake: 41 seconds ago

  transfer: 156 B received, 180 B sent

  persistent keepalive: every 25 seconds

as you can see some of the endpoint's addresses have changed. 

the first three are not correct anymore.

After the introduction of the iptables rules they change to an IP that makes no sense to me 10.134.0.1

finally here are a few seconds of tcpdump in case it helps:

sh-4.2# tcpdump -i eth0 -nn -v tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 23:07:01.045331 IP (tos 0x0, ttl 64, id 27711, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 192.168.99.6.31165: UDP, length 32 23:07:01.045363 IP (tos 0x0, ttl 64, id 53835, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 192.168.99.7.31714: UDP, length 32 23:07:01.045411 IP (tos 0x0, ttl 64, id 27009, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 192.168.99.12.31112: UDP, length 32 23:07:02.758694 IP (tos 0x0, ttl 61, id 19309, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.1.1031 > 10.134.0.72.5555: UDP, length 32 23:07:04.053339 IP (tos 0x0, ttl 64, id 36786, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 10.134.0.1.1032: UDP, length 32 23:07:07.765375 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.134.0.72 tell 10.134.0.1, length 28 23:07:07.765394 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.134.0.72 is-at 0a:58:0a:86:00:48, length 28 23:07:10.938921 IP (tos 0x0, ttl 61, id 33093, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.1.1033 > 10.134.0.72.5555: UDP, length 32 23:07:26.069271 IP (tos 0x0, ttl 64, id 37778, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 192.168.99.6.31165: UDP, length 32 23:07:26.069271 IP (tos 0x0, ttl 64, id 59175, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 192.168.99.7.31714: UDP, length 32 23:07:26.069303 IP (tos 0x0, ttl 64, id 49067, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 192.168.99.12.31112: UDP, length 32 23:07:27.797284 IP (tos 0x0, ttl 64, id 57007, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.72.5555 > 10.134.0.1.1031: UDP, length 32 23:07:29.079935 IP (tos 0x0, ttl 61, id 18743, offset 0, flags [none], proto UDP (17), length 60) 10.134.0.1.1032 > 10.134.0.72.5555: UDP, length 32

Thanks,

Raffaele

Raffaele Spazzoli

Senior Architect - OpenShift, Containers and PaaS Practice

Tel: +1 216-258-7717

On Sun, Sep 16, 2018 at 2:56 PM, Raffaele Spazzoli <rspazzol@redhat.com> wrote:

I'll try to make an example
cluster 1 node 1 has private IP1 and VIP1
cluster 2 node 2 has private IP2 and VIP2

each node uses it's private ip for outbound connections.
each node can receive inbound connection on its VIP.

so the wireguard config file for node1 is going to look like:

[peer]
endpoint: VIP2:port

and for node 2:
[peer]
endpoint: VIP1: port

the problem is that after the handshake, wireguard updates the config to the following (for example for node2):
[peer]
endpoint: IP1:port

but IP2 cannot route to IP1...

I think a well configured SNAT rule may work, although is not elegant because it forces the cluster to exchange information about their private IPs.
This should not be needed and in the cloud private IPs are ephemeral....

anyway thanks for the advice, I am going to try to use it in my prototype.

I still think there is need for a better technical approach for a long term solution.

Thanks,
Raffaele

Raffaele Spazzoli
Senior Architect - OpenShift, Containers and PaaS Practice
Tel: +1 216-258-7717

On Sun, Sep 16, 2018 at 12:54 PM, Ivan Labáth <labawi-wg@matrix-dream.net> wrote:
Hi,

On Sun, Sep 16, 2018 at 08:21:02AM -0400, Raffaele Spazzoli wrote:
> ... then the IP that a node uses for its outbound
> connection is not the same that its peer need to use for its inbound
> connections.

Who uses what for whose connection? You lost me here.
Looks like a broken network to me. Does TCP even work?

Anyway, SNAT/DNAT should be able to fix things up, if you want to go
that route.

Regards,
Ivan