All of lore.kernel.org
 help / color / mirror / Atom feed
* multi-home difficulty
@ 2017-11-21 13:21 d tbsky
  2017-11-21 13:32 ` Tomas Herceg
  2017-11-21 14:15 ` Jason A. Donenfeld
  0 siblings, 2 replies; 18+ messages in thread
From: d tbsky @ 2017-11-21 13:21 UTC (permalink / raw)
  To: wireguard

Hi:
   I tested wireguard and the speed is amazing. but when I try to
deploy it to our real linux firewall, I found it is hard to make it
work.

   our current linux firewall have multiple interface and multiple
routing tables. local program will get lan ip address and nat to
correct wan ip address when goto internet.

  since wireguard can not bind to specific ip address, it sometimes
use wrong ip address to reply and the vpn communication can not be
established.

for example:

config for client site: (assume wan ip is 2.2.2.2)
interface: wg0
  public key: ****
  private key: (hidden)
  listening port: 51820
peer: ****
  endpoint: 1.1.1.1:51820
  allowed ips: 0.0.0.0/0

config for server site: (assume wan ip is 1.1.1.1)
interface: wg0
  public key: ****
  private key: (hidden)
  listening port: 51820
peer: ****
  allowed ips: 0.0.0.0/0

when client initial connect to server, at server site I saw  flow like below:
"cat /proc/net/nf_conntrack | grep 51820"

ipv4     2 udp      17 23 src=172.18.1.254 dst=2.2.2.2 sport=51820
dport=51820 packets=1 bytes=120 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1
sport=51820 dport=1085 packets=0 bytes=0 mark=1 zone=0 use=2
ipv4     2 udp      17 23 src=2.2.2.2 dst=1.1.1.1 sport=51820
dport=51820 packets=1 bytes=176 [UNREPLIED] src=1.1.1.1 dst=2.2.2.2
sport=51820 dport=51820 packets=0 bytes=0 mark=1 zone=0 use=2

so at first client  2.2.2.2:51820 connect to server 1.1.1.1:51820
but then server use 172.18.1.254(lan ip address) to reply and 51820
port is nat to 1085 so the communication is broken.

if wireguard can bind to specific ip address then there will be no problem.
or if wireguard can reply with the correct ip address.( eg: if client
connect to wireguard ip 1.1.1.1, then wiregurad should reply via ip
address 1.1.1.1) then maybe there will be no problem.

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-21 13:21 multi-home difficulty d tbsky
@ 2017-11-21 13:32 ` Tomas Herceg
  2017-11-21 14:15 ` Jason A. Donenfeld
  1 sibling, 0 replies; 18+ messages in thread
From: Tomas Herceg @ 2017-11-21 13:32 UTC (permalink / raw)
  To: wireguard


[-- Attachment #1.1: Type: text/plain, Size: 2235 bytes --]

+1 for binding only on specific IP

On 11/21/2017 02:21 PM, d tbsky wrote:
> Hi:
>    I tested wireguard and the speed is amazing. but when I try to
> deploy it to our real linux firewall, I found it is hard to make it
> work.
> 
>    our current linux firewall have multiple interface and multiple
> routing tables. local program will get lan ip address and nat to
> correct wan ip address when goto internet.
> 
>   since wireguard can not bind to specific ip address, it sometimes
> use wrong ip address to reply and the vpn communication can not be
> established.
> 
> for example:
> 
> config for client site: (assume wan ip is 2.2.2.2)
> interface: wg0
>   public key: ****
>   private key: (hidden)
>   listening port: 51820
> peer: ****
>   endpoint: 1.1.1.1:51820
>   allowed ips: 0.0.0.0/0
> 
> config for server site: (assume wan ip is 1.1.1.1)
> interface: wg0
>   public key: ****
>   private key: (hidden)
>   listening port: 51820
> peer: ****
>   allowed ips: 0.0.0.0/0
> 
> when client initial connect to server, at server site I saw  flow like below:
> "cat /proc/net/nf_conntrack | grep 51820"
> 
> ipv4     2 udp      17 23 src=172.18.1.254 dst=2.2.2.2 sport=51820
> dport=51820 packets=1 bytes=120 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1
> sport=51820 dport=1085 packets=0 bytes=0 mark=1 zone=0 use=2
> ipv4     2 udp      17 23 src=2.2.2.2 dst=1.1.1.1 sport=51820
> dport=51820 packets=1 bytes=176 [UNREPLIED] src=1.1.1.1 dst=2.2.2.2
> sport=51820 dport=51820 packets=0 bytes=0 mark=1 zone=0 use=2
> 
> so at first client  2.2.2.2:51820 connect to server 1.1.1.1:51820
> but then server use 172.18.1.254(lan ip address) to reply and 51820
> port is nat to 1085 so the communication is broken.
> 
> if wireguard can bind to specific ip address then there will be no problem.
> or if wireguard can reply with the correct ip address.( eg: if client
> connect to wireguard ip 1.1.1.1, then wiregurad should reply via ip
> address 1.1.1.1) then maybe there will be no problem.
> 
> Regards,
> tbskyd
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-21 13:21 multi-home difficulty d tbsky
  2017-11-21 13:32 ` Tomas Herceg
@ 2017-11-21 14:15 ` Jason A. Donenfeld
  2017-11-21 14:35   ` d tbsky
  1 sibling, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2017-11-21 14:15 UTC (permalink / raw)
  To: d tbsky; +Cc: WireGuard mailing list

On Tue, Nov 21, 2017 at 2:21 PM, d tbsky <tbskyd@gmail.com> wrote:
> so at first client  2.2.2.2:51820 connect to server 1.1.1.1:51820
> but then server use 172.18.1.254(lan ip address) to reply and 51820
> port is nat to 1085 so the communication is broken.

The server should use 1.1.1.1 to reply. If it's not, that's a bug that
I should fix. Can you give me a minimal configuration for reproducing
this setup, so that I can fix whatever issue is occurring?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-21 14:15 ` Jason A. Donenfeld
@ 2017-11-21 14:35   ` d tbsky
  2017-11-22 23:35     ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: d tbsky @ 2017-11-21 14:35 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-21 22:15 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> On Tue, Nov 21, 2017 at 2:21 PM, d tbsky <tbskyd@gmail.com> wrote:
>> so at first client  2.2.2.2:51820 connect to server 1.1.1.1:51820
>> but then server use 172.18.1.254(lan ip address) to reply and 51820
>> port is nat to 1085 so the communication is broken.
>
> The server should use 1.1.1.1 to reply. If it's not, that's a bug that
> I should fix. Can you give me a minimal configuration for reproducing
> this setup, so that I can fix whatever issue is occurring?
>
> Thanks,
> Jason

thanks for the quick reply. my wireguard configuration is in the
previous mail, so I think the linux firewall part is what you want.
there is only one thing special in our firewall config. normally when
you use "ip route get 8.8.8.8", you will get a wan ip address through
main routing table(eg 1.1.1.1 in above example) . but since we have
multiple routing tables and there is little entries in main routing
table,  "ip route get 8.8.8.8" will get 172.18.1.254 (lan ip address)
in our firewall.

I don't know how wireguard decide its replying ip address, but it
seems wrong under the situation. maybe it decide it through main
routing table?

our linux firewall environment is RHEL 7.4 and wireguard version is
0.0.20171111 from official repository.

thanks a lot  for help!

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-21 14:35   ` d tbsky
@ 2017-11-22 23:35     ` Jason A. Donenfeld
  2017-11-23 17:06       ` d tbsky
  2017-11-29 11:05       ` d tbsky
  0 siblings, 2 replies; 18+ messages in thread
From: Jason A. Donenfeld @ 2017-11-22 23:35 UTC (permalink / raw)
  To: d tbsky; +Cc: WireGuard mailing list

On Tue, Nov 21, 2017 at 3:35 PM, d tbsky <tbskyd@gmail.com> wrote:
> thanks for the quick reply. my wireguard configuration is in the
> previous mail, so I think the linux firewall part is what you want.

Right. So if you can give me minimal instructions on how to set up a
box that exhibits the buggy behavior you're seeing, I can try to fix
it.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-22 23:35     ` Jason A. Donenfeld
@ 2017-11-23 17:06       ` d tbsky
  2017-11-29 11:05       ` d tbsky
  1 sibling, 0 replies; 18+ messages in thread
From: d tbsky @ 2017-11-23 17:06 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-23 7:35 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> On Tue, Nov 21, 2017 at 3:35 PM, d tbsky <tbskyd@gmail.com> wrote:
>> thanks for the quick reply. my wireguard configuration is in the
>> previous mail, so I think the linux firewall part is what you want.
>
> Right. So if you can give me minimal instructions on how to set up a
> box that exhibits the buggy behavior you're seeing, I can try to fix
> it.
>
> Jason

sorry for the delay.  I try to make a minimal config to reproduce the
problem in our firewall, but it's not easy. the communication
sometimes works, sometimes failed. suddenly I remember many years ago
I got similar problems with openvpn. according the manual pages of
openvpn:

 --multihome
              Configure a multi-homed UDP server.  This option needs
to be used when a server has more  than  one  IP
              address  (e.g. multiple interfaces, or secondary IP
addresses), and is not using --local to force bind=E2=80=90
              ing to one specific address only.  This option will add
some extra lookups to the packet path to ensure
              that  the UDP reply packets are always sent from the
address that the client is talking to. This is not
              supported on all platforms, and it adds more processing,
so it's not enabled by default.

              Note: this option is only relevant for UDP servers.

              Note 2: if you do an IPv6+IPv4 dual-stack bind on a
Linux machine with multiple IPv4  address,  connec=E2=80=90
              tions  to  IPv4 addresses will not work right on kernels
before 3.15, due to missing kernel support for
              the IPv4-mapped case (some distributions have ported
this to earlier kernel versions, though).

  I forgot these. many strange things happen if you didn't bind
specific ip, even with "--multihome"

  finally I made a environment for you to test. my OS is rehl 7.4,
kernel version 3.10.0-693.5.2

  1. build a virtual rhel 7.4 box, bind 2 virtio nic to it. (single
nic won't show the problem, I don't now why).
  2. stop NetworkManager
  3. setup network environment like below(skip eth0, setup eth1 with
two ip addresses):

ip addr flush dev eth1
ip addr add 10.99.1.99/24 dev eth1
ip addr add 10.99.1.100/24 dev eth1
ip link set eth1 up
ip route add default via 10.99.1.254

ip link add wg0 type wireguard
ip addr add 172.31.21.1 peer 172.31.21.2 dev wg0
wg setconf wg0 /root/server.conf
ip link set wg0 up

/root/server.conf like below:
[Interface]
PrivateKey =3D ****
ListenPort =3D 51820
[Peer]
PublicKey =3D ****
AllowedIPs =3D 0.0.0.0/0

    4. setup wireguard at client site. client.conf like below:

[Interface]
PrivateKey =3D ****
ListenPort =3D 51820
[Peer]
PublicKey =3D ****
Endpoint =3D  10.99.1.100:51820
AllowedIPs =3D 0.0.0.0/0

    5. at client site, "ping 172.31.21.1".

    6. at server site, "modprobe nf_conntrack_ipv4;cat
/proc/net/nf_conntrack | grep 51820":

ipv4     2 udp      17 29 src=3D10.99.1.99 dst=3D10.99.20.254 sport=3D51820
dport=3D51820 [UNREPLIED] src=3D10.99.20.254 dst=3D10.99.1.99 sport=3D51820
dport=3D51820 mark=3D0 zone=3D0 use=3D2
ipv4     2 udp      17 29 src=3D10.99.20.254 dst=3D10.99.1.100 sport=3D5182=
0
dport=3D51820 [UNREPLIED] src=3D10.99.1.100 dst=3D10.99.20.254 sport=3D5182=
0
dport=3D51820 mark=3D0 zone=3D0 use=3D2

   I don't know if you can reproduce in your environment.
   hope  wireguard can bind to specific ip in the future..

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-22 23:35     ` Jason A. Donenfeld
  2017-11-23 17:06       ` d tbsky
@ 2017-11-29 11:05       ` d tbsky
  2017-11-29 13:13         ` Jason A. Donenfeld
  2017-11-29 13:51         ` Jason A. Donenfeld
  1 sibling, 2 replies; 18+ messages in thread
From: d tbsky @ 2017-11-29 11:05 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-23 7:35 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> On Tue, Nov 21, 2017 at 3:35 PM, d tbsky <tbskyd@gmail.com> wrote:
>> thanks for the quick reply. my wireguard configuration is in the
>> previous mail, so I think the linux firewall part is what you want.
>
> Right. So if you can give me minimal instructions on how to set up a
> box that exhibits the buggy behavior you're seeing, I can try to fix
> it.
>
> Jason

Hi jason:

    are you still interested with the problem? today I try to use
multi-home client to connect server. and I found not only server, but
client suffered.
the problem seems at rhel linux kernel side, but I am not sure. since
wireguard was the only victim I met.

   I can create a virtual machine with ssh access  if you want to test
these strange problems.

  btw, is it possible that wireguard bind to specific ip in the
future? I think it will solve all the problems, but maybe you have
technical concerns.

  thanks a lot for your help.

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 11:05       ` d tbsky
@ 2017-11-29 13:13         ` Jason A. Donenfeld
  2017-11-29 13:51         ` Jason A. Donenfeld
  1 sibling, 0 replies; 18+ messages in thread
From: Jason A. Donenfeld @ 2017-11-29 13:13 UTC (permalink / raw)
  To: d tbsky; +Cc: WireGuard mailing list

On Wed, Nov 29, 2017 at 12:05 PM, d tbsky <tbskyd@gmail.com> wrote:
>     are you still interested with the problem?

Yes, patience please. Lots of things in motion at the moment.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 11:05       ` d tbsky
  2017-11-29 13:13         ` Jason A. Donenfeld
@ 2017-11-29 13:51         ` Jason A. Donenfeld
  2017-11-29 14:08           ` d tbsky
  1 sibling, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2017-11-29 13:51 UTC (permalink / raw)
  To: d tbsky; +Cc: WireGuard mailing list

Hi,

I made a small script in order to reproduce this issue, but I was not
able to replicate the results. Would you spend some time with the below
code tweaking it so that it exhibits the broken behavior you're seeing?

Jason

==== script (please mind the use of literal \t) ====

#!/bin/bash
set -e

exec 3>&1
export WG_HIDE_KEYS=never
netns1="wg-test-$$-1"
netns2="wg-test-$$-2"
pretty() { echo -e "\x1b[32m\x1b[1m[+] ${1:+NS$1: }${2}\x1b[0m" >&3; }
pp() { pretty "" "$*"; "$@"; }
maybe_exec() { if [[ $BASHPID -eq $$ ]]; then "$@"; else exec "$@"; fi; }
n1() { pretty 1 "$*"; maybe_exec ip netns exec $netns1 "$@"; }
n2() { pretty 2 "$*"; maybe_exec ip netns exec $netns2 "$@"; }
ip1() { pretty 1 "ip $*"; ip -n $netns1 "$@"; }
ip2() { pretty 2 "ip $*"; ip -n $netns2 "$@"; }
sleep() { read -t "$1" -N 0 || true; }
waitiface() { pretty "${1//*-}" "wait for $2 to come up"; ip netns exec "$1" bash -c "while [[ \$(< \"/sys/class/net/$2/operstate\") != up ]]; do read -t .1 -N 0 || true; done;"; }

cleanup() {
	set +e
	exec 2>/dev/null
	ip1 link del dev wg0
	ip2 link del dev wg0
	local to_kill="$(ip netns pids $netns1) $(ip netns pids $netns2)"
	[[ -n $to_kill ]] && kill $to_kill
	pp ip netns del $netns1
	pp ip netns del $netns2
	exit
}

trap cleanup EXIT

ip netns del $netns1 2>/dev/null || true
ip netns del $netns2 2>/dev/null || true
pp ip netns add $netns1
pp ip netns add $netns2

key1="$(pp wg genkey)"
key2="$(pp wg genkey)"
pub1="$(pp wg pubkey <<<"$key1")"
pub2="$(pp wg pubkey <<<"$key2")"
psk="$(pp wg genpsk)"
[[ -n $key1 && -n $key2 && -n $psk ]]

configure_peers() {
	ip1 addr add 192.168.241.1/24 dev wg0
	ip2 addr add 192.168.241.2/24 dev wg0

	n1 wg set wg0 \
		private-key <(echo "$key1") \
		listen-port 1 \
		peer "$pub2" \
			preshared-key <(echo "$psk") \
			allowed-ips 192.168.241.2/32,fd00::2/128
	n2 wg set wg0 \
		private-key <(echo "$key2") \
		listen-port 2 \
		peer "$pub1" \
			preshared-key <(echo "$psk") \
			allowed-ips 192.168.241.1/32,fd00::1/128

	ip1 link set up dev wg0
	ip2 link set up dev wg0
}

n1 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6'
n2 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6'
n1 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6'
n2 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6'

ip1 link add dev wg0 type wireguard
ip2 link add dev wg0 type wireguard
configure_peers

ip1 link add veth1 type veth peer name veth2
ip1 link set veth2 netns $netns2

ip1 addr add 10.0.0.1/24 dev veth1
ip1 addr add 10.0.0.2/24 dev veth1
ip2 addr add 10.0.0.3/24 dev veth2

ip1 link set veth1 up
ip2 link set veth2 up
waitiface $netns1 veth1
waitiface $netns2 veth2

n1 iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
n2 iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

n2 wg set wg0 peer "$pub1" endpoint 10.0.0.1:1
n2 ping -W 1 -c 5 -f 192.168.241.1
[[ $(n2 wg show wg0 endpoints) == "$pub1	10.0.0.1:1" ]]

n1 conntrack -L
n2 conntrack -L

n2 wg set wg0 peer "$pub1" endpoint 10.0.0.2:1
n2 ping -W 1 -c 5 -f 192.168.241.1
[[ $(n2 wg show wg0 endpoints) == "$pub1	10.0.0.2:1" ]]

n1 conntrack -L
n2 conntrack -L

==== output ====

[+] ip netns add wg-test-32269-1
[+] ip netns add wg-test-32269-2
[+] wg genkey
[+] wg genkey
[+] wg pubkey
[+] wg pubkey
[+] wg genpsk
[+] NS1: bash -c echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
[+] NS2: bash -c echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
[+] NS1: bash -c echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6
[+] NS2: bash -c echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6
[+] NS1: ip link add dev wg0 type wireguard
[+] NS2: ip link add dev wg0 type wireguard
[+] NS1: ip addr add 192.168.241.1/24 dev wg0
[+] NS2: ip addr add 192.168.241.2/24 dev wg0
[+] NS1: wg set wg0 private-key /dev/fd/63 listen-port 1 peer NNBvFmhApGEcgy8erS6bCLUi3+nRmg2mzV/xvek9PG0= preshared-key /dev/fd/62 allowed-ips 192.168.241.2/32,fd00::2/128
[+] NS2: wg set wg0 private-key /dev/fd/63 listen-port 2 peer nkdJlCF8z2+MH7aZV0FN9iO6UM+MUbPebADldwJmNRc= preshared-key /dev/fd/62 allowed-ips 192.168.241.1/32,fd00::1/128
[+] NS1: ip link set up dev wg0
[+] NS2: ip link set up dev wg0
[+] NS1: ip link add veth1 type veth peer name veth2
[+] NS1: ip link set veth2 netns wg-test-32269-2
[+] NS1: ip addr add 10.0.0.1/24 dev veth1
[+] NS1: ip addr add 10.0.0.2/24 dev veth1
[+] NS2: ip addr add 10.0.0.3/24 dev veth2
[+] NS1: ip link set veth1 up
[+] NS2: ip link set veth2 up
[+] NS1: wait for veth1 to come up
[+] NS2: wait for veth2 to come up
[+] NS1: iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
[+] NS2: iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
[+] NS2: wg set wg0 peer nkdJlCF8z2+MH7aZV0FN9iO6UM+MUbPebADldwJmNRc= endpoint 10.0.0.1:1
[+] NS2: ping -W 1 -c 5 -f 192.168.241.1
PING 192.168.241.1 (192.168.241.1) 56(84) bytes of data.

--- 192.168.241.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 1ms
rtt min/avg/max/mdev = 0.073/0.256/0.915/0.329 ms, ipg/ewma 0.339/0.575 ms
[+] NS2: wg show wg0 endpoints
[+] NS1: conntrack -L
icmp     1 29 src=192.168.241.2 dst=192.168.241.1 type=8 code=0 id=32322 src=192.168.241.1 dst=192.168.241.2 type=0 code=0 id=32322 mark=0 use=1
udp      17 179 src=10.0.0.3 dst=10.0.0.1 sport=2 dport=1 src=10.0.0.1 dst=10.0.0.3 sport=1 dport=2 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 2 flow entries have been shown.
[+] NS2: conntrack -L
udp      17 179 src=10.0.0.3 dst=10.0.0.1 sport=2 dport=1 src=10.0.0.1 dst=10.0.0.3 sport=1 dport=2 [ASSURED] mark=0 use=1
icmp     1 29 src=192.168.241.2 dst=192.168.241.1 type=8 code=0 id=32322 src=192.168.241.1 dst=192.168.241.2 type=0 code=0 id=32322 mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 2 flow entries have been shown.
[+] NS2: wg set wg0 peer nkdJlCF8z2+MH7aZV0FN9iO6UM+MUbPebADldwJmNRc= endpoint 10.0.0.2:1
[+] NS2: ping -W 1 -c 5 -f 192.168.241.1
PING 192.168.241.1 (192.168.241.1) 56(84) bytes of data.

--- 192.168.241.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.067/0.153/0.320/0.097 ms, ipg/ewma 0.205/0.172 ms
[+] NS2: wg show wg0 endpoints
[+] NS1: conntrack -L
udp      17 179 src=10.0.0.3 dst=10.0.0.2 sport=2 dport=1 src=10.0.0.2 dst=10.0.0.3 sport=1 dport=2 [ASSURED] mark=0 use=1
icmp     1 29 src=192.168.241.2 dst=192.168.241.1 type=8 code=0 id=32327 src=192.168.241.1 dst=192.168.241.2 type=0 code=0 id=32327 mark=0 use=1
icmp     1 29 src=192.168.241.2 dst=192.168.241.1 type=8 code=0 id=32322 src=192.168.241.1 dst=192.168.241.2 type=0 code=0 id=32322 mark=0 use=1
udp      17 179 src=10.0.0.3 dst=10.0.0.1 sport=2 dport=1 src=10.0.0.1 dst=10.0.0.3 sport=1 dport=2 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 4 flow entries have been shown.
[+] NS2: conntrack -L
icmp     1 29 src=192.168.241.2 dst=192.168.241.1 type=8 code=0 id=32327 src=192.168.241.1 dst=192.168.241.2 type=0 code=0 id=32327 mark=0 use=1
udp      17 179 src=10.0.0.3 dst=10.0.0.1 sport=2 dport=1 src=10.0.0.1 dst=10.0.0.3 sport=1 dport=2 [ASSURED] mark=0 use=1
icmp     1 29 src=192.168.241.2 dst=192.168.241.1 type=8 code=0 id=32322 src=192.168.241.1 dst=192.168.241.2 type=0 code=0 id=32322 mark=0 use=1
udp      17 179 src=10.0.0.3 dst=10.0.0.2 sport=2 dport=1 src=10.0.0.2 dst=10.0.0.3 sport=1 dport=2 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 4 flow entries have been shown.
[+] NS1: ip link del dev wg0
[+] NS2: ip link del dev wg0
[+] ip netns del wg-test-32269-1
[+] ip netns del wg-test-32269-2

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 13:51         ` Jason A. Donenfeld
@ 2017-11-29 14:08           ` d tbsky
  2017-11-29 14:10             ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: d tbsky @ 2017-11-29 14:08 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-29 21:51 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> Hi,
>
> I made a small script in order to reproduce this issue, but I was not
> able to replicate the results. Would you spend some time with the below
> code tweaking it so that it exhibits the broken behavior you're seeing?
>
> Jason

Hi jason:

    may I ask what kernel/distribution are you using? since under my
testing, just add an unused nic shows the problem, so I think it's
some kind of racing or conflict. or simply RHEL7.4 kernel bug maybe.
I don't know how to modify the script to reveal the problem in your
environment. I will try to install mainline kernel to see if the
problem is the same. or I can build an RHEL 7.4 vm  and let you ssh in
for testing  if you don' t mind.

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 14:08           ` d tbsky
@ 2017-11-29 14:10             ` Jason A. Donenfeld
  2017-11-29 14:16               ` d tbsky
  0 siblings, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2017-11-29 14:10 UTC (permalink / raw)
  To: d tbsky; +Cc: WireGuard mailing list

Hi tbskyd,

This is on 4.14.2. Would you confirm that this is an issue on your
kernel by actually _running that script and sending the output to the
list_? It would also be helpful to have the output of uname -a.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 14:10             ` Jason A. Donenfeld
@ 2017-11-29 14:16               ` d tbsky
  2017-11-29 14:49                 ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: d tbsky @ 2017-11-29 14:16 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-29 22:10 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> Hi tbskyd,
>
> This is on 4.14.2. Would you confirm that this is an issue on your
> kernel by actually _running that script and sending the output to the
> list_? It would also be helpful to have the output of uname -a.
>
> Jason

  Hi jason:

     sorry I misunderstand you. you mean I modify the script and run
in my environment to reveal the problem?
ok I will try to do it.


Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 14:16               ` d tbsky
@ 2017-11-29 14:49                 ` Jason A. Donenfeld
  2017-11-30  6:15                   ` d tbsky
  2017-12-01  7:44                   ` d tbsky
  0 siblings, 2 replies; 18+ messages in thread
From: Jason A. Donenfeld @ 2017-11-29 14:49 UTC (permalink / raw)
  To: d tbsky; +Cc: WireGuard mailing list

On Wed, Nov 29, 2017 at 3:16 PM, d tbsky <tbskyd@gmail.com> wrote:
>      sorry I misunderstand you. you mean I modify the script and run
> in my environment to reveal the problem?
> ok I will try to do it.

Take what I sent you. Run it. If it breaks, send me the output and
your kernel. If it doesn't break, mess with it until it breaks, and
then send it back to me.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 14:49                 ` Jason A. Donenfeld
@ 2017-11-30  6:15                   ` d tbsky
  2017-11-30  6:22                     ` d tbsky
  2017-12-01  7:44                   ` d tbsky
  1 sibling, 1 reply; 18+ messages in thread
From: d tbsky @ 2017-11-30  6:15 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-29 22:49 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> On Wed, Nov 29, 2017 at 3:16 PM, d tbsky <tbskyd@gmail.com> wrote:
>>      sorry I misunderstand you. you mean I modify the script and run
>> in my environment to reveal the problem?
>> ok I will try to do it.
>
> Take what I sent you. Run it. If it breaks, send me the output and
> your kernel. If it doesn't break, mess with it until it breaks, and
> then send it back to me.

Hi jason:

 "uname -a" result:

 Linux localhost.localdomain 3.10.0-693.5.2.el7.x86_64 #1 SMP Thu Oct
19 10:13:14 CDT 2017 x86_64 x86_64 x86_64 GNU/Linux

 your original script runs fine under my environment.
 I add  three 3 lines before "ip1 link add veth1"  to reveal the problem:

ip1 link add dummy0 type dummy
ip1 addr add 10.0.0.10/24 dev dummy0
ip1 link set dummy0 up

===== whole script below ======
#!/bin/bash
set -e

exec 3>&1
export WG_HIDE_KEYS=never
netns1="wg-test-$$-1"
netns2="wg-test-$$-2"
pretty() { echo -e "\x1b[32m\x1b[1m[+] ${1:+NS$1: }${2}\x1b[0m" >&3; }
pp() { pretty "" "$*"; "$@"; }
maybe_exec() { if [[ $BASHPID -eq $$ ]]; then "$@"; else exec "$@"; fi; }
n1() { pretty 1 "$*"; maybe_exec ip netns exec $netns1 "$@"; }
n2() { pretty 2 "$*"; maybe_exec ip netns exec $netns2 "$@"; }
ip1() { pretty 1 "ip $*"; ip -n $netns1 "$@"; }
ip2() { pretty 2 "ip $*"; ip -n $netns2 "$@"; }
sleep() { read -t "$1" -N 0 || true; }
waitiface() { pretty "${1//*-}" "wait for $2 to come up"; ip netns
exec "$1" bash -c "while [[ \$(< \"/sys/class/net/$2/operstate\") !=
up ]]; do read -t .1 -N 0 || true; done;"; }

cleanup() {
        set +e
        exec 2>/dev/null
        ip1 link del dev wg0
        ip2 link del dev wg0
        local to_kill="$(ip netns pids $netns1) $(ip netns pids $netns2)"
        [[ -n $to_kill ]] && kill $to_kill
        pp ip netns del $netns1
        pp ip netns del $netns2
        exit
}

trap cleanup EXIT

ip netns del $netns1 2>/dev/null || true
ip netns del $netns2 2>/dev/null || true
pp ip netns add $netns1
pp ip netns add $netns2

key1="$(pp wg genkey)"
key2="$(pp wg genkey)"
pub1="$(pp wg pubkey <<<"$key1")"
pub2="$(pp wg pubkey <<<"$key2")"
psk="$(pp wg genpsk)"
[[ -n $key1 && -n $key2 && -n $psk ]]

configure_peers() {
        ip1 addr add 192.168.241.1/24 dev wg0
        ip2 addr add 192.168.241.2/24 dev wg0

        n1 wg set wg0 \
                private-key <(echo "$key1") \
                listen-port 1 \
                peer "$pub2" \
                        preshared-key <(echo "$psk") \
                        allowed-ips 192.168.241.2/32,fd00::2/128
        n2 wg set wg0 \
                private-key <(echo "$key2") \
                listen-port 2 \
                peer "$pub1" \
                        preshared-key <(echo "$psk") \
                        allowed-ips 192.168.241.1/32,fd00::1/128

        ip1 link set up dev wg0
        ip2 link set up dev wg0
}

n1 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6'
n2 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6'
n1 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6'
n2 bash -c 'echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6'

ip1 link add dev wg0 type wireguard
ip2 link add dev wg0 type wireguard
configure_peers

ip1 link add dummy0 type dummy
ip1 addr add 10.0.0.10/24 dev dummy0
ip1 link set dummy0 up

ip1 link add veth1 type veth peer name veth2
ip1 link set veth2 netns $netns2

ip1 addr add 10.0.0.1/24 dev veth1
ip1 addr add 10.0.0.2/24 dev veth1
ip2 addr add 10.0.0.3/24 dev veth2

ip1 link set veth1 up
ip2 link set veth2 up
waitiface $netns1 veth1
waitiface $netns2 veth2

n1 iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
n2 iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

n2 wg set wg0 peer "$pub1" endpoint 10.0.0.1:1
n2 ping -W 1 -c 5 -f 192.168.241.1
[[ $(n2 wg show wg0 endpoints) == "$pub1        10.0.0.1:1" ]]

n1 conntrack -L
n2 conntrack -L

n2 wg set wg0 peer "$pub1" endpoint 10.0.0.2:1
n2 ping -W 1 -c 5 -f 192.168.241.1
[[ $(n2 wg show wg0 endpoints) == "$pub1        10.0.0.2:1" ]]

n1 conntrack -L
n2 conntrack -L

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-30  6:15                   ` d tbsky
@ 2017-11-30  6:22                     ` d tbsky
  2017-11-30  6:30                       ` d tbsky
  0 siblings, 1 reply; 18+ messages in thread
From: d tbsky @ 2017-11-30  6:22 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

[-- Attachment #1: Type: text/plain, Size: 1015 bytes --]

2017-11-30 14:15 GMT+08:00 d tbsky <tbskyd@gmail.com>:
> 2017-11-29 22:49 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
>> On Wed, Nov 29, 2017 at 3:16 PM, d tbsky <tbskyd@gmail.com> wrote:
>>>      sorry I misunderstand you. you mean I modify the script and run
>>> in my environment to reveal the problem?
>>> ok I will try to do it.
>>
>> Take what I sent you. Run it. If it breaks, send me the output and
>> your kernel. If it doesn't break, mess with it until it breaks, and
>> then send it back to me.
>
> Hi jason:
>
>  "uname -a" result:
>
>  Linux localhost.localdomain 3.10.0-693.5.2.el7.x86_64 #1 SMP Thu Oct
> 19 10:13:14 CDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>
>  your original script runs fine under my environment.
>  I add  three 3 lines before "ip1 link add veth1"  to reveal the problem:
>
> ip1 link add dummy0 type dummy
> ip1 addr add 10.0.0.10/24 dev dummy0
> ip1 link set dummy0 up
>

sorry my fault. incorrect copy paste to email break the script. script
as attachment.

Regards,
tbskyd

[-- Attachment #2: final.sh --]
[-- Type: application/x-sh, Size: 3264 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-30  6:22                     ` d tbsky
@ 2017-11-30  6:30                       ` d tbsky
  0 siblings, 0 replies; 18+ messages in thread
From: d tbsky @ 2017-11-30  6:30 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-30 14:22 GMT+08:00 d tbsky <tbskyd@gmail.com>:
> 2017-11-30 14:15 GMT+08:00 d tbsky <tbskyd@gmail.com>:
>> 2017-11-29 22:49 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
>>> On Wed, Nov 29, 2017 at 3:16 PM, d tbsky <tbskyd@gmail.com> wrote:
>>>>      sorry I misunderstand you. you mean I modify the script and run
>>>> in my environment to reveal the problem?
>>>> ok I will try to do it.
>>>
>>> Take what I sent you. Run it. If it breaks, send me the output and
>>> your kernel. If it doesn't break, mess with it until it breaks, and
>>> then send it back to me.
>>
>> Hi jason:
>>
>>  "uname -a" result:
>>
>>  Linux localhost.localdomain 3.10.0-693.5.2.el7.x86_64 #1 SMP Thu Oct
>> 19 10:13:14 CDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>>  your original script runs fine under my environment.
>>  I add  three 3 lines before "ip1 link add veth1"  to reveal the problem:
>>
>> ip1 link add dummy0 type dummy
>> ip1 addr add 10.0.0.10/24 dev dummy0
>> ip1 link set dummy0 up
>>
>
> sorry my fault. incorrect copy paste to email break the script. script
> as attachment.
>
> Regards,
> tbskyd

Hi jason:

   sorry the routing is not correct under the situation. I will add
more rules to see if I can make a normal environment to reveal the
problem.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-11-29 14:49                 ` Jason A. Donenfeld
  2017-11-30  6:15                   ` d tbsky
@ 2017-12-01  7:44                   ` d tbsky
  2017-12-03 17:45                     ` d tbsky
  1 sibling, 1 reply; 18+ messages in thread
From: d tbsky @ 2017-12-01  7:44 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-11-29 22:49 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
> On Wed, Nov 29, 2017 at 3:16 PM, d tbsky <tbskyd@gmail.com> wrote:
>>      sorry I misunderstand you. you mean I modify the script and run
>> in my environment to reveal the problem?
>> ok I will try to do it.
>
> Take what I sent you. Run it. If it breaks, send me the output and
> your kernel. If it doesn't break, mess with it until it breaks, and
> then send it back to me.

Hi jason:

      during test in netns environment, I saw something that I never
saw at real world. the steps below:

1. client try connect to multi-home-server.
2. wait for conntrack session timeout both for client and server.
3. server try connect to client. server will use source ip at step1 to connect.

it means at step1, wireguard not only remember client's ip address,
but also remember self source ip address. even though the source
address didn't show at "wg wg0"  user interface.
is the assumption true? I didn't see this behavior in real world.

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: multi-home difficulty
  2017-12-01  7:44                   ` d tbsky
@ 2017-12-03 17:45                     ` d tbsky
  0 siblings, 0 replies; 18+ messages in thread
From: d tbsky @ 2017-12-03 17:45 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

2017-12-01 15:44 GMT+08:00 d tbsky <tbskyd@gmail.com>:
> 2017-11-29 22:49 GMT+08:00 Jason A. Donenfeld <Jason@zx2c4.com>:
>> On Wed, Nov 29, 2017 at 3:16 PM, d tbsky <tbskyd@gmail.com> wrote:
>>>      sorry I misunderstand you. you mean I modify the script and run
>>> in my environment to reveal the problem?
>>> ok I will try to do it.
>>
>> Take what I sent you. Run it. If it breaks, send me the output and
>> your kernel. If it doesn't break, mess with it until it breaks, and
>> then send it back to me.

Hi Jason:

     sorry for bothering your again. I still can not find the key
point. my testing environment is  rhel 7.4,
I have  tried kernel 3.10,  4.4,  4.14. wireguard 20171111 and 20171127.

I have three things in mind.

1. when wireguard communication established, it will remember self
source ip(although "wg wg0" didn't show) forever until changed next
time. I don't know if the assumption true, could you tell me? I don't
know if this is wireguard feature or netns feature.

2. I build three netns environment, to emulate multi-home-client,
multi-home-server, and a router between client/server. wireguard works
perfect under netns environment.

3. in real world the situation is strange. as I said last time, build
a simple vm with two nic(in the same host bridge)  will reveal the
problem. my vm looks like below:

>ip addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
    link/ether 52:54:00:ff:29:75 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
    link/ether 52:54:00:31:d3:1a brd ff:ff:ff:ff:ff:ff
    inet 10.99.1.99/24 scope global eth1
       valid_lft forever preferred_lft forever
    inet 10.99.1.100/24 scope global secondary eth1
       valid_lft forever preferred_lft forever

it is the most simple config I could find to reveal the problem.
situation below won't show any problem:
1. single nic
2. two nic but ip bound to first nic
3. two nic but first nic state is "down", not "up".

the problem is the same under kernel 3.10, 4.4, 4.14. when client
connect to server ip "10.99.1.100", server will reply with ip
"10.99.1.99". it is really a puzzle to me. but maybe you can see why
immediately if you have the environment.

thanks a lot for your patience.

Regards,
tbskyd

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-12-03 17:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-21 13:21 multi-home difficulty d tbsky
2017-11-21 13:32 ` Tomas Herceg
2017-11-21 14:15 ` Jason A. Donenfeld
2017-11-21 14:35   ` d tbsky
2017-11-22 23:35     ` Jason A. Donenfeld
2017-11-23 17:06       ` d tbsky
2017-11-29 11:05       ` d tbsky
2017-11-29 13:13         ` Jason A. Donenfeld
2017-11-29 13:51         ` Jason A. Donenfeld
2017-11-29 14:08           ` d tbsky
2017-11-29 14:10             ` Jason A. Donenfeld
2017-11-29 14:16               ` d tbsky
2017-11-29 14:49                 ` Jason A. Donenfeld
2017-11-30  6:15                   ` d tbsky
2017-11-30  6:22                     ` d tbsky
2017-11-30  6:30                       ` d tbsky
2017-12-01  7:44                   ` d tbsky
2017-12-03 17:45                     ` d tbsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.