All of lore.kernel.org
 help / color / mirror / Atom feed
* [WireGuard] NAT-T Keepalives
@ 2016-07-07 16:33 Jason A. Donenfeld
  2016-07-07 16:58 ` Baptiste Jonglez
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-07-07 16:33 UTC (permalink / raw)
  To: WireGuard mailing list

Hi folks,

WireGuard is designed to be as silent as possible. We have
opportunistic encrypted keepalives sent only in relation to received
data with nothing to respond, to detect if the link is dead. This
system works quite well.

However, with NAT, a mapping times out after a while. This is fine for
the case in which the system behind NAT connects up to the server,
after a period of inactivity, since then there will simply be a new
NAT mapping and roaming support takes care of it. However, if the NAT
mapping expires, and the server wants to send a packet to the client
behind NAT, then we're in trouble. This is a common setup too, when
people want to keep their PCs at home accessible via their VPN.

The most bootleg solution for this is to just run "ping $server" from
userspace. What a disgusting fix.

It seems evident that, like every other UDP protocol on the planet,
WireGuard needs to support what I'm calling "persistent keepalives". I
call it persistent to distinguish it from the encrypted ones which are
opportunistic. There are several other important differences:

a) The persistent keepalive does not need an active session and does
not need to send any encrypted data. It simply is a UDP packet to the
endpoint. The payload doesn't matter for the purpose of just keeping
the NAT mapping alive.
b) The persistent keepalive is optional. It is a configuration option
in seconds. "0" means off. "60" means send once per minute. And so
forth. By default it is off.

So, now there are several things to decide:

1. What should the payload be? Should it be a single fixed byte? Or
should it be a zero length UDP packet?
2. What is an acceptable minimum interval? Every 5 seconds?
3. What is an acceptable maximum interval? 3600 seconds?
4. What is a good interval to show in documentation examples that will
work for most people?
5. Is there a good resource for real world NAT mapping timings found
in the wild?

After this feature is ironed out, I'll be pushing a new experimental
snapshot. This is currently the most visible headache of WireGuard and
I'd like to get it ironed out sooner rather than later.

What are your thoughts?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 16:33 [WireGuard] NAT-T Keepalives Jason A. Donenfeld
@ 2016-07-07 16:58 ` Baptiste Jonglez
  2016-07-07 17:15   ` Jason A. Donenfeld
  2016-07-07 17:57 ` Bruno Wolff III
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Baptiste Jonglez @ 2016-07-07 16:58 UTC (permalink / raw)
  To: wireguard

[-- Attachment #1: Type: text/plain, Size: 2565 bytes --]

Just a small note: this would not be useful just for NAT, but also for
stateful firewalls.  It is not uncommon to have a stateful firewall
directly on end hosts (I think Fedora does this by default?), and
keepalives would prevent the mapping there from expiring.

The same goes for IPv6: most home routers have a stateful firewall by
default, even though they don't do IPv6 NAT (hopefully).

On Thu, Jul 07, 2016 at 06:33:11PM +0200, Jason A. Donenfeld wrote:
> It seems evident that, like every other UDP protocol on the planet,
> WireGuard needs to support what I'm calling "persistent keepalives". I
> call it persistent to distinguish it from the encrypted ones which are
> opportunistic. There are several other important differences:
> 
> a) The persistent keepalive does not need an active session and does
> not need to send any encrypted data. It simply is a UDP packet to the
> endpoint. The payload doesn't matter for the purpose of just keeping
> the NAT mapping alive.

Why not simply use the same technique as the opportunic keepalives, then?
(encrypted payload)

One small advantage over the "empty UDP packet" method is that it will
also refresh the "latest handshake" timer shown in wg (or whatever GUI
people will build on top of wireguard).  From a user perspective, it's
nice to know that the VPN is still alive.

> b) The persistent keepalive is optional. It is a configuration option
> in seconds. "0" means off. "60" means send once per minute. And so
> forth. By default it is off.
> 
> So, now there are several things to decide:
> 
> 1. What should the payload be? Should it be a single fixed byte? Or
> should it be a zero length UDP packet?

I wouldn't be surprised that some middleboxes drop zero-length UDP
packets, but I don't have any data...

> 2. What is an acceptable minimum interval? Every 5 seconds?
> 3. What is an acceptable maximum interval? 3600 seconds?
> 4. What is a good interval to show in documentation examples that will
> work for most people?

30 seconds?

> 5. Is there a good resource for real world NAT mapping timings found
> in the wild?
> 
> After this feature is ironed out, I'll be pushing a new experimental
> snapshot. This is currently the most visible headache of WireGuard and
> I'd like to get it ironed out sooner rather than later.
> 
> What are your thoughts?
> 
> Thanks,
> Jason
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> http://lists.zx2c4.com/mailman/listinfo/wireguard

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 16:58 ` Baptiste Jonglez
@ 2016-07-07 17:15   ` Jason A. Donenfeld
  2016-07-07 17:43     ` Alex Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-07-07 17:15 UTC (permalink / raw)
  To: Baptiste Jonglez; +Cc: wireguard

[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]

On Jul 7, 2016 6:58 PM, "Baptiste Jonglez" <baptiste@bitsofnetworks.org>
wrote:
> Why not simply use the same technique as the opportunic keepalives, then?
> (encrypted payload)

Because when persistent-keepalive-interval > keylifetime (2 minutes), this
results in a new handshake, causing 3 packets instead of 1.

Also, why waste crypto cycles when you don't have to? If we're just trying
to appease firewalls and NATs, then let's leave the problem at that level,
not let it infect other layers.

>
> One small advantage over the "empty UDP packet" method is that it will
> also refresh the "latest handshake" timer shown in wg (or whatever GUI
> people will build on top of wireguard).  From a user perspective, it's
> nice to know that the VPN is still alive.

No way josé. (In fact I wouldn't mind removing the beloved latest handshake
field.) From the perspective of the sysadmin, wireguard must appear
stateless. Fundamental design goal.

> [0 len could be bad]
> 30 seconds?

I have the same set of speculations and concerns, but it's mostly just
imaginary without real data, as you said. Maybe I'll write into NANOG?

[-- Attachment #2: Type: text/html, Size: 1438 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 17:15   ` Jason A. Donenfeld
@ 2016-07-07 17:43     ` Alex Xu
  2016-07-07 17:44       ` Jason A. Donenfeld
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Xu @ 2016-07-07 17:43 UTC (permalink / raw)
  To: wireguard

On Thu, 7 Jul 2016 19:15:16 +0200
"Jason A. Donenfeld" <Jason@zx2c4.com> wrote as excerpted:

> On Jul 7, 2016 6:58 PM, "Baptiste Jonglez"
> <baptiste@bitsofnetworks.org> wrote:
> > [0 len could be bad]
> > 30 seconds?  
> 
> I have the same set of speculations and concerns, but it's mostly just
> imaginary without real data, as you said. Maybe I'll write into NANOG?

See RFC 4787.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 17:43     ` Alex Xu
@ 2016-07-07 17:44       ` Jason A. Donenfeld
  0 siblings, 0 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-07-07 17:44 UTC (permalink / raw)
  To: Alex Xu; +Cc: WireGuard mailing list

I was just reading that actually.

https://tools.ietf.org/html/rfc4787
http://www.dcs.gla.ac.uk/publications/PAPERS/9347/2010-hgw-study.pdf
https://www.ietf.org/proceedings/78/slides/behave-8.pdf

Trying to get a good sampling of various real world routers.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 16:33 [WireGuard] NAT-T Keepalives Jason A. Donenfeld
  2016-07-07 16:58 ` Baptiste Jonglez
@ 2016-07-07 17:57 ` Bruno Wolff III
  2016-07-08  0:55 ` Jason A. Donenfeld
  2016-07-14 10:55 ` Guus Sliepen
  3 siblings, 0 replies; 11+ messages in thread
From: Bruno Wolff III @ 2016-07-07 17:57 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

On Thu, Jul 07, 2016 at 18:33:11 +0200,
  "Jason A. Donenfeld" <Jason@zx2c4.com> wrote:
>
>The most bootleg solution for this is to just run "ping $server" from
>userspace. What a disgusting fix.

This also forces encrypted traffic, which your solution would avoid.

>1. What should the payload be? Should it be a single fixed byte? Or
>should it be a zero length UDP packet?

The packet doesn't even need to make it to the endpoint, just through 
the NAT or firewall. So you don't want something that would get blocked 
by those. I don't know of anything that would be likely to. Otherwise 
you probably want minimum resources expended by the end points producing 
and receiving the packet.

>5. Is there a good resource for real world NAT mapping timings found
>in the wild?

Our NAT defaults to one minute for random UDP ports. So 30 seconds seems 
like a reasonable time if there isn't much packet loss.

>What are your thoughts?

It sounds like a nice improvement.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 16:33 [WireGuard] NAT-T Keepalives Jason A. Donenfeld
  2016-07-07 16:58 ` Baptiste Jonglez
  2016-07-07 17:57 ` Bruno Wolff III
@ 2016-07-08  0:55 ` Jason A. Donenfeld
  2016-07-08 11:49   ` Jason A. Donenfeld
  2016-07-14 10:55 ` Guus Sliepen
  3 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-07-08  0:55 UTC (permalink / raw)
  To: WireGuard mailing list

On Thu, Jul 7, 2016 at 6:33 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> 1. What should the payload be? Should it be a single fixed byte? Or
> should it be a zero length UDP packet?

Zero length.

> 2. What is an acceptable minimum interval? Every 5 seconds?

Every 10 seconds, so that we can only push the timer back on sending,
and then rely on the opportunistic keepalive for making things
coherent.

> 3. What is an acceptable maximum interval? 3600 seconds?

3600 seconds.

> 4. What is a good interval to show in documentation examples that will
> work for most people?

25 seconds, based on a massive survey of different routing equipment
in the wild.

> After this feature is ironed out, I'll be pushing a new experimental
> snapshot. This is currently the most visible headache of WireGuard and
> I'd like to get it ironed out sooner rather than later.

Ongoing work lives in this branch, which I'll merge soon:
https://git.zx2c4.com/WireGuard/log/?h=persistent-keepalive

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-08  0:55 ` Jason A. Donenfeld
@ 2016-07-08 11:49   ` Jason A. Donenfeld
  0 siblings, 0 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-07-08 11:49 UTC (permalink / raw)
  To: WireGuard mailing list

I've merged this with snapshot experimental-0.0.20160708. Give it a
try and let me know how it works.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-07 16:33 [WireGuard] NAT-T Keepalives Jason A. Donenfeld
                   ` (2 preceding siblings ...)
  2016-07-08  0:55 ` Jason A. Donenfeld
@ 2016-07-14 10:55 ` Guus Sliepen
  2016-07-15 12:07   ` Jason A. Donenfeld
  3 siblings, 1 reply; 11+ messages in thread
From: Guus Sliepen @ 2016-07-14 10:55 UTC (permalink / raw)
  To: wireguard

[-- Attachment #1: Type: text/plain, Size: 2121 bytes --]

Some insights learned from tinc:

On Thu, Jul 07, 2016 at 06:33:11PM +0200, Jason A. Donenfeld wrote:

> a) The persistent keepalive does not need an active session and does
> not need to send any encrypted data. It simply is a UDP packet to the
> endpoint. The payload doesn't matter for the purpose of just keeping
> the NAT mapping alive.

Indeed.

> 1. What should the payload be? Should it be a single fixed byte? Or
> should it be a zero length UDP packet?

A zero-length UDP packet should be fine, although it might upset some
OSes or firewalls.

Another issue that tinc deals with is path MTU discovery. It combines
this with the heartbeat packets. While a zero-length UDP packets is
enough to keep a NAT mapping alive, the actual path between two peers
might change, and that also changes the path MTU. AFAIK WireGuard
doesn't care about this, but in case you (start to) do, you want to send
packets with the discovered MTU and perhaps a slightly bigger one too,
once in a while, to check whether the PMTU changed.

Discovering the PMTU between two peers and enforcing this inside the
tunnel helps prevent fragmentation of the outer UDP packets. This
improves performance and sometimes it's just necessary because there are
firewalls out there that block fragments.

> 2. What is an acceptable minimum interval? Every 5 seconds?
> 3. What is an acceptable maximum interval? 3600 seconds?
> 4. What is a good interval to show in documentation examples that will
> work for most people?

If you want to keep alive a NAT mapping, then experience tells me 10
seconds is something that works for virtually all NAT devices. Once you
start to go over 10 seconds, you will find there are those that will
drop the mappings. There are RFCs which tell you how a NAT device should
behave (RFC 4787 and 7857), but it's hard to find devices that follow
all these requirements. The recommended timeout for NAT devices is 5
minutes. I'm quite sure a 3600 second interval is useless in practice.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus@tinc-vpn.org>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-14 10:55 ` Guus Sliepen
@ 2016-07-15 12:07   ` Jason A. Donenfeld
  2016-07-15 15:19     ` Guus Sliepen
  0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-07-15 12:07 UTC (permalink / raw)
  To: Guus Sliepen; +Cc: WireGuard mailing list

Hey Guus,

On Thu, Jul 14, 2016 at 12:55 PM, Guus Sliepen <guus@tinc-vpn.org> wrote:
> Some insights learned from tinc:

Thanks very much for these!

>
> On Thu, Jul 07, 2016 at 06:33:11PM +0200, Jason A. Donenfeld wrote:
>
>> a) The persistent keepalive does not need an active session and does
>> not need to send any encrypted data. It simply is a UDP packet to the
>> endpoint. The payload doesn't matter for the purpose of just keeping
>> the NAT mapping alive.
>
> Indeed.
>
>> 1. What should the payload be? Should it be a single fixed byte? Or
>> should it be a zero length UDP packet?
>
> A zero-length UDP packet should be fine, although it might upset some
> OSes or firewalls.

In fact, we wound up switching to an encrypted keepalive, so that it
would work nicely with roaming and endpoint discovery. Now, setting
persistent-keepalive mode on will ensure that wireguard remains
"connected", while its default is to "go to sleep".


>
> Another issue that tinc deals with is path MTU discovery. It combines
> this with the heartbeat packets. While a zero-length UDP packets is
> enough to keep a NAT mapping alive, the actual path between two peers
> might change, and that also changes the path MTU. AFAIK WireGuard
> doesn't care about this, but in case you (start to) do, you want to send
> packets with the discovered MTU and perhaps a slightly bigger one too,
> once in a while, to check whether the PMTU changed.
>
> Discovering the PMTU between two peers and enforcing this inside the
> tunnel helps prevent fragmentation of the outer UDP packets. This
> improves performance and sometimes it's just necessary because there are
> firewalls out there that block fragments.

Doesn't the Linux kernel already support PMTU discovery with the usual
ICMP notifications, unless you turn it off with the sysctl nob? Have
you experimented at all with how this discovery trickles down to tinc?
I wonder if, since I'm inside the kernel, I'd have an even closer way
of integrating with the already existing mechanisms.


> If you want to keep alive a NAT mapping, then experience tells me 10
> seconds is something that works for virtually all NAT devices. Once you
> start to go over 10 seconds, you will find there are those that will
> drop the mappings. There are RFCs which tell you how a NAT device should
> behave (RFC 4787 and 7857), but it's hard to find devices that follow
> all these requirements. The recommended timeout for NAT devices is 5
> minutes. I'm quite sure a 3600 second interval is useless in practice.

Yea, I read those RFCs and then promptly found out nobody follows
them. What a disappointment.

In practice, have you seen any devices that are worse than 30 seconds?
That's about the lowest I saw.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [WireGuard] NAT-T Keepalives
  2016-07-15 12:07   ` Jason A. Donenfeld
@ 2016-07-15 15:19     ` Guus Sliepen
  0 siblings, 0 replies; 11+ messages in thread
From: Guus Sliepen @ 2016-07-15 15:19 UTC (permalink / raw)
  To: WireGuard mailing list

[-- Attachment #1: Type: text/plain, Size: 4932 bytes --]

On Fri, Jul 15, 2016 at 02:07:54PM +0200, Jason A. Donenfeld wrote:

> > A zero-length UDP packet should be fine, although it might upset some
> > OSes or firewalls.
> 
> In fact, we wound up switching to an encrypted keepalive, so that it
> would work nicely with roaming and endpoint discovery. Now, setting
> persistent-keepalive mode on will ensure that wireguard remains
> "connected", while its default is to "go to sleep".

Great!

> > Discovering the PMTU between two peers and enforcing this inside the
> > tunnel helps prevent fragmentation of the outer UDP packets. This
> > improves performance and sometimes it's just necessary because there are
> > firewalls out there that block fragments.
> 
> Doesn't the Linux kernel already support PMTU discovery with the usual
> ICMP notifications, unless you turn it off with the sysctl nob? Have
> you experimented at all with how this discovery trickles down to tinc?
> I wonder if, since I'm inside the kernel, I'd have an even closer way
> of integrating with the already existing mechanisms.

The kernel does a limitted form of PMTU discovery. It assumes the PMTU
is the same as the local network interface's MTU. When the real PMTU is
smaller, its only form of discovery is by receiving ICMP Fragmentation
needed/Packet too big messages from somewhere along the path. It can be
that these ICMP packets are blocked by firewalls along the return path.

If it receives those ICMP packets, then the next time it sends a packet
to the peer with the Don't Fragment bit set, and the packet is bigger
than the PMTU, the send() call will fail, and then you have to do some
other query to find out what the current idea of the PMTU is. Doing this
in the kernel is probably easier than in userspace, where there is no
easy, cross-platform way to get the PMTU for a given destination from an
unconnected UDP socket.

So when the send() call fails because of the PMTU, you have to generate
your own ICMP Fragmentation needed/Packet too big packet inside the
tunnel.

If you don't receive ICMP packets telling you your packets are too big,
then you have a problem. The kernel doesn't do any kind of proactive
PMTU discovery, it only reacts to those ICMP packets. What typically
happens is that if you make a TCP connection via your VPN, the initial
connection works, and as long as you don't send a lot of data at a time,
it keeps working. But as soon as you send a lot of data, the packets
will be larger than the PMTU and they get dropped without notice. The
connection then hangs indefinitely. Typically, if you log in to a remote
machine via SSH over the VPN, the connection works, and you get a login
prompt. Some commands work fine, but if you do for example "ls -lR
/usr", it will hang.

If you don't set the DF bit on the outer UDP packets, then things work
fine until you have a firewall blocking fragments along the way.

I gave a talk about this at FOSDEM in 2010:

https://tinc-vpn.org/presentations/fosdem-2010/tinc_fosdem2010_slides.pdf

I don't know what WireGuard should do. Do you want it to be very robust
or a low-level thing that people might need to tweak (ie, setting the
MTU of the wireguard interface manually)? I think the best solution is
that you keep the kernel code as simple as possible, and have a
userspace daemon take over tasks that don't require high performance. I
believe it is only necessary to have the kernel handle packets from
known, already authenticated peers. Everything it cannot handle, have
the userspace daemon deal with.

This daemon could then also do proactive PMTU discovery between peers.
Basically, tinc does this at the start of a connection by regularly
sending packets with a random size between (lower_pmtu, upper_pmtu),
where lower_pmtu is the biggest packet that has succesfully been sent
and received, and upper_pmtu starts at the interface MTU, and is lowered
whenever an ICMP Fragmention needed/Packet too big packet is received,
until the lower and upper bounds converge or a timeout occurs (after
which the lower_pmtu is used as the actual PMTU).

> In practice, have you seen any devices that are worse than 30 seconds?
> That's about the lowest I saw.

Yes; several people have reported issues with UDP connectivity, and upon
closer inspection the culprit was a NAT or stateful firewall that had a
timeout of less than 30 seconds. A heartbeat interval of 10 seconds
worked, whereas longer intervals resulted in lost UDP mappings.

I've also personally had a broadband router that did have very
reasonable timeouts, but it could only remember a very small amount of
mappings. As soon as you started something that created lots of
connections (say, a torrent), it would cause it to lose other mappings
that did not have regular traffic.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus@tinc-vpn.org>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-07-15 15:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-07 16:33 [WireGuard] NAT-T Keepalives Jason A. Donenfeld
2016-07-07 16:58 ` Baptiste Jonglez
2016-07-07 17:15   ` Jason A. Donenfeld
2016-07-07 17:43     ` Alex Xu
2016-07-07 17:44       ` Jason A. Donenfeld
2016-07-07 17:57 ` Bruno Wolff III
2016-07-08  0:55 ` Jason A. Donenfeld
2016-07-08 11:49   ` Jason A. Donenfeld
2016-07-14 10:55 ` Guus Sliepen
2016-07-15 12:07   ` Jason A. Donenfeld
2016-07-15 15:19     ` Guus Sliepen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.