All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: <netdev@vger.kernel.org>
Cc: aconole@redhat.com, sbrivio@redhat.com, Florian Westphal <fw@strlen.de>
Subject: [PATCH net-next 1/3] udp_tunnel: allow to turn off path mtu discovery on encap sockets
Date: Sun, 12 Jul 2020 22:07:03 +0200	[thread overview]
Message-ID: <20200712200705.9796-2-fw@strlen.de> (raw)
In-Reply-To: <20200712200705.9796-1-fw@strlen.de>

vxlan and geneve take the to-be-transmitted skb, prepend the
encapsulation header and send the result.

Neither vxlan nor geneve can do anything about a lowered path mtu
except notifying the peer/upper dst entry.
In routed setups, vxlan takes the updated pmtu from the encap sockets'
dst entry and will notify/update the dst entry of the current skb.

Some setups, however, will use vxlan as a bridge port (or openvs vport).

In both cases, no upper dst entry exists.

Without this patch:

1. Client sends x bytes, where x == MTU of vxlan/geneve interface.
2. the encap header is prepended and the encap packet is passed to
   ip_output.
3. If the sk received a pmtu error in the mean time, then ip_output
   will fetch the mtu from the encap socket instead of dev->mtu.
4. ip_output emits an ICMP error to encap socket

The step #4 prevents the route exception from timing out, and setup
remains in a state where the upper layer cannot send MTU-sized packets,
even though the encapsulated packet doesn't exceed the link MTU.

It appears best to configure the encap socket to never learn about path
MTU in these setups.

Next patch will add the VXLAN config plane to use this.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/ipv6.h         | 7 +++++++
 include/net/udp_tunnel.h   | 2 ++
 net/ipv4/udp_tunnel_core.c | 2 ++
 net/ipv6/ip6_udp_tunnel.c  | 7 +++++++
 4 files changed, 18 insertions(+)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 5e65bf2fd32d..fa8e546546e3 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1195,6 +1195,13 @@ static inline void ip6_sock_set_recverr(struct sock *sk)
 	release_sock(sk);
 }
 
+static inline void ip6_sock_set_mtu_discover(struct sock *sk, int val)
+{
+	lock_sock(sk);
+	inet6_sk(sk)->pmtudisc = val;
+	release_sock(sk);
+}
+
 static inline int __ip6_sock_set_addr_preferences(struct sock *sk, int val)
 {
 	unsigned int pref = 0;
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index dd20ce99740c..f02be73bdae1 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -34,6 +34,8 @@ struct udp_port_cfg {
 	unsigned int		use_udp_checksums:1,
 				use_udp6_tx_checksums:1,
 				use_udp6_rx_checksums:1,
+				ip_pmtudisc:1,
+				ip_pmtudiscv:3,
 				ipv6_v6only:1;
 };
 
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 3eecba0874aa..1d20bd5b72ac 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -26,6 +26,8 @@ int udp_sock_create4(struct net *net, struct udp_port_cfg *cfg,
 		if (err < 0)
 			goto error;
 	}
+	if (cfg->ip_pmtudisc)
+		ip_sock_set_mtu_discover(sock->sk, cfg->ip_pmtudiscv);
 
 	udp_addr.sin_family = AF_INET;
 	udp_addr.sin_addr = cfg->local_ip;
diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
index cdc4d4ee2420..63c22252a76f 100644
--- a/net/ipv6/ip6_udp_tunnel.c
+++ b/net/ipv6/ip6_udp_tunnel.c
@@ -34,6 +34,13 @@ int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg,
 		if (err < 0)
 			goto error;
 	}
+	if (cfg->ip_pmtudisc) {
+		BUILD_BUG_ON(IP_PMTUDISC_DONT != IPV6_PMTUDISC_DONT);
+		BUILD_BUG_ON(IP_PMTUDISC_OMIT != IPV6_PMTUDISC_OMIT);
+
+		ip_sock_set_mtu_discover(sock->sk, cfg->ip_pmtudiscv);
+		ip6_sock_set_mtu_discover(sock->sk, cfg->ip_pmtudiscv);
+	}
 
 	udp6_addr.sin6_family = AF_INET6;
 	memcpy(&udp6_addr.sin6_addr, &cfg->local_ip6,
-- 
2.26.2


  reply	other threads:[~2020-07-12 20:07 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-12 20:07 [PATCH net-next 0/3] vxlan, geneve: allow to turn off PMTU updates on encap socket Florian Westphal
2020-07-12 20:07 ` Florian Westphal [this message]
2020-07-12 22:38   ` [PATCH net-next 1/3] udp_tunnel: allow to turn off path mtu discovery on encap sockets Stefano Brivio
2020-07-13  8:04     ` Florian Westphal
2020-07-13 10:04       ` Stefano Brivio
2020-07-13 10:51         ` Numan Siddique
2020-07-14 20:38           ` Aaron Conole
2020-07-15 11:58             ` Stefano Brivio
2020-07-13 13:25       ` David Ahern
2020-07-13 14:02         ` Florian Westphal
2020-07-13 14:41           ` David Ahern
2020-07-13 14:59             ` Florian Westphal
2020-07-13 15:57               ` Stefano Brivio
2020-07-13 16:22                 ` Florian Westphal
2020-07-14 12:33                   ` Stefano Brivio
2020-07-14 12:33           ` Stefano Brivio
2020-07-15 12:42             ` Florian Westphal
2020-07-15 13:35               ` Stefano Brivio
2020-07-15 14:33                 ` Florian Westphal
2020-07-17 12:27                   ` Stefano Brivio
2020-07-17 15:04                     ` David Ahern
2020-07-17 18:43                       ` Florian Westphal
2020-07-18  6:56                       ` Stefano Brivio
2020-07-18 17:02                         ` David Ahern
2020-07-18 17:58                           ` Stefano Brivio
2020-07-18 18:04                             ` Stefano Brivio
2020-07-19 18:43                             ` David Ahern
2020-07-19 21:49                               ` Stefano Brivio
2020-07-20  3:19                                 ` David Ahern
2020-07-26 17:01                                   ` Stefano Brivio
2020-07-12 20:07 ` [PATCH net-next 2/3] vxlan: allow to disable path mtu learning on encap socket Florian Westphal
2020-07-16 19:33   ` Jakub Kicinski
2020-07-17 10:13     ` Florian Westphal
2020-07-12 20:07 ` [PATCH net-next 3/3] geneve: allow disabling of pmtu detection on encap sk Florian Westphal
2020-07-12 22:39 ` [PATCH net-next 0/3] vxlan, geneve: allow to turn off PMTU updates on encap socket Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200712200705.9796-2-fw@strlen.de \
    --to=fw@strlen.de \
    --cc=aconole@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.