netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej Żenczykowski" <zenczykowski@gmail.com>
To: "Maciej Żenczykowski" <maze@google.com>,
	"David S . Miller" <davem@davemloft.net>
Cc: Linux Network Development Mailing List <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Willem de Bruijn <willemb@google.com>,
	Lorenzo Colitti <lorenzo@google.com>,
	Sunmeet Gill <sgill@quicinc.com>,
	Vinay Paradkar <vparadka@qti.qualcomm.com>,
	Tyler Wear <twear@quicinc.com>, David Ahern <dsahern@kernel.org>
Subject: [PATCH 1/2] net/ipv6: always honour route mtu during forwarding
Date: Wed,  7 Oct 2020 20:31:01 -0700	[thread overview]
Message-ID: <20201008033102.623894-1-zenczykowski@gmail.com> (raw)

From: Maciej Żenczykowski <maze@google.com>

This matches the new ipv4 behaviour as of commit:
  commit 02a1b175b0e92d9e0fa5df3957ade8d733ceb6a0
  Author: Maciej Żenczykowski <maze@google.com>
  Date:   Wed Sep 23 13:18:15 2020 -0700

  net/ipv4: always honour route mtu during forwarding

The reasoning is similar: There doesn't seem to be any reason
why you would want to ignore route mtu.

There are two potential sources of ipv6 route mtu:
  - manually configured by NET_ADMIN, since you configured
    a route mtu explicitly you probably know best...
  - derived from mtu information from RA messages,
    but this is the network telling you what will work,
    again presumably whatever network admin configured
    the RA content knows best what the network conditions are.

One could argue that RAs can be spoofed, but if we get spoofed
RAs we're *already* screwed, and erroneous mtu information is
less dangerous then the erroneous routes themselves...
(The proper place to do RA filtering is in the switch/router)

Additionally, a reduction from 1500 to 1280 (min ipv6 mtu) is
not very noticable on performance (especially with gro/gso/tso),
while packets getting lost (due to rx buffer overruns) or
generating icmpv6 packet too big errors and needing to be
retransmitted is very noticable (guaranteed impact of full rtt)

It is pretty common to have a higher device mtu to allow receiving
large (jumbo) frames, while having some routes via that interface
(potentially including the default route to the internet) specify
a lower mtu.

There might also be use cases around xfrm/ipsec/tunnels.
Especially for something like sit/6to4/6rd, where you may have one
sit device, but traffic through it will flow over different
underlying paths and thus is per subnet and not per device.

(Note that this function does not honour pmtu, which can be spoofed
via icmpv6 messages, but see also ip6_mtu_from_fib6() which honours
pmtu for ipv6 'locked mtu' routes)

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Sunmeet Gill (Sunny) <sgill@quicinc.com>
Cc: Vinay Paradkar <vparadka@qti.qualcomm.com>
Cc: Tyler Wear <twear@quicinc.com>
Cc: David Ahern <dsahern@kernel.org>
---
 include/net/ip6_route.h | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 2a5277758379..598415743f46 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -311,19 +311,13 @@ static inline bool rt6_duplicate_nexthop(struct fib6_info *a, struct fib6_info *
 static inline unsigned int ip6_dst_mtu_forward(const struct dst_entry *dst)
 {
 	struct inet6_dev *idev;
-	unsigned int mtu;
+	unsigned int mtu = dst_metric_raw(dst, RTAX_MTU);
+	if (mtu)
+		return mtu;
 
-	if (dst_metric_locked(dst, RTAX_MTU)) {
-		mtu = dst_metric_raw(dst, RTAX_MTU);
-		if (mtu)
-			return mtu;
-	}
-
-	mtu = IPV6_MIN_MTU;
 	rcu_read_lock();
 	idev = __in6_dev_get(dst->dev);
-	if (idev)
-		mtu = idev->cnf.mtu6;
+	mtu = idev ? idev->cnf.mtu6 : IPV6_MIN_MTU;
 	rcu_read_unlock();
 
 	return mtu;
-- 
2.28.0.806.g8561365e88-goog


             reply	other threads:[~2020-10-08  3:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-08  3:31 Maciej Żenczykowski [this message]
2020-10-08  3:31 ` [PATCH 2/2] net/ipv6: ensure ip6_dst_mtu_forward() returns at least IPV6_MIN_MTU Maciej Żenczykowski
2020-10-08  6:04 ` [PATCH 1/2] net/ipv6: always honour route mtu during forwarding Lorenzo Colitti
2020-10-08  6:22   ` Maciej Żenczykowski
2020-10-08 16:31 ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201008033102.623894-1-zenczykowski@gmail.com \
    --to=zenczykowski@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=lorenzo@google.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=sgill@quicinc.com \
    --cc=twear@quicinc.com \
    --cc=vparadka@qti.qualcomm.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).