netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
@ 2020-06-25 22:44 Oliver Herms
  2020-06-30  6:22 ` Jakub Kicinski
  0 siblings, 1 reply; 7+ messages in thread
From: Oliver Herms @ 2020-06-25 22:44 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuznet, yoshfuji, kuba

The calculation of the effective tunnel mtu, that is used to create
mtu exceptions if necessary, is currently not done correctly. This
leads to unnecessary entries in the IPv6 route cache for any
packet send through the tunnel.

The root cause is, that "dev->hard_header_len" is subtracted from the
tunnel destionations path mtu. Thus subtracting too much, if
dev->hard_header_len is filled in. This is that case for SIT tunnels
where hard_header_len is the underlyings dev hard_header_len (e.g. 14
for ethernet) + 20 bytes IP header (see net/ipv6/sit.c:1091).

However, the MTU of the path is exclusive of the ethernet header
and the 20 bytes for the IP header are being subtracted separately
already. Thus hard_header_len is removed from this calculation.

For IPIP and GRE tunnels this doesn't change anything as hard_header_len
is zero in those cases anyways.

This patch also corrects the calculation of the payload's packet size.

Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com>
---
 net/ipv4/ip_tunnel.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index f4f1d11eab50..66565647122d 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -492,11 +492,10 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
 	int mtu;
 
 	tunnel_hlen = md ? tunnel_hlen : tunnel->hlen;
-	pkt_size = skb->len - tunnel_hlen - dev->hard_header_len;
+	pkt_size = skb->len - tunnel_hlen;
 
 	if (df)
-		mtu = dst_mtu(&rt->dst) - dev->hard_header_len
-					- sizeof(struct iphdr) - tunnel_hlen;
+		mtu = dst_mtu(&rt->dst) - sizeof(struct iphdr) - tunnel_hlen;
 	else
 		mtu = skb_valid_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
  2020-06-25 22:44 [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation Oliver Herms
@ 2020-06-30  6:22 ` Jakub Kicinski
  2020-06-30 10:21   ` Oliver Herms
  2020-06-30 15:51   ` Nicolas Dichtel
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-06-30  6:22 UTC (permalink / raw)
  To: Oliver Herms; +Cc: netdev, davem, kuznet, yoshfuji

On Fri, 26 Jun 2020 00:44:35 +0200 Oliver Herms wrote:
> The calculation of the effective tunnel mtu, that is used to create
> mtu exceptions if necessary, is currently not done correctly. This
> leads to unnecessary entries in the IPv6 route cache for any
> packet send through the tunnel.
> 
> The root cause is, that "dev->hard_header_len" is subtracted from the
> tunnel destionations path mtu. Thus subtracting too much, if
> dev->hard_header_len is filled in. This is that case for SIT tunnels
> where hard_header_len is the underlyings dev hard_header_len (e.g. 14
> for ethernet) + 20 bytes IP header (see net/ipv6/sit.c:1091).

It seems like SIT possibly got missed in evolution of the ip_tunnel
code? It seems to duplicate a lot of code, including pmtu checking.
Doesn't call ip_tunnel_init()...

My understanding is that for a while now tunnels are not supposed to use
dev->hard_header_len to reserve skb space, and use dev->needed_headroom, 
instead. sit uses hard_header_len and doesn't even copy needed_headroom
of the lower device.

> However, the MTU of the path is exclusive of the ethernet header
> and the 20 bytes for the IP header are being subtracted separately
> already. Thus hard_header_len is removed from this calculation.
> 
> For IPIP and GRE tunnels this doesn't change anything as
> hard_header_len is zero in those cases anyways.

This statement is definitely not true. Please see the calls to
ether_setup() in ip_gre.c, and the implementation of this function.

> This patch also corrects the calculation of the payload's packet size.
> 
> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com>

All in all, I think it's the SIT code that needs work, not ip_tunnel.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
  2020-06-30  6:22 ` Jakub Kicinski
@ 2020-06-30 10:21   ` Oliver Herms
  2020-06-30 17:27     ` Jakub Kicinski
  2020-06-30 15:51   ` Nicolas Dichtel
  1 sibling, 1 reply; 7+ messages in thread
From: Oliver Herms @ 2020-06-30 10:21 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, kuznet, yoshfuji

On 30.06.20 08:22, Jakub Kicinski wrote:
> On Fri, 26 Jun 2020 00:44:35 +0200 Oliver Herms wrote:
>> The calculation of the effective tunnel mtu, that is used to create
>> mtu exceptions if necessary, is currently not done correctly. This
>> leads to unnecessary entries in the IPv6 route cache for any
>> packet send through the tunnel.
>>
>> The root cause is, that "dev->hard_header_len" is subtracted from the
>> tunnel destionations path mtu. Thus subtracting too much, if
>> dev->hard_header_len is filled in. This is that case for SIT tunnels
>> where hard_header_len is the underlyings dev hard_header_len (e.g. 14
>> for ethernet) + 20 bytes IP header (see net/ipv6/sit.c:1091).
> 
> It seems like SIT possibly got missed in evolution of the ip_tunnel
> code? It seems to duplicate a lot of code, including pmtu checking.
> Doesn't call ip_tunnel_init()...

Are you open for patches cleaning this up?

> 
> My understanding is that for a while now tunnels are not supposed to use
> dev->hard_header_len to reserve skb space, and use dev->needed_headroom, 
> instead. sit uses hard_header_len and doesn't even copy needed_headroom
> of the lower device.
> 
>> However, the MTU of the path is exclusive of the ethernet header
>> and the 20 bytes for the IP header are being subtracted separately
>> already. Thus hard_header_len is removed from this calculation.
>>
>> For IPIP and GRE tunnels this doesn't change anything as
>> hard_header_len is zero in those cases anyways.
> 
> This statement is definitely not true. Please see the calls to
> ether_setup() in ip_gre.c, and the implementation of this function
Right. I have to admit I've only checked for L3 tunnels using printk
on dev->hard_header_len. Showing 0 for IPIP and GRE.

So shall I file a patch that changes hard_header_len for SIT tunnels to 0?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
  2020-06-30  6:22 ` Jakub Kicinski
  2020-06-30 10:21   ` Oliver Herms
@ 2020-06-30 15:51   ` Nicolas Dichtel
  2020-06-30 17:33     ` Jakub Kicinski
  1 sibling, 1 reply; 7+ messages in thread
From: Nicolas Dichtel @ 2020-06-30 15:51 UTC (permalink / raw)
  To: Jakub Kicinski, Oliver Herms; +Cc: netdev, davem, kuznet, yoshfuji

Le 30/06/2020 à 08:22, Jakub Kicinski a écrit :
[snip]
> My understanding is that for a while now tunnels are not supposed to use
> dev->hard_header_len to reserve skb space, and use dev->needed_headroom, 
> instead. sit uses hard_header_len and doesn't even copy needed_headroom
> of the lower device.

I missed this. I was wondering why IPv6 tunnels uses hard_header_len, if there
was a "good" reason:

$ git grep "hard_header_len.*=" net/ipv6/
net/ipv6/ip6_tunnel.c:                  dev->hard_header_len =
tdev->hard_header_len + t_hlen;
net/ipv6/ip6_tunnel.c:  dev->hard_header_len = LL_MAX_HEADER + t_hlen;
net/ipv6/sit.c:         dev->hard_header_len = tdev->hard_header_len +
sizeof(struct iphdr);
net/ipv6/sit.c: dev->hard_header_len    = LL_MAX_HEADER + t_hlen;

A cleanup would be nice ;-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
  2020-06-30 10:21   ` Oliver Herms
@ 2020-06-30 17:27     ` Jakub Kicinski
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-06-30 17:27 UTC (permalink / raw)
  To: Oliver Herms; +Cc: netdev, davem, kuznet, yoshfuji

On Tue, 30 Jun 2020 12:21:14 +0200 Oliver Herms wrote:
> On 30.06.20 08:22, Jakub Kicinski wrote:
> > On Fri, 26 Jun 2020 00:44:35 +0200 Oliver Herms wrote:  
> >> The calculation of the effective tunnel mtu, that is used to create
> >> mtu exceptions if necessary, is currently not done correctly. This
> >> leads to unnecessary entries in the IPv6 route cache for any
> >> packet send through the tunnel.
> >>
> >> The root cause is, that "dev->hard_header_len" is subtracted from the
> >> tunnel destionations path mtu. Thus subtracting too much, if
> >> dev->hard_header_len is filled in. This is that case for SIT tunnels
> >> where hard_header_len is the underlyings dev hard_header_len (e.g. 14
> >> for ethernet) + 20 bytes IP header (see net/ipv6/sit.c:1091).  
> > 
> > It seems like SIT possibly got missed in evolution of the ip_tunnel
> > code? It seems to duplicate a lot of code, including pmtu checking.
> > Doesn't call ip_tunnel_init()...  
> 
> Are you open for patches cleaning this up?

Certainly! Maybe some of the oddities are justified, but cleanup /
re-aligning with the rest of ip_tunnels would be nice.

Not sure how much of it is qualifying as a bug, so perhaps two series
would be needed - one for net / stable with bug fixes and another of
pure cleanups for net-next?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
  2020-06-30 15:51   ` Nicolas Dichtel
@ 2020-06-30 17:33     ` Jakub Kicinski
  2020-06-30 22:27       ` Nicolas Dichtel
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2020-06-30 17:33 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: Oliver Herms, netdev, davem, kuznet, yoshfuji

On Tue, 30 Jun 2020 17:51:41 +0200 Nicolas Dichtel wrote:
> Le 30/06/2020 à 08:22, Jakub Kicinski a écrit :
> [snip]
> > My understanding is that for a while now tunnels are not supposed to use
> > dev->hard_header_len to reserve skb space, and use dev->needed_headroom, 
> > instead. sit uses hard_header_len and doesn't even copy needed_headroom
> > of the lower device.  
> 
> I missed this. I was wondering why IPv6 tunnels uses hard_header_len, if there
> was a "good" reason:
> 
> $ git grep "hard_header_len.*=" net/ipv6/
> net/ipv6/ip6_tunnel.c:                  dev->hard_header_len =
> tdev->hard_header_len + t_hlen;
> net/ipv6/ip6_tunnel.c:  dev->hard_header_len = LL_MAX_HEADER + t_hlen;
> net/ipv6/sit.c:         dev->hard_header_len = tdev->hard_header_len +
> sizeof(struct iphdr);
> net/ipv6/sit.c: dev->hard_header_len    = LL_MAX_HEADER + t_hlen;
> 
> A cleanup would be nice ;-)

I did some archaeological investigatin' yesterday, and I saw
c95b819ad75b ("gre: Use needed_headroom") which converted GRE.
Then I think Pravin used GRE as a base for better ip_tunnel infra 
and the no-hard_header_len-abuse gospel has spread to other IPv4
tunnels. AFAICT IPv6 tunnels were not as lucky, and SIT just got
missed in the IPV4 conversion..

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation
  2020-06-30 17:33     ` Jakub Kicinski
@ 2020-06-30 22:27       ` Nicolas Dichtel
  0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Dichtel @ 2020-06-30 22:27 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Oliver Herms, netdev, davem, kuznet, yoshfuji

Le 30/06/2020 à 19:33, Jakub Kicinski a écrit :
> On Tue, 30 Jun 2020 17:51:41 +0200 Nicolas Dichtel wrote:
>> Le 30/06/2020 à 08:22, Jakub Kicinski a écrit :
>> [snip]
>>> My understanding is that for a while now tunnels are not supposed to use
>>> dev->hard_header_len to reserve skb space, and use dev->needed_headroom, 
>>> instead. sit uses hard_header_len and doesn't even copy needed_headroom
>>> of the lower device.  
>>
>> I missed this. I was wondering why IPv6 tunnels uses hard_header_len, if there
>> was a "good" reason:
>>
>> $ git grep "hard_header_len.*=" net/ipv6/
>> net/ipv6/ip6_tunnel.c:                  dev->hard_header_len =
>> tdev->hard_header_len + t_hlen;
>> net/ipv6/ip6_tunnel.c:  dev->hard_header_len = LL_MAX_HEADER + t_hlen;
>> net/ipv6/sit.c:         dev->hard_header_len = tdev->hard_header_len +
>> sizeof(struct iphdr);
>> net/ipv6/sit.c: dev->hard_header_len    = LL_MAX_HEADER + t_hlen;
>>
>> A cleanup would be nice ;-)
> 
> I did some archaeological investigatin' yesterday, and I saw
> c95b819ad75b ("gre: Use needed_headroom") which converted GRE.
Thanks for the pointer.

> Then I think Pravin used GRE as a base for better ip_tunnel infra 
> and the no-hard_header_len-abuse gospel has spread to other IPv4
> tunnels. AFAICT IPv6 tunnels were not as lucky, and SIT just got
> missed in the IPV4 conversion..
Yep, I agree with you, it's probably "historical".

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-30 22:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 22:44 [PATCH v3] IPv4: Tunnel: Fix effective path mtu calculation Oliver Herms
2020-06-30  6:22 ` Jakub Kicinski
2020-06-30 10:21   ` Oliver Herms
2020-06-30 17:27     ` Jakub Kicinski
2020-06-30 15:51   ` Nicolas Dichtel
2020-06-30 17:33     ` Jakub Kicinski
2020-06-30 22:27       ` Nicolas Dichtel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).