All of lore.kernel.org
 help / color / mirror / Atom feed
* xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall
@ 2022-01-14 17:31 Jiri Bohac
  2022-01-14 17:40 ` [PATCH] xfrm: fix MTU regression Jiri Bohac
  2022-01-16 10:18 ` xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Thorsten Leemhuis
  0 siblings, 2 replies; 15+ messages in thread
From: Jiri Bohac @ 2022-01-14 17:31 UTC (permalink / raw)
  To: Sabrina Dubroca, Steffen Klassert, Herbert Xu, David S. Miller
  Cc: Mike Maloney, Eric Dumazet, netdev

Hello,

our customer found that commit
b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
should return at least 1280 for ipv6") in v5.14 breaks the TCP
MSS calculation in ipsec transport mode, resulting complete
stalls of TCP connections. This happens when the (P)MTU is 1280
or slighly larger.

The desired formula for the MSS is:
	MSS = (MTU - ESP_overhead) - IP header - TCP header

However, the above patch clamps the (MTU - ESP_overhead) to a
minimum of 1280, turning the formula into
	MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header

With the (P)MTU near 1280, the calculated MSS is too large and
the resulting TCP packets never make it to the destination
because they are over the actual PMTU.

Trying to fix the exact same problem as the broken patch, which I
was unaware of, I sent an alternative patch in this thread of
April 2021:
https://lore.kernel.org/netdev/20210429170254.5grfgsz2hgy2qjhk@dwarf.suse.cz/
(note the v1 is broken and followed by v2!)

In that thread I also found other problems with
b515d2637276a3810d6595e10ab02c13bfd0b63a - in tunnel mode it
causes suboptimal double fragmentation:
https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/

I therefore propose to revert
b515d2637276a3810d6595e10ab02c13bfd0b63a and
apply the v2 version of my patch, which I'll re-send in reply to
this e-mail.

Thanks,

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, Prague, Czechia


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] xfrm: fix MTU regression
  2022-01-14 17:31 xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Jiri Bohac
@ 2022-01-14 17:40 ` Jiri Bohac
  2022-01-19  7:35   ` Steffen Klassert
  2022-01-16 10:18 ` xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Thorsten Leemhuis
  1 sibling, 1 reply; 15+ messages in thread
From: Jiri Bohac @ 2022-01-14 17:40 UTC (permalink / raw)
  To: Sabrina Dubroca, Steffen Klassert, Herbert Xu, David S. Miller
  Cc: Mike Maloney, Eric Dumazet, netdev

Commit 749439bfac6e1a2932c582e2699f91d329658196 ("ipv6: fix udpv6
sendmsg crash caused by too small MTU") breaks PMTU for xfrm.

A Packet Too Big ICMPv6 message received in response to an ESP
packet will prevent all further communication through the tunnel
if the reported MTU minus the ESP overhead is smaller than 1280.

E.g. in a case of a tunnel-mode ESP with sha256/aes the overhead
is 92 bytes. Receiving a PTB with MTU of 1371 or less will result
in all further packets in the tunnel dropped. A ping through the
tunnel fails with "ping: sendmsg: Invalid argument".

Apparently the MTU on the xfrm route is smaller than 1280 and
fails the check inside ip6_setup_cork() added by 749439bf.

We found this by debugging USGv6/ipv6ready failures. Failing
tests are: "Phase-2 Interoperability Test Scenario IPsec" /
5.3.11 and 5.4.11 (Tunnel Mode: Fragmentation).

Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm:
xfrm_state_mtu should return at least 1280 for ipv6") attempted
to fix this but caused another regression in TCP MSS calculations
and had to be reverted.

The patch below fixes the situation by dropping the MTU
check and instead checking for the underflows described in the
749439bf commit message.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ff4f9ebcf7f6..171eb4ec1e67 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1402,8 +1402,6 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 		if (np->frag_size)
 			mtu = np->frag_size;
 	}
-	if (mtu < IPV6_MIN_MTU)
-		return -EINVAL;
 	cork->base.fragsize = mtu;
 	cork->base.gso_size = ipc6->gso_size;
 	cork->base.tx_flags = 0;
@@ -1465,8 +1463,6 @@ static int __ip6_append_data(struct sock *sk,
 
 	fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
 			(opt ? opt->opt_nflen : 0);
-	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
-		     sizeof(struct frag_hdr);
 
 	headersize = sizeof(struct ipv6hdr) +
 		     (opt ? opt->opt_flen + opt->opt_nflen : 0) +
@@ -1474,6 +1470,13 @@ static int __ip6_append_data(struct sock *sk,
 		      sizeof(struct frag_hdr) : 0) +
 		     rt->rt6i_nfheader_len;
 
+	if (mtu < fragheaderlen ||
+	    ((mtu - fragheaderlen) & ~7) + fragheaderlen < sizeof(struct frag_hdr))
+		goto emsgsize;
+
+	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
+		     sizeof(struct frag_hdr);
+
 	/* as per RFC 7112 section 5, the entire IPv6 Header Chain must fit
 	 * the first fragment
 	 */

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, Prague, Czechia


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall
  2022-01-14 17:31 xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Jiri Bohac
  2022-01-14 17:40 ` [PATCH] xfrm: fix MTU regression Jiri Bohac
@ 2022-01-16 10:18 ` Thorsten Leemhuis
  2022-02-04  8:59   ` xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall #forregzbot Thorsten Leemhuis
  1 sibling, 1 reply; 15+ messages in thread
From: Thorsten Leemhuis @ 2022-01-16 10:18 UTC (permalink / raw)
  To: Jiri Bohac, Sabrina Dubroca, Steffen Klassert, Herbert Xu,
	David S. Miller
  Cc: Mike Maloney, Eric Dumazet, netdev, regressions

[TLDR: I'm adding this regression to regzbot, the Linux kernel
regression tracking bot; most text you find below is compiled from a few
templates paragraphs some of you might have seen already.]

Top-posting for once, to make this easy accessible to everyone.

On 14.01.22 18:31, Jiri Bohac wrote:
> 
> our customer found that commit
> b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
> should return at least 1280 for ipv6") in v5.14 breaks the TCP
> MSS calculation in ipsec transport mode, resulting complete
> stalls of TCP connections. This happens when the (P)MTU is 1280
> or slighly larger.
> 
> The desired formula for the MSS is:
> 	MSS = (MTU - ESP_overhead) - IP header - TCP header
> 
> However, the above patch clamps the (MTU - ESP_overhead) to a
> minimum of 1280, turning the formula into
> 	MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header
> 
> With the (P)MTU near 1280, the calculated MSS is too large and
> the resulting TCP packets never make it to the destination
> because they are over the actual PMTU.
> 
> Trying to fix the exact same problem as the broken patch, which I
> was unaware of, I sent an alternative patch in this thread of
> April 2021:
> https://lore.kernel.org/netdev/20210429170254.5grfgsz2hgy2qjhk@dwarf.suse.cz/
> (note the v1 is broken and followed by v2!)
> 
> In that thread I also found other problems with
> b515d2637276a3810d6595e10ab02c13bfd0b63a - in tunnel mode it
> causes suboptimal double fragmentation:
> https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/
> 
> I therefore propose to revert
> b515d2637276a3810d6595e10ab02c13bfd0b63a and
> apply the v2 version of my patch, which I'll re-send in reply to
> this e-mail.

Thanks for the report.

To be sure this issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced b515d2637276a3810d6595e10ab02c13bfd0b63a
#regzbot title xfrm: TCP MSS calculation broken by commit b515d263,
results in TCP stall
#regzbot ignore-activity

Reminder: when fixing the issue, please add a 'Link:' tag with the URL
to the report (the parent of this mail) using the kernel.org redirector,
as explained in 'Documentation/process/submitting-patches.rst'. Regzbot
then will automatically mark the regression as resolved once the fix
lands in the appropriate tree. For more details about regzbot see footer.

Sending this to everyone that got the initial report, to make all aware
of the tracking. I also hope that messages like this motivate people to
directly get at least the regression mailing list and ideally even
regzbot involved when dealing with regressions, as messages like this
wouldn't be needed then.

Don't worry, I'll send further messages wrt to this regression just to
the lists (with a tag in the subject so people can filter them away), as
long as they are intended just for regzbot. With a bit of luck no such
messages will be needed anyway.

Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat)

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply, that's in everyone's interest.

BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.

---
Additional information about regzbot:

If you want to know more about regzbot, check out its web-interface, the
getting start guide, and/or the references documentation:

https://linux-regtracking.leemhuis.info/regzbot/
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md

The last two documents will explain how you can interact with regzbot
yourself if your want to.

Hint for reporters: when reporting a regression it's in your interest to
tell #regzbot about it in the report, as that will ensure the regression
gets on the radar of regzbot and the regression tracker. That's in your
interest, as they will make sure the report won't fall through the
cracks unnoticed.

Hint for developers: you normally don't need to care about regzbot once
it's involved. Fix the issue as you normally would, just remember to
include a 'Link:' tag to the report in the commit message, as explained
in Documentation/process/submitting-patches.rst
That aspect was recently was made more explicit in commit 1f57bd42b77c:
https://git.kernel.org/linus/1f57bd42b77c


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] xfrm: fix MTU regression
  2022-01-14 17:40 ` [PATCH] xfrm: fix MTU regression Jiri Bohac
@ 2022-01-19  7:35   ` Steffen Klassert
  2022-01-19  9:12     ` Jiri Bohac
  0 siblings, 1 reply; 15+ messages in thread
From: Steffen Klassert @ 2022-01-19  7:35 UTC (permalink / raw)
  To: Jiri Bohac
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Fri, Jan 14, 2022 at 06:40:58PM +0100, Jiri Bohac wrote:
> Commit 749439bfac6e1a2932c582e2699f91d329658196 ("ipv6: fix udpv6
> sendmsg crash caused by too small MTU") breaks PMTU for xfrm.
> 
> A Packet Too Big ICMPv6 message received in response to an ESP
> packet will prevent all further communication through the tunnel
> if the reported MTU minus the ESP overhead is smaller than 1280.
> 
> E.g. in a case of a tunnel-mode ESP with sha256/aes the overhead
> is 92 bytes. Receiving a PTB with MTU of 1371 or less will result
> in all further packets in the tunnel dropped. A ping through the
> tunnel fails with "ping: sendmsg: Invalid argument".
> 
> Apparently the MTU on the xfrm route is smaller than 1280 and
> fails the check inside ip6_setup_cork() added by 749439bf.
> 
> We found this by debugging USGv6/ipv6ready failures. Failing
> tests are: "Phase-2 Interoperability Test Scenario IPsec" /
> 5.3.11 and 5.4.11 (Tunnel Mode: Fragmentation).
> 
> Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm:
> xfrm_state_mtu should return at least 1280 for ipv6") attempted
> to fix this but caused another regression in TCP MSS calculations
> and had to be reverted.
> 
> The patch below fixes the situation by dropping the MTU
> check and instead checking for the underflows described in the
> 749439bf commit message.
> 
> Signed-off-by: Jiri Bohac <jbohac@suse.cz>

Can you please add a 'Fixes:' tag so that it can be backported
to the stable trees?

Btw. this fixes a xfrm issue, but touches only generic IPv6 code.
To which tree should this patch be applied? I can take it to
the ipsec tee, but would also be ok if it is applied directly
to the net tree.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] xfrm: fix MTU regression
  2022-01-19  7:35   ` Steffen Klassert
@ 2022-01-19  9:12     ` Jiri Bohac
  2022-01-19  9:22       ` [PATCH v2] " Jiri Bohac
  2022-01-24 15:45       ` [PATCH] " Steffen Klassert
  0 siblings, 2 replies; 15+ messages in thread
From: Jiri Bohac @ 2022-01-19  9:12 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Wed, Jan 19, 2022 at 08:35:19AM +0100, Steffen Klassert wrote:
> Can you please add a 'Fixes:' tag so that it can be backported
> to the stable trees?

sure; I'll send a v2 with added Fixes: for the original
regression (749439bf), which will reappear once b515d263 (which
causes the current regression) is reverted. Note this patch needs
to be accompanied by the revert!

> Btw. this fixes a xfrm issue, but touches only generic IPv6 code.
> To which tree should this patch be applied? I can take it to
> the ipsec tee, but would also be ok if it is applied directly
> to the net tree.

b515d263 touches xfrm code; but being a regression maybe we want
the fastest track possible? 

Thanks,


-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, Prague, Czechia


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] xfrm: fix MTU regression
  2022-01-19  9:12     ` Jiri Bohac
@ 2022-01-19  9:22       ` Jiri Bohac
  2022-01-26  6:41         ` Steffen Klassert
  2022-01-24 15:45       ` [PATCH] " Steffen Klassert
  1 sibling, 1 reply; 15+ messages in thread
From: Jiri Bohac @ 2022-01-19  9:22 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

Commit 749439bfac6e1a2932c582e2699f91d329658196 ("ipv6: fix udpv6
sendmsg crash caused by too small MTU") breaks PMTU for xfrm.

A Packet Too Big ICMPv6 message received in response to an ESP
packet will prevent all further communication through the tunnel
if the reported MTU minus the ESP overhead is smaller than 1280.

E.g. in a case of a tunnel-mode ESP with sha256/aes the overhead
is 92 bytes. Receiving a PTB with MTU of 1371 or less will result
in all further packets in the tunnel dropped. A ping through the
tunnel fails with "ping: sendmsg: Invalid argument".

Apparently the MTU on the xfrm route is smaller than 1280 and
fails the check inside ip6_setup_cork() added by 749439bf.

We found this by debugging USGv6/ipv6ready failures. Failing
tests are: "Phase-2 Interoperability Test Scenario IPsec" /
5.3.11 and 5.4.11 (Tunnel Mode: Fragmentation).

Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm:
xfrm_state_mtu should return at least 1280 for ipv6") attempted
to fix this but caused another regression in TCP MSS calculations
and had to be reverted.

The patch below fixes the situation by dropping the MTU
check and instead checking for the underflows described in the
749439bf commit message.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Fixes: 749439bfac6e ("ipv6: fix udpv6 sendmsg crash caused by too small MTU") 

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ff4f9ebcf7f6..171eb4ec1e67 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1402,8 +1402,6 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 		if (np->frag_size)
 			mtu = np->frag_size;
 	}
-	if (mtu < IPV6_MIN_MTU)
-		return -EINVAL;
 	cork->base.fragsize = mtu;
 	cork->base.gso_size = ipc6->gso_size;
 	cork->base.tx_flags = 0;
@@ -1465,8 +1463,6 @@ static int __ip6_append_data(struct sock *sk,
 
 	fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
 			(opt ? opt->opt_nflen : 0);
-	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
-		     sizeof(struct frag_hdr);
 
 	headersize = sizeof(struct ipv6hdr) +
 		     (opt ? opt->opt_flen + opt->opt_nflen : 0) +
@@ -1474,6 +1470,13 @@ static int __ip6_append_data(struct sock *sk,
 		      sizeof(struct frag_hdr) : 0) +
 		     rt->rt6i_nfheader_len;
 
+	if (mtu < fragheaderlen ||
+	    ((mtu - fragheaderlen) & ~7) + fragheaderlen < sizeof(struct frag_hdr))
+		goto emsgsize;
+
+	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
+		     sizeof(struct frag_hdr);
+
 	/* as per RFC 7112 section 5, the entire IPv6 Header Chain must fit
 	 * the first fragment
 	 */

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, Prague, Czechia

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] xfrm: fix MTU regression
  2022-01-19  9:12     ` Jiri Bohac
  2022-01-19  9:22       ` [PATCH v2] " Jiri Bohac
@ 2022-01-24 15:45       ` Steffen Klassert
  2022-01-25  9:41         ` Jiri Bohac
  1 sibling, 1 reply; 15+ messages in thread
From: Steffen Klassert @ 2022-01-24 15:45 UTC (permalink / raw)
  To: Jiri Bohac
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Wed, Jan 19, 2022 at 10:12:33AM +0100, Jiri Bohac wrote:
> On Wed, Jan 19, 2022 at 08:35:19AM +0100, Steffen Klassert wrote:
> > Can you please add a 'Fixes:' tag so that it can be backported
> > to the stable trees?
> 
> sure; I'll send a v2 with added Fixes: for the original
> regression (749439bf), which will reappear once b515d263 (which
> causes the current regression) is reverted. Note this patch needs
> to be accompanied by the revert!
> 
> > Btw. this fixes a xfrm issue, but touches only generic IPv6 code.
> > To which tree should this patch be applied? I can take it to
> > the ipsec tee, but would also be ok if it is applied directly
> > to the net tree.
> 
> b515d263 touches xfrm code; but being a regression maybe we want
> the fastest track possible? 

The patch is already marked as 'awaiting upstream' in patchwork,
so I'll take it into the ipsec tree.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] xfrm: fix MTU regression
  2022-01-24 15:45       ` [PATCH] " Steffen Klassert
@ 2022-01-25  9:41         ` Jiri Bohac
  2022-01-26  6:42           ` Steffen Klassert
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Bohac @ 2022-01-25  9:41 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Mon, Jan 24, 2022 at 04:45:31PM +0100, Steffen Klassert wrote:
> > sure; I'll send a v2 with added Fixes: for the original
> > regression (749439bf), which will reappear once b515d263 (which
> > causes the current regression) is reverted. Note this patch needs
> > to be accompanied by the revert!
> > 
> > > Btw. this fixes a xfrm issue, but touches only generic IPv6 code.
> > > To which tree should this patch be applied? I can take it to
> > > the ipsec tee, but would also be ok if it is applied directly
> > > to the net tree.
> > 
> > b515d263 touches xfrm code; but being a regression maybe we want
> > the fastest track possible? 
> 
> The patch is already marked as 'awaiting upstream' in patchwork,
> so I'll take it into the ipsec tree.

OK, thanks! Will you also revert b515d263 in the ipsec tree?

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, Prague, Czechia


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] xfrm: fix MTU regression
  2022-01-19  9:22       ` [PATCH v2] " Jiri Bohac
@ 2022-01-26  6:41         ` Steffen Klassert
  0 siblings, 0 replies; 15+ messages in thread
From: Steffen Klassert @ 2022-01-26  6:41 UTC (permalink / raw)
  To: Jiri Bohac
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Wed, Jan 19, 2022 at 10:22:53AM +0100, Jiri Bohac wrote:
> Commit 749439bfac6e1a2932c582e2699f91d329658196 ("ipv6: fix udpv6
> sendmsg crash caused by too small MTU") breaks PMTU for xfrm.
> 
> A Packet Too Big ICMPv6 message received in response to an ESP
> packet will prevent all further communication through the tunnel
> if the reported MTU minus the ESP overhead is smaller than 1280.
> 
> E.g. in a case of a tunnel-mode ESP with sha256/aes the overhead
> is 92 bytes. Receiving a PTB with MTU of 1371 or less will result
> in all further packets in the tunnel dropped. A ping through the
> tunnel fails with "ping: sendmsg: Invalid argument".
> 
> Apparently the MTU on the xfrm route is smaller than 1280 and
> fails the check inside ip6_setup_cork() added by 749439bf.
> 
> We found this by debugging USGv6/ipv6ready failures. Failing
> tests are: "Phase-2 Interoperability Test Scenario IPsec" /
> 5.3.11 and 5.4.11 (Tunnel Mode: Fragmentation).
> 
> Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm:
> xfrm_state_mtu should return at least 1280 for ipv6") attempted
> to fix this but caused another regression in TCP MSS calculations
> and had to be reverted.
> 
> The patch below fixes the situation by dropping the MTU
> check and instead checking for the underflows described in the
> 749439bf commit message.
> 
> Signed-off-by: Jiri Bohac <jbohac@suse.cz>
> Fixes: 749439bfac6e ("ipv6: fix udpv6 sendmsg crash caused by too small MTU") 

Applied to the ipsec tree, thanks Jiri!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] xfrm: fix MTU regression
  2022-01-25  9:41         ` Jiri Bohac
@ 2022-01-26  6:42           ` Steffen Klassert
  2022-01-26 15:00             ` [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6" Jiri Bohac
  0 siblings, 1 reply; 15+ messages in thread
From: Steffen Klassert @ 2022-01-26  6:42 UTC (permalink / raw)
  To: Jiri Bohac
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Tue, Jan 25, 2022 at 10:41:02AM +0100, Jiri Bohac wrote:
> On Mon, Jan 24, 2022 at 04:45:31PM +0100, Steffen Klassert wrote:
> > > sure; I'll send a v2 with added Fixes: for the original
> > > regression (749439bf), which will reappear once b515d263 (which
> > > causes the current regression) is reverted. Note this patch needs
> > > to be accompanied by the revert!
> > > 
> > > > Btw. this fixes a xfrm issue, but touches only generic IPv6 code.
> > > > To which tree should this patch be applied? I can take it to
> > > > the ipsec tee, but would also be ok if it is applied directly
> > > > to the net tree.
> > > 
> > > b515d263 touches xfrm code; but being a regression maybe we want
> > > the fastest track possible? 
> > 
> > The patch is already marked as 'awaiting upstream' in patchwork,
> > so I'll take it into the ipsec tree.
> 
> OK, thanks! Will you also revert b515d263 in the ipsec tree?

You need to send a patch if you want to have this reverted.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
  2022-01-26  6:42           ` Steffen Klassert
@ 2022-01-26 15:00             ` Jiri Bohac
  2022-02-01  6:46               ` Steffen Klassert
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Bohac @ 2022-01-26 15:00 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

This reverts commit b515d2637276a3810d6595e10ab02c13bfd0b63a.

Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS
calculation in ipsec transport mode, resulting complete stalls of TCP
connections. This happens when the (P)MTU is 1280 or slighly larger.

The desired formula for the MSS is:
MSS = (MTU - ESP_overhead) - IP header - TCP header

However, the above commit clamps the (MTU - ESP_overhead) to a
minimum of 1280, turning the formula into
MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header

With the (P)MTU near 1280, the calculated MSS is too large and the
resulting TCP packets never make it to the destination because they
are over the actual PMTU.

The above commit also causes suboptimal double fragmentation in
xfrm tunnel mode, as described in
https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/

The original problem the above commit was trying to fix is now fixed
by commit 6596a0229541270fb8d38d989f91b78838e5e9da ("xfrm: fix MTU
regression").

Signed-off-by: Jiri Bohac <jbohac@suse.cz>

---
 include/net/xfrm.h    |  1 -
 net/ipv4/esp4.c       |  2 +-
 net/ipv6/esp6.c       |  2 +-
 net/xfrm/xfrm_state.c | 14 ++------------
 4 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index fdb41e8bb626..d78a294e1d0c 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1568,7 +1568,6 @@ void xfrm_sad_getinfo(struct net *net, struct xfrmk_sadinfo *si);
 void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
 u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
 int xfrm_init_replay(struct xfrm_state *x);
-u32 __xfrm_state_mtu(struct xfrm_state *x, int mtu);
 u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
 int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload);
 int xfrm_init_state(struct xfrm_state *x);
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 851f542928a3..e1b1d080e908 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -671,7 +671,7 @@ static int esp_output(struct xfrm_state *x, struct sk_buff *skb)
 		struct xfrm_dst *dst = (struct xfrm_dst *)skb_dst(skb);
 		u32 padto;
 
-		padto = min(x->tfcpad, __xfrm_state_mtu(x, dst->child_mtu_cached));
+		padto = min(x->tfcpad, xfrm_state_mtu(x, dst->child_mtu_cached));
 		if (skb->len < padto)
 			esp.tfclen = padto - skb->len;
 	}
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 8bb2c407b46b..7591160edce1 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -707,7 +707,7 @@ static int esp6_output(struct xfrm_state *x, struct sk_buff *skb)
 		struct xfrm_dst *dst = (struct xfrm_dst *)skb_dst(skb);
 		u32 padto;
 
-		padto = min(x->tfcpad, __xfrm_state_mtu(x, dst->child_mtu_cached));
+		padto = min(x->tfcpad, xfrm_state_mtu(x, dst->child_mtu_cached));
 		if (skb->len < padto)
 			esp.tfclen = padto - skb->len;
 	}
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index ca6bee18346d..dde82f8eb949 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -2572,7 +2572,7 @@ void xfrm_state_delete_tunnel(struct xfrm_state *x)
 }
 EXPORT_SYMBOL(xfrm_state_delete_tunnel);
 
-u32 __xfrm_state_mtu(struct xfrm_state *x, int mtu)
+u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
 {
 	const struct xfrm_type *type = READ_ONCE(x->type);
 	struct crypto_aead *aead;
@@ -2603,17 +2603,7 @@ u32 __xfrm_state_mtu(struct xfrm_state *x, int mtu)
 	return ((mtu - x->props.header_len - crypto_aead_authsize(aead) -
 		 net_adj) & ~(blksize - 1)) + net_adj - 2;
 }
-EXPORT_SYMBOL_GPL(__xfrm_state_mtu);
-
-u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
-{
-	mtu = __xfrm_state_mtu(x, mtu);
-
-	if (x->props.family == AF_INET6 && mtu < IPV6_MIN_MTU)
-		return IPV6_MIN_MTU;
-
-	return mtu;
-}
+EXPORT_SYMBOL_GPL(xfrm_state_mtu);
 
 int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload)
 {

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, Prague, Czechia


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
  2022-01-26 15:00             ` [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6" Jiri Bohac
@ 2022-02-01  6:46               ` Steffen Klassert
  2022-02-15 14:59                 ` Thorsten Leemhuis
  0 siblings, 1 reply; 15+ messages in thread
From: Steffen Klassert @ 2022-02-01  6:46 UTC (permalink / raw)
  To: Jiri Bohac
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev

On Wed, Jan 26, 2022 at 04:00:18PM +0100, Jiri Bohac wrote:
> This reverts commit b515d2637276a3810d6595e10ab02c13bfd0b63a.
> 
> Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
> should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS
> calculation in ipsec transport mode, resulting complete stalls of TCP
> connections. This happens when the (P)MTU is 1280 or slighly larger.
> 
> The desired formula for the MSS is:
> MSS = (MTU - ESP_overhead) - IP header - TCP header
> 
> However, the above commit clamps the (MTU - ESP_overhead) to a
> minimum of 1280, turning the formula into
> MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header
> 
> With the (P)MTU near 1280, the calculated MSS is too large and the
> resulting TCP packets never make it to the destination because they
> are over the actual PMTU.
> 
> The above commit also causes suboptimal double fragmentation in
> xfrm tunnel mode, as described in
> https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/
> 
> The original problem the above commit was trying to fix is now fixed
> by commit 6596a0229541270fb8d38d989f91b78838e5e9da ("xfrm: fix MTU
> regression").
> 
> Signed-off-by: Jiri Bohac <jbohac@suse.cz>

Applied, thanks Jiri!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall #forregzbot
  2022-01-16 10:18 ` xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Thorsten Leemhuis
@ 2022-02-04  8:59   ` Thorsten Leemhuis
  0 siblings, 0 replies; 15+ messages in thread
From: Thorsten Leemhuis @ 2022-02-04  8:59 UTC (permalink / raw)
  To: regressions

For the record:

#regzbot fixed-by: a6d95c5a628a09be

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=a6d95c5a628a09be129f25d5663a7e9db8261f51

> Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
> This reverts commit b515d2637276a3810d6595e10ab02c13bfd0b63a.
> 
> Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
> should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS
> calculation in ipsec transport mode, resulting complete stalls of TCP
> connections. This happens when the (P)MTU is 1280 or slighly larger.
> 
> The desired formula for the MSS is:
> MSS = (MTU - ESP_overhead) - IP header - TCP header
> 
> However, the above commit clamps the (MTU - ESP_overhead) to a
> minimum of 1280, turning the formula into
> MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header
> 
> With the (P)MTU near 1280, the calculated MSS is too large and the
> resulting TCP packets never make it to the destination because they
> are over the actual PMTU.
> 
> The above commit also causes suboptimal double fragmentation in
> xfrm tunnel mode, as described in
> https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/
> 
> The original problem the above commit was trying to fix is now fixed
> by commit 6596a0229541270fb8d38d989f91b78838e5e9da ("xfrm: fix MTU
> regression").
> 
> Signed-off-by: Jiri Bohac <jbohac@suse.cz>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

On 16.01.22 11:18, Thorsten Leemhuis wrote:
> [TLDR: I'm adding this regression to regzbot, the Linux kernel
> regression tracking bot; most text you find below is compiled from a few
> templates paragraphs some of you might have seen already.]
> 
> Top-posting for once, to make this easy accessible to everyone.
> 
> On 14.01.22 18:31, Jiri Bohac wrote:
>>
>> our customer found that commit
>> b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
>> should return at least 1280 for ipv6") in v5.14 breaks the TCP
>> MSS calculation in ipsec transport mode, resulting complete
>> stalls of TCP connections. This happens when the (P)MTU is 1280
>> or slighly larger.
>>
>> The desired formula for the MSS is:
>> 	MSS = (MTU - ESP_overhead) - IP header - TCP header
>>
>> However, the above patch clamps the (MTU - ESP_overhead) to a
>> minimum of 1280, turning the formula into
>> 	MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header
>>
>> With the (P)MTU near 1280, the calculated MSS is too large and
>> the resulting TCP packets never make it to the destination
>> because they are over the actual PMTU.
>>
>> Trying to fix the exact same problem as the broken patch, which I
>> was unaware of, I sent an alternative patch in this thread of
>> April 2021:
>> https://lore.kernel.org/netdev/20210429170254.5grfgsz2hgy2qjhk@dwarf.suse.cz/
>> (note the v1 is broken and followed by v2!)
>>
>> In that thread I also found other problems with
>> b515d2637276a3810d6595e10ab02c13bfd0b63a - in tunnel mode it
>> causes suboptimal double fragmentation:
>> https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/
>>
>> I therefore propose to revert
>> b515d2637276a3810d6595e10ab02c13bfd0b63a and
>> apply the v2 version of my patch, which I'll re-send in reply to
>> this e-mail.
> 
> Thanks for the report.
> 
> To be sure this issue doesn't fall through the cracks unnoticed, I'm
> adding it to regzbot, my Linux kernel regression tracking bot:
> 
> #regzbot ^introduced b515d2637276a3810d6595e10ab02c13bfd0b63a
> #regzbot title xfrm: TCP MSS calculation broken by commit b515d263,
> results in TCP stall
> #regzbot ignore-activity
> 
> Reminder: when fixing the issue, please add a 'Link:' tag with the URL
> to the report (the parent of this mail) using the kernel.org redirector,
> as explained in 'Documentation/process/submitting-patches.rst'. Regzbot
> then will automatically mark the regression as resolved once the fix
> lands in the appropriate tree. For more details about regzbot see footer.
> 
> Sending this to everyone that got the initial report, to make all aware
> of the tracking. I also hope that messages like this motivate people to
> directly get at least the regression mailing list and ideally even
> regzbot involved when dealing with regressions, as messages like this
> wouldn't be needed then.
> 
> Don't worry, I'll send further messages wrt to this regression just to
> the lists (with a tag in the subject so people can filter them away), as
> long as they are intended just for regzbot. With a bit of luck no such
> messages will be needed anyway.
> 
> Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat)
> 
> P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
> on my table. I can only look briefly into most of them. Unfortunately
> therefore I sometimes will get things wrong or miss something important.
> I hope that's not the case here; if you think it is, don't hesitate to
> tell me about it in a public reply, that's in everyone's interest.
> 
> BTW, I have no personal interest in this issue, which is tracked using
> regzbot, my Linux kernel regression tracking bot
> (https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
> this mail to get things rolling again and hence don't need to be CC on
> all further activities wrt to this regression.
> 
> ---
> Additional information about regzbot:
> 
> If you want to know more about regzbot, check out its web-interface, the
> getting start guide, and/or the references documentation:
> 
> https://linux-regtracking.leemhuis.info/regzbot/
> https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md
> https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md
> 
> The last two documents will explain how you can interact with regzbot
> yourself if your want to.
> 
> Hint for reporters: when reporting a regression it's in your interest to
> tell #regzbot about it in the report, as that will ensure the regression
> gets on the radar of regzbot and the regression tracker. That's in your
> interest, as they will make sure the report won't fall through the
> cracks unnoticed.
> 
> Hint for developers: you normally don't need to care about regzbot once
> it's involved. Fix the issue as you normally would, just remember to
> include a 'Link:' tag to the report in the commit message, as explained
> in Documentation/process/submitting-patches.rst
> That aspect was recently was made more explicit in commit 1f57bd42b77c:
> https://git.kernel.org/linus/1f57bd42b77c
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
  2022-02-01  6:46               ` Steffen Klassert
@ 2022-02-15 14:59                 ` Thorsten Leemhuis
  2022-02-16 11:02                   ` Steffen Klassert
  0 siblings, 1 reply; 15+ messages in thread
From: Thorsten Leemhuis @ 2022-02-15 14:59 UTC (permalink / raw)
  To: Steffen Klassert, Jiri Bohac
  Cc: Sabrina Dubroca, Herbert Xu, David S. Miller, Mike Maloney,
	Eric Dumazet, netdev, Jakub Kicinski, regressions

Hi, this is your Linux kernel regression tracker speaking.

The patch discussed below is now in linux-next for about 11 days, but
not yet in the net tree afaics. Will it be merged this week? And
shouldn't this patch have these stables tags?

Cc: <stable@vger.kernel.org> # 5.14: 6596a0229541 xfrm: fix MTU regression
Cc: <stable@vger.kernel.org> # 5.14

And maybe a fixes tag, too?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.

#regzbot poke

On 01.02.22 07:46, Steffen Klassert wrote:
> On Wed, Jan 26, 2022 at 04:00:18PM +0100, Jiri Bohac wrote:
>> This reverts commit b515d2637276a3810d6595e10ab02c13bfd0b63a.
>>
>> Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
>> should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS
>> calculation in ipsec transport mode, resulting complete stalls of TCP
>> connections. This happens when the (P)MTU is 1280 or slighly larger.
>>
>> The desired formula for the MSS is:
>> MSS = (MTU - ESP_overhead) - IP header - TCP header
>>
>> However, the above commit clamps the (MTU - ESP_overhead) to a
>> minimum of 1280, turning the formula into
>> MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header
>>
>> With the (P)MTU near 1280, the calculated MSS is too large and the
>> resulting TCP packets never make it to the destination because they
>> are over the actual PMTU.
>>
>> The above commit also causes suboptimal double fragmentation in
>> xfrm tunnel mode, as described in
>> https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/
>>
>> The original problem the above commit was trying to fix is now fixed
>> by commit 6596a0229541270fb8d38d989f91b78838e5e9da ("xfrm: fix MTU
>> regression").
>>
>> Signed-off-by: Jiri Bohac <jbohac@suse.cz>
> 
> Applied, thanks Jiri!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
  2022-02-15 14:59                 ` Thorsten Leemhuis
@ 2022-02-16 11:02                   ` Steffen Klassert
  0 siblings, 0 replies; 15+ messages in thread
From: Steffen Klassert @ 2022-02-16 11:02 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Jiri Bohac, Sabrina Dubroca, Herbert Xu, David S. Miller,
	Mike Maloney, Eric Dumazet, netdev, Jakub Kicinski, regressions

On Tue, Feb 15, 2022 at 03:59:38PM +0100, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker speaking.
> 
> The patch discussed below is now in linux-next for about 11 days, but
> not yet in the net tree afaics. Will it be merged this week? And

It will be merged with the next pull request for the ipsec tree
that will happen likely next week.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-02-16 11:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14 17:31 xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Jiri Bohac
2022-01-14 17:40 ` [PATCH] xfrm: fix MTU regression Jiri Bohac
2022-01-19  7:35   ` Steffen Klassert
2022-01-19  9:12     ` Jiri Bohac
2022-01-19  9:22       ` [PATCH v2] " Jiri Bohac
2022-01-26  6:41         ` Steffen Klassert
2022-01-24 15:45       ` [PATCH] " Steffen Klassert
2022-01-25  9:41         ` Jiri Bohac
2022-01-26  6:42           ` Steffen Klassert
2022-01-26 15:00             ` [PATCH] Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6" Jiri Bohac
2022-02-01  6:46               ` Steffen Klassert
2022-02-15 14:59                 ` Thorsten Leemhuis
2022-02-16 11:02                   ` Steffen Klassert
2022-01-16 10:18 ` xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall Thorsten Leemhuis
2022-02-04  8:59   ` xfrm regression: TCP MSS calculation broken by commit b515d263, results in TCP stall #forregzbot Thorsten Leemhuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.