netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] tcp: refine memory limit test in tcp_fragment()
@ 2019-06-21 13:09 Eric Dumazet
  2019-06-21 14:50 ` Christoph Paasch
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Eric Dumazet @ 2019-06-21 13:09 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet, Christoph Paasch

tcp_fragment() might be called for skbs in the write queue.

Memory limits might have been exceeded because tcp_sendmsg() only
checks limits at full skb (64KB) boundaries.

Therefore, we need to make sure tcp_fragment() wont punish applications
that might have setup very low SO_SNDBUF values.

Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Christoph Paasch <cpaasch@apple.com>
---
 net/ipv4/tcp_output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 00c01a01b547ec67c971dc25a74c9258563cf871..0ebc33d1c9e5099d163a234930e213ee35e9fbd1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
 	if (nsize < 0)
 		nsize = 0;
 
-	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
+	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
+		     tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) {
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
 		return -ENOMEM;
 	}
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: refine memory limit test in tcp_fragment()
  2019-06-21 13:09 [PATCH net] tcp: refine memory limit test in tcp_fragment() Eric Dumazet
@ 2019-06-21 14:50 ` Christoph Paasch
  2019-06-22  1:00 ` David Miller
  2019-07-03  3:27 ` Tony Lu
  2 siblings, 0 replies; 5+ messages in thread
From: Christoph Paasch @ 2019-06-21 14:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S . Miller, netdev, Eric Dumazet



> On Jun 21, 2019, at 6:11 AM, Eric Dumazet <edumazet@google.com> wrote:
> 
> tcp_fragment() might be called for skbs in the write queue.
> 
> Memory limits might have been exceeded because tcp_sendmsg() only
> checks limits at full skb (64KB) boundaries.
> 
> Therefore, we need to make sure tcp_fragment() wont punish applications
> that might have setup very low SO_SNDBUF values.
> 
> Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> ---
> net/ipv4/tcp_output.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)

Tested-by: Christoph Paasch <cpaasch@apple.com>


Thanks!


> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 00c01a01b547ec67c971dc25a74c9258563cf871..0ebc33d1c9e5099d163a234930e213ee35e9fbd1 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
>    if (nsize < 0)
>        nsize = 0;
> 
> -    if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
> +    if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
> +             tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) {
>        NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>        return -ENOMEM;
>    }
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: refine memory limit test in tcp_fragment()
  2019-06-21 13:09 [PATCH net] tcp: refine memory limit test in tcp_fragment() Eric Dumazet
  2019-06-21 14:50 ` Christoph Paasch
@ 2019-06-22  1:00 ` David Miller
  2019-07-03  3:27 ` Tony Lu
  2 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2019-06-22  1:00 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, cpaasch

From: Eric Dumazet <edumazet@google.com>
Date: Fri, 21 Jun 2019 06:09:55 -0700

> tcp_fragment() might be called for skbs in the write queue.
> 
> Memory limits might have been exceeded because tcp_sendmsg() only
> checks limits at full skb (64KB) boundaries.
> 
> Therefore, we need to make sure tcp_fragment() wont punish applications
> that might have setup very low SO_SNDBUF values.
> 
> Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Christoph Paasch <cpaasch@apple.com>

Applied and queued up for -stable.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: refine memory limit test in tcp_fragment()
  2019-06-21 13:09 [PATCH net] tcp: refine memory limit test in tcp_fragment() Eric Dumazet
  2019-06-21 14:50 ` Christoph Paasch
  2019-06-22  1:00 ` David Miller
@ 2019-07-03  3:27 ` Tony Lu
  2019-07-03  9:19   ` Eric Dumazet
  2 siblings, 1 reply; 5+ messages in thread
From: Tony Lu @ 2019-07-03  3:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, netdev, Eric Dumazet, Christoph Paasch,
	oliver.yang, xlpang, dust.li

Hello Eric,

	We have applied that commit e358f4af19db ("tcp: tcp_fragment() should apply sane memory limits")
	as a hotpatch in production environment. We found that it will make
	tcp long connection reset during sending out packet when applying
	that commit. 
	
	Our applications which in A/B test have suffered that
	and made them retransmit large data, and then caused retransmission
	storm and lower the performance and increase RT.

	Therefore we discontinued to apply this hotpatch in A/B test.

	After invesgation, we found this patch already fix this issue in
	stable. Before applying this patch, we have some questions:

	1. This commit in stable hard coded a magic number 0x20000. I am
	wondering this value and if there any better solution.
	2. Is there any known or unknown side effect? If any, we could test
	it in some suspicious scenarios before testing in prod env.

	Thanks.

Cheers,
Tony Lu

On Fri, Jun 21, 2019 at 06:09:55AM -0700, Eric Dumazet wrote:
> tcp_fragment() might be called for skbs in the write queue.
> 
> Memory limits might have been exceeded because tcp_sendmsg() only
> checks limits at full skb (64KB) boundaries.
> 
> Therefore, we need to make sure tcp_fragment() wont punish applications
> that might have setup very low SO_SNDBUF values.
> 
> Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> ---
>  net/ipv4/tcp_output.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 00c01a01b547ec67c971dc25a74c9258563cf871..0ebc33d1c9e5099d163a234930e213ee35e9fbd1 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
>  	if (nsize < 0)
>  		nsize = 0;
>  
> -	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
> +	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
> +		     tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) {
>  		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>  		return -ENOMEM;
>  	}
> -- 
> 2.22.0.410.gd8fdbe21b5-goog

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: refine memory limit test in tcp_fragment()
  2019-07-03  3:27 ` Tony Lu
@ 2019-07-03  9:19   ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2019-07-03  9:19 UTC (permalink / raw)
  To: Tony Lu, Eric Dumazet
  Cc: David S . Miller, netdev, Eric Dumazet, Christoph Paasch,
	oliver.yang, xlpang, dust.li



On 7/2/19 8:27 PM, Tony Lu wrote:
> Hello Eric,
> 
> 	We have applied that commit e358f4af19db ("tcp: tcp_fragment() should apply sane memory limits")
> 	as a hotpatch in production environment. We found that it will make
> 	tcp long connection reset during sending out packet when applying
> 	that commit. 
> 	
> 	Our applications which in A/B test have suffered that
> 	and made them retransmit large data, and then caused retransmission
> 	storm and lower the performance and increase RT.
> 
> 	Therefore we discontinued to apply this hotpatch in A/B test.
> 
> 	After invesgation, we found this patch already fix this issue in
> 	stable. Before applying this patch, we have some questions:
> 

Which stable version are you referring to exactly ?

> 	1. This commit in stable hard coded a magic number 0x20000. I am
> 	wondering this value and if there any better solution.

0x20000 is two times 64KB, please read the changelog for the rationale.

> 	2. Is there any known or unknown side effect? If any, we could test
> 	it in some suspicious scenarios before testing in prod env.

No known side effect.

Honestly, applications setting small SO_SNDBUF values can not expect good TCP performance anyway.


> 
> 	Thanks.
> 
> Cheers,
> Tony Lu
> 
> On Fri, Jun 21, 2019 at 06:09:55AM -0700, Eric Dumazet wrote:
>> tcp_fragment() might be called for skbs in the write queue.
>>
>> Memory limits might have been exceeded because tcp_sendmsg() only
>> checks limits at full skb (64KB) boundaries.
>>
>> Therefore, we need to make sure tcp_fragment() wont punish applications
>> that might have setup very low SO_SNDBUF values.
>>
>> Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> Reported-by: Christoph Paasch <cpaasch@apple.com>
>> ---
>>  net/ipv4/tcp_output.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> index 00c01a01b547ec67c971dc25a74c9258563cf871..0ebc33d1c9e5099d163a234930e213ee35e9fbd1 100644
>> --- a/net/ipv4/tcp_output.c
>> +++ b/net/ipv4/tcp_output.c
>> @@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
>>  	if (nsize < 0)
>>  		nsize = 0;
>>  
>> -	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
>> +	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
>> +		     tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) {
>>  		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>>  		return -ENOMEM;
>>  	}
>> -- 
>> 2.22.0.410.gd8fdbe21b5-goog

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-07-03  9:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-21 13:09 [PATCH net] tcp: refine memory limit test in tcp_fragment() Eric Dumazet
2019-06-21 14:50 ` Christoph Paasch
2019-06-22  1:00 ` David Miller
2019-07-03  3:27 ` Tony Lu
2019-07-03  9:19   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).