linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tcp: keepalive fixes
@ 2021-01-12 19:25 Enke Chen
  2021-01-12 22:48 ` Yuchung Cheng
  2021-01-22 19:45 ` Enke Chen
  0 siblings, 2 replies; 7+ messages in thread
From: Enke Chen @ 2021-01-12 19:25 UTC (permalink / raw)
  To: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski
  Cc: netdev, linux-kernel, enkechen2020

From: Enke Chen <enchen@paloaltonetworks.com>

In this patch two issues with TCP keepalives are fixed:

1) TCP keepalive does not timeout when there are data waiting to be
   delivered and then the connection got broken. The TCP keepalive
   timeout is not evaluated in that condition.

   The fix is to remove the code that prevents TCP keepalive from
   being evaluated for timeout.

2) With the fix for #1, TCP keepalive can erroneously timeout after
   the 0-window probe kicks in. The 0-window probe counter is wrongly
   applied to TCP keepalives.

   The fix is to use the elapsed time instead of the 0-window probe
   counter in evaluating TCP keepalive timeout.

Cc: stable@vger.kernel.org
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
---
 net/ipv4/tcp_timer.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 6c62b9ea1320..40953aa40d53 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
 	    ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
 		goto out;
 
-	elapsed = keepalive_time_when(tp);
-
-	/* It is alive without keepalive 8) */
-	if (tp->packets_out || !tcp_write_queue_empty(sk))
-		goto resched;
-
 	elapsed = keepalive_time_elapsed(tp);
 
 	if (elapsed >= keepalive_time_when(tp)) {
@@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t)
 		 * to determine when to timeout instead.
 		 */
 		if ((icsk->icsk_user_timeout != 0 &&
-		    elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) &&
-		    icsk->icsk_probes_out > 0) ||
+		     elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) ||
 		    (icsk->icsk_user_timeout == 0 &&
-		    icsk->icsk_probes_out >= keepalive_probes(tp))) {
+		     (elapsed >= keepalive_time_when(tp) +
+		      keepalive_intvl_when(tp) * keepalive_probes(tp)))) {
 			tcp_send_active_reset(sk, GFP_ATOMIC);
 			tcp_write_err(sk);
 			goto out;
 		}
 		if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
-			icsk->icsk_probes_out++;
 			elapsed = keepalive_intvl_when(tp);
 		} else {
 			/* If keepalive was lost due to local congestion,
@@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
 	}
 
 	sk_mem_reclaim(sk);
-
-resched:
 	inet_csk_reset_keepalive_timer (sk, elapsed);
 	goto out;
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tcp: keepalive fixes
  2021-01-12 19:25 [PATCH] tcp: keepalive fixes Enke Chen
@ 2021-01-12 22:48 ` Yuchung Cheng
  2021-01-12 22:52   ` Eric Dumazet
  2021-01-13  0:42   ` Enke Chen
  2021-01-22 19:45 ` Enke Chen
  1 sibling, 2 replies; 7+ messages in thread
From: Yuchung Cheng @ 2021-01-12 22:48 UTC (permalink / raw)
  To: Enke Chen
  Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell

On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote:
>
> From: Enke Chen <enchen@paloaltonetworks.com>
>
> In this patch two issues with TCP keepalives are fixed:
>
> 1) TCP keepalive does not timeout when there are data waiting to be
>    delivered and then the connection got broken. The TCP keepalive
>    timeout is not evaluated in that condition.
hi enke
Do you have an example to demonstrate this issue -- in theory when
there is data inflight, an RTO timer should be pending (which
considers user-timeout setting). based on the user-timeout description
(man tcp), the user timeout should abort the socket per the specified
time after data commences. some data would help to understand the
issue.

>
>    The fix is to remove the code that prevents TCP keepalive from
>    being evaluated for timeout.
>
> 2) With the fix for #1, TCP keepalive can erroneously timeout after
>    the 0-window probe kicks in. The 0-window probe counter is wrongly
>    applied to TCP keepalives.
>
>    The fix is to use the elapsed time instead of the 0-window probe
>    counter in evaluating TCP keepalive timeout.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
> ---
>  net/ipv4/tcp_timer.c | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 6c62b9ea1320..40953aa40d53 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
>             ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
>                 goto out;
>
> -       elapsed = keepalive_time_when(tp);
> -
> -       /* It is alive without keepalive 8) */
> -       if (tp->packets_out || !tcp_write_queue_empty(sk))
> -               goto resched;
> -
>         elapsed = keepalive_time_elapsed(tp);
>
>         if (elapsed >= keepalive_time_when(tp)) {
> @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t)
>                  * to determine when to timeout instead.
>                  */
>                 if ((icsk->icsk_user_timeout != 0 &&
> -                   elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) &&
> -                   icsk->icsk_probes_out > 0) ||
> +                    elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) ||
>                     (icsk->icsk_user_timeout == 0 &&
> -                   icsk->icsk_probes_out >= keepalive_probes(tp))) {
> +                    (elapsed >= keepalive_time_when(tp) +
> +                     keepalive_intvl_when(tp) * keepalive_probes(tp)))) {
>                         tcp_send_active_reset(sk, GFP_ATOMIC);
>                         tcp_write_err(sk);
>                         goto out;
>                 }
>                 if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> -                       icsk->icsk_probes_out++;
>                         elapsed = keepalive_intvl_when(tp);
>                 } else {
>                         /* If keepalive was lost due to local congestion,
> @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
>         }
>
>         sk_mem_reclaim(sk);
> -
> -resched:
>         inet_csk_reset_keepalive_timer (sk, elapsed);
>         goto out;
>
> --
> 2.29.2
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tcp: keepalive fixes
  2021-01-12 22:48 ` Yuchung Cheng
@ 2021-01-12 22:52   ` Eric Dumazet
  2021-01-13 20:06     ` Enke Chen
  2021-01-13  0:42   ` Enke Chen
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2021-01-12 22:52 UTC (permalink / raw)
  To: Yuchung Cheng
  Cc: Enke Chen, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Jakub Kicinski, netdev, LKML, Neal Cardwell

On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <ycheng@google.com> wrote:
>
> On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote:
> >
> > From: Enke Chen <enchen@paloaltonetworks.com>
> >
> > In this patch two issues with TCP keepalives are fixed:
> >
> > 1) TCP keepalive does not timeout when there are data waiting to be
> >    delivered and then the connection got broken. The TCP keepalive
> >    timeout is not evaluated in that condition.
> hi enke
> Do you have an example to demonstrate this issue -- in theory when
> there is data inflight, an RTO timer should be pending (which
> considers user-timeout setting). based on the user-timeout description
> (man tcp), the user timeout should abort the socket per the specified
> time after data commences. some data would help to understand the
> issue.
>

+1

A packetdrill test would be ideal.

Also, given that there is this ongoing issue with TCP_USER_TIMEOUT,
lets not mix things
or risk added work for backports to stable versions.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tcp: keepalive fixes
  2021-01-12 22:48 ` Yuchung Cheng
  2021-01-12 22:52   ` Eric Dumazet
@ 2021-01-13  0:42   ` Enke Chen
  1 sibling, 0 replies; 7+ messages in thread
From: Enke Chen @ 2021-01-13  0:42 UTC (permalink / raw)
  To: Yuchung Cheng
  Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell,
	enkechen2020

Hi, Yuchung:

I have attached the python script that reproduces the keepalive issues.
The script is a slight modification of the one written by Marek Majkowski:

https://github.com/cloudflare/cloudflare-blog/blob/master/2019-09-tcp-keepalives/test-zero.py

Please note that only the TCP keepalive is configured, and not the user timeout.

Thanks.  -- Enke

On Tue, Jan 12, 2021 at 02:48:01PM -0800, Yuchung Cheng wrote:
> On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote:
> >
> > From: Enke Chen <enchen@paloaltonetworks.com>
> >
> > In this patch two issues with TCP keepalives are fixed:
> >
> > 1) TCP keepalive does not timeout when there are data waiting to be
> >    delivered and then the connection got broken. The TCP keepalive
> >    timeout is not evaluated in that condition.
> hi enke
> Do you have an example to demonstrate this issue -- in theory when
> there is data inflight, an RTO timer should be pending (which
> considers user-timeout setting). based on the user-timeout description
> (man tcp), the user timeout should abort the socket per the specified
> time after data commences. some data would help to understand the
> issue.
> 

------
#! /usr/bin/python

import io
import os
import select
import socket
import time
import utils
import ctypes

utils.new_ns()

port = 1

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
s.bind(('127.0.0.1', port))
s.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)

s.listen(16)

tcpdump = utils.tcpdump_start(port)
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
c.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
c.connect(('127.0.0.1', port))

x, _ = s.accept()

if False:
    c.setsockopt(socket.IPPROTO_TCP, socket.TCP_USER_TIMEOUT, 90*1000)

if True:
    c.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
    c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
    c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 10)
    c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 10)

time.sleep(0.2)
print("[ ] c.send()")
import fcntl
TIOCOUTQ=0x5411
c.setblocking(False)
while True:
    bytes_avail = ctypes.c_int()
    fcntl.ioctl(c.fileno(), TIOCOUTQ, bytes_avail)
    if bytes_avail.value > 64*1024:
        break
    try:
        c.send(b"A" * 16384 * 4)
    except io.BlockingIOError:
        break
c.setblocking(True)
time.sleep(0.2)
utils.ss(port)
utils.check_buffer(c)

t0 = time.time()

if True:
    utils.drop_start(dport=port)
    utils.drop_start(sport=port)

poll = select.poll()
poll.register(c, select.POLLIN)
poll.poll()

utils.ss(port)


e = c.getsockopt(socket.SOL_SOCKET, socket.SO_ERROR)
print("[ ] SO_ERROR = %s" % (e,))

t1 = time.time()
print("[ ] took: %f seconds" % (t1-t0,))

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tcp: keepalive fixes
  2021-01-12 22:52   ` Eric Dumazet
@ 2021-01-13 20:06     ` Enke Chen
  2021-01-13 20:28       ` Enke Chen
  0 siblings, 1 reply; 7+ messages in thread
From: Enke Chen @ 2021-01-13 20:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Yuchung Cheng, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell

Hi, Eric:

Just to clarify: the issues for tcp keepalive and TCP_USER_TIMEOUT are
separate isues, and the fixes would not conflict afaik.

Thanks.  -- Enke

On Tue, Jan 12, 2021 at 11:52:43PM +0100, Eric Dumazet wrote:
> On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <ycheng@google.com> wrote:
> >
> > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote:
> > >
> > > From: Enke Chen <enchen@paloaltonetworks.com>
> > >
> > > In this patch two issues with TCP keepalives are fixed:
> > >
> > > 1) TCP keepalive does not timeout when there are data waiting to be
> > >    delivered and then the connection got broken. The TCP keepalive
> > >    timeout is not evaluated in that condition.
> > hi enke
> > Do you have an example to demonstrate this issue -- in theory when
> > there is data inflight, an RTO timer should be pending (which
> > considers user-timeout setting). based on the user-timeout description
> > (man tcp), the user timeout should abort the socket per the specified
> > time after data commences. some data would help to understand the
> > issue.
> >
> 
> +1
> 
> A packetdrill test would be ideal.
> 
> Also, given that there is this ongoing issue with TCP_USER_TIMEOUT,
> lets not mix things
> or risk added work for backports to stable versions.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tcp: keepalive fixes
  2021-01-13 20:06     ` Enke Chen
@ 2021-01-13 20:28       ` Enke Chen
  0 siblings, 0 replies; 7+ messages in thread
From: Enke Chen @ 2021-01-13 20:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Yuchung Cheng, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell,
	enkechen2020

On Wed, Jan 13, 2021 at 12:06:27PM -0800, Enke Chen wrote:
> Hi, Eric:
> 
> Just to clarify: the issues for tcp keepalive and TCP_USER_TIMEOUT are
> separate isues, and the fixes would not conflict afaik.
> 
> Thanks.  -- Enke

I have posted patches for both issues, and there is no conflict between
the patches.

Thanks.  -- Enke

> 
> On Tue, Jan 12, 2021 at 11:52:43PM +0100, Eric Dumazet wrote:
> > On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <ycheng@google.com> wrote:
> > >
> > > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote:
> > > >
> > > > From: Enke Chen <enchen@paloaltonetworks.com>
> > > >
> > > > In this patch two issues with TCP keepalives are fixed:
> > > >
> > > > 1) TCP keepalive does not timeout when there are data waiting to be
> > > >    delivered and then the connection got broken. The TCP keepalive
> > > >    timeout is not evaluated in that condition.
> > > hi enke
> > > Do you have an example to demonstrate this issue -- in theory when
> > > there is data inflight, an RTO timer should be pending (which
> > > considers user-timeout setting). based on the user-timeout description
> > > (man tcp), the user timeout should abort the socket per the specified
> > > time after data commences. some data would help to understand the
> > > issue.
> > >
> > 
> > +1
> > 
> > A packetdrill test would be ideal.
> > 
> > Also, given that there is this ongoing issue with TCP_USER_TIMEOUT,
> > lets not mix things
> > or risk added work for backports to stable versions.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tcp: keepalive fixes
  2021-01-12 19:25 [PATCH] tcp: keepalive fixes Enke Chen
  2021-01-12 22:48 ` Yuchung Cheng
@ 2021-01-22 19:45 ` Enke Chen
  1 sibling, 0 replies; 7+ messages in thread
From: Enke Chen @ 2021-01-22 19:45 UTC (permalink / raw)
  To: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski
  Cc: netdev, linux-kernel, Neal Cardwell, enkechen2020

Hi, Folks:

Please ignore this patch. I will split it into separate ones as suggested
off-list by Neal Cardwell <ncardwell@google.com>.

Thanks.  -- Enke

On Tue, Jan 12, 2021 at 11:25:44AM -0800, Enke Chen wrote:
> From: Enke Chen <enchen@paloaltonetworks.com>
> 
> In this patch two issues with TCP keepalives are fixed:
> 
> 1) TCP keepalive does not timeout when there are data waiting to be
>    delivered and then the connection got broken. The TCP keepalive
>    timeout is not evaluated in that condition.
> 
>    The fix is to remove the code that prevents TCP keepalive from
>    being evaluated for timeout.
> 
> 2) With the fix for #1, TCP keepalive can erroneously timeout after
>    the 0-window probe kicks in. The 0-window probe counter is wrongly
>    applied to TCP keepalives.
> 
>    The fix is to use the elapsed time instead of the 0-window probe
>    counter in evaluating TCP keepalive timeout.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
> ---
>  net/ipv4/tcp_timer.c | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 6c62b9ea1320..40953aa40d53 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
>  	    ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
>  		goto out;
>  
> -	elapsed = keepalive_time_when(tp);
> -
> -	/* It is alive without keepalive 8) */
> -	if (tp->packets_out || !tcp_write_queue_empty(sk))
> -		goto resched;
> -
>  	elapsed = keepalive_time_elapsed(tp);
>  
>  	if (elapsed >= keepalive_time_when(tp)) {
> @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t)
>  		 * to determine when to timeout instead.
>  		 */
>  		if ((icsk->icsk_user_timeout != 0 &&
> -		    elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) &&
> -		    icsk->icsk_probes_out > 0) ||
> +		     elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) ||
>  		    (icsk->icsk_user_timeout == 0 &&
> -		    icsk->icsk_probes_out >= keepalive_probes(tp))) {
> +		     (elapsed >= keepalive_time_when(tp) +
> +		      keepalive_intvl_when(tp) * keepalive_probes(tp)))) {
>  			tcp_send_active_reset(sk, GFP_ATOMIC);
>  			tcp_write_err(sk);
>  			goto out;
>  		}
>  		if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> -			icsk->icsk_probes_out++;
>  			elapsed = keepalive_intvl_when(tp);
>  		} else {
>  			/* If keepalive was lost due to local congestion,
> @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
>  	}
>  
>  	sk_mem_reclaim(sk);
> -
> -resched:
>  	inet_csk_reset_keepalive_timer (sk, elapsed);
>  	goto out;
>  
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-22 22:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-12 19:25 [PATCH] tcp: keepalive fixes Enke Chen
2021-01-12 22:48 ` Yuchung Cheng
2021-01-12 22:52   ` Eric Dumazet
2021-01-13 20:06     ` Enke Chen
2021-01-13 20:28       ` Enke Chen
2021-01-13  0:42   ` Enke Chen
2021-01-22 19:45 ` Enke Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).