All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
@ 2021-01-22 10:27 Pengcheng Yang
  2021-01-22 10:53 ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Pengcheng Yang @ 2021-01-22 10:27 UTC (permalink / raw)
  To: edumazet, ncardwell, ycheng, davem; +Cc: netdev, Pengcheng Yang

When CA_STATE is in DISORDER, the TLP timer is not set when receiving
an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
change from DISORDER to OPEN. If the sender is app-limited, it can only
wait for the RTO timer to expire and retransmit.

The reason for this is that the TLP timer is set before CA_STATE changes
in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
until after tcp_fastretrans_alert() returns and remove the
FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.

This commit has two additional benefits:
1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
avoid spurious RTO caused by RTO timer early expires.
2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
timer is set.

Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
 include/net/tcp.h       |  2 +-
 net/ipv4/tcp_input.c    | 10 ++++++----
 net/ipv4/tcp_recovery.c |  5 +++--
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 78d13c8..67f7e52 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2060,7 +2060,7 @@ static inline __u32 cookie_init_sequence(const struct tcp_request_sock_ops *ops,
 void tcp_newreno_mark_lost(struct sock *sk, bool snd_una_advanced);
 extern s32 tcp_rack_skb_timeout(struct tcp_sock *tp, struct sk_buff *skb,
 				u32 reo_wnd);
-extern void tcp_rack_mark_lost(struct sock *sk);
+extern bool tcp_rack_mark_lost(struct sock *sk);
 extern void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
 			     u64 xmit_time);
 extern void tcp_rack_reo_timeout(struct sock *sk);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c7e16b0..d0a9588 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2859,7 +2859,8 @@ static void tcp_identify_packet_loss(struct sock *sk, int *ack_flag)
 	} else if (tcp_is_rack(sk)) {
 		u32 prior_retrans = tp->retrans_out;
 
-		tcp_rack_mark_lost(sk);
+		if (tcp_rack_mark_lost(sk))
+			*ack_flag &= ~FLAG_SET_XMIT_TIMER;
 		if (prior_retrans > tp->retrans_out)
 			*ack_flag |= FLAG_LOST_RETRANS;
 	}
@@ -3815,9 +3816,6 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 
 	if (tp->tlp_high_seq)
 		tcp_process_tlp_ack(sk, ack, flag);
-	/* If needed, reset TLP/RTO timer; RACK may later override this. */
-	if (flag & FLAG_SET_XMIT_TIMER)
-		tcp_set_xmit_timer(sk);
 
 	if (tcp_ack_is_dubious(sk, flag)) {
 		if (!(flag & (FLAG_SND_UNA_ADVANCED | FLAG_NOT_DUP))) {
@@ -3830,6 +3828,10 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 				      &rexmit);
 	}
 
+	/* If needed, reset TLP/RTO timer when RACK doesn't set. */
+	if (flag & FLAG_SET_XMIT_TIMER)
+		tcp_set_xmit_timer(sk);
+
 	if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
 		sk_dst_confirm(sk);
 
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index 177307a..6f1b4ac 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -96,13 +96,13 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout)
 	}
 }
 
-void tcp_rack_mark_lost(struct sock *sk)
+bool tcp_rack_mark_lost(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	u32 timeout;
 
 	if (!tp->rack.advanced)
-		return;
+		return false;
 
 	/* Reset the advanced flag to avoid unnecessary queue scanning */
 	tp->rack.advanced = 0;
@@ -112,6 +112,7 @@ void tcp_rack_mark_lost(struct sock *sk)
 		inet_csk_reset_xmit_timer(sk, ICSK_TIME_REO_TIMEOUT,
 					  timeout, inet_csk(sk)->icsk_rto);
 	}
+	return !!timeout;
 }
 
 /* Record the most recently (re)sent time among the (s)acked packets
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-22 10:27 [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN Pengcheng Yang
@ 2021-01-22 10:53 ` Eric Dumazet
  2021-01-22 14:36   ` Neal Cardwell
  2021-01-23  1:27   ` Jakub Kicinski
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2021-01-22 10:53 UTC (permalink / raw)
  To: Pengcheng Yang; +Cc: Neal Cardwell, Yuchung Cheng, David Miller, netdev

On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
>
> When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> change from DISORDER to OPEN. If the sender is app-limited, it can only
> wait for the RTO timer to expire and retransmit.
>
> The reason for this is that the TLP timer is set before CA_STATE changes
> in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> until after tcp_fastretrans_alert() returns and remove the
> FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
>
> This commit has two additional benefits:
> 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> avoid spurious RTO caused by RTO timer early expires.
> 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> timer is set.
>
> Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> ---

This looks like a very nice patch, let me run packetdrill tests on it.

By any chance, have you cooked a packetdrill test showing the issue
(failing on unpatched kernel) ?

Thanks.

>  include/net/tcp.h       |  2 +-
>  net/ipv4/tcp_input.c    | 10 ++++++----
>  net/ipv4/tcp_recovery.c |  5 +++--
>  3 files changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 78d13c8..67f7e52 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -2060,7 +2060,7 @@ static inline __u32 cookie_init_sequence(const struct tcp_request_sock_ops *ops,
>  void tcp_newreno_mark_lost(struct sock *sk, bool snd_una_advanced);
>  extern s32 tcp_rack_skb_timeout(struct tcp_sock *tp, struct sk_buff *skb,
>                                 u32 reo_wnd);
> -extern void tcp_rack_mark_lost(struct sock *sk);
> +extern bool tcp_rack_mark_lost(struct sock *sk);
>  extern void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
>                              u64 xmit_time);
>  extern void tcp_rack_reo_timeout(struct sock *sk);
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index c7e16b0..d0a9588 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2859,7 +2859,8 @@ static void tcp_identify_packet_loss(struct sock *sk, int *ack_flag)
>         } else if (tcp_is_rack(sk)) {
>                 u32 prior_retrans = tp->retrans_out;
>
> -               tcp_rack_mark_lost(sk);
> +               if (tcp_rack_mark_lost(sk))
> +                       *ack_flag &= ~FLAG_SET_XMIT_TIMER;
>                 if (prior_retrans > tp->retrans_out)
>                         *ack_flag |= FLAG_LOST_RETRANS;
>         }
> @@ -3815,9 +3816,6 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
>
>         if (tp->tlp_high_seq)
>                 tcp_process_tlp_ack(sk, ack, flag);
> -       /* If needed, reset TLP/RTO timer; RACK may later override this. */
> -       if (flag & FLAG_SET_XMIT_TIMER)
> -               tcp_set_xmit_timer(sk);
>
>         if (tcp_ack_is_dubious(sk, flag)) {
>                 if (!(flag & (FLAG_SND_UNA_ADVANCED | FLAG_NOT_DUP))) {
> @@ -3830,6 +3828,10 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
>                                       &rexmit);
>         }
>
> +       /* If needed, reset TLP/RTO timer when RACK doesn't set. */
> +       if (flag & FLAG_SET_XMIT_TIMER)
> +               tcp_set_xmit_timer(sk);
> +
>         if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
>                 sk_dst_confirm(sk);
>
> diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
> index 177307a..6f1b4ac 100644
> --- a/net/ipv4/tcp_recovery.c
> +++ b/net/ipv4/tcp_recovery.c
> @@ -96,13 +96,13 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout)
>         }
>  }
>
> -void tcp_rack_mark_lost(struct sock *sk)
> +bool tcp_rack_mark_lost(struct sock *sk)
>  {
>         struct tcp_sock *tp = tcp_sk(sk);
>         u32 timeout;
>
>         if (!tp->rack.advanced)
> -               return;
> +               return false;
>
>         /* Reset the advanced flag to avoid unnecessary queue scanning */
>         tp->rack.advanced = 0;
> @@ -112,6 +112,7 @@ void tcp_rack_mark_lost(struct sock *sk)
>                 inet_csk_reset_xmit_timer(sk, ICSK_TIME_REO_TIMEOUT,
>                                           timeout, inet_csk(sk)->icsk_rto);
>         }
> +       return !!timeout;
>  }
>
>  /* Record the most recently (re)sent time among the (s)acked packets
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-22 10:53 ` Eric Dumazet
@ 2021-01-22 14:36   ` Neal Cardwell
  2021-01-22 21:02     ` Yuchung Cheng
  2021-01-23  1:27   ` Jakub Kicinski
  1 sibling, 1 reply; 10+ messages in thread
From: Neal Cardwell @ 2021-01-22 14:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Pengcheng Yang, Yuchung Cheng, David Miller, netdev

On Fri, Jan 22, 2021 at 5:53 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> >
> > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > change from DISORDER to OPEN. If the sender is app-limited, it can only
> > wait for the RTO timer to expire and retransmit.
> >
> > The reason for this is that the TLP timer is set before CA_STATE changes
> > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > until after tcp_fastretrans_alert() returns and remove the
> > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> >
> > This commit has two additional benefits:
> > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > avoid spurious RTO caused by RTO timer early expires.
> > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > timer is set.
> >
> > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > Cc: Neal Cardwell <ncardwell@google.com>
> > ---
>
> This looks like a very nice patch, let me run packetdrill tests on it.
>
> By any chance, have you cooked a packetdrill test showing the issue
> (failing on unpatched kernel) ?

Thanks, Pengcheng. This patch looks good to me as well, assuming it
passes our packetdrill tests. I agree with Eric that it would be good
to have an explicit packetdrill test for this case.

neal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-22 14:36   ` Neal Cardwell
@ 2021-01-22 21:02     ` Yuchung Cheng
  2021-01-23 13:58       ` Pengcheng Yang
  0 siblings, 1 reply; 10+ messages in thread
From: Yuchung Cheng @ 2021-01-22 21:02 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Eric Dumazet, Pengcheng Yang, David Miller, netdev

On Fri, Jan 22, 2021 at 6:37 AM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Fri, Jan 22, 2021 at 5:53 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > >
> > > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > > change from DISORDER to OPEN. If the sender is app-limited, it can only

Could you point which line of code causes the state to flip
incorrectly due to the TLP timer setting?

> > > wait for the RTO timer to expire and retransmit.
> > >
> > > The reason for this is that the TLP timer is set before CA_STATE changes
> > > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > > until after tcp_fastretrans_alert() returns and remove the
> > > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> > >
> > > This commit has two additional benefits:
> > > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > > avoid spurious RTO caused by RTO timer early expires.
> > > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > > timer is set.
> > >
> > > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > > Cc: Neal Cardwell <ncardwell@google.com>
> > > ---
> >
> > This looks like a very nice patch, let me run packetdrill tests on it.
> >
> > By any chance, have you cooked a packetdrill test showing the issue
> > (failing on unpatched kernel) ?
>
> Thanks, Pengcheng. This patch looks good to me as well, assuming it
> passes our packetdrill tests. I agree with Eric that it would be good
> to have an explicit packetdrill test for this case.
>
> neal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-22 10:53 ` Eric Dumazet
  2021-01-22 14:36   ` Neal Cardwell
@ 2021-01-23  1:27   ` Jakub Kicinski
  2021-01-23 14:47     ` Pengcheng Yang
  1 sibling, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2021-01-23  1:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Pengcheng Yang, Neal Cardwell, Yuchung Cheng, David Miller, netdev

On Fri, 22 Jan 2021 11:53:46 +0100 Eric Dumazet wrote:
> On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> >
> > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > change from DISORDER to OPEN. If the sender is app-limited, it can only
> > wait for the RTO timer to expire and retransmit.
> >
> > The reason for this is that the TLP timer is set before CA_STATE changes
> > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > until after tcp_fastretrans_alert() returns and remove the
> > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> >
> > This commit has two additional benefits:
> > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > avoid spurious RTO caused by RTO timer early expires.
> > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > timer is set.
> >
> > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > Cc: Neal Cardwell <ncardwell@google.com>
>
> This looks like a very nice patch, let me run packetdrill tests on it.
> 
> By any chance, have you cooked a packetdrill test showing the issue
> (failing on unpatched kernel) ?

Any guidance on backporting / fixes tag? (once the packetdrill
questions are satisfied)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-22 21:02     ` Yuchung Cheng
@ 2021-01-23 13:58       ` Pengcheng Yang
  2021-01-23 19:14         ` Yuchung Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Pengcheng Yang @ 2021-01-23 13:58 UTC (permalink / raw)
  To: ycheng; +Cc: davem, edumazet, ncardwell, netdev, yangpc

On Sat, Jan 23, 2021 at 5:02 AM "Yuchung Cheng" <ycheng@google.com> wrote:
>
> On Fri, Jan 22, 2021 at 6:37 AM Neal Cardwell <ncardwell@google.com> wrote:
> >
> > On Fri, Jan 22, 2021 at 5:53 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > > >
> > > > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > > > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > > > change from DISORDER to OPEN. If the sender is app-limited, it can only
> 
> Could you point which line of code causes the state to flip
> incorrectly due to the TLP timer setting?
> 

I mean TLP timer is not set due to receiving an ACK that changes CA_STATE 
from DISORDER to OPEN.

Receive an ACK covered out-of-order data in disorder state:

tcp_ack()
|-tcp_set_xmit_timer()	// RTO timer is set instead of TLP timer
|  ...
|-tcp_fastretrans_alert() // change from disorder to open

> > > > wait for the RTO timer to expire and retransmit.
> > > >
> > > > The reason for this is that the TLP timer is set before CA_STATE changes
> > > > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > > > until after tcp_fastretrans_alert() returns and remove the
> > > > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> > > >
> > > > This commit has two additional benefits:
> > > > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > > > avoid spurious RTO caused by RTO timer early expires.
> > > > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > > > timer is set.
> > > >
> > > > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > > > Cc: Neal Cardwell <ncardwell@google.com>
> > > > ---
> > >
> > > This looks like a very nice patch, let me run packetdrill tests on it.
> > >
> > > By any chance, have you cooked a packetdrill test showing the issue
> > > (failing on unpatched kernel) ?
> >
> > Thanks, Pengcheng. This patch looks good to me as well, assuming it
> > passes our packetdrill tests. I agree with Eric that it would be good
> > to have an explicit packetdrill test for this case.
> >
> > neal

Here is a packetdrill test case:

// Enable TLP
    0 `sysctl -q net.ipv4.tcp_early_retrans=3`

// Establish a connection
   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

// RTT 100ms, RTO 300ms
  +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <...>
  +.1 < . 1:1(0) ack 1 win 257
   +0 accept(3, ..., ...) = 4

// Send 4 data segments
   +0 write(4, ..., 4000) = 4000
   +0 > P. 1:4001(4000) ack 1

// out-of-order: ca_state turns to disorder
  +.1 < . 1:1(0) ack 1 win 257 <sack 1001:2001,nop,nop>

// ACK covered out-of-order data: ca_state turns to open,
// but RTO timer is set instead of TLP timer and the RTO 
// timer will expire at rtx_head_time+RTO (in 200ms).
   +0 < . 1:1(0) ack 2001 win 257

// Expect to send TLP packet in 2*rtt (200ms)
+.2~+.25 > P. 3001:4001(1000) ack 1


I ran this packetdrill test case on the kernel without 
the patch applied:

tlp_timer_unset.pkt:31: error handling packet: live packet 
field tcp_seq: expected: 3001 (0xbb9) vs actual: 2001 (0x7d1)
script packet:  0.644587 P. 3001:4001(1000) ack 1 
actual packet:  0.646197 . 2001:3001(1000) ack 1 win 502 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-23  1:27   ` Jakub Kicinski
@ 2021-01-23 14:47     ` Pengcheng Yang
  2021-01-23 18:25       ` Neal Cardwell
  0 siblings, 1 reply; 10+ messages in thread
From: Pengcheng Yang @ 2021-01-23 14:47 UTC (permalink / raw)
  To: kuba; +Cc: davem, edumazet, ncardwell, netdev, yangpc, ycheng

On Sat, Jan 23, 2021 at 9:27 AM "Jakub Kicinski" <kuba@kernel.org> wrote:
> 
> On Fri, 22 Jan 2021 11:53:46 +0100 Eric Dumazet wrote:
> > On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > >
> > > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > > change from DISORDER to OPEN. If the sender is app-limited, it can only
> > > wait for the RTO timer to expire and retransmit.
> > >
> > > The reason for this is that the TLP timer is set before CA_STATE changes
> > > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > > until after tcp_fastretrans_alert() returns and remove the
> > > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> > >
> > > This commit has two additional benefits:
> > > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > > avoid spurious RTO caused by RTO timer early expires.
> > > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > > timer is set.
> > >
> > > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > > Cc: Neal Cardwell <ncardwell@google.com>
> >
> > This looks like a very nice patch, let me run packetdrill tests on it.
> > 
> > By any chance, have you cooked a packetdrill test showing the issue
> > (failing on unpatched kernel) ?
> 
> Any guidance on backporting / fixes tag? (once the packetdrill
> questions are satisfied)

By reading the commits, we can add:
Fixes: df92c8394e6e ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-23 14:47     ` Pengcheng Yang
@ 2021-01-23 18:25       ` Neal Cardwell
  2021-01-23 20:25         ` Jakub Kicinski
  0 siblings, 1 reply; 10+ messages in thread
From: Neal Cardwell @ 2021-01-23 18:25 UTC (permalink / raw)
  To: Pengcheng Yang
  Cc: Jakub Kicinski, David Miller, Eric Dumazet, Netdev, Yuchung Cheng

On Sat, Jan 23, 2021 at 9:15 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
>
> On Sat, Jan 23, 2021 at 9:27 AM "Jakub Kicinski" <kuba@kernel.org> wrote:
> >
> > On Fri, 22 Jan 2021 11:53:46 +0100 Eric Dumazet wrote:
> > > On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > > >
> > > > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > > > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > > > change from DISORDER to OPEN. If the sender is app-limited, it can only
> > > > wait for the RTO timer to expire and retransmit.
> > > >
> > > > The reason for this is that the TLP timer is set before CA_STATE changes
> > > > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > > > until after tcp_fastretrans_alert() returns and remove the
> > > > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> > > >
> > > > This commit has two additional benefits:
> > > > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > > > avoid spurious RTO caused by RTO timer early expires.
> > > > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > > > timer is set.
> > > >
> > > > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > > > Cc: Neal Cardwell <ncardwell@google.com>
> > >
> > > This looks like a very nice patch, let me run packetdrill tests on it.
> > >
> > > By any chance, have you cooked a packetdrill test showing the issue
> > > (failing on unpatched kernel) ?
> >
> > Any guidance on backporting / fixes tag? (once the packetdrill
> > questions are satisfied)
>
> By reading the commits, we can add:
> Fixes: df92c8394e6e ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")

Thanks for the fix and the packetdrill test!

Eric ran our internal packetdrill test suite and all the changes in
behavior with your patch all look like fixes to me.

Acked-by: Neal Cardwell <ncardwell@google.com>

The Fixes: tag you suggest also looks good to me. (It seems like
patchwork did not pick it up from your email and add it to the commit
message automatically, BTW?)

neal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-23 13:58       ` Pengcheng Yang
@ 2021-01-23 19:14         ` Yuchung Cheng
  0 siblings, 0 replies; 10+ messages in thread
From: Yuchung Cheng @ 2021-01-23 19:14 UTC (permalink / raw)
  To: Pengcheng Yang; +Cc: David Miller, Eric Dumazet, Neal Cardwell, netdev

On Sat, Jan 23, 2021 at 5:27 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
>
> On Sat, Jan 23, 2021 at 5:02 AM "Yuchung Cheng" <ycheng@google.com> wrote:
> >
> > On Fri, Jan 22, 2021 at 6:37 AM Neal Cardwell <ncardwell@google.com> wrote:
> > >
> > > On Fri, Jan 22, 2021 at 5:53 AM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> > > > >
> > > > > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > > > > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > > > > change from DISORDER to OPEN. If the sender is app-limited, it can only
> >
> > Could you point which line of code causes the state to flip
> > incorrectly due to the TLP timer setting?
> >
>
> I mean TLP timer is not set due to receiving an ACK that changes CA_STATE
> from DISORDER to OPEN.
Acked-by: Yuchung Cheng <ycheng@google.com>


Thanks for the clarification and the fix. The initial cmsg reads like
the timer setting is causing state flips. I'd suggest this commit
message in your next rev.

"Upon receiving a cumulative ACK that changes the congestion state
from Disorder to Open, the TLP timer is not set. If the sender is
app-limited, it can only wait for the RTO timer to expire and
retransmit.

The reason for this ... "


>
> Receive an ACK covered out-of-order data in disorder state:
>
> tcp_ack()
> |-tcp_set_xmit_timer()  // RTO timer is set instead of TLP timer
> |  ...
> |-tcp_fastretrans_alert() // change from disorder to open
>
> > > > > wait for the RTO timer to expire and retransmit.
> > > > >
> > > > > The reason for this is that the TLP timer is set before CA_STATE changes
> > > > > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > > > > until after tcp_fastretrans_alert() returns and remove the
> > > > > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> > > > >
> > > > > This commit has two additional benefits:
> > > > > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > > > > avoid spurious RTO caused by RTO timer early expires.
> > > > > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > > > > timer is set.
> > > > >
> > > > > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > > > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > > > > Cc: Neal Cardwell <ncardwell@google.com>
> > > > > ---
> > > >
> > > > This looks like a very nice patch, let me run packetdrill tests on it.
> > > >
> > > > By any chance, have you cooked a packetdrill test showing the issue
> > > > (failing on unpatched kernel) ?
> > >
> > > Thanks, Pengcheng. This patch looks good to me as well, assuming it
> > > passes our packetdrill tests. I agree with Eric that it would be good
> > > to have an explicit packetdrill test for this case.
> > >
> > > neal
>
> Here is a packetdrill test case:
>
> // Enable TLP
>     0 `sysctl -q net.ipv4.tcp_early_retrans=3`
>
> // Establish a connection
>    +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
>    +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>    +0 bind(3, ..., ...) = 0
>    +0 listen(3, 1) = 0
>
> // RTT 100ms, RTO 300ms
>   +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
>    +0 > S. 0:0(0) ack 1 <...>
>   +.1 < . 1:1(0) ack 1 win 257
>    +0 accept(3, ..., ...) = 4
>
> // Send 4 data segments
>    +0 write(4, ..., 4000) = 4000
>    +0 > P. 1:4001(4000) ack 1
>
> // out-of-order: ca_state turns to disorder
>   +.1 < . 1:1(0) ack 1 win 257 <sack 1001:2001,nop,nop>
>
> // ACK covered out-of-order data: ca_state turns to open,
> // but RTO timer is set instead of TLP timer and the RTO
> // timer will expire at rtx_head_time+RTO (in 200ms).
>    +0 < . 1:1(0) ack 2001 win 257
>
> // Expect to send TLP packet in 2*rtt (200ms)
> +.2~+.25 > P. 3001:4001(1000) ack 1
>
>
> I ran this packetdrill test case on the kernel without
> the patch applied:
>
> tlp_timer_unset.pkt:31: error handling packet: live packet
> field tcp_seq: expected: 3001 (0xbb9) vs actual: 2001 (0x7d1)
> script packet:  0.644587 P. 3001:4001(1000) ack 1
> actual packet:  0.646197 . 2001:3001(1000) ack 1 win 502
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
  2021-01-23 18:25       ` Neal Cardwell
@ 2021-01-23 20:25         ` Jakub Kicinski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2021-01-23 20:25 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Pengcheng Yang, David Miller, Eric Dumazet, Netdev, Yuchung Cheng

On Sat, 23 Jan 2021 13:25:23 -0500 Neal Cardwell wrote:
> On Sat, Jan 23, 2021 at 9:15 AM Pengcheng Yang <yangpc@wangsu.com> wrote:
> >
> > On Sat, Jan 23, 2021 at 9:27 AM "Jakub Kicinski" <kuba@kernel.org> wrote:  
> > >
> > > On Fri, 22 Jan 2021 11:53:46 +0100 Eric Dumazet wrote:  
> > > > On Fri, Jan 22, 2021 at 11:28 AM Pengcheng Yang <yangpc@wangsu.com> wrote:  
> > > > >
> > > > > When CA_STATE is in DISORDER, the TLP timer is not set when receiving
> > > > > an ACK (a cumulative ACK covered out-of-order data) causes CA_STATE to
> > > > > change from DISORDER to OPEN. If the sender is app-limited, it can only
> > > > > wait for the RTO timer to expire and retransmit.
> > > > >
> > > > > The reason for this is that the TLP timer is set before CA_STATE changes
> > > > > in tcp_ack(), so we delay the time point of calling tcp_set_xmit_timer()
> > > > > until after tcp_fastretrans_alert() returns and remove the
> > > > > FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer is set.
> > > > >
> > > > > This commit has two additional benefits:
> > > > > 1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
> > > > > avoid spurious RTO caused by RTO timer early expires.
> > > > > 2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
> > > > > timer is set.
> > > > >
> > > > > Link: https://lore.kernel.org/netdev/1611139794-11254-1-git-send-email-yangpc@wangsu.com
> > > > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
> > > > > Cc: Neal Cardwell <ncardwell@google.com>  
> > > >
> > > > This looks like a very nice patch, let me run packetdrill tests on it.
> > > >
> > > > By any chance, have you cooked a packetdrill test showing the issue
> > > > (failing on unpatched kernel) ?  
> > >
> > > Any guidance on backporting / fixes tag? (once the packetdrill
> > > questions are satisfied)  
> >
> > By reading the commits, we can add:
> > Fixes: df92c8394e6e ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")  
> 
> Thanks for the fix and the packetdrill test!
> 
> Eric ran our internal packetdrill test suite and all the changes in
> behavior with your patch all look like fixes to me.
> 
> Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks!

> The Fixes: tag you suggest also looks good to me. (It seems like
> patchwork did not pick it up from your email and add it to the commit
> message automatically, BTW?)

Indeed, we haven't worked that out yet in patchwork.kernel.org, but
it's no big deal, normally we can add manually. In this case seems 
like Yuchung asked for a small rewrite to the commit message so I'd
appreciate a full repost.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-23 20:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-22 10:27 [PATCH net] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN Pengcheng Yang
2021-01-22 10:53 ` Eric Dumazet
2021-01-22 14:36   ` Neal Cardwell
2021-01-22 21:02     ` Yuchung Cheng
2021-01-23 13:58       ` Pengcheng Yang
2021-01-23 19:14         ` Yuchung Cheng
2021-01-23  1:27   ` Jakub Kicinski
2021-01-23 14:47     ` Pengcheng Yang
2021-01-23 18:25       ` Neal Cardwell
2021-01-23 20:25         ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.