All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "tcp: simplify window probe aborting on USER_TIMEOUT"
@ 2021-01-09  4:38 Enke Chen
  2021-01-11 14:58 ` Neal Cardwell
  0 siblings, 1 reply; 3+ messages in thread
From: Enke Chen @ 2021-01-09  4:38 UTC (permalink / raw)
  To: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Soheil Hassas Yeganeh,
	Neal Cardwell, Yuchung Cheng
  Cc: netdev, linux-kernel, Jonathan Maxwell, William McCall, enchen2020

From: Enke Chen <enchen@paloaltonetworks.com>

This reverts commit 9721e709fa68ef9b860c322b474cfbd1f8285b0f.

With the commit 9721e709fa68 ("tcp: simplify window probe aborting
on USER_TIMEOUT"), the TCP session does not terminate with
TCP_USER_TIMEOUT when data remain untransmitted due to zero window.

The number of unanswered zero-window probes (tcp_probes_out) is
reset to zero with incoming acks irrespective of the window size,
as described in tcp_probe_timer():

    RFC 1122 4.2.2.17 requires the sender to stay open indefinitely
    as long as the receiver continues to respond probes. We support
    this by default and reset icsk_probes_out with incoming ACKs.

This counter, however, is the wrong one to be used in calculating the
duration that the window remains closed and data remain untransmitted.
Thanks to Jonathan Maxwell <jmaxwell37@gmail.com> for diagnosing the
actual issue.

Cc: stable@vger.kernel.org
Fixes: 9721e709fa68 ("tcp: simplify window probe aborting on USER_TIMEOUT")
Reported-by: William McCall <william.mccall@gmail.com>
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
---
 net/ipv4/tcp_timer.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 6c62b9ea1320..ad98f2ea89f1 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -346,6 +346,7 @@ static void tcp_probe_timer(struct sock *sk)
 	struct sk_buff *skb = tcp_send_head(sk);
 	struct tcp_sock *tp = tcp_sk(sk);
 	int max_probes;
+	u32 start_ts;
 
 	if (tp->packets_out || !skb) {
 		icsk->icsk_probes_out = 0;
@@ -360,13 +361,12 @@ static void tcp_probe_timer(struct sock *sk)
 	 * corresponding system limit. We also implement similar policy when
 	 * we use RTO to probe window in tcp_retransmit_timer().
 	 */
-	if (icsk->icsk_user_timeout) {
-		u32 elapsed = tcp_model_timeout(sk, icsk->icsk_probes_out,
-						tcp_probe0_base(sk));
-
-		if (elapsed >= icsk->icsk_user_timeout)
-			goto abort;
-	}
+	start_ts = tcp_skb_timestamp(skb);
+	if (!start_ts)
+		skb->skb_mstamp_ns = tp->tcp_clock_cache;
+	else if (icsk->icsk_user_timeout &&
+		 (s32)(tcp_time_stamp(tp) - start_ts) > icsk->icsk_user_timeout)
+		goto abort;
 
 	max_probes = sock_net(sk)->ipv4.sysctl_tcp_retries2;
 	if (sock_flag(sk, SOCK_DEAD)) {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Revert "tcp: simplify window probe aborting on USER_TIMEOUT"
  2021-01-09  4:38 [PATCH] Revert "tcp: simplify window probe aborting on USER_TIMEOUT" Enke Chen
@ 2021-01-11 14:58 ` Neal Cardwell
  2021-01-11 23:22   ` Enke Chen
  0 siblings, 1 reply; 3+ messages in thread
From: Neal Cardwell @ 2021-01-11 14:58 UTC (permalink / raw)
  To: Enke Chen
  Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Soheil Hassas Yeganeh,
	Yuchung Cheng, Netdev, LKML, Jonathan Maxwell, William McCall,
	enchen2020

On Fri, Jan 8, 2021 at 11:38 PM Enke Chen <enkechen2020@gmail.com> wrote:
>
> From: Enke Chen <enchen@paloaltonetworks.com>
>
> This reverts commit 9721e709fa68ef9b860c322b474cfbd1f8285b0f.
>
> With the commit 9721e709fa68 ("tcp: simplify window probe aborting
> on USER_TIMEOUT"), the TCP session does not terminate with
> TCP_USER_TIMEOUT when data remain untransmitted due to zero window.
>
> The number of unanswered zero-window probes (tcp_probes_out) is
> reset to zero with incoming acks irrespective of the window size,
> as described in tcp_probe_timer():
>
>     RFC 1122 4.2.2.17 requires the sender to stay open indefinitely
>     as long as the receiver continues to respond probes. We support
>     this by default and reset icsk_probes_out with incoming ACKs.
>
> This counter, however, is the wrong one to be used in calculating the
> duration that the window remains closed and data remain untransmitted.
> Thanks to Jonathan Maxwell <jmaxwell37@gmail.com> for diagnosing the
> actual issue.
>
> Cc: stable@vger.kernel.org
> Fixes: 9721e709fa68 ("tcp: simplify window probe aborting on USER_TIMEOUT")
> Reported-by: William McCall <william.mccall@gmail.com>
> Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
> ---

I ran this revert commit through our packetdrill TCP tests, and it's
causing failures in a ZWP/USER_TIMEOUT test due to interactions with
this Jan 2019 patch:

    7f12422c4873e9b274bc151ea59cb0cdf9415cf1
    tcp: always timestamp on every skb transmission

The issue seems to be that after 7f12422c4873 the skb->skb_mstamp_ns
is set on every transmit attempt. That means that even skbs that are
not successfully transmitted have a non-zero skb_mstamp_ns. That means
that if ZWPs are repeatedly failing to be sent due to severe local
qdisc congestion, then at this point in the code the start_ts is
always only 500ms in the past (from TCP_RESOURCE_PROBE_INTERVAL =
500ms). That means that if there is severe local qdisc congestion a
USER_TIMEOUT above 500ms is a NOP, and the socket can live far past
the USER_TIMEOUT.

It seems we need a slightly different approach than the revert in this commit.

neal

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Revert "tcp: simplify window probe aborting on USER_TIMEOUT"
  2021-01-11 14:58 ` Neal Cardwell
@ 2021-01-11 23:22   ` Enke Chen
  0 siblings, 0 replies; 3+ messages in thread
From: Enke Chen @ 2021-01-11 23:22 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Soheil Hassas Yeganeh,
	Yuchung Cheng, Netdev, LKML, Jonathan Maxwell, William McCall,
	enkechen2020

Hi, Neal:

Thank you for testing the reverted patch, and provding the detailed analysis
of the underline issue with the original patch.

Let me go back to the simple and clean approach using a separate counter, as
we were discussing before.

-- Enke

On Mon, Jan 11, 2021 at 09:58:33AM -0500, Neal Cardwell wrote:
> On Fri, Jan 8, 2021 at 11:38 PM Enke Chen <enkechen2020@gmail.com> wrote:
> >
> > From: Enke Chen <enchen@paloaltonetworks.com>
> >
> > This reverts commit 9721e709fa68ef9b860c322b474cfbd1f8285b0f.
> >
> > With the commit 9721e709fa68 ("tcp: simplify window probe aborting
> > on USER_TIMEOUT"), the TCP session does not terminate with
> > TCP_USER_TIMEOUT when data remain untransmitted due to zero window.
> >
> > The number of unanswered zero-window probes (tcp_probes_out) is
> > reset to zero with incoming acks irrespective of the window size,
> > as described in tcp_probe_timer():
> >
> >     RFC 1122 4.2.2.17 requires the sender to stay open indefinitely
> >     as long as the receiver continues to respond probes. We support
> >     this by default and reset icsk_probes_out with incoming ACKs.
> >
> > This counter, however, is the wrong one to be used in calculating the
> > duration that the window remains closed and data remain untransmitted.
> > Thanks to Jonathan Maxwell <jmaxwell37@gmail.com> for diagnosing the
> > actual issue.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: 9721e709fa68 ("tcp: simplify window probe aborting on USER_TIMEOUT")
> > Reported-by: William McCall <william.mccall@gmail.com>
> > Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
> > ---
> 
> I ran this revert commit through our packetdrill TCP tests, and it's
> causing failures in a ZWP/USER_TIMEOUT test due to interactions with
> this Jan 2019 patch:
> 
>     7f12422c4873e9b274bc151ea59cb0cdf9415cf1
>     tcp: always timestamp on every skb transmission
> 
> The issue seems to be that after 7f12422c4873 the skb->skb_mstamp_ns
> is set on every transmit attempt. That means that even skbs that are
> not successfully transmitted have a non-zero skb_mstamp_ns. That means
> that if ZWPs are repeatedly failing to be sent due to severe local
> qdisc congestion, then at this point in the code the start_ts is
> always only 500ms in the past (from TCP_RESOURCE_PROBE_INTERVAL =
> 500ms). That means that if there is severe local qdisc congestion a
> USER_TIMEOUT above 500ms is a NOP, and the socket can live far past
> the USER_TIMEOUT.
> 
> It seems we need a slightly different approach than the revert in this commit.
> 
> neal

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-12  0:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-09  4:38 [PATCH] Revert "tcp: simplify window probe aborting on USER_TIMEOUT" Enke Chen
2021-01-11 14:58 ` Neal Cardwell
2021-01-11 23:22   ` Enke Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.