All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neal Cardwell <ncardwell@google.com>
To: Steve Ibanez <sibanez@stanford.edu>
Cc: Eric Dumazet <edumazet@google.com>,
	Yuchung Cheng <ycheng@google.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Netdev <netdev@vger.kernel.org>, Florian Westphal <fw@strlen.de>,
	Mohammad Alizadeh <alizadeh@csail.mit.edu>,
	Lawrence Brakmo <brakmo@fb.com>
Subject: Re: Linux ECN Handling
Date: Tue, 21 Nov 2017 22:46:16 -0500	[thread overview]
Message-ID: <CADVnQykDfa3o6CnX7BeLdokEd4t-mq6wEwsShLtn+spV6OkkaA@mail.gmail.com> (raw)
In-Reply-To: <CACJspmLRasn3vmmQdXdsXTJ4JmNZyWjMZdk2kf3qHao19dhJgw@mail.gmail.com>

On Tue, Nov 21, 2017 at 10:02 PM, Steve Ibanez <sibanez@stanford.edu> wrote:
> Hi Neal,
>
> I just tried out your fix for enabling TLPs in the CWR state (while
> leaving tcp_tso_should_defer() unchanged), but I'm still seeing the
> host enter long timeouts. Feel free to let me know if there is
> something else you'd like me to try.

Oh, interesting. That was surprising to me, until I re-read the TLP
code. I think the TLP code is accidentally preventing the TLP timer
from being set in cases where TSO deferral is using cwnd unused,
because of this part of the logic:

  if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
      !tcp_write_queue_empty(sk))
        return false;

AFAICT it would be great for the TLP timer to be set if TSO deferral
decides not to send. That way the TLP timer firing can unwedge the
connection (in only a few milliseconds in LAN cases) if TSO deferral
decides to defer sending and ACK stop arriving. Removing those 3 lines
might allow TLP to give us much of the benefit of having a timer to
unwedge things after TSO deferral, without adding any new timers or
code.

If you have time, would you be able to try the following two patches
together in your test set-up?

(1)
commit 1ade85cd788cfed0433a83da03e299f396769c73
Author: Neal Cardwell <ncardwell@google.com>
Date:   Tue Nov 21 22:33:30 2017 -0500

    tcp: allow TLP in CWR

    (Also allows TLP in disorder, though this is somewhat academic, since
    in disorder RACK will almost always override the TLP timer with a
    reorder timeout.)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 4ea79b2ad82e..deccf8070f84 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2536,11 +2536,11 @@ bool tcp_schedule_loss_probe(struct sock *sk,
bool advancing_rto)

        early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans;
        /* Schedule a loss probe in 2*RTT for SACK capable connections
-        * in Open state, that are either limited by cwnd or application.
+        * not in loss recovery, that are either limited by cwnd or application.
         */
        if ((early_retrans != 3 && early_retrans != 4) ||
            !tp->packets_out || !tcp_is_sack(tp) ||
-           icsk->icsk_ca_state != TCP_CA_Open)
+           icsk->icsk_ca_state >= TCP_CA_Recovery)
                return false;

        if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&

(2)
commit ccd377a601c14dc82826720d93afb573a388022e (HEAD ->
gnetnext8xx-tlp-recalc-rto-on-ack-fix-v4)
Author: Neal Cardwell <ncardwell@google.com>
Date:   Tue Nov 21 22:34:42 2017 -0500

    tcp: allow scheduling TLP timer if TSO deferral leaves some cwnd unused

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index deccf8070f84..1724cc2bbf1a 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2543,10 +2543,6 @@ bool tcp_schedule_loss_probe(struct sock *sk,
bool advancing_rto)
            icsk->icsk_ca_state >= TCP_CA_Recovery)
                return false;

-       if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
-            !tcp_write_queue_empty(sk))
-               return false;
-
        /* Probe timeout is 2*rtt. Add minimum RTO to account
         * for delayed ack when there's one outstanding packet. If no RTT
         * sample is available then probe after TCP_TIMEOUT_INIT.

Thanks!

neal

  reply	other threads:[~2017-11-22  3:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CACJspmLFdy9i8K=TkzXHnofyFMNoJf9HkYE7On8uG+PREc2Dqw@mail.gmail.com>
2017-10-19 12:43 ` Linux ECN Handling Florian Westphal
2017-10-23 22:15   ` Steve Ibanez
2017-10-24  1:11     ` Neal Cardwell
2017-11-06 14:08       ` Daniel Borkmann
2017-11-06 23:31         ` Steve Ibanez
2017-11-20  7:31           ` Steve Ibanez
2017-11-20 15:05             ` Neal Cardwell
     [not found]             ` <CADVnQy=q4qBpe0Ymo8dtFTYU_0x0q_XKE+ZvazLQE-ULwu7pQA@mail.gmail.com>
2017-11-20 15:40               ` Eric Dumazet
2017-11-21  5:58               ` Steve Ibanez
2017-11-21 15:01                 ` Neal Cardwell
2017-11-21 15:51                   ` Yuchung Cheng
2017-11-21 16:20                     ` Neal Cardwell
2017-11-21 16:52                       ` Eric Dumazet
2017-11-22  3:02                         ` Steve Ibanez
2017-11-22  3:46                           ` Neal Cardwell [this message]
2017-11-27 18:49                             ` Steve Ibanez
2017-12-01 16:35                               ` Neal Cardwell
2017-12-05  5:22                                 ` Steve Ibanez
2017-12-05 15:23                                   ` Neal Cardwell
2017-12-05 19:36                                     ` Steve Ibanez
2017-12-05 20:04                                       ` Neal Cardwell
2017-12-19  5:16                                         ` Steve Ibanez
2017-12-19 15:28                                           ` Neal Cardwell
2017-12-19 22:00                                             ` Steve Ibanez
2017-12-20  0:08                                               ` Neal Cardwell
2017-12-20 19:20                                                 ` Steve Ibanez
2017-12-20 20:17                                                   ` Neal Cardwell
2018-01-02  7:43                                                     ` Steve Ibanez
2018-01-02 16:27                                                       ` Neal Cardwell
2018-01-02 23:57                                                         ` Steve Ibanez
2018-01-03 19:39                                                           ` Neal Cardwell
2018-01-03 22:21                                                             ` Steve Ibanez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADVnQykDfa3o6CnX7BeLdokEd4t-mq6wEwsShLtn+spV6OkkaA@mail.gmail.com \
    --to=ncardwell@google.com \
    --cc=alizadeh@csail.mit.edu \
    --cc=brakmo@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=netdev@vger.kernel.org \
    --cc=sibanez@stanford.edu \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.