linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Larry Brakmo <brakmo@fb.com>,
	Yuchung Cheng <ycheng@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.18 13/27] tcp: fix dctcp delayed ACK schedule
Date: Fri, 27 Jul 2018 12:26:50 +0200	[thread overview]
Message-ID: <20180727102649.533469606@linuxfoundation.org> (raw)
In-Reply-To: <20180727102649.049641021@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yuchung Cheng <ycheng@google.com>

[ Upstream commit b0c05d0e99d98d7f0cd41efc1eeec94efdc3325d ]

Previously, when a data segment was sent an ACK was piggybacked
on the data segment without generating a CA_EVENT_NON_DELAYED_ACK
event to notify congestion control modules. So the DCTCP
ca->delayed_ack_reserved flag could incorrectly stay set when
in fact there were no delayed ACKs being reserved. This could result
in sending a special ECN notification ACK that carries an older
ACK sequence, when in fact there was no need for such an ACK.
DCTCP keeps track of the delayed ACK status with its own separate
state ca->delayed_ack_reserved. Previously it may accidentally cancel
the delayed ACK without updating this field upon sending a special
ACK that carries a older ACK sequence. This inconsistency would
lead to DCTCP receiver never acknowledging the latest data until the
sender times out and retry in some cases.

Packetdrill script (provided by Larry Brakmo)

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4

0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001

0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001

0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 2:3(1) ack 2001

0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
0.200 > [ect01] . 3:3(0) ack 4001

0.210 < [ce] P. 4001:4501(500) ack 3 win 257

+0.001 read(4, ..., 4500) = 4500
+0 write(4, ..., 1) = 1
+0 > [ect01] PE. 3:4(1) ack 4501

+0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
// Previously the ACK sequence below would be 4501, causing a long RTO
+0.040~+0.045 > [ect01] . 4:4(0) ack 5501   // delayed ack

+0.311 < [ect0] . 5501:6501(1000) ack 4 win 257  // More data
+0 > [ect01] . 4:4(0) ack 6501     // now acks everything

+0.500 < F. 9501:9501(0) ack 4 win 257

Reported-by: Larry Brakmo <brakmo@fb.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_dctcp.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_dctcp.c
+++ b/net/ipv4/tcp_dctcp.c
@@ -131,7 +131,8 @@ static void dctcp_ce_state_0_to_1(struct
 	/* State has changed from CE=0 to CE=1 and delayed
 	 * ACK has not sent yet.
 	 */
-	if (!ca->ce_state && ca->delayed_ack_reserved) {
+	if (!ca->ce_state &&
+	    inet_csk(sk)->icsk_ack.pending & ICSK_ACK_TIMER) {
 		u32 tmp_rcv_nxt;
 
 		/* Save current rcv_nxt. */
@@ -161,7 +162,8 @@ static void dctcp_ce_state_1_to_0(struct
 	/* State has changed from CE=1 to CE=0 and delayed
 	 * ACK has not sent yet.
 	 */
-	if (ca->ce_state && ca->delayed_ack_reserved) {
+	if (ca->ce_state &&
+	    inet_csk(sk)->icsk_ack.pending & ICSK_ACK_TIMER) {
 		u32 tmp_rcv_nxt;
 
 		/* Save current rcv_nxt. */



  parent reply	other threads:[~2018-07-27 10:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-27 10:26 [PATCH 3.18 00/27] 3.18.117-stable review Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 01/27] x86/MCE: Remove min interval polling limitation Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 02/27] fat: fix memory allocation failure handling of match_strdup() Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 03/27] ALSA: rawmidi: Change resized buffers atomically Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 04/27] ARC: Fix CONFIG_SWAP Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 05/27] ARC: mm: allow mprotect to make stack mappings executable Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 06/27] ipv4: Return EINVAL when ping_group_range sysctl doesnt map to user ns Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 07/27] ptp: fix missing break in switch Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 08/27] net: Dont copy pfmemalloc flag in __copy_skb_header() Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 09/27] skbuff: Unconditionally copy pfmemalloc in __skb_clone() Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 10/27] ip: hash fragments consistently Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 11/27] net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 12/27] rtnetlink: add rtnl_link_state check in rtnl_configure_link Greg Kroah-Hartman
2018-07-27 10:26 ` Greg Kroah-Hartman [this message]
2018-07-27 10:26 ` [PATCH 3.18 14/27] tcp: helpers to send special DCTCP ack Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 15/27] tcp: do not cancel delay-AcK on DCTCP special ACK Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 16/27] tcp: do not delay ACK in DCTCP upon CE status change Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 17/27] tcp: avoid collapses in tcp_prune_queue() if possible Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 18/27] tcp: detect malicious patterns in tcp_collapse_ofo_queue() Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 19/27] usb: cdc_acm: Add quirk for Castles VEGA3000 Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 20/27] usb: core: handle hub C_PORT_OVER_CURRENT condition Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 21/27] usb: gadget: f_fs: Only return delayed status when len is 0 Greg Kroah-Hartman
2018-07-27 10:26 ` [PATCH 3.18 22/27] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Greg Kroah-Hartman
2018-07-27 10:27 ` [PATCH 3.18 23/27] can: xilinx_can: fix device dropping off bus on RX overrun Greg Kroah-Hartman
2018-07-27 10:27 ` [PATCH 3.18 24/27] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting Greg Kroah-Hartman
2018-07-27 10:27 ` [PATCH 3.18 25/27] can: xilinx_can: fix RX overflow interrupt not being enabled Greg Kroah-Hartman
2018-07-27 10:27 ` [PATCH 3.18 26/27] ARM: fix put_user() for gcc-8 Greg Kroah-Hartman
2018-07-27 10:27 ` [PATCH 3.18 27/27] turn off -Wattribute-alias Greg Kroah-Hartman
2018-07-27 12:21 ` [PATCH 3.18 00/27] 3.18.117-stable review Nathan Chancellor
2018-07-27 12:29   ` Greg Kroah-Hartman
2018-07-27 17:28 ` Guenter Roeck
2018-07-27 20:25 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180727102649.533469606@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=brakmo@fb.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=stable@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).