Linux-kselftest Archive on lore.kernel.org
 help / color / Atom feed
From: <sjpark@amazon.com>
To: <edumazet@google.com>, <davem@davemloft.net>
Cc: <shuah@kernel.org>, <netdev@vger.kernel.org>,
	<linux-kselftest@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<sj38.park@gmail.com>, <aams@amazon.com>,
	SeongJae Park <sjpark@amazon.de>
Subject: [PATCH 0/3] Fix reconnection latency caused by FIN/ACK handling race
Date: Fri, 31 Jan 2020 13:24:18 +0100
Message-ID: <20200131122421.23286-1-sjpark@amazon.com> (raw)

From: SeongJae Park <sjpark@amazon.de>

When closing a connection, the two acks that required to change closing
socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in
reverse order.  This is possible in RSS disabled environments such as a
connection inside a host.

For example, expected state transitions and required packets for the
disconnection will be similar to below flow.

	 00 (Process A)				(Process B)
	 01 ESTABLISHED				ESTABLISHED
	 02 close()
	 03 FIN_WAIT_1
	 04 		---FIN-->
	 05 					CLOSE_WAIT
	 06 		<--ACK---
	 07 FIN_WAIT_2
	 08 		<--FIN/ACK---
	 09 TIME_WAIT
	 10 		---ACK-->
	 11 					LAST_ACK
	 12 CLOSED				CLOSED

The acks in lines 6 and 8 are the acks.  If the line 8 packet is
processed before the line 6 packet, it will be just ignored as it is not
a expected packet, and the later process of the line 6 packet will
change the status of Process A to FIN_WAIT_2, but as it has already
handled line 8 packet, it will not go to TIME_WAIT and thus will not
send the line 10 packet to Process B.  Thus, Process B will left in
CLOSE_WAIT status, as below.

	 00 (Process A)				(Process B)
	 01 ESTABLISHED				ESTABLISHED
	 02 close()
	 03 FIN_WAIT_1
	 04 		---FIN-->
	 05 					CLOSE_WAIT
	 06 				(<--ACK---)
	 07	  			(<--FIN/ACK---)
	 08 				(fired in right order)
	 09 		<--FIN/ACK---
	 10 		<--ACK---
	 11 		(processed in reverse order)
	 12 FIN_WAIT_2

Later, if the Process B sends SYN to Process A for reconnection using
the same port, Process A will responds with an ACK for the last flow,
which has no increased sequence number.  Thus, Process A will send RST,
wait for TIMEOUT_INIT (one second in default), and then try
reconnection.  If reconnections are frequent, the one second latency
spikes can be a big problem.  Below is a tcpdump results of the problem:

    14.436259 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644
    14.436266 IP 127.0.0.1.4242 > 127.0.0.1.45150: Flags [.], ack 5, win 512
    14.436271 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [R], seq 2541101298
    /* ONE SECOND DELAY */
    15.464613 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644

Patchset Organization
---------------------

The first patch fix a trivial nit.  The second one fix the problem by
adjusting the resend delay of the SYN in the case.  Finally, the third
patch adds a user space test to reproduce this problem.

The patches are based on the v5.5.  You can also clone the complete git
tree:

    $ git clone git://github.com/sjp38/linux -b patches/finack_lat/v1

The web is also available:
https://github.com/sjp38/linux/tree/patches/finack_lat/v1

SeongJae Park (3):
  net/ipv4/inet_timewait_sock: Fix inconsistent comments
  tcp: Reduce SYN resend delay if a suspicous ACK is received
  selftests: net: Add FIN_ACK processing order related latency spike
    test

 net/ipv4/inet_timewait_sock.c                 |  1 +
 net/ipv4/tcp_input.c                          |  6 +-
 tools/testing/selftests/net/.gitignore        |  2 +
 tools/testing/selftests/net/Makefile          |  2 +
 tools/testing/selftests/net/fin_ack_lat.sh    | 42 ++++++++++
 .../selftests/net/fin_ack_lat_accept.c        | 49 +++++++++++
 .../selftests/net/fin_ack_lat_connect.c       | 81 +++++++++++++++++++
 7 files changed, 182 insertions(+), 1 deletion(-)
 create mode 100755 tools/testing/selftests/net/fin_ack_lat.sh
 create mode 100644 tools/testing/selftests/net/fin_ack_lat_accept.c
 create mode 100644 tools/testing/selftests/net/fin_ack_lat_connect.c

-- 
2.17.1


             reply index

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-31 12:24 sjpark [this message]
2020-01-31 12:24 ` [PATCH 1/3] net/ipv4/inet_timewait_sock: Fix inconsistent comments sjpark
2020-01-31 14:54   ` Eric Dumazet
2020-01-31 15:09     ` sjpark
2020-01-31 12:24 ` [PATCH 2/3] tcp: Reduce SYN resend delay if a suspicous ACK is received sjpark
2020-01-31 15:01   ` Eric Dumazet
2020-01-31 16:12     ` sjpark
2020-01-31 16:55       ` Eric Dumazet
2020-01-31 17:05         ` sjpark
2020-01-31 17:08           ` Eric Dumazet
2020-01-31 15:10   ` Neal Cardwell
2020-01-31 18:12     ` Eric Dumazet
2020-01-31 22:11       ` Neal Cardwell
2020-01-31 22:17         ` SeongJae Park
2020-02-01  3:55           ` Neal Cardwell
2020-02-01  6:08             ` SeongJae Park
2020-02-01 13:30               ` Neal Cardwell
2020-01-31 22:53         ` Eric Dumazet
2020-02-03 15:40           ` David Laight
2020-02-03 15:54             ` Eric Dumazet
2020-01-31 12:24 ` [PATCH 3/3] selftests: net: Add FIN_ACK processing order related latency spike test sjpark
2020-01-31 14:56   ` Eric Dumazet
2020-01-31 15:13     ` sjpark
2020-01-31 14:00 ` [PATCH 0/3] Fix reconnection latency caused by FIN/ACK handling race David Laight
2020-01-31 15:05   ` sjpark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200131122421.23286-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=aams@amazon.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-kselftest Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-kselftest/0 linux-kselftest/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-kselftest linux-kselftest/ https://lore.kernel.org/linux-kselftest \
		linux-kselftest@vger.kernel.org
	public-inbox-index linux-kselftest

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kselftest


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git