All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neal Cardwell <ncardwell@google.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: sjpark@amazon.com, Eric Dumazet <edumazet@google.com>,
	David Miller <davem@davemloft.net>,
	shuah@kernel.org, Netdev <netdev@vger.kernel.org>,
	linux-kselftest@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	sj38.park@gmail.com, aams@amazon.com,
	SeongJae Park <sjpark@amazon.de>,
	Yuchung Cheng <ycheng@google.com>
Subject: Re: [PATCH 2/3] tcp: Reduce SYN resend delay if a suspicous ACK is received
Date: Fri, 31 Jan 2020 17:11:35 -0500	[thread overview]
Message-ID: <CADVnQy=Z0YRPY_0bxBpsZvECgamigESNKx6_-meNW5-6_N4kww@mail.gmail.com> (raw)
In-Reply-To: <dd146bac-4e8a-4119-2d2b-ce6bf2daf7ce@gmail.com>

On Fri, Jan 31, 2020 at 1:12 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 1/31/20 7:10 AM, Neal Cardwell wrote:
> > On Fri, Jan 31, 2020 at 7:25 AM <sjpark@amazon.com> wrote:
> >>
> >> From: SeongJae Park <sjpark@amazon.de>
> >>
> >> When closing a connection, the two acks that required to change closing
> >> socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in
> >> reverse order.  This is possible in RSS disabled environments such as a
> >> connection inside a host.
> >>
> >> For example, expected state transitions and required packets for the
> >> disconnection will be similar to below flow.
> >>
> >>          00 (Process A)                         (Process B)
> >>          01 ESTABLISHED                         ESTABLISHED
> >>          02 close()
> >>          03 FIN_WAIT_1
> >>          04             ---FIN-->
> >>          05                                     CLOSE_WAIT
> >>          06             <--ACK---
> >>          07 FIN_WAIT_2
> >>          08             <--FIN/ACK---
> >>          09 TIME_WAIT
> >>          10             ---ACK-->
> >>          11                                     LAST_ACK
> >>          12 CLOSED                              CLOSED
> >
> > AFAICT this sequence is not quite what would happen, and that it would
> > be different starting in line 8, and would unfold as follows:
> >
> >           08                                     close()
> >           09                                     LAST_ACK
> >           10             <--FIN/ACK---
> >           11 TIME_WAIT
> >           12             ---ACK-->
> >           13 CLOSED                              CLOSED
> >
> >
> >> The acks in lines 6 and 8 are the acks.  If the line 8 packet is
> >> processed before the line 6 packet, it will be just ignored as it is not
> >> a expected packet,
> >
> > AFAICT that is where the bug starts.
> >
> > AFAICT, from first principles, when process A receives the FIN/ACK it
> > should move to TIME_WAIT even if it has not received a preceding ACK.
> > That's because ACKs are cumulative. So receiving a later cumulative
> > ACK conveys all the information in the previous ACKs.
> >
> > Also, consider the de facto standard state transition diagram from
> > "TCP/IP Illustrated, Volume 2: The Implementation", by Wright and
> > Stevens, e.g.:
> >
> >   https://courses.cs.washington.edu/courses/cse461/19sp/lectures/TCPIP_State_Transition_Diagram.pdf
> >
> > This first-principles analysis agrees with the Wright/Stevens diagram,
> > which says that a connection in FIN_WAIT_1 that receives a FIN/ACK
> > should move to TIME_WAIT.
> >
> > This seems like a faster and more robust solution than installing
> > special timers.
> >
> > Thoughts?
>
>
> This is orthogonal I think.
>
> No matter how hard we fix the other side, we should improve the active side.
>
> Since we send a RST, sending the SYN a few ms after the RST seems way better
> than waiting 1 second as if we received no packet at all.
>
> Receiving this ACK tells us something about networking health, no need
> to be very cautious about the next attempt.

Yes, all good points. Thanks!

> Of course, if you have a fix for the passive side, that would be nice to review !

I looked into fixing this, but my quick reading of the Linux
tcp_rcv_state_process() code is that it should behave correctly and
that a connection in FIN_WAIT_1 that receives a FIN/ACK should move to
TIME_WAIT.

SeongJae, do you happen to have a tcpdump trace of the problematic
sequence where the "process A" ends up in FIN_WAIT_2 when it should be
in TIME_WAIT?

If I have time I will try to construct a packetdrill case to verify
the behavior in this case.

thanks,
neal

>
>
>

  reply	other threads:[~2020-01-31 22:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-31 12:24 [PATCH 0/3] Fix reconnection latency caused by FIN/ACK handling race sjpark
2020-01-31 12:24 ` [PATCH 1/3] net/ipv4/inet_timewait_sock: Fix inconsistent comments sjpark
2020-01-31 14:54   ` Eric Dumazet
2020-01-31 15:09     ` sjpark
2020-01-31 12:24 ` [PATCH 2/3] tcp: Reduce SYN resend delay if a suspicous ACK is received sjpark
2020-01-31 15:01   ` Eric Dumazet
2020-01-31 16:12     ` sjpark
2020-01-31 16:55       ` Eric Dumazet
2020-01-31 17:05         ` sjpark
2020-01-31 17:08           ` Eric Dumazet
2020-01-31 15:10   ` Neal Cardwell
2020-01-31 18:12     ` Eric Dumazet
2020-01-31 22:11       ` Neal Cardwell [this message]
2020-01-31 22:17         ` SeongJae Park
2020-02-01  3:55           ` Neal Cardwell
2020-02-01  6:08             ` SeongJae Park
2020-02-01 13:30               ` Neal Cardwell
2020-01-31 22:53         ` Eric Dumazet
2020-02-03 15:40           ` David Laight
2020-02-03 15:54             ` Eric Dumazet
2020-01-31 12:24 ` [PATCH 3/3] selftests: net: Add FIN_ACK processing order related latency spike test sjpark
2020-01-31 14:56   ` Eric Dumazet
2020-01-31 15:13     ` sjpark
2020-01-31 14:00 ` [PATCH 0/3] Fix reconnection latency caused by FIN/ACK handling race David Laight
2020-01-31 15:05   ` sjpark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADVnQy=Z0YRPY_0bxBpsZvECgamigESNKx6_-meNW5-6_N4kww@mail.gmail.com' \
    --to=ncardwell@google.com \
    --cc=aams@amazon.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.com \
    --cc=sjpark@amazon.de \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.