All of lore.kernel.org
 help / color / mirror / Atom feed
* linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
@ 2022-03-30  0:56 Jaco
  2022-03-30  2:01 ` Neal Cardwell
  0 siblings, 1 reply; 38+ messages in thread
From: Jaco @ 2022-03-30  0:56 UTC (permalink / raw)
  To: LKML

[-- Attachment #1: Type: text/plain, Size: 9844 bytes --]

Dear All,

I'm seeing very strange TCP behaviour.  Disabled TCP Segmentation Offload to
try and pinpoint this more closely.

It seems the kernel is ignoring ACKs coming from the remote side in some cases.
In this case, on one of four hosts, and seemingly between this one host and
Google ... (We've have two emails to google stuck on another host due to same
issue, but several hundred others passed out today on that same host).  I also
killed selective ACKs as a test as these are known to sometimes cause issues
for firewalls and "tcp accelerators" (or used to at the very least).

SMTP connection between ourselves and Google ... I'm going to be selective in
copying from tcpdump (full coversation up to the point where I killed it
because it plainly got stuck in a loop is attached).

Connection setup:

00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0

00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0

00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp

00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0

This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
we shouldn't send segments larger than that, and they "can't".  I need to
determine if this is some form of offloading or they really are sending >1500
byte frames (which I know won't pass our firewalls without fragmentation so
probably some form of NIC offloading - which if it was active on older 5.8
kernels did not cause problems):

00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP

00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP

These are the only two frames I can find that supposedly exceeds the MSS values
(although, they don't exceed our value).

Then everything goes pretty normal for a bit.  The last data we receive from
the remote side before stuff goes wrong:

00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP

We ACK immediately along with the next segment:

00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP

Hereafter there is a flurry of data that we transmit, all nicely acknowledged,
no retransmits that I can pick up (eyeballs).

Before a long sequence of TX data we get this ACK:

00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0

We then continue to RX a sequence of:

00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP

up to:

00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP

Before we hit our first retransmit:

00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP

Since 956700036 is the last ACKed data, this seems correct, not sure what timer
this is based on though, the ACK for the just prior data came in ~384ms prior
(could be based on normal time to ACK, I don't know, this is about double the
usual round-trip-time currently).

And then we receive this ACK (we can see this time the kernel waited for ACK of
this single segment):

00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0

Then we do something (in my opinion) strange by jumping back to the tail of the previous burst:

00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP

00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP

We then jump back and retransmit again from the just received ACK:

00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP

We then continue from there on as I'd expect (slow restart), this goes pretty
normal up to:

00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0

00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP

00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0

Up to here is fine, now things gets bizarre, we just jump to a different
sequence number, which has already been ACKed:

00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq *956707176*:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP

00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0

And remote side re-ACKs the 956710032 value, which frankly indicates we need to
realize that the data we are transmitting has already been received, and we can
continue on to transmit the segments following up on sequence number 956710032,
instead we choose to get stuck in this sequence:

00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP

00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0

00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP

00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0

And here the connection dies.  It eventually times out, and we retry to the
next host, resulting in the same problem.

I am aware that Google is having congestion issues in the JHB area in SA
currently, and there are probably packet delays and losses somewhere along the
line between us, but this really should not stall as dead as it is here.

Looking at only the incoming ACK values, I can see they are strictly
increasing, so we've never received an ACK > 956710032, but this is still
greater than the value we are retransmitting.

The first time we transmitted the frame at sequence number 956707176 was part
of the longest sequence of TX frames without a returning ACK, part of this
sequence:

...

00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP

00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP

00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP

...

Google here is ACKing not only the frame we are continuously retransmitting,
but also the frame directly after ... so why would the kernel not move on to
retransmitting starting from sequence number 956710032 (which is larger than
the start sequence number of the frame we are retransmitting)?

Kind Regards,
Jaco

[-- Attachment #2: iewc_google.txt --]
[-- Type: text/plain, Size: 38985 bytes --]

00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
00:56:17.218754 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633780:956633803, ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 23: SMTP: EHLO uriel.iewc.co.za
00:56:17.380021 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956633803, win 256, options [nop,nop,TS val 3477429381 ecr 3687705645], length 0
00:56:17.383685 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465760:726465949, ack 956633803, win 256, options [nop,nop,TS val 3477429384 ecr 3687705645], length 189: SMTP: 250-mx.google.com at your service, [2c0f:f720:0:3:d6ae:52ff:feb8:f27b]
00:56:17.383714 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 0
00:56:17.383934 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633803:956633813, ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 10: SMTP: STARTTLS
00:56:17.546391 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465949:726465979, ack 956633813, win 256, options [nop,nop,TS val 3477429547 ecr 3687705810], length 30: SMTP: 220 2.0.0 Ready to start TLS
00:56:17.546430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 0
00:56:17.547288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633813:956634111, ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 298: SMTP
00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470811:726470898, ack 956634111, win 261, options [nop,nop,TS val 3477429711 ecr 3687705973], length 87: SMTP
00:56:17.709949 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726468395, win 470, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0
00:56:17.709964 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470811, win 452, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0
00:56:17.709978 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470898, win 452, options [nop,nop,TS val 3687706136 ecr 3477429711], length 0
00:56:17.712221 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634111:956634191, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 80: SMTP
00:56:17.712353 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634191:956634236, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 45: SMTP
00:56:17.874774 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956634236, win 261, options [nop,nop,TS val 3477429875 ecr 3687706138], length 0
00:56:17.875264 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470898:726471633, ack 956634236, win 261, options [nop,nop,TS val 3477429876 ecr 3687706138], length 735: SMTP
00:56:17.904288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634236:956634348, ack 726471633, win 447, options [nop,nop,TS val 3687706330 ecr 3477429876], length 112: SMTP
00:56:18.066936 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471633:726471728, ack 956634348, win 261, options [nop,nop,TS val 3477430067 ecr 3687706330], length 95: SMTP
00:56:18.088465 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471728:726471823, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 95: SMTP
00:56:18.088603 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726471823, win 446, options [nop,nop,TS val 3687706515 ecr 3477430067], length 0
00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP
00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.088973 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956635776:956637204, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.088988 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956637204:956638632, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.088990 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956638632:956640060, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089099 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956640060:956641488, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089103 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956641488:956642916, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089117 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956642916:956644344, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089119 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956644344:956645772, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089226 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956645772:956647200, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089229 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956647200:956648628, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956637204, win 283, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956640060, win 305, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956642916, win 328, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956645772, win 350, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251295 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956648628:956650056, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956650056:956651484, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251314 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956651484:956652912, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251317 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956652912:956654340, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251378 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956648628, win 372, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956654340:956655768, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251435 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956655768:956657196, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251455 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956657196:956658624, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251458 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956658624:956660052, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251563 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956660052:956661480, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251566 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956661480:956662908, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251583 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956662908:956664336, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251585 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956664336:956665764, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251694 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956665764:956667192, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251697 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956667192:956668620, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251713 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956668620:956670048, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251716 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956670048:956671476, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251826 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956671476:956672904, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251829 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956672904:956674332, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251841 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956674332:956675760, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251844 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956675760:956677188, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956651484, win 395, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0
00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956654340, win 417, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0
00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956657196, win 439, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956660052, win 461, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956662908, win 484, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956665764, win 506, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413635 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956677188:956678616, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413639 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956678616:956680044, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413655 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956680044:956681472, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413657 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956681472:956682900, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413774 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956682900:956684328, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413780 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956684328:956685756, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413807 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956685756:956687184, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413810 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956687184:956688612, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956668620, win 528, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956671476, win 551, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413920 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956688612:956690040, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956674332, win 573, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0
00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956677188, win 595, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0
00:56:18.413924 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956690040:956691468, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413938 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956691468:956692896, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413940 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956692896:956694324, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.414048 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956694324:956695752, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414052 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956695752:956697180, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414065 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956697180:956698608, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414067 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956698608:956700036, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414174 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414177 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414190 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414192 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414318 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956710032:956711460, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414424 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956711460:956712888, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414427 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956712888:956714316, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414440 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956714316:956715744, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414442 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956715744:956717172, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414546 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956717172:956718600, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414550 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956718600:956720028, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414562 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956720028:956721456, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414565 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956721456:956722884, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414670 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956722884:956724312, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414673 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956724312:956725740, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414685 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956725740:956727168, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414687 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956727168:956728596, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414793 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956728596:956730024, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414796 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956730024:956731452, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414809 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956731452:956732880, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414811 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956732880:956734308, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956680044, win 618, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956682900, win 640, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956685756, win 662, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956688612, win 685, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956691468, win 707, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576005 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956734308:956735736, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576010 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956735736:956737164, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576025 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956737164:956738592, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956738592:956740020, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576066 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956694324, win 729, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576146 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956740020:956741448, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576152 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956741448:956742876, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576180 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956742876:956744304, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576184 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956744304:956745732, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956697180, win 752, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576304 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956747160:956748588, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576325 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956748588:956750016, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576328 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956750016:956751444, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576441 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956751444:956752872, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576445 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956752872:956754300, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576467 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956754300:956755728, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576470 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956755728:956757156, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576582 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956757156:956758584, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576586 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956758584:956760012, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576606 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956760012:956761440, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576609 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956761440:956762868, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576722 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956762868:956764296, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576726 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956764296:956765724, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576746 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956765724:956767152, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576749 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956767152:956768580, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576863 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956768580:956770008, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576867 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956770008:956771436, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576889 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956771436:956772864, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576892 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956772864:956774292, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577004 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956774292:956775720, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577008 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956775720:956777148, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956777148:956778576, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP
00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0
00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP
00:56:19.672429 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956702892, win 796, options [nop,nop,TS val 3477431673 ecr 3687707936], length 0
00:56:19.672489 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP
00:56:19.672494 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP
00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956704320, win 807, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0
00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956705748, win 818, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0
00:56:19.834818 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP
00:56:19.834846 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP
00:56:19.997087 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956707176, win 830, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0
00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0
00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP
00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0
00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP
00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0
00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP
00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0
00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP
00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0
00:56:25.900090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687714326 ecr 3477431998], length 1428: SMTP
00:56:26.062272 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477438063 ecr 3687708423], length 0
00:56:32.620090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687721046 ecr 3477431998], length 1428: SMTP
00:56:32.782226 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477444783 ecr 3687708423], length 0
00:56:45.420093 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687733846 ecr 3477431998], length 1428: SMTP
00:56:45.581587 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477457582 ecr 3687708423], length 0
00:57:10.380083 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687758806 ecr 3477431998], length 1428: SMTP
00:57:10.542248 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477482543 ecr 3687708423], length 0
00:58:00.940090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687809366 ecr 3477431998], length 1428: SMTP
00:58:01.102342 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477533103 ecr 3687708423], length 0

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30  0:56 linux 5.17.1 disregarding ACK values resulting in stalled TCP connections Jaco
@ 2022-03-30  2:01 ` Neal Cardwell
  2022-03-30  2:40   ` Eric Dumazet
  2022-03-30  2:58   ` Jaco Kroon
  0 siblings, 2 replies; 38+ messages in thread
From: Neal Cardwell @ 2022-03-30  2:01 UTC (permalink / raw)
  To: Jaco; +Cc: LKML, Netdev, Eric Dumazet, Yuchung Cheng

[-- Attachment #1: Type: text/plain, Size: 10969 bytes --]

On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote:
>
> Dear All,
>
> I'm seeing very strange TCP behaviour.  Disabled TCP Segmentation Offload to
> try and pinpoint this more closely.
>
> It seems the kernel is ignoring ACKs coming from the remote side in some cases.
> In this case, on one of four hosts, and seemingly between this one host and
> Google ... (We've have two emails to google stuck on another host due to same
> issue, but several hundred others passed out today on that same host).  I also
> killed selective ACKs as a test as these are known to sometimes cause issues
> for firewalls and "tcp accelerators" (or used to at the very least).
>
> SMTP connection between ourselves and Google ... I'm going to be selective in
> copying from tcpdump (full coversation up to the point where I killed it
> because it plainly got stuck in a loop is attached).
>
> Connection setup:
>
> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
>
> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
>
> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
>
> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
>
> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
> we shouldn't send segments larger than that, and they "can't".  I need to
> determine if this is some form of offloading or they really are sending >1500
> byte frames (which I know won't pass our firewalls without fragmentation so
> probably some form of NIC offloading - which if it was active on older 5.8
> kernels did not cause problems):
>
> 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
>
> 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
>
> These are the only two frames I can find that supposedly exceeds the MSS values
> (although, they don't exceed our value).
>
> Then everything goes pretty normal for a bit.  The last data we receive from
> the remote side before stuff goes wrong:
>
> 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP
>
> We ACK immediately along with the next segment:
>
> 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
>
> Hereafter there is a flurry of data that we transmit, all nicely acknowledged,
> no retransmits that I can pick up (eyeballs).
>
> Before a long sequence of TX data we get this ACK:
>
> 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
>
> We then continue to RX a sequence of:
>
> 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
>
> up to:
>
> 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
>
> Before we hit our first retransmit:
>
> 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP
>
> Since 956700036 is the last ACKed data, this seems correct, not sure what timer
> this is based on though, the ACK for the just prior data came in ~384ms prior
> (could be based on normal time to ACK, I don't know, this is about double the
> usual round-trip-time currently).
>
> And then we receive this ACK (we can see this time the kernel waited for ACK of
> this single segment):
>
> 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0
>
> Then we do something (in my opinion) strange by jumping back to the tail of the previous burst:
>
> 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
>
> 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
>
> We then jump back and retransmit again from the just received ACK:
>
> 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP
>
> We then continue from there on as I'd expect (slow restart), this goes pretty
> normal up to:
>
> 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0
>
> 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP
>
> 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0
>
> Up to here is fine, now things gets bizarre, we just jump to a different
> sequence number, which has already been ACKed:
>
> 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq *956707176*:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP
>
> 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0
>
> And remote side re-ACKs the 956710032 value, which frankly indicates we need to
> realize that the data we are transmitting has already been received, and we can
> continue on to transmit the segments following up on sequence number 956710032,
> instead we choose to get stuck in this sequence:
>
> 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP
>
> 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0
>
> 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP
>
> 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0
>
> And here the connection dies.  It eventually times out, and we retry to the
> next host, resulting in the same problem.
>
> I am aware that Google is having congestion issues in the JHB area in SA
> currently, and there are probably packet delays and losses somewhere along the
> line between us, but this really should not stall as dead as it is here.
>
> Looking at only the incoming ACK values, I can see they are strictly
> increasing, so we've never received an ACK > 956710032, but this is still
> greater than the value we are retransmitting.
>
> The first time we transmitted the frame at sequence number 956707176 was part
> of the longest sequence of TX frames without a returning ACK, part of this
> sequence:
>
> ...
>
> 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
>
> 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
>
> 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
>
> ...
>
> Google here is ACKing not only the frame we are continuously retransmitting,
> but also the frame directly after ... so why would the kernel not move on to
> retransmitting starting from sequence number 956710032 (which is larger than
> the start sequence number of the frame we are retransmitting)?
>
> Kind Regards,
> Jaco

Thanks for the report!  I have CC-ed the netdev list, since it is
probably a better forum for this discussion.

Can you please attach (or link to) a tcpdump raw .pcap file  (produced
with the -w flag)? There are a number of tools that will make this
easier to visualize and analyze if we can see the raw .pcap file. You
may want to anonymize the trace and/or capture just headers, etc (for
example, the -s flag can control how much of each packet tcpdump
grabs).

Can you please share the exact kernel version of the client machine?

Also, can you please summarize/clarify whether you think the client,
server, or both are misbehaving?

Thanks!
neal

[-- Attachment #2: iewc_google.txt --]
[-- Type: text/plain, Size: 38985 bytes --]

00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
00:56:17.218754 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633780:956633803, ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 23: SMTP: EHLO uriel.iewc.co.za
00:56:17.380021 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956633803, win 256, options [nop,nop,TS val 3477429381 ecr 3687705645], length 0
00:56:17.383685 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465760:726465949, ack 956633803, win 256, options [nop,nop,TS val 3477429384 ecr 3687705645], length 189: SMTP: 250-mx.google.com at your service, [2c0f:f720:0:3:d6ae:52ff:feb8:f27b]
00:56:17.383714 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 0
00:56:17.383934 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633803:956633813, ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 10: SMTP: STARTTLS
00:56:17.546391 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465949:726465979, ack 956633813, win 256, options [nop,nop,TS val 3477429547 ecr 3687705810], length 30: SMTP: 220 2.0.0 Ready to start TLS
00:56:17.546430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 0
00:56:17.547288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633813:956634111, ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 298: SMTP
00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470811:726470898, ack 956634111, win 261, options [nop,nop,TS val 3477429711 ecr 3687705973], length 87: SMTP
00:56:17.709949 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726468395, win 470, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0
00:56:17.709964 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470811, win 452, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0
00:56:17.709978 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470898, win 452, options [nop,nop,TS val 3687706136 ecr 3477429711], length 0
00:56:17.712221 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634111:956634191, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 80: SMTP
00:56:17.712353 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634191:956634236, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 45: SMTP
00:56:17.874774 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956634236, win 261, options [nop,nop,TS val 3477429875 ecr 3687706138], length 0
00:56:17.875264 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470898:726471633, ack 956634236, win 261, options [nop,nop,TS val 3477429876 ecr 3687706138], length 735: SMTP
00:56:17.904288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634236:956634348, ack 726471633, win 447, options [nop,nop,TS val 3687706330 ecr 3477429876], length 112: SMTP
00:56:18.066936 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471633:726471728, ack 956634348, win 261, options [nop,nop,TS val 3477430067 ecr 3687706330], length 95: SMTP
00:56:18.088465 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471728:726471823, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 95: SMTP
00:56:18.088603 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726471823, win 446, options [nop,nop,TS val 3687706515 ecr 3477430067], length 0
00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP
00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.088973 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956635776:956637204, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.088988 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956637204:956638632, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.088990 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956638632:956640060, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089099 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956640060:956641488, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089103 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956641488:956642916, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089117 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956642916:956644344, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089119 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956644344:956645772, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089226 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956645772:956647200, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.089229 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956647200:956648628, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956637204, win 283, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956640060, win 305, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956642916, win 328, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956645772, win 350, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251295 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956648628:956650056, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956650056:956651484, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251314 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956651484:956652912, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251317 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956652912:956654340, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP
00:56:18.251378 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956648628, win 372, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0
00:56:18.251430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956654340:956655768, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251435 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956655768:956657196, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251455 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956657196:956658624, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251458 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956658624:956660052, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251563 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956660052:956661480, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251566 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956661480:956662908, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251583 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956662908:956664336, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251585 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956664336:956665764, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251694 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956665764:956667192, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251697 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956667192:956668620, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251713 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956668620:956670048, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251716 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956670048:956671476, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251826 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956671476:956672904, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251829 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956672904:956674332, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251841 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956674332:956675760, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.251844 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956675760:956677188, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP
00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956651484, win 395, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0
00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956654340, win 417, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0
00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956657196, win 439, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956660052, win 461, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956662908, win 484, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956665764, win 506, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413635 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956677188:956678616, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413639 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956678616:956680044, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413655 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956680044:956681472, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413657 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956681472:956682900, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413774 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956682900:956684328, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413780 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956684328:956685756, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413807 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956685756:956687184, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413810 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956687184:956688612, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956668620, win 528, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956671476, win 551, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0
00:56:18.413920 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956688612:956690040, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956674332, win 573, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0
00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956677188, win 595, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0
00:56:18.413924 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956690040:956691468, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413938 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956691468:956692896, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.413940 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956692896:956694324, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP
00:56:18.414048 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956694324:956695752, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414052 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956695752:956697180, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414065 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956697180:956698608, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414067 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956698608:956700036, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414174 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414177 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414190 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414192 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414318 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956710032:956711460, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
00:56:18.414424 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956711460:956712888, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414427 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956712888:956714316, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414440 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956714316:956715744, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414442 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956715744:956717172, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414546 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956717172:956718600, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414550 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956718600:956720028, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414562 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956720028:956721456, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414565 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956721456:956722884, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414670 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956722884:956724312, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414673 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956724312:956725740, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414685 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956725740:956727168, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414687 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956727168:956728596, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414793 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956728596:956730024, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414796 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956730024:956731452, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414809 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956731452:956732880, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.414811 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956732880:956734308, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956680044, win 618, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956682900, win 640, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956685756, win 662, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956688612, win 685, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956691468, win 707, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576005 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956734308:956735736, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576010 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956735736:956737164, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576025 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956737164:956738592, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956738592:956740020, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576066 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956694324, win 729, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576146 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956740020:956741448, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576152 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956741448:956742876, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576180 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956742876:956744304, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576184 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956744304:956745732, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956697180, win 752, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576304 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956747160:956748588, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576325 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956748588:956750016, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576328 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956750016:956751444, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
00:56:18.576441 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956751444:956752872, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576445 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956752872:956754300, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576467 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956754300:956755728, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576470 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956755728:956757156, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576582 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956757156:956758584, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576586 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956758584:956760012, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576606 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956760012:956761440, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576609 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956761440:956762868, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576722 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956762868:956764296, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576726 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956764296:956765724, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576746 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956765724:956767152, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576749 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956767152:956768580, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576863 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956768580:956770008, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576867 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956770008:956771436, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576889 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956771436:956772864, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.576892 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956772864:956774292, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577004 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956774292:956775720, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577008 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956775720:956777148, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956777148:956778576, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP
00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0
00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP
00:56:19.672429 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956702892, win 796, options [nop,nop,TS val 3477431673 ecr 3687707936], length 0
00:56:19.672489 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP
00:56:19.672494 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP
00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956704320, win 807, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0
00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956705748, win 818, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0
00:56:19.834818 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP
00:56:19.834846 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP
00:56:19.997087 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956707176, win 830, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0
00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0
00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP
00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0
00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP
00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0
00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP
00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0
00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP
00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0
00:56:25.900090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687714326 ecr 3477431998], length 1428: SMTP
00:56:26.062272 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477438063 ecr 3687708423], length 0
00:56:32.620090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687721046 ecr 3477431998], length 1428: SMTP
00:56:32.782226 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477444783 ecr 3687708423], length 0
00:56:45.420093 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687733846 ecr 3477431998], length 1428: SMTP
00:56:45.581587 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477457582 ecr 3687708423], length 0
00:57:10.380083 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687758806 ecr 3477431998], length 1428: SMTP
00:57:10.542248 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477482543 ecr 3687708423], length 0
00:58:00.940090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687809366 ecr 3477431998], length 1428: SMTP
00:58:01.102342 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477533103 ecr 3687708423], length 0

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30  2:01 ` Neal Cardwell
@ 2022-03-30  2:40   ` Eric Dumazet
  2022-03-30  2:58   ` Jaco Kroon
  1 sibling, 0 replies; 38+ messages in thread
From: Eric Dumazet @ 2022-03-30  2:40 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Jaco, LKML, Netdev, Yuchung Cheng

On Tue, Mar 29, 2022 at 7:01 PM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote:
> >
> > Dear All,
> >
> > I'm seeing very strange TCP behaviour.  Disabled TCP Segmentation Offload to
> > try and pinpoint this more closely.
> >
> > It seems the kernel is ignoring ACKs coming from the remote side in some cases.
> > In this case, on one of four hosts, and seemingly between this one host and
> > Google ... (We've have two emails to google stuck on another host due to same
> > issue, but several hundred others passed out today on that same host).  I also
> > killed selective ACKs as a test as these are known to sometimes cause issues
> > for firewalls and "tcp accelerators" (or used to at the very least).
> >
> > SMTP connection between ourselves and Google ... I'm going to be selective in
> > copying from tcpdump (full coversation up to the point where I killed it
> > because it plainly got stuck in a loop is attached).
> >
> > Connection setup:
> >
> > 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
> >
> > 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
> >
> > 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
> >
> > 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
> >
> > This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
> > we shouldn't send segments larger than that, and they "can't".  I need to
> > determine if this is some form of offloading or they really are sending >1500
> > byte frames (which I know won't pass our firewalls without fragmentation so
> > probably some form of NIC offloading - which if it was active on older 5.8
> > kernels did not cause problems):
> >
> > 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
> >
> > 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP
> >
> > These are the only two frames I can find that supposedly exceeds the MSS values
> > (although, they don't exceed our value).
> >
> > Then everything goes pretty normal for a bit.  The last data we receive from
> > the remote side before stuff goes wrong:
> >
> > 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP
> >
> > We ACK immediately along with the next segment:
> >
> > 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP
> >
> > Hereafter there is a flurry of data that we transmit, all nicely acknowledged,
> > no retransmits that I can pick up (eyeballs).
> >
> > Before a long sequence of TX data we get this ACK:
> >
> > 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0
> >
> > We then continue to RX a sequence of:
> >
> > 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP
> >
> > up to:
> >
> > 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP
> >
> > Before we hit our first retransmit:
> >
> > 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP
> >
> > Since 956700036 is the last ACKed data, this seems correct, not sure what timer
> > this is based on though, the ACK for the just prior data came in ~384ms prior
> > (could be based on normal time to ACK, I don't know, this is about double the
> > usual round-trip-time currently).
> >
> > And then we receive this ACK (we can see this time the kernel waited for ACK of
> > this single segment):
> >
> > 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0
> >
> > Then we do something (in my opinion) strange by jumping back to the tail of the previous burst:
> >
> > 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
> >
> > 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP
> >
> > We then jump back and retransmit again from the just received ACK:
> >
> > 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP
> >
> > We then continue from there on as I'd expect (slow restart), this goes pretty
> > normal up to:
> >
> > 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0
> >
> > 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP
> >
> > 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0
> >
> > Up to here is fine, now things gets bizarre, we just jump to a different
> > sequence number, which has already been ACKed:
> >
> > 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq *956707176*:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP
> >
> > 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0
> >
> > And remote side re-ACKs the 956710032 value, which frankly indicates we need to
> > realize that the data we are transmitting has already been received, and we can
> > continue on to transmit the segments following up on sequence number 956710032,
> > instead we choose to get stuck in this sequence:
> >
> > 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP
> >
> > 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0
> >
> > 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP
> >
> > 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0
> >
> > And here the connection dies.  It eventually times out, and we retry to the
> > next host, resulting in the same problem.
> >
> > I am aware that Google is having congestion issues in the JHB area in SA
> > currently, and there are probably packet delays and losses somewhere along the
> > line between us, but this really should not stall as dead as it is here.
> >
> > Looking at only the incoming ACK values, I can see they are strictly
> > increasing, so we've never received an ACK > 956710032, but this is still
> > greater than the value we are retransmitting.
> >

It could be that ACK packets have a wrong checksum, after some point
is reached (some bug in a firewall/middlebox)

"tcpdump -v" will tell you something about checksum errors.
And/or "nstat -az | grep TcpInCsumError"

Also, packets could be dropped in a layer like netfilter.
Make sure you do not have a rule rate limiting flows, or something like that.

> > The first time we transmitted the frame at sequence number 956707176 was part
> > of the longest sequence of TX frames without a returning ACK, part of this
> > sequence:
> >
> > ...
> >
> > 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
> >
> > 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
> >
> > 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP
> >
> > ...
> >
> > Google here is ACKing not only the frame we are continuously retransmitting,
> > but also the frame directly after ... so why would the kernel not move on to
> > retransmitting starting from sequence number 956710032 (which is larger than
> > the start sequence number of the frame we are retransmitting)?
> >
> > Kind Regards,
> > Jaco
>
> Thanks for the report!  I have CC-ed the netdev list, since it is
> probably a better forum for this discussion.
>
> Can you please attach (or link to) a tcpdump raw .pcap file  (produced
> with the -w flag)? There are a number of tools that will make this
> easier to visualize and analyze if we can see the raw .pcap file. You
> may want to anonymize the trace and/or capture just headers, etc (for
> example, the -s flag can control how much of each packet tcpdump
> grabs).
>
> Can you please share the exact kernel version of the client machine?
>
> Also, can you please summarize/clarify whether you think the client,
> server, or both are misbehaving?
>
> Thanks!
> neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30  2:01 ` Neal Cardwell
  2022-03-30  2:40   ` Eric Dumazet
@ 2022-03-30  2:58   ` Jaco Kroon
  2022-03-30  3:48     ` Eric Dumazet
  1 sibling, 1 reply; 38+ messages in thread
From: Jaco Kroon @ 2022-03-30  2:58 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: LKML, Netdev, Eric Dumazet, Yuchung Cheng

[-- Attachment #1: Type: text/plain, Size: 2712 bytes --]

Hi Neal,

> Thanks for the report!  I have CC-ed the netdev list, since it is
> probably a better forum for this discussion.
Awesome thank you.
>
> Can you please attach (or link to) a tcpdump raw .pcap file  (produced
> with the -w flag)? There are a number of tools that will make this
> easier to visualize and analyze if we can see the raw .pcap file. You
> may want to anonymize the trace and/or capture just headers, etc (for
> example, the -s flag can control how much of each packet tcpdump
> grabs).

Attached.

The traffic itself should be mostly encrypted but stripped with -s100
anyway.  At this point SACK was still on.

I don't know how, or why, but this relates to TFO.  After sending report
on a hunch (based on comparing the exim logs of a successful delivery
compared to a non-successful) and the only difference was that the
non-working was stating:

TFO mode sendto, no data: EINPROGRESS

and then specifically:

TCP_FASTOPEN tcpi_unacked 2

The working connections never had the latter line in the output.

The moment I set sysctl -w net.ipv4.tcp_fastopen=0 (default is 1) I've
managed to flood out about 1200 emails to google in a matter of no more
than 15 minutes.

In the kernel sources:  git log v5.8..v5.17 net/

And searching for TFO only gives so many possible commits that broke
this, just looking at changelogs I'm not sure if any of them are
relevant.  I'm guessing the issue possibly relates to congestion
control, as such this is probably the most relevant:

commit be5d1b61a2ad28c7e57fe8bfa277373e8ecffcdc
Author: Nguyen Dinh Phi <phind.uet@gmail.com>
Date:   Tue Jul 6 07:19:12 2021 +0800

    tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized

Just looking at the diff it removes a icsk->icsk_ca_initialized = 0; -
the only other place this gets set to 0 is in tcp_disconnect() ... and
to 1 in tcp_init_congestion_control() - so I think we might have an
uninitialized variable here ... then again tcp_init_socket mentions
explicitly that sk_alloc set lots of stuff to 0 - still bugs me that the
original commit (8919a9b31eb4) felt the need to set an explicit 0 in
tcp_init_transfer().

>
> Can you please share the exact kernel version of the client machine?
Our side (client) is 5.17.1 (side that initiates TCP/IP connection), I
obviously can't comment for the Google side (server).
> Also, can you please summarize/clarify whether you think the client,
> server, or both are misbehaving?

client is re-transmitting frames for which it has already received an
ACK from the server.  In pcap from frames 105 onwards one can start
seeing retransmits, then first "spurious retransmission" as wireshark
labels it from frames 122 onwards.

Kind Regards,
Jaco

[-- Attachment #2: iewc_google2.pcap --]
[-- Type: application/vnd.tcpdump.pcap, Size: 19828 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30  2:58   ` Jaco Kroon
@ 2022-03-30  3:48     ` Eric Dumazet
  2022-03-30  6:22       ` Jaco Kroon
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-03-30  3:48 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Neal,
>
> > Thanks for the report!  I have CC-ed the netdev list, since it is
> > probably a better forum for this discussion.
> Awesome thank you.
> >
> > Can you please attach (or link to) a tcpdump raw .pcap file  (produced
> > with the -w flag)? There are a number of tools that will make this
> > easier to visualize and analyze if we can see the raw .pcap file. You
> > may want to anonymize the trace and/or capture just headers, etc (for
> > example, the -s flag can control how much of each packet tcpdump
> > grabs).
>
> Attached.
>
> The traffic itself should be mostly encrypted but stripped with -s100
> anyway.  At this point SACK was still on.
>
> I don't know how, or why, but this relates to TFO.  After sending report
> on a hunch (based on comparing the exim logs of a successful delivery
> compared to a non-successful) and the only difference was that the
> non-working was stating:
>
> TFO mode sendto, no data: EINPROGRESS
>
> and then specifically:
>
> TCP_FASTOPEN tcpi_unacked 2
>
> The working connections never had the latter line in the output.
>
> The moment I set sysctl -w net.ipv4.tcp_fastopen=0 (default is 1) I've
> managed to flood out about 1200 emails to google in a matter of no more
> than 15 minutes.
>
> In the kernel sources:  git log v5.8..v5.17 net/
>
> And searching for TFO only gives so many possible commits that broke
> this, just looking at changelogs I'm not sure if any of them are
> relevant.  I'm guessing the issue possibly relates to congestion
> control, as such this is probably the most relevant:
>
> commit be5d1b61a2ad28c7e57fe8bfa277373e8ecffcdc
> Author: Nguyen Dinh Phi <phind.uet@gmail.com>
> Date:   Tue Jul 6 07:19:12 2021 +0800
>
>     tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized
>
> Just looking at the diff it removes a icsk->icsk_ca_initialized = 0; -
> the only other place this gets set to 0 is in tcp_disconnect() ... and
> to 1 in tcp_init_congestion_control() - so I think we might have an
> uninitialized variable here ... then again tcp_init_socket mentions
> explicitly that sk_alloc set lots of stuff to 0 - still bugs me that the
> original commit (8919a9b31eb4) felt the need to set an explicit 0 in
> tcp_init_transfer().

I do not think this commit is related to the issue you have.

I guess you could try a revert ?

Then, if you think old linux versions were ok, start a bisection ?

Thank you.

(I do not see why a successful TFO would lead to a freeze after ~70 KB
of data has been sent)

>
> >
> > Can you please share the exact kernel version of the client machine?
> Our side (client) is 5.17.1 (side that initiates TCP/IP connection), I
> obviously can't comment for the Google side (server).
> > Also, can you please summarize/clarify whether you think the client,
> > server, or both are misbehaving?
>
> client is re-transmitting frames for which it has already received an
> ACK from the server.  In pcap from frames 105 onwards one can start
> seeing retransmits, then first "spurious retransmission" as wireshark
> labels it from frames 122 onwards.
>
> Kind Regards,
> Jaco

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30  3:48     ` Eric Dumazet
@ 2022-03-30  6:22       ` Jaco Kroon
  2022-03-30 13:56         ` Neal Cardwell
  0 siblings, 1 reply; 38+ messages in thread
From: Jaco Kroon @ 2022-03-30  6:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

Hi Eric,

On 2022/03/30 05:48, Eric Dumazet wrote:
> On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote:
>
> I do not think this commit is related to the issue you have.
>
> I guess you could try a revert ?
>
> Then, if you think old linux versions were ok, start a bisection ?
That'll be interesting, will see if I can reproduce on a non-production
host.
>
> Thank you.
>
> (I do not see why a successful TFO would lead to a freeze after ~70 KB
> of data has been sent)

I do actually agree with this in that it makes no sense, but disabling
TFO definitely resolved the issue for us.

Kind Regards,
Jaco


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30  6:22       ` Jaco Kroon
@ 2022-03-30 13:56         ` Neal Cardwell
  2022-03-30 15:00           ` Jaco Kroon
  0 siblings, 1 reply; 38+ messages in thread
From: Neal Cardwell @ 2022-03-30 13:56 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng

On Wed, Mar 30, 2022 at 2:22 AM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Eric,
>
> On 2022/03/30 05:48, Eric Dumazet wrote:
> > On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote:
> >
> > I do not think this commit is related to the issue you have.
> >
> > I guess you could try a revert ?
> >
> > Then, if you think old linux versions were ok, start a bisection ?
> That'll be interesting, will see if I can reproduce on a non-production
> host.
> >
> > Thank you.
> >
> > (I do not see why a successful TFO would lead to a freeze after ~70 KB
> > of data has been sent)
>
> I do actually agree with this in that it makes no sense, but disabling
> TFO definitely resolved the issue for us.
>
> Kind Regards,
> Jaco

Thanks for the pcap trace! That's a pretty strange trace. I agree with
Eric's theory that this looks like one or more bugs in a firewall,
middlebox, or netfilter rule. From the trace it looks like the buggy
component is sometimes dropping packets and sometimes corrupting them
so that the client's TCP stack ignores them.

Interestingly, in that trace the client SYN has a TFO option and
cookie, but no data in the SYN.

The last packet that looks sane/normal is the ACK from the SMTP server
that looks like:

00:00:00.000010 IP6 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 66263 win
774 <nop,nop,TS val 1206544341 ecr 331189186>

That's the first ACK that crosses past 2^16. Maybe that is a
coincidence, or maybe not. Perhaps the buggy firewall/middlebox/etc is
confused by the TFO option, corrupts its state, and thereafter behaves
incorrectly past the first 64 KBytes of data from the client.

In addition to checking for checksum failures, mentioned by Eric, you
could look for PAWS failures, something like:

  nstat -az | egrep  -i 'TcpInCsumError|PAWS'

best,
neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30 13:56         ` Neal Cardwell
@ 2022-03-30 15:00           ` Jaco Kroon
  2022-03-30 16:19             ` Eric Dumazet
  0 siblings, 1 reply; 38+ messages in thread
From: Jaco Kroon @ 2022-03-30 15:00 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng

Hi,

On 2022/03/30 15:56, Neal Cardwell wrote:
> On Wed, Mar 30, 2022 at 2:22 AM Jaco Kroon <jaco@uls.co.za> wrote:
>> Hi Eric,
>>
>> On 2022/03/30 05:48, Eric Dumazet wrote:
>>> On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote:
>>>
>>> I do not think this commit is related to the issue you have.
>>>
>>> I guess you could try a revert ?
>>>
>>> Then, if you think old linux versions were ok, start a bisection ?
>> That'll be interesting, will see if I can reproduce on a non-production
>> host.
>>> Thank you.
>>>
>>> (I do not see why a successful TFO would lead to a freeze after ~70 KB
>>> of data has been sent)
>> I do actually agree with this in that it makes no sense, but disabling
>> TFO definitely resolved the issue for us.
>>
>> Kind Regards,
>> Jaco
> Thanks for the pcap trace! That's a pretty strange trace. I agree with
> Eric's theory that this looks like one or more bugs in a firewall,
> middlebox, or netfilter rule. From the trace it looks like the buggy
> component is sometimes dropping packets and sometimes corrupting them
> so that the client's TCP stack ignores them.
The capture was taken on the client.  So the only firewall there is
iptables, and I redirected all -j DROP statements to a L_DROP chain
which did a -j LOG prior to -j DROP - didn't pick up any drops here.
>
> Interestingly, in that trace the client SYN has a TFO option and
> cookie, but no data in the SYN.

So this allows the SMTP server which in the conversation speaks first to
identify itself to respond with data in the SYN (not sure that was
actually happening but if I recall I did see it send data prior to
receiving the final ACK on the handshake.

>
> The last packet that looks sane/normal is the ACK from the SMTP server
> that looks like:
>
> 00:00:00.000010 IP6 2a00:1450:4013:c16::1a.25 >
> 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 66263 win
> 774 <nop,nop,TS val 1206544341 ecr 331189186>
>
> That's the first ACK that crosses past 2^16. Maybe that is a
> coincidence, or maybe not. Perhaps the buggy firewall/middlebox/etc is

I believe it should be because we literally had this on every single
connection going out to Google's SMTP ... probably 1/100 connections
managed to deliver an email over the connection.  Then again ... 64KB
isn't that much ...

When you state sane/normal, do you mean there is fault with the other
frames that could not be explained by packet loss in one or both of the
directions?

> confused by the TFO option, corrupts its state, and thereafter behaves
> incorrectly past the first 64 KBytes of data from the client.

Only firewalls we've got are netfilter based, and these packets all
passed through the dedicated firewalls at least by the time they reach
here.  No middleboxes on our end, and if this was Google's side there
would be crazy noise be heard, not just me.  I think the trigger is
packet loss between us (as indicated we know they have link congestion
issues in JHB area, it took us the better part of two weeks to get the
first line tech on their side to just query the internal teams and
probably another week to get the response acknowledging this -
mybroadband.co.za has an article about other local ISPs also complaining).

>
> In addition to checking for checksum failures, mentioned by Eric, you
> could look for PAWS failures, something like:
>
>   nstat -az | egrep  -i 'TcpInCsumError|PAWS'

TcpInCsumErrors                 0                  0.0
TcpExtPAWSActive                0                  0.0
TcpExtPAWSEstab                 90092              0.0
TcpExtTCPACKSkippedPAWS         81317              0.0

Not sure what these mean, but i should probably investigate, the latter
two are definitely incrementing.

Appreciate the feedback and for looking at the traces.

Kind Regards,
Jaco


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30 15:00           ` Jaco Kroon
@ 2022-03-30 16:19             ` Eric Dumazet
  2022-03-31 15:41               ` Neal Cardwell
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-03-30 16:19 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi,
>
> On 2022/03/30 15:56, Neal Cardwell wrote:
> > On Wed, Mar 30, 2022 at 2:22 AM Jaco Kroon <jaco@uls.co.za> wrote:
> >> Hi Eric,
> >>
> >> On 2022/03/30 05:48, Eric Dumazet wrote:
> >>> On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote:
> >>>
> >>> I do not think this commit is related to the issue you have.
> >>>
> >>> I guess you could try a revert ?
> >>>
> >>> Then, if you think old linux versions were ok, start a bisection ?
> >> That'll be interesting, will see if I can reproduce on a non-production
> >> host.
> >>> Thank you.
> >>>
> >>> (I do not see why a successful TFO would lead to a freeze after ~70 KB
> >>> of data has been sent)
> >> I do actually agree with this in that it makes no sense, but disabling
> >> TFO definitely resolved the issue for us.
> >>
> >> Kind Regards,
> >> Jaco
> > Thanks for the pcap trace! That's a pretty strange trace. I agree with
> > Eric's theory that this looks like one or more bugs in a firewall,
> > middlebox, or netfilter rule. From the trace it looks like the buggy
> > component is sometimes dropping packets and sometimes corrupting them
> > so that the client's TCP stack ignores them.
> The capture was taken on the client.  So the only firewall there is
> iptables, and I redirected all -j DROP statements to a L_DROP chain
> which did a -j LOG prior to -j DROP - didn't pick up any drops here.
> >
> > Interestingly, in that trace the client SYN has a TFO option and
> > cookie, but no data in the SYN.
>
> So this allows the SMTP server which in the conversation speaks first to
> identify itself to respond with data in the SYN (not sure that was
> actually happening but if I recall I did see it send data prior to
> receiving the final ACK on the handshake.
>
> >
> > The last packet that looks sane/normal is the ACK from the SMTP server
> > that looks like:
> >
> > 00:00:00.000010 IP6 2a00:1450:4013:c16::1a.25 >
> > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 66263 win
> > 774 <nop,nop,TS val 1206544341 ecr 331189186>
> >
> > That's the first ACK that crosses past 2^16. Maybe that is a
> > coincidence, or maybe not. Perhaps the buggy firewall/middlebox/etc is
>
> I believe it should be because we literally had this on every single
> connection going out to Google's SMTP ... probably 1/100 connections
> managed to deliver an email over the connection.  Then again ... 64KB
> isn't that much ...
>
> When you state sane/normal, do you mean there is fault with the other
> frames that could not be explained by packet loss in one or both of the
> directions?
>
> > confused by the TFO option, corrupts its state, and thereafter behaves
> > incorrectly past the first 64 KBytes of data from the client.
>
> Only firewalls we've got are netfilter based, and these packets all
> passed through the dedicated firewalls at least by the time they reach
> here.  No middleboxes on our end, and if this was Google's side there
> would be crazy noise be heard, not just me.  I think the trigger is
> packet loss between us (as indicated we know they have link congestion
> issues in JHB area, it took us the better part of two weeks to get the
> first line tech on their side to just query the internal teams and
> probably another week to get the response acknowledging this -
> mybroadband.co.za has an article about other local ISPs also complaining).
>
> >
> > In addition to checking for checksum failures, mentioned by Eric, you
> > could look for PAWS failures, something like:
> >
> >   nstat -az | egrep  -i 'TcpInCsumError|PAWS'
>
> TcpInCsumErrors                 0                  0.0
> TcpExtPAWSActive                0                  0.0
> TcpExtPAWSEstab                 90092              0.0
> TcpExtTCPACKSkippedPAWS         81317              0.0
>
> Not sure what these mean, but i should probably investigate, the latter
> two are definitely incrementing.
>
> Appreciate the feedback and for looking at the traces.
>

Your pcap does not show any obvious PAWS issues.

If the host is lightly loaded you could try while the connection is
attempted/frozen

perf record -a -g -e skb:kfree_skb sleep 30
perf script  (or perf report)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-30 16:19             ` Eric Dumazet
@ 2022-03-31 15:41               ` Neal Cardwell
  2022-03-31 23:06                 ` Jaco Kroon
  0 siblings, 1 reply; 38+ messages in thread
From: Neal Cardwell @ 2022-03-31 15:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Jaco Kroon, LKML, Netdev, Yuchung Cheng

[-- Attachment #1: Type: text/plain, Size: 2126 bytes --]

On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote:
...
> When you state sane/normal, do you mean there is fault with the other
> frames that could not be explained by packet loss in one or both of the
> directions?

Yes.

(1) If you look at the attached trace time/sequence plots (from
tcptrace and xplot.org) there are several behaviors that do not look
like normal congestive packet loss:

  (a) Literally *all* original transmissions (white segments in the
plot) of packets after client sequence 66263 appear lost (are not
ACKed). Congestion generally does not behave like that. But broken
firewalls/middleboxes do.
       (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png )

  (b) When the client is retransmitting packets, only packets at
exactly snd_una are ACKed. The packets beyond that point are always
un-ACKed. Again sounds like a broken firewall/middlebox.
       (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png )

  (c) After the client receives the server's "ack 73403", the client
ignores/drops all other incoming packets that show up in the trace.

       As Eric notes, this doesn't look like a PAWS issue. And it
doesn't look like a checksum or sequence/ACK validation issue. The
client starts ignoring ACKs between two ACKs that have correct
checksums, valid ACK numbers, and valid (identical) sequence numbers
and TS val and ecr values (here showing absolute sequence/ACK
numbers):

    (i) The client processes this ACK and uses it to advance snd_una:
    17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
(6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct)
2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val
1206546583 ecr 331191428>

    (ii) The client ignores this ACK and all later ACKs:
    17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
(6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct)
2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val
1206546583 ecr 331191428>

neal

[-- Attachment #2: netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png --]
[-- Type: image/png, Size: 131216 bytes --]

[-- Attachment #3: netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png --]
[-- Type: image/png, Size: 128102 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-31 15:41               ` Neal Cardwell
@ 2022-03-31 23:06                 ` Jaco Kroon
  2022-04-01  0:10                   ` Eric Dumazet
  2022-04-01 14:50                   ` Neal Cardwell
  0 siblings, 2 replies; 38+ messages in thread
From: Jaco Kroon @ 2022-03-31 23:06 UTC (permalink / raw)
  To: Neal Cardwell, Eric Dumazet; +Cc: LKML, Netdev, Yuchung Cheng

Hi Neal,

This sniff was grabbed ON THE CLIENT HOST.  There is no middlebox or
anything between the sniffer and the client.  Only the firewall on the
host itself, where we've already establish the traffic is NOT DISCARDED
(at least not in filter/INPUT).

Setup on our end:

2 x routers, usually each with a direct peering with Google (which is
being ignored at the moment so instead traffic is incoming via IPT over DD).

Connected via switch to

2 x firewalls, of which ONE is active (they have different networks
behind them, and could be active / standby for different networks behind
them - avoiding active-active because conntrackd is causing more trouble
than it's worth), Linux hosts, using netfilter, has been operating for
years, no recent kernel upgrades.

4 x hosts in mail cluster, one of which you're looking at here.

On 2022/03/31 17:41, Neal Cardwell wrote:
> On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote:
> ...
>> When you state sane/normal, do you mean there is fault with the other
>> frames that could not be explained by packet loss in one or both of the
>> directions?
> Yes.
>
> (1) If you look at the attached trace time/sequence plots (from
> tcptrace and xplot.org) there are several behaviors that do not look
> like normal congestive packet loss:
OK.  I'm not 100% sure how these plots of yours work, but let's see if I
can follow your logic here - they mostly make sense.  A legend would
probably help.  As I understand the white dots are original transmits,
green is what has been ACKED.  R is retransmits ... what's the S? 
What's the yellow line (I'm guessing receive window as advertised by the
server)?
>
>   (a) Literally *all* original transmissions (white segments in the
> plot) of packets after client sequence 66263 appear lost (are not
> ACKed). Congestion generally does not behave like that. But broken
> firewalls/middleboxes do.
>        (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png )

Agreed.  So could it be that something in the transit path towards
Google is actually dropping all of that?

As stated - I highly doubt this is on our network unless newer kernel
(on mail cluster) is doing stuff which is causing older netfilter to
drop perhaps?  But this doesn't explain why newer kernel retransmits
data for which it received an ACK.

>
>   (b) When the client is retransmitting packets, only packets at
> exactly snd_una are ACKed. The packets beyond that point are always
> un-ACKed. Again sounds like a broken firewall/middlebox.
>        (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png )
No middlebox between packet sniffer and client ... client here is linux
5.17.1.  Brings me back to the only thing that could be dropping the
traffic is netfilter on the host, or the kernel doesn't like something
about the ACK, or kernel is doing something else wrong as a result of
TFO.  I'm not sure which option I like less.  Unfortunately I also use
netfilter for redirecting traffic into haproxy here so can't exactly
just switch off netfilter.
>
>   (c) After the client receives the server's "ack 73403", the client
> ignores/drops all other incoming packets that show up in the trace.

Agreed.  However, if I read your graph correctly, it gets an ACK for
frame X at ~3.8s into the connection, then for X+2 at 4s, but it keeps
retransmitting X+2, not X+1?


>
>        As Eric notes, this doesn't look like a PAWS issue. And it
> doesn't look like a checksum or sequence/ACK validation issue. The
> client starts ignoring ACKs between two ACKs that have correct
> checksums, valid ACK numbers, and valid (identical) sequence numbers
> and TS val and ecr values (here showing absolute sequence/ACK
> numbers):
I'm not familiar with PAWS here.  Assuming that the green line is ACKs,
then at around 4s we get an ACK that basically ACKs two frames in one
(which is fine from my understanding of TCP), and then the second of
these frames keeps getting retransmitted going forward, so it's almost
like the kernel ACKs the *first* of these two frames but not the second.
>
>     (i) The client processes this ACK and uses it to advance snd_una:
>     17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
> (6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
> 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct)
> 2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val
> 1206546583 ecr 331191428>

>
>     (ii) The client ignores this ACK and all later ACKs:
>     17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
> (6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
> 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct)
> 2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val
> 1206546583 ecr 331191428>
>
> neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-31 23:06                 ` Jaco Kroon
@ 2022-04-01  0:10                   ` Eric Dumazet
  2022-04-01  0:15                     ` Florian Westphal
  2022-04-01  0:33                     ` Jaco Kroon
  2022-04-01 14:50                   ` Neal Cardwell
  1 sibling, 2 replies; 38+ messages in thread
From: Eric Dumazet @ 2022-04-01  0:10 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

On Thu, Mar 31, 2022 at 4:06 PM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Neal,
>
> This sniff was grabbed ON THE CLIENT HOST.  There is no middlebox or
> anything between the sniffer and the client.  Only the firewall on the
> host itself, where we've already establish the traffic is NOT DISCARDED
> (at least not in filter/INPUT).
>
> Setup on our end:
>
> 2 x routers, usually each with a direct peering with Google (which is
> being ignored at the moment so instead traffic is incoming via IPT over DD).
>
> Connected via switch to
>
> 2 x firewalls, of which ONE is active (they have different networks
> behind them, and could be active / standby for different networks behind
> them - avoiding active-active because conntrackd is causing more trouble
> than it's worth), Linux hosts, using netfilter, has been operating for
> years, no recent kernel upgrades.

Next step would be to attempt removing _all_ firewalls, especially not
common setups like yours.

conntrack had a bug preventing TFO deployment for a while, because
many boxes kept buggy kernel versions for years.

356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix
tcp_in_window for Fast Open

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01  0:10                   ` Eric Dumazet
@ 2022-04-01  0:15                     ` Florian Westphal
  2022-04-01 11:54                       ` Jaco Kroon
  2022-04-01  0:33                     ` Jaco Kroon
  1 sibling, 1 reply; 38+ messages in thread
From: Florian Westphal @ 2022-04-01  0:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Jaco Kroon, Neal Cardwell, LKML, Netdev, Yuchung Cheng

Eric Dumazet <edumazet@google.com> wrote:
> Next step would be to attempt removing _all_ firewalls, especially not
> common setups like yours.
> 
> conntrack had a bug preventing TFO deployment for a while, because
> many boxes kept buggy kernel versions for years.
> 
> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix
> tcp_in_window for Fast Open

Jaco could also try with
net.netfilter.nf_conntrack_tcp_be_liberal=1

and, if that helps, with liberal=0 and
sysctl net.netfilter.nf_conntrack_log_invalid=6

(check dmesg/syslog/nflog).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01  0:10                   ` Eric Dumazet
  2022-04-01  0:15                     ` Florian Westphal
@ 2022-04-01  0:33                     ` Jaco Kroon
  2022-04-01  0:41                       ` Eric Dumazet
  1 sibling, 1 reply; 38+ messages in thread
From: Jaco Kroon @ 2022-04-01  0:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

Hi,

On 2022/04/01 02:10, Eric Dumazet wrote:
> On Thu, Mar 31, 2022 at 4:06 PM Jaco Kroon <jaco@uls.co.za> wrote:
>> Hi Neal,
>>
>> This sniff was grabbed ON THE CLIENT HOST.  There is no middlebox or
>> anything between the sniffer and the client.  Only the firewall on the
>> host itself, where we've already establish the traffic is NOT DISCARDED
>> (at least not in filter/INPUT).
>>
>> Setup on our end:
>>
>> 2 x routers, usually each with a direct peering with Google (which is
>> being ignored at the moment so instead traffic is incoming via IPT over DD).
>>
>> Connected via switch to
>>
>> 2 x firewalls, of which ONE is active (they have different networks
>> behind them, and could be active / standby for different networks behind
>> them - avoiding active-active because conntrackd is causing more trouble
>> than it's worth), Linux hosts, using netfilter, has been operating for
>> years, no recent kernel upgrades.
> Next step would be to attempt removing _all_ firewalls, especially not
> common setups like yours.
That I'm afraid is not going to happen here.  I can't imagine what we're
doing is that uncommon.  On the host basically for INPUT drop invalid,
ACCEPT related established, accept specific ports, drop everything
else.  Other than the redirects in NAT there really isn't anything "funny".
>
> conntrack had a bug preventing TFO deployment for a while, because
> many boxes kept buggy kernel versions for years.

We don't use conntrackd, we tried many years back, but eventually we
just ended up using ucarp with /32s on the interfaces and whatever
subnet is required for the floating IP itself, combined with OSPF to
sort out the routing, that way we get to avoid asymmetric routing and
the need for conntrackd.  The core firewalls basically on FORWARD does
some directing based on ingress and/or egress interface to determine
ruleset to apply, again INVALID and RELATED,ESTABLISHED rules at the head.

>
> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix
> tcp_in_window for Fast Open

This is from Aug 9, 2013 ... our firewall's kernel isn't that old :). 
Again, the traffic was sniffed on the client side of that firewall, and
the only firewall between the sniffer and the processing part of the
kernel is the local netfilter.

I'll deploy same on a dev host we've got in the coming week and start a
bisect process.

Kind Regards,
Jaco


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01  0:33                     ` Jaco Kroon
@ 2022-04-01  0:41                       ` Eric Dumazet
  2022-04-01  0:54                         ` Eric Dumazet
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-04-01  0:41 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote:

> I'll deploy same on a dev host we've got in the coming week and start a
> bisect process.

Thanks, this will definitely help.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01  0:41                       ` Eric Dumazet
@ 2022-04-01  0:54                         ` Eric Dumazet
  2022-04-01 11:36                           ` Jaco Kroon
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-04-01  0:54 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote:
>
> > I'll deploy same on a dev host we've got in the coming week and start a
> > bisect process.
>
> Thanks, this will definitely help.

One thing I noticed in your pcap is a good amount of drops, as if
Hystart was not able to stop slow-start before the drops are
happening.

TFO with one less RTT at connection establishment could be the trigger.

If you are still using cubic, please try to revert.


commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Nov 23 12:25:35 2021 -0800

    tcp_cubic: fix spurious Hystart ACK train detections for
not-cwnd-limited flows

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01  0:54                         ` Eric Dumazet
@ 2022-04-01 11:36                           ` Jaco Kroon
  2022-04-01 13:54                             ` Eric Dumazet
  0 siblings, 1 reply; 38+ messages in thread
From: Jaco Kroon @ 2022-04-01 11:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

Hi Eric,

On 2022/04/01 02:54, Eric Dumazet wrote:
> On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@google.com> wrote:
>> On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote:
>>
>>> I'll deploy same on a dev host we've got in the coming week and start a
>>> bisect process.
>> Thanks, this will definitely help.
> One thing I noticed in your pcap is a good amount of drops, as if
> Hystart was not able to stop slow-start before the drops are
> happening.
>
> TFO with one less RTT at connection establishment could be the trigger.
>
> If you are still using cubic, please try to revert.
Sorry, I understand TCP itself a bit, but I've given up trying to
understand the various schedulers a long time ago and am just using the
defaults that the kernel provides.  How do I check what I'm using, and
how can I change that?  What is recommended at this stage?
>
>
> commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Tue Nov 23 12:25:35 2021 -0800
>
>     tcp_cubic: fix spurious Hystart ACK train detections for
> not-cwnd-limited flows
Ok, instead of starting with bisect, if I can reproduce in dev I'll use
this one first.

Kind Regards,
Jaco

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01  0:15                     ` Florian Westphal
@ 2022-04-01 11:54                       ` Jaco Kroon
  2022-04-01 12:09                         ` Florian Westphal
  0 siblings, 1 reply; 38+ messages in thread
From: Jaco Kroon @ 2022-04-01 11:54 UTC (permalink / raw)
  To: Florian Westphal, Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

Hi,

On 2022/04/01 02:15, Florian Westphal wrote:

Incidently, I always find your initials to be interesting considering
(as far as I know) you work on netfilter firewall.

> Eric Dumazet <edumazet@google.com> wrote:
>> Next step would be to attempt removing _all_ firewalls, especially not
>> common setups like yours.
>>
>> conntrack had a bug preventing TFO deployment for a while, because
>> many boxes kept buggy kernel versions for years.
>>
>> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix
>> tcp_in_window for Fast Open
> Jaco could also try with
> net.netfilter.nf_conntrack_tcp_be_liberal=1
>
> and, if that helps, with liberal=0 and
> sysctl net.netfilter.nf_conntrack_log_invalid=6
>
> (check dmesg/syslog/nflog).

Our core firewalls already had nf_conntrack_tcp_be_liberal for other
reasons (asymmetric routing combined with conntrackd left-over if I
recall), so maybe that's why it got through there ... don't exactly want
to just flip that setting though, is there a way to log if it would have
dropped anything, without actually dropping it (yet)?

Will do this first, first need to confirm that I can reproduce in a dev
environment.

Kind Regards,
Jaco



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01 11:54                       ` Jaco Kroon
@ 2022-04-01 12:09                         ` Florian Westphal
  0 siblings, 0 replies; 38+ messages in thread
From: Florian Westphal @ 2022-04-01 12:09 UTC (permalink / raw)
  To: Jaco Kroon
  Cc: Florian Westphal, Eric Dumazet, Neal Cardwell, LKML, Netdev,
	Yuchung Cheng

Jaco Kroon <jaco@uls.co.za> wrote:
> > Eric Dumazet <edumazet@google.com> wrote:
> >> Next step would be to attempt removing _all_ firewalls, especially not
> >> common setups like yours.
> >>
> >> conntrack had a bug preventing TFO deployment for a while, because
> >> many boxes kept buggy kernel versions for years.
> >>
> >> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix
> >> tcp_in_window for Fast Open
> > Jaco could also try with
> > net.netfilter.nf_conntrack_tcp_be_liberal=1
> >
> > and, if that helps, with liberal=0 and
> > sysctl net.netfilter.nf_conntrack_log_invalid=6
> >
> > (check dmesg/syslog/nflog).
> 
> Our core firewalls already had nf_conntrack_tcp_be_liberal for other
> reasons (asymmetric routing combined with conntrackd left-over if I
> recall), so maybe that's why it got through there ... don't exactly want
> to just flip that setting though, is there a way to log if it would have
> dropped anything, without actually dropping it (yet)?

This means conntrack doesn't tag packets as invalid EVEN if it would
consider sequence/ack out-of-window (e.g. due to a bug).

I have a hard time seeing how tcp liberal-mode conntrack would be to
blame here.

Only thing you could also check is if
net.netfilter.nf_conntrack_checksum=0 helps (but i doubt it).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01 11:36                           ` Jaco Kroon
@ 2022-04-01 13:54                             ` Eric Dumazet
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Dumazet @ 2022-04-01 13:54 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng

On Fri, Apr 1, 2022 at 4:36 AM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Eric,
>
> On 2022/04/01 02:54, Eric Dumazet wrote:
> > On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@google.com> wrote:
> >> On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote:
> >>
> >>> I'll deploy same on a dev host we've got in the coming week and start a
> >>> bisect process.
> >> Thanks, this will definitely help.
> > One thing I noticed in your pcap is a good amount of drops, as if
> > Hystart was not able to stop slow-start before the drops are
> > happening.
> >
> > TFO with one less RTT at connection establishment could be the trigger.
> >
> > If you are still using cubic, please try to revert.
> Sorry, I understand TCP itself a bit, but I've given up trying to
> understand the various schedulers a long time ago and am just using the
> defaults that the kernel provides.  How do I check what I'm using, and
> how can I change that?  What is recommended at this stage?

How to check: cat /proc/sys/net/ipv4/tcp_congestion_control"

This is of course orthogonal to the buf we are tracking here,
but given your long RTT, I would recommend using fq packet scheduler and bbr.

tc qd replace dev eth0 root fq   # or use mq+fq if your NIC is multi
queue and you need a good amount of throughput

insmod tcp_bbr   # (after enabling CONFIG_TCP_CONG_BBR=m)
echo bbr >/proc/sys/net/ipv4/tcp_congestion_control


> >
> >
> > commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Tue Nov 23 12:25:35 2021 -0800
> >
> >     tcp_cubic: fix spurious Hystart ACK train detections for
> > not-cwnd-limited flows
> Ok, instead of starting with bisect, if I can reproduce in dev I'll use
> this one first.

Thanks ! (again this won't fix the bug, this is really a shoot in the dark)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-03-31 23:06                 ` Jaco Kroon
  2022-04-01  0:10                   ` Eric Dumazet
@ 2022-04-01 14:50                   ` Neal Cardwell
  2022-04-01 15:39                     ` Neal Cardwell
  1 sibling, 1 reply; 38+ messages in thread
From: Neal Cardwell @ 2022-04-01 14:50 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng

On Thu, Mar 31, 2022 at 7:06 PM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Neal,
>
> This sniff was grabbed ON THE CLIENT HOST.  There is no middlebox or
> anything between the sniffer and the client.  Only the firewall on the
> host itself, where we've already establish the traffic is NOT DISCARDED
> (at least not in filter/INPUT).

Yes, understood. Please excuse my general use of the term
"firewalls/middleboxes" even where in some contexts it's clear the
"middleboxes" aspect of that term could not apply. :-)

> Setup on our end:
>
> 2 x routers, usually each with a direct peering with Google (which is
> being ignored at the moment so instead traffic is incoming via IPT over DD).
>
> Connected via switch to
>
> 2 x firewalls, of which ONE is active (they have different networks
> behind them, and could be active / standby for different networks behind
> them - avoiding active-active because conntrackd is causing more trouble
> than it's worth), Linux hosts, using netfilter, has been operating for
> years, no recent kernel upgrades.
>
> 4 x hosts in mail cluster, one of which you're looking at here.
>
> On 2022/03/31 17:41, Neal Cardwell wrote:
> > On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote:
> > ...
> >> When you state sane/normal, do you mean there is fault with the other
> >> frames that could not be explained by packet loss in one or both of the
> >> directions?
> > Yes.
> >
> > (1) If you look at the attached trace time/sequence plots (from
> > tcptrace and xplot.org) there are several behaviors that do not look
> > like normal congestive packet loss:
> OK.  I'm not 100% sure how these plots of yours work, but let's see if I
> can follow your logic here - they mostly make sense.  A legend would
> probably help.  As I understand the white dots are original transmits,
> green is what has been ACKED.  R is retransmits ... what's the S?

"S" is "SACKed", or selectively acknowledged. The SACK blocks below
the green ACK lines are DSACK blocks, for "Duplicate SACKs",
indicating the receiver has already received that sequence range.

> What's the yellow line (I'm guessing receive window as advertised by the
> server)?

Yes, the yellow line is the right edge of the receive window of the server.

> >   (a) Literally *all* original transmissions (white segments in the
> > plot) of packets after client sequence 66263 appear lost (are not
> > ACKed). Congestion generally does not behave like that. But broken
> > firewalls/middleboxes do.
> >        (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png )
>
> Agreed.  So could it be that something in the transit path towards
> Google is actually dropping all of that?

It could be. Or it could be a firewall/middlebox.

> As stated - I highly doubt this is on our network unless newer kernel
> (on mail cluster) is doing stuff which is causing older netfilter to
> drop perhaps?  But this doesn't explain why newer kernel retransmits
> data for which it received an ACK.

Yes, I agree that the biggest problem to focus on is the TCP code in
the kernel retransmitting data for which the NIC is receiving ACKs.

> >
> >   (b) When the client is retransmitting packets, only packets at
> > exactly snd_una are ACKed. The packets beyond that point are always
> > un-ACKed. Again sounds like a broken firewall/middlebox.
> >        (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png )
> No middlebox between packet sniffer and client ... client here is linux
> 5.17.1.  Brings me back to the only thing that could be dropping the
> traffic is netfilter on the host, or the kernel doesn't like something
> about the ACK, or kernel is doing something else wrong as a result of
> TFO.  I'm not sure which option I like less.  Unfortunately I also use
> netfilter for redirecting traffic into haproxy here so can't exactly
> just switch off netfilter.

Given the most problematic aspect of the trace, where the client-side
TCP connection is repeatedly retransmitting packets for which ACKs are
arriving at the NIC (and captured by tcpdump), it seems some software
in your kernel is dropping packets between the network device and the
TCP layer. Given that you mention  "the only thing that could be
dropping the traffic is netfilter on the host", it seems like the
netfilter rules or software are buggy.

A guess would be that the netfilter code is getting into a bad state
due to the TFO behavior where there is a data packet arriving from the
server immediately after the SYN/ACK and just before the client sends
its first ACK:

00:00:00.000000 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590 >
2a00:1450:4013:c16::1a.25: S 3451342529:3451342529(0) win 62580 <mss
8940,sackOK,TS val 331187616 ecr 0,nop,wscale 7,Unknown Option
3472da7bfe84[|tcp]>

00:00:00.164295 IP6 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: S. 2699962254:2699962254(0)
ack 3451342530 win 65535 <mss 1440,sackOK,TS val 1206542770 ecr
331187616,nop,wscale 8>

# this one is perhaps confusing netfilter?:
00:00:00.001641 IP6 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: P. 1:89(88) ack 1 win 256
<nop,nop,TS val 1206542772 ecr 331187616>

00:00:00.000035 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590 >
2a00:1450:4013:c16::1a.25: . 1:1(0) ack 89 win 489 <nop,nop,TS val
331187782 ecr 1206542772>

00:00:00.000042 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590 >
2a00:1450:4013:c16::1a.25: P. 1:24(23) ack 89 win 489 <nop,nop,TS val
331187782 ecr 1206542772>

Re "so can't exactly just switch off netfilter", are there any other
counters or logs you can somehow check for netfilter drops?

> >
> >   (c) After the client receives the server's "ack 73403", the client
> > ignores/drops all other incoming packets that show up in the trace.
>
> Agreed.  However, if I read your graph correctly, it gets an ACK for
> frame X at ~3.8s into the connection, then for X+2 at 4s, but it keeps
> retransmitting X+2, not X+1?

At t=4s, as I discussed below there are two ACKs that arrive
back-to-back, where the client TCP apparently processes the first but
not the second. That's why it keeps retransmitting the packet beyond
the first ACk but not beyond the second ACK.

>
> >
> >        As Eric notes, this doesn't look like a PAWS issue. And it
> > doesn't look like a checksum or sequence/ACK validation issue. The
> > client starts ignoring ACKs between two ACKs that have correct
> > checksums, valid ACK numbers, and valid (identical) sequence numbers
> > and TS val and ecr values (here showing absolute sequence/ACK
> > numbers):
> I'm not familiar with PAWS here.  Assuming that the green line is ACKs,
> then at around 4s we get an ACK that basically ACKs two frames in one
> (which is fine from my understanding of TCP), and then the second of
> these frames keeps getting retransmitted going forward, so it's almost
> like the kernel ACKs the *first* of these two frames but not the second.

Again, there are two ACKs, where the client TCP apparently processes
the first but not the second, as discussed here:

> >
> >     (i) The client processes this ACK and uses it to advance snd_una:
> >     17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
> > (6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
> > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct)
> > 2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val
> > 1206546583 ecr 331191428>
>
> >
> >     (ii) The client ignores this ACK and all later ACKs:
> >     17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
> > (6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
> > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct)
> > 2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val
> > 1206546583 ecr 331191428>
> >

Here are those same two ACKs again, shown with absolute time and
relative sequence numbers, to make them easier to parse:

(i) The client processes this ACK and uses it to advance snd_una:
17:46:49.889911 IP6 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 73403 win
830 <nop,nop,TS val 1206546583 ecr 331191428>

 (ii) The client ignores this ACK and all later ACKs:
17:46:49.889912 IP6 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 74831 win
841 <nop,nop,TS val 1206546583 ecr 331191428>


neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01 14:50                   ` Neal Cardwell
@ 2022-04-01 15:39                     ` Neal Cardwell
  2022-04-01 15:48                       ` Neal Cardwell
  2022-04-02  8:42                       ` Jaco Kroon
  0 siblings, 2 replies; 38+ messages in thread
From: Neal Cardwell @ 2022-04-01 15:39 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang

On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote:
...
> Connection setup:
>
> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
>
> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
>
> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
>
> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
>
> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
> we shouldn't send segments larger than that, and they "can't".  I need to
> determine if this is some form of offloading or they really are sending >1500
> byte frames (which I know won't pass our firewalls without fragmentation so
> probably some form of NIC offloading - which if it was active on older 5.8
> kernels did not cause problems):

Jaco, was there some previous kernel version on these client machines
where this problem did not show up? Perhaps the v5.8 version you
mention here? Can you please share the exact version number?

If so, a hypothesis would be:

(1) There is a bug in netfilter's handling of TFO connections where
the server sends a data packet after a TFO SYNACK, before the client
ACKs anything (as we see in this trace).

This bug is perhaps similar in character to the bug fixed by Yuchung's
2013 commit that Eric mentioned:

356d7d88e088687b6578ca64601b0a2c9d145296
netfilter: nf_conntrack: fix tcp_in_window for Fast Open

(2) With kernel v5.8, TFO blackhole detection detected that in your
workload there were TFO connections that died due to apparent
blackholing (like what's shown in the trace), and dynamically disabled
TFO on your machines. This allowed mail traffic to flow, because the
netfilter bug was no longer tickled. This worked around the netfilter
bug.

(3) You upgraded your client-side machine from v5.8 to v5.17, which
has the following commit from v5.14, which disables TFO blackhole
logic by default:
  213ad73d0607 tcp: disable TFO blackhole logic by default

(4) Due to (3), the blackhole detection logic was no longer operative,
and when the netfilter bug blackholed the connection, TFO stayed
enabled. This caused mail traffic to Google to stall.

This hypothesis would explain why:
  o disabling TFO fixes this problem
  o you are seeing this with a newer kernel (and apparently not with a
kernel before v5.14?)

With this hypothesis, we need several pieces to trigger this:

(a) client side software that tries TFO to a server that supports TFO
(like the exim mail transfer agent you are using, connecting to
Google)

(b) a client-side Linux kernel running buggy netfilter code (you are
running netfilter)

(c) a client-side Linux kernel with TFO support but no blackhole
detection logic active (e.g. v5.14 or later, like your v5.17.1)

That's probably a rare combination, so would explain why we have not
had this report before.

Jaco, to provide some evidence for this hypothesis, can you please
re-enable fastopen but also enable the TFO blackhole detection that
was disabled in v5.14 (213ad73d0607), with something like:

  sysctl -w net.ipv4.tcp_fastopen=1
  sysctl -w tcp_fastopen_blackhole_timeout=3600

And then after a few hours, check to see if this blackholing behavior
has been detected:
  nstat -az | grep -i blackhole
And see if TFO FastOpenActive attempts have been cut to a super-low rate:
  nstat -az | grep -i fastopenactive

thanks,
neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01 15:39                     ` Neal Cardwell
@ 2022-04-01 15:48                       ` Neal Cardwell
  2022-04-02  8:42                       ` Jaco Kroon
  1 sibling, 0 replies; 38+ messages in thread
From: Neal Cardwell @ 2022-04-01 15:48 UTC (permalink / raw)
  To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang

On Fri, Apr 1, 2022 at 11:39 AM Neal Cardwell <ncardwell@google.com> wrote:
...
> Jaco, to provide some evidence for this hypothesis, can you please
> re-enable fastopen but also enable the TFO blackhole detection that
> was disabled in v5.14 (213ad73d0607), with something like:
>
>   sysctl -w net.ipv4.tcp_fastopen=1
>   sysctl -w tcp_fastopen_blackhole_timeout=3600

I would also suggest using Florian's suggestion to log invalid
packets, so perhaps we can get a clue as to why netfilter thinks these
packets are invalid:

    sysctl net.netfilter.nf_conntrack_log_invalid=6

> And then after a few hours, check to see if this blackholing behavior
> has been detected:
>   nstat -az | grep -i blackhole
> And see if TFO FastOpenActive attempts have been cut to a super-low rate:
>   nstat -az | grep -i fastopenactive

Then I would correspondingly echo Florian's suggestion to check
dmesg/syslog/nflog to learn more about the drops.

neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-01 15:39                     ` Neal Cardwell
  2022-04-01 15:48                       ` Neal Cardwell
@ 2022-04-02  8:42                       ` Jaco Kroon
  2022-04-02 13:20                         ` Eric Dumazet
                                           ` (2 more replies)
  1 sibling, 3 replies; 38+ messages in thread
From: Jaco Kroon @ 2022-04-02  8:42 UTC (permalink / raw)
  To: Neal Cardwell, Florian Westphal
  Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang

Hi Neal,

On 2022/04/01 17:39, Neal Cardwell wrote:
> On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote:
> ...
>> Connection setup:
>>
>> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
>>
>> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
>>
>> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
>>
>> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
>>
>> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
>> we shouldn't send segments larger than that, and they "can't".  I need to
>> determine if this is some form of offloading or they really are sending >1500
>> byte frames (which I know won't pass our firewalls without fragmentation so
>> probably some form of NIC offloading - which if it was active on older 5.8
>> kernels did not cause problems):
> Jaco, was there some previous kernel version on these client machines
> where this problem did not show up? Perhaps the v5.8 version you
> mention here? Can you please share the exact version number?
5.8.14
>
> If so, a hypothesis would be:
>
> (1) There is a bug in netfilter's handling of TFO connections where
> the server sends a data packet after a TFO SYNACK, before the client
> ACKs anything (as we see in this trace).
>
> This bug is perhaps similar in character to the bug fixed by Yuchung's
> 2013 commit that Eric mentioned:
>
> 356d7d88e088687b6578ca64601b0a2c9d145296
> netfilter: nf_conntrack: fix tcp_in_window for Fast Open
>
> (2) With kernel v5.8, TFO blackhole detection detected that in your
> workload there were TFO connections that died due to apparent
> blackholing (like what's shown in the trace), and dynamically disabled
> TFO on your machines. This allowed mail traffic to flow, because the
> netfilter bug was no longer tickled. This worked around the netfilter
> bug.
>
> (3) You upgraded your client-side machine from v5.8 to v5.17, which
> has the following commit from v5.14, which disables TFO blackhole
> logic by default:
>   213ad73d0607 tcp: disable TFO blackhole logic by default
>
> (4) Due to (3), the blackhole detection logic was no longer operative,
> and when the netfilter bug blackholed the connection, TFO stayed
> enabled. This caused mail traffic to Google to stall.
>
> This hypothesis would explain why:
>   o disabling TFO fixes this problem
>   o you are seeing this with a newer kernel (and apparently not with a
> kernel before v5.14?)
Agreed.
>
> With this hypothesis, we need several pieces to trigger this:
>
> (a) client side software that tries TFO to a server that supports TFO
> (like the exim mail transfer agent you are using, connecting to
> Google)
>
> (b) a client-side Linux kernel running buggy netfilter code (you are
> running netfilter)
>
> (c) a client-side Linux kernel with TFO support but no blackhole
> detection logic active (e.g. v5.14 or later, like your v5.17.1)
>
> That's probably a rare combination, so would explain why we have not
> had this report before.
>
> Jaco, to provide some evidence for this hypothesis, can you please
> re-enable fastopen but also enable the TFO blackhole detection that
> was disabled in v5.14 (213ad73d0607), with something like:
>
>   sysctl -w net.ipv4.tcp_fastopen=1
>   sysctl -w tcp_fastopen_blackhole_timeout=3600

Done.

Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which
generates lots of logs, something specific I should be looking for?  I
suspect these relate:

[Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
(over the window of the receiver) IN= OUT=bond0
SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020
WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
GID=12
[Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
(over the window of the receiver) IN= OUT=bond0
SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689941170 ACK=4200412020
WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
GID=12

(There are many more of those, and the remote side is Google in this case)

>
> And then after a few hours, check to see if this blackholing behavior
> has been detected:
>   nstat -az | grep -i blackhole
> And see if TFO FastOpenActive attempts have been cut to a super-low rate:
>   nstat -az | grep -i fastopenactive

uriel [06:10:03] ~ # nstat -az | grep -i fastopen
TcpExtTCPFastOpenActive         0                  0.0
TcpExtTCPFastOpenActiveFail     3739               0.0
TcpExtTCPFastOpenPassive        0                  0.0
TcpExtTCPFastOpenPassiveFail    0                  0.0
TcpExtTCPFastOpenListenOverflow 0                  0.0
TcpExtTCPFastOpenCookieReqd     3378               0.0
TcpExtTCPFastOpenBlackhole      0                  0.0
TcpExtTCPFastOpenPassiveAltKey  0                  0.0

uriel [09:54:54] ~ # nstat -az | grep -i fastopen
TcpExtTCPFastOpenActive         0                  0.0
TcpExtTCPFastOpenActiveFail     3742               0.0
TcpExtTCPFastOpenPassive        0                  0.0
TcpExtTCPFastOpenPassiveFail    0                  0.0
TcpExtTCPFastOpenListenOverflow 0                  0.0
TcpExtTCPFastOpenCookieReqd     3391               0.0
TcpExtTCPFastOpenBlackhole      3                  0.0
TcpExtTCPFastOpenPassiveAltKey  0                  0.0

I'm fairly certain that strongly supports your theory.  So I *suspect*
the next test would be something like:

Disable the blackhole again, let the queue build up a few minutes until
we have something from google.  Shut down exim so we can isolate SMTP
traffic.  tcpdump again, capturing the traffic, and correlate the FW
logs with the connection?

Kind Regards,
Jaco


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02  8:42                       ` Jaco Kroon
@ 2022-04-02 13:20                         ` Eric Dumazet
  2022-04-02 22:02                           ` Jaco Kroon
  2022-04-02 14:14                         ` Florian Westphal
  2022-04-02 16:29                         ` Neal Cardwell
  2 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-04-02 13:20 UTC (permalink / raw)
  To: Jaco Kroon
  Cc: Neal Cardwell, Florian Westphal, LKML, Netdev, Yuchung Cheng,
	Wei Wang, Pablo Neira Ayuso, Sven Auhagen

On Sat, Apr 2, 2022 at 1:42 AM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Neal,
>
> On 2022/04/01 17:39, Neal Cardwell wrote:
> > On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote:
> > ...
> >> Connection setup:
> >>
> >> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
> >>
> >> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
> >>
> >> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
> >>
> >> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
> >>
> >> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
> >> we shouldn't send segments larger than that, and they "can't".  I need to
> >> determine if this is some form of offloading or they really are sending >1500
> >> byte frames (which I know won't pass our firewalls without fragmentation so
> >> probably some form of NIC offloading - which if it was active on older 5.8
> >> kernels did not cause problems):
> > Jaco, was there some previous kernel version on these client machines
> > where this problem did not show up? Perhaps the v5.8 version you
> > mention here? Can you please share the exact version number?
> 5.8.14
> >
> > If so, a hypothesis would be:
> >
> > (1) There is a bug in netfilter's handling of TFO connections where
> > the server sends a data packet after a TFO SYNACK, before the client
> > ACKs anything (as we see in this trace).
> >
> > This bug is perhaps similar in character to the bug fixed by Yuchung's
> > 2013 commit that Eric mentioned:
> >
> > 356d7d88e088687b6578ca64601b0a2c9d145296
> > netfilter: nf_conntrack: fix tcp_in_window for Fast Open
> >
> > (2) With kernel v5.8, TFO blackhole detection detected that in your
> > workload there were TFO connections that died due to apparent
> > blackholing (like what's shown in the trace), and dynamically disabled
> > TFO on your machines. This allowed mail traffic to flow, because the
> > netfilter bug was no longer tickled. This worked around the netfilter
> > bug.
> >
> > (3) You upgraded your client-side machine from v5.8 to v5.17, which
> > has the following commit from v5.14, which disables TFO blackhole
> > logic by default:
> >   213ad73d0607 tcp: disable TFO blackhole logic by default
> >
> > (4) Due to (3), the blackhole detection logic was no longer operative,
> > and when the netfilter bug blackholed the connection, TFO stayed
> > enabled. This caused mail traffic to Google to stall.
> >
> > This hypothesis would explain why:
> >   o disabling TFO fixes this problem
> >   o you are seeing this with a newer kernel (and apparently not with a
> > kernel before v5.14?)
> Agreed.
> >
> > With this hypothesis, we need several pieces to trigger this:
> >
> > (a) client side software that tries TFO to a server that supports TFO
> > (like the exim mail transfer agent you are using, connecting to
> > Google)
> >
> > (b) a client-side Linux kernel running buggy netfilter code (you are
> > running netfilter)
> >
> > (c) a client-side Linux kernel with TFO support but no blackhole
> > detection logic active (e.g. v5.14 or later, like your v5.17.1)
> >
> > That's probably a rare combination, so would explain why we have not
> > had this report before.
> >
> > Jaco, to provide some evidence for this hypothesis, can you please
> > re-enable fastopen but also enable the TFO blackhole detection that
> > was disabled in v5.14 (213ad73d0607), with something like:
> >
> >   sysctl -w net.ipv4.tcp_fastopen=1
> >   sysctl -w tcp_fastopen_blackhole_timeout=3600
>
> Done.
>
> Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which
> generates lots of logs, something specific I should be looking for?  I
> suspect these relate:
>
> [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
> (over the window of the receiver) IN= OUT=bond0
> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020
> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
> GID=12
> [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
> (over the window of the receiver) IN= OUT=bond0
> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689941170 ACK=4200412020
> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
> GID=12
>
> (There are many more of those, and the remote side is Google in this case)
>

Great. This confirms our suspicions.

Please try the following patch that landed in 5.18-rc

f2dd495a8d589371289981d5ed33e6873df94ecc netfilter: nf_conntrack_tcp:
preserve liberal flag in tcp options

CC netfilter folks.

Condition triggering the bug :
   before(seq, sender->td_maxend + 1),

I took a look at the code, and it is not clear if td_maxend is
properly setup (or if td_scale is cleared at some point while it
should not)

Alternatively, if conntracking does not know if the connection is
using wscale (or what is the scale), the "before(seq,
sender->td_maxend + 1),"
should not be evaluated/used.

Also, I do not see where td_maxend is extended in tcp_init_sender()

Probably wrong patch, just to point to the code I do not understand yet.

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c
b/net/netfilter/nf_conntrack_proto_tcp.c
index 8ec55cd72572e0cca076631e2cc1c11f0c2b86f6..950082785d61b7a2768559c7500d3aee3aaea7c2
100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -456,9 +456,10 @@ static void tcp_init_sender(struct ip_ct_tcp_state *sender,
        /* SYN-ACK in reply to a SYN
         * or SYN from reply direction in simultaneous open.
         */
-       sender->td_end =
-       sender->td_maxend = end;
-       sender->td_maxwin = (win == 0 ? 1 : win);
+       sender->td_end = end;
+       sender->td_maxwin = max(win, 1U);
+       /* WIN in SYN & SYNACK is not scaled */
+       sender->td_maxend = end + sender->td_maxwin;

        tcp_options(skb, dataoff, tcph, sender);
        /* RFC 1323:

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02  8:42                       ` Jaco Kroon
  2022-04-02 13:20                         ` Eric Dumazet
@ 2022-04-02 14:14                         ` Florian Westphal
  2022-04-02 15:57                           ` Neal Cardwell
  2022-04-02 21:51                           ` Jaco Kroon
  2022-04-02 16:29                         ` Neal Cardwell
  2 siblings, 2 replies; 38+ messages in thread
From: Florian Westphal @ 2022-04-02 14:14 UTC (permalink / raw)
  To: Jaco Kroon
  Cc: Neal Cardwell, Florian Westphal, Eric Dumazet, LKML, Netdev,
	Yuchung Cheng, Wei Wang

Jaco Kroon <jaco@uls.co.za> wrote:
> Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which
> generates lots of logs, something specific I should be looking for?  I
> suspect these relate:
> 
> [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
> (over the window of the receiver) IN= OUT=bond0
> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020
> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
> GID=12

I thought this had "liberal mode" enabled for tcp conntrack?
The above implies its off.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02 14:14                         ` Florian Westphal
@ 2022-04-02 15:57                           ` Neal Cardwell
  2022-04-02 21:51                           ` Jaco Kroon
  1 sibling, 0 replies; 38+ messages in thread
From: Neal Cardwell @ 2022-04-02 15:57 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Jaco Kroon, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang

On Sat, Apr 2, 2022 at 10:14 AM Florian Westphal <fw@strlen.de> wrote:
>
> Jaco Kroon <jaco@uls.co.za> wrote:
> > Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which
> > generates lots of logs, something specific I should be looking for?  I
> > suspect these relate:
> >
> > [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
> > (over the window of the receiver) IN= OUT=bond0
> > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
> > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
> > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020
> > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
> > GID=12
>
> I thought this had "liberal mode" enabled for tcp conntrack?
> The above implies its off.

Jaco's email said: "Our core firewalls already had
nf_conntrack_tcp_be_liberal". But this log is from the client machine
itself, not the core firewall machines. AFAICT it seems the client
machine does not have "liberal mode" enabled.

neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02  8:42                       ` Jaco Kroon
  2022-04-02 13:20                         ` Eric Dumazet
  2022-04-02 14:14                         ` Florian Westphal
@ 2022-04-02 16:29                         ` Neal Cardwell
  2022-04-02 16:32                           ` Eric Dumazet
  2 siblings, 1 reply; 38+ messages in thread
From: Neal Cardwell @ 2022-04-02 16:29 UTC (permalink / raw)
  To: Jaco Kroon
  Cc: Florian Westphal, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang

)

On Sat, Apr 2, 2022 at 4:42 AM Jaco Kroon <jaco@uls.co.za> wrote:
>
> Hi Neal,
>
> On 2022/04/01 17:39, Neal Cardwell wrote:
> > On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote:
> > ...
> >> Connection setup:
> >>
> >> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo  cookie f025dd84b6122510,nop,nop], length 0
> >>
> >> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0
> >>
> >> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp
> >>
> >> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0
> >>
> >> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus
> >> we shouldn't send segments larger than that, and they "can't".  I need to
> >> determine if this is some form of offloading or they really are sending >1500
> >> byte frames (which I know won't pass our firewalls without fragmentation so
> >> probably some form of NIC offloading - which if it was active on older 5.8
> >> kernels did not cause problems):
> > Jaco, was there some previous kernel version on these client machines
> > where this problem did not show up? Perhaps the v5.8 version you
> > mention here? Can you please share the exact version number?
> 5.8.14

Thanks for the client kernel version! (5.8.14)

> > If so, a hypothesis would be:
> >
> > (1) There is a bug in netfilter's handling of TFO connections where
> > the server sends a data packet after a TFO SYNACK, before the client
> > ACKs anything (as we see in this trace).
> >
> > This bug is perhaps similar in character to the bug fixed by Yuchung's
> > 2013 commit that Eric mentioned:
> >
> > 356d7d88e088687b6578ca64601b0a2c9d145296
> > netfilter: nf_conntrack: fix tcp_in_window for Fast Open
> >
> > (2) With kernel v5.8, TFO blackhole detection detected that in your
> > workload there were TFO connections that died due to apparent
> > blackholing (like what's shown in the trace), and dynamically disabled
> > TFO on your machines. This allowed mail traffic to flow, because the
> > netfilter bug was no longer tickled. This worked around the netfilter
> > bug.
> >
> > (3) You upgraded your client-side machine from v5.8 to v5.17, which
> > has the following commit from v5.14, which disables TFO blackhole
> > logic by default:
> >   213ad73d0607 tcp: disable TFO blackhole logic by default
> >
> > (4) Due to (3), the blackhole detection logic was no longer operative,
> > and when the netfilter bug blackholed the connection, TFO stayed
> > enabled. This caused mail traffic to Google to stall.
> >
> > This hypothesis would explain why:
> >   o disabling TFO fixes this problem
> >   o you are seeing this with a newer kernel (and apparently not with a
> > kernel before v5.14?)
> Agreed.
> >
> > With this hypothesis, we need several pieces to trigger this:
> >
> > (a) client side software that tries TFO to a server that supports TFO
> > (like the exim mail transfer agent you are using, connecting to
> > Google)
> >
> > (b) a client-side Linux kernel running buggy netfilter code (you are
> > running netfilter)
> >
> > (c) a client-side Linux kernel with TFO support but no blackhole
> > detection logic active (e.g. v5.14 or later, like your v5.17.1)
> >
> > That's probably a rare combination, so would explain why we have not
> > had this report before.
> >
> > Jaco, to provide some evidence for this hypothesis, can you please
> > re-enable fastopen but also enable the TFO blackhole detection that
> > was disabled in v5.14 (213ad73d0607), with something like:
> >
> >   sysctl -w net.ipv4.tcp_fastopen=1
> >   sysctl -w tcp_fastopen_blackhole_timeout=3600
>
> Done.

Thanks for running that experiment and reporting your data! That was
super-informative. So it seems like we have a working high-level
theory about what's going on and where, and we just need to pinpoint
the buggy lines in the netfilter conntrack code running on the mail
client machines.

> Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which
> generates lots of logs, something specific I should be looking for?  I
> suspect these relate:
>
> [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
> (over the window of the receiver) IN= OUT=bond0
> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020
> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
> GID=12
> [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
> (over the window of the receiver) IN= OUT=bond0
> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689941170 ACK=4200412020
> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
> GID=12
>
> (There are many more of those, and the remote side is Google in this case)

FWIW those log entries indicate netfilter on the mail client machine
dropping consecutive outbound skbs with 2*MSS of payload. So that
explains the large consecutive losses of client data packets to the
e-mail server. That seems to confirm my earlier hunch that those drops
of consecutive client data packets "do not look like normal congestive
packet loss".

neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02 16:29                         ` Neal Cardwell
@ 2022-04-02 16:32                           ` Eric Dumazet
  2022-04-02 18:04                             ` Neal Cardwell
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-04-02 16:32 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Jaco Kroon, Florian Westphal, LKML, Netdev, Yuchung Cheng, Wei Wang

On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote:
>
> FWIW those log entries indicate netfilter on the mail client machine
> dropping consecutive outbound skbs with 2*MSS of payload. So that
> explains the large consecutive losses of client data packets to the
> e-mail server. That seems to confirm my earlier hunch that those drops
> of consecutive client data packets "do not look like normal congestive
> packet loss".


This also explains why we have all these tiny 2-MSS packets in the pcap.

Under normal conditions, autocorking should kick in, allowing TCP to
build bigger TSO packets.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02 16:32                           ` Eric Dumazet
@ 2022-04-02 18:04                             ` Neal Cardwell
  2022-04-06 13:58                               ` Florian Westphal
  0 siblings, 1 reply; 38+ messages in thread
From: Neal Cardwell @ 2022-04-02 18:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jaco Kroon, Florian Westphal, LKML, Netdev, Yuchung Cheng, Wei Wang

On Sat, Apr 2, 2022 at 12:32 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote:
> >
> > FWIW those log entries indicate netfilter on the mail client machine
> > dropping consecutive outbound skbs with 2*MSS of payload. So that
> > explains the large consecutive losses of client data packets to the
> > e-mail server. That seems to confirm my earlier hunch that those drops
> > of consecutive client data packets "do not look like normal congestive
> > packet loss".
>
>
> This also explains why we have all these tiny 2-MSS packets in the pcap.
>
> Under normal conditions, autocorking should kick in, allowing TCP to
> build bigger TSO packets.

I have not looked at the conntrack code before today, but AFAICT this
is the buggy section of  nf_conntrack_proto_tcp.c:

        } else if (((state->state == TCP_CONNTRACK_SYN_SENT
                     && dir == IP_CT_DIR_ORIGINAL)
                   || (state->state == TCP_CONNTRACK_SYN_RECV
                     && dir == IP_CT_DIR_REPLY))
                   && after(end, sender->td_end)) {
                /*
                 * RFC 793: "if a TCP is reinitialized ... then it need
                 * not wait at all; it must only be sure to use sequence
                 * numbers larger than those recently used."
                 */
                sender->td_end =
                sender->td_maxend = end;
                sender->td_maxwin = (win == 0 ? 1 : win);

                tcp_options(skb, dataoff, tcph, sender);

Note that the tcp_options() function implicitly assumes it is being
called on a SYN, because it sets state->td_scale to 0 and only sets
state->td_scale to something non-zero if it sees a wscale option. So
if we ever call that on an skb that's not a SYN, we will forget that
the connection is using the wscale option.

But at this point in the code it is calling tcp_options() without
first checking that this is a SYN.

For this TFO scenario like the one in the trace, where the server
sends its first data packet after the SYNACK packet and before the
client's first ACK, presumably the conntrack state machine is
(correctly) SYN_RECV, and then (incorrectly) executes this code,
including the call to tcp_options(), on this first data packet, which
has no SYN bit, and no wscale option. Thus tcp_options() zeroes out
the server's sending state td_scale and does not set it to a non-zero
value. So now conntrack thinks the server is not using the wscale
option. So when conntrack interprets future receive windows from the
server, it does not scale them (with: win <<= sender->td_scale;), so
in this scenario the estimated right edge of the server's receive
window (td_maxend) is never advanced past the roughly 64KB value
offered in the SYN. Thus when the client sends data packets beyond
64KBytes, conntrack declares them invalid and drops them, due to
failing the condition Eric noted above:

   before(seq, sender->td_maxend + 1),

This explains my previous observation that the client's original data
packet transmissions are always dropped after the first 64KBytes.

Someone more familiar with conntrack may have a good idea about how to
best fix this?

neal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02 14:14                         ` Florian Westphal
  2022-04-02 15:57                           ` Neal Cardwell
@ 2022-04-02 21:51                           ` Jaco Kroon
  1 sibling, 0 replies; 38+ messages in thread
From: Jaco Kroon @ 2022-04-02 21:51 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Neal Cardwell, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang

Hi Florian,

On 2022/04/02 16:14, Florian Westphal wrote:
> Jaco Kroon <jaco@uls.co.za> wrote:
>> Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which
>> generates lots of logs, something specific I should be looking for?  I
>> suspect these relate:
>>
>> [Sat Apr  2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound
>> (over the window of the receiver) IN= OUT=bond0
>> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b
>> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64
>> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020
>> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8
>> GID=12
> I thought this had "liberal mode" enabled for tcp conntrack?
> The above implies its off.

We have liberal on the core firewalls, not on the endpoints ... yes, we
do double firewall :).

So the firewalls into the subnets has liberal mode (which really was an
oversight when axing conntrackd), but the servers themselves do not.

Kind Regards,
Jaco


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02 13:20                         ` Eric Dumazet
@ 2022-04-02 22:02                           ` Jaco Kroon
  0 siblings, 0 replies; 38+ messages in thread
From: Jaco Kroon @ 2022-04-02 22:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Neal Cardwell, Florian Westphal, LKML, Netdev, Yuchung Cheng,
	Wei Wang, Pablo Neira Ayuso, Sven Auhagen

Hi,

On 2022/04/02 15:20, Eric Dumazet wrote:

> Great. This confirms our suspicions.
>
> Please try the following patch that landed in 5.18-rc
>
> f2dd495a8d589371289981d5ed33e6873df94ecc netfilter: nf_conntrack_tcp:
> preserve liberal flag in tcp options

Will track this down and deploy in the next day or two.  Thank you, Neal
and Florian for all the assistance!

As an aside, would really like to engage with someone that can assist on
the known congestion w.r.t. Google services in JHB, so if you're willing
- or can get me in contact with the right people, please do contact me
direct off-list (we've alleviated the issue by upgrading out IPT but
would like to understand what is going on, can provide ticket references).


Kind Regards,
Jaco

>
> CC netfilter folks.
>
> Condition triggering the bug :
>    before(seq, sender->td_maxend + 1),
>
> I took a look at the code, and it is not clear if td_maxend is
> properly setup (or if td_scale is cleared at some point while it
> should not)
>
> Alternatively, if conntracking does not know if the connection is
> using wscale (or what is the scale), the "before(seq,
> sender->td_maxend + 1),"
> should not be evaluated/used.
>
> Also, I do not see where td_maxend is extended in tcp_init_sender()
>
> Probably wrong patch, just to point to the code I do not understand yet.
>
> diff --git a/net/netfilter/nf_conntrack_proto_tcp.c
> b/net/netfilter/nf_conntrack_proto_tcp.c
> index 8ec55cd72572e0cca076631e2cc1c11f0c2b86f6..950082785d61b7a2768559c7500d3aee3aaea7c2
> 100644
> --- a/net/netfilter/nf_conntrack_proto_tcp.c
> +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> @@ -456,9 +456,10 @@ static void tcp_init_sender(struct ip_ct_tcp_state *sender,
>         /* SYN-ACK in reply to a SYN
>          * or SYN from reply direction in simultaneous open.
>          */
> -       sender->td_end =
> -       sender->td_maxend = end;
> -       sender->td_maxwin = (win == 0 ? 1 : win);
> +       sender->td_end = end;
> +       sender->td_maxwin = max(win, 1U);
> +       /* WIN in SYN & SYNACK is not scaled */
> +       sender->td_maxend = end + sender->td_maxwin;
>
>         tcp_options(skb, dataoff, tcph, sender);
>         /* RFC 1323:

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-02 18:04                             ` Neal Cardwell
@ 2022-04-06 13:58                               ` Florian Westphal
  2022-04-06 19:04                                 ` Jozsef Kadlecsik
  0 siblings, 1 reply; 38+ messages in thread
From: Florian Westphal @ 2022-04-06 13:58 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Eric Dumazet, Jaco Kroon, netfilter-devel, netdev, kadlec

Neal Cardwell <ncardwell@google.com> wrote:

[ trimmed CCs, add Jozsef and nf-devel ]

Neal, Eric, thanks for debugging this problem.

> On Sat, Apr 2, 2022 at 12:32 PM Eric Dumazet <edumazet@google.com> wrote:
> > On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote:
> > > FWIW those log entries indicate netfilter on the mail client machine
> > > dropping consecutive outbound skbs with 2*MSS of payload. So that
> > > explains the large consecutive losses of client data packets to the
> > > e-mail server. That seems to confirm my earlier hunch that those drops
> > > of consecutive client data packets "do not look like normal congestive
> > > packet loss".
> >
> > This also explains why we have all these tiny 2-MSS packets in the pcap.
> > Under normal conditions, autocorking should kick in, allowing TCP to
> > build bigger TSO packets.
> 
> I have not looked at the conntrack code before today, but AFAICT this
> is the buggy section of  nf_conntrack_proto_tcp.c:
> 
>         } else if (((state->state == TCP_CONNTRACK_SYN_SENT
>                      && dir == IP_CT_DIR_ORIGINAL)
>                    || (state->state == TCP_CONNTRACK_SYN_RECV
>                      && dir == IP_CT_DIR_REPLY))
>                    && after(end, sender->td_end)) {
>                 /*
>                  * RFC 793: "if a TCP is reinitialized ... then it need
>                  * not wait at all; it must only be sure to use sequence
>                  * numbers larger than those recently used."
>                  */
>                 sender->td_end =
>                 sender->td_maxend = end;
>                 sender->td_maxwin = (win == 0 ? 1 : win);
> 
>                 tcp_options(skb, dataoff, tcph, sender);
> 
> Note that the tcp_options() function implicitly assumes it is being
> called on a SYN, because it sets state->td_scale to 0 and only sets
> state->td_scale to something non-zero if it sees a wscale option. So
> if we ever call that on an skb that's not a SYN, we will forget that
> the connection is using the wscale option.
>
> But at this point in the code it is calling tcp_options() without
> first checking that this is a SYN.

Yes, thats the bug, tcp_options() must not be called if syn bit is not
set.

> For this TFO scenario like the one in the trace, where the server
> sends its first data packet after the SYNACK packet and before the
> client's first ACK, presumably the conntrack state machine is
> (correctly) SYN_RECV, and then (incorrectly) executes this code,

Right.  Jozsef, for context, sequence is in trace is:

S > C Flags [S], seq 3451342529, win 62580, options [mss 8940,sackOK,TS val 331187616 ecr 0,nop,wscale 7,tfo [|tcp]>
C > S Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [mss 1440,sackOK,TS val 1206542770 ecr 331187616,nop,wscale 8], length 0
C > S Flags [P.], seq 1:89, ack 1, win 256, options [nop,nop,TS val 1206542772 ecr 331187616], length 88: SMTP [|smtp]

Normally, 3rd packet would be S > C, but this one is C > S.

So, packet #3 hits the 'reinit' branch which zaps wscale option.

> Someone more familiar with conntrack may have a good idea about how to
> best fix this?

Jozsef, does this look sane to you?
It fixes the TFO capture and still passes the test case i made for
82b72cb94666b3dbd7152bb9f441b068af7a921b
("netfilter: conntrack: re-init state for retransmitted syn-ack").

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 8ec55cd72572..90ad1c0f23b1 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -556,33 +556,24 @@ static bool tcp_in_window(struct nf_conn *ct,
 			}
 
 		}
-	} else if (((state->state == TCP_CONNTRACK_SYN_SENT
-		     && dir == IP_CT_DIR_ORIGINAL)
-		   || (state->state == TCP_CONNTRACK_SYN_RECV
-		     && dir == IP_CT_DIR_REPLY))
-		   && after(end, sender->td_end)) {
+	} else if (tcph->syn &&
+		   after(end, sender->td_end) &&
+		   (state->state == TCP_CONNTRACK_SYN_SENT ||
+		    state->state == TCP_CONNTRACK_SYN_RECV)) {
 		/*
 		 * RFC 793: "if a TCP is reinitialized ... then it need
 		 * not wait at all; it must only be sure to use sequence
 		 * numbers larger than those recently used."
-		 */
-		sender->td_end =
-		sender->td_maxend = end;
-		sender->td_maxwin = (win == 0 ? 1 : win);
-
-		tcp_options(skb, dataoff, tcph, sender);
-	} else if (tcph->syn && dir == IP_CT_DIR_REPLY &&
-		   state->state == TCP_CONNTRACK_SYN_SENT) {
-		/* Retransmitted syn-ack, or syn (simultaneous open).
 		 *
+		 * also check for retransmitted syn-ack, or syn (simultaneous open).
 		 * Re-init state for this direction, just like for the first
 		 * syn(-ack) reply, it might differ in seq, ack or tcp options.
+		 *
+		 * Check for invalid syn-ack in original direction was already done.
 		 */
 		tcp_init_sender(sender, receiver,
 				skb, dataoff, tcph,
 				end, win);
-		if (!tcph->ack)
-			return true;
 	}
 
 	if (!(tcph->ack)) {

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-06 13:58                               ` Florian Westphal
@ 2022-04-06 19:04                                 ` Jozsef Kadlecsik
  2022-04-07 10:26                                   ` Florian Westphal
  0 siblings, 1 reply; 38+ messages in thread
From: Jozsef Kadlecsik @ 2022-04-06 19:04 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Neal Cardwell, Eric Dumazet, Jaco Kroon, netfilter-devel, netdev

Hi Florian,

On Wed, 6 Apr 2022, Florian Westphal wrote:

> Neal Cardwell <ncardwell@google.com> wrote:
> 
> [ trimmed CCs, add Jozsef and nf-devel ]
> 
> Neal, Eric, thanks for debugging this problem.
> 
> > On Sat, Apr 2, 2022 at 12:32 PM Eric Dumazet <edumazet@google.com> wrote:
> > > On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote:
> > > > FWIW those log entries indicate netfilter on the mail client machine
> > > > dropping consecutive outbound skbs with 2*MSS of payload. So that
> > > > explains the large consecutive losses of client data packets to the
> > > > e-mail server. That seems to confirm my earlier hunch that those drops
> > > > of consecutive client data packets "do not look like normal congestive
> > > > packet loss".
> > >
> > > This also explains why we have all these tiny 2-MSS packets in the pcap.
> > > Under normal conditions, autocorking should kick in, allowing TCP to
> > > build bigger TSO packets.
> > 
> > I have not looked at the conntrack code before today, but AFAICT this
> > is the buggy section of  nf_conntrack_proto_tcp.c:
> > 
> >         } else if (((state->state == TCP_CONNTRACK_SYN_SENT
> >                      && dir == IP_CT_DIR_ORIGINAL)
> >                    || (state->state == TCP_CONNTRACK_SYN_RECV
> >                      && dir == IP_CT_DIR_REPLY))
> >                    && after(end, sender->td_end)) {
> >                 /*
> >                  * RFC 793: "if a TCP is reinitialized ... then it need
> >                  * not wait at all; it must only be sure to use sequence
> >                  * numbers larger than those recently used."
> >                  */
> >                 sender->td_end =
> >                 sender->td_maxend = end;
> >                 sender->td_maxwin = (win == 0 ? 1 : win);
> > 
> >                 tcp_options(skb, dataoff, tcph, sender);
> > 
> > Note that the tcp_options() function implicitly assumes it is being
> > called on a SYN, because it sets state->td_scale to 0 and only sets
> > state->td_scale to something non-zero if it sees a wscale option. So
> > if we ever call that on an skb that's not a SYN, we will forget that
> > the connection is using the wscale option.
> >
> > But at this point in the code it is calling tcp_options() without
> > first checking that this is a SYN.
> 
> Yes, thats the bug, tcp_options() must not be called if syn bit is not
> set.
> 
> > For this TFO scenario like the one in the trace, where the server
> > sends its first data packet after the SYNACK packet and before the
> > client's first ACK, presumably the conntrack state machine is
> > (correctly) SYN_RECV, and then (incorrectly) executes this code,
> 
> Right.  Jozsef, for context, sequence is in trace is:
> 
> S > C Flags [S], seq 3451342529, win 62580, options [mss 8940,sackOK,TS val 331187616 ecr 0,nop,wscale 7,tfo [|tcp]>
> C > S Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [mss 1440,sackOK,TS val 1206542770 ecr 331187616,nop,wscale 8], length 0
> C > S Flags [P.], seq 1:89, ack 1, win 256, options [nop,nop,TS val 1206542772 ecr 331187616], length 88: SMTP [|smtp]
> 
> Normally, 3rd packet would be S > C, but this one is C > S.
> 
> So, packet #3 hits the 'reinit' branch which zaps wscale option.
> 
> > Someone more familiar with conntrack may have a good idea about how to
> > best fix this?
> 
> Jozsef, does this look sane to you?
> It fixes the TFO capture and still passes the test case i made for
> 82b72cb94666b3dbd7152bb9f441b068af7a921b
> ("netfilter: conntrack: re-init state for retransmitted syn-ack").

As far as I see it'd break simultaneous open because after(end, 
sender->td_end) is called in the new condition:

> diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> index 8ec55cd72572..90ad1c0f23b1 100644
> --- a/net/netfilter/nf_conntrack_proto_tcp.c
> +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> @@ -556,33 +556,24 @@ static bool tcp_in_window(struct nf_conn *ct,
>  			}
>  
>  		}
> -	} else if (((state->state == TCP_CONNTRACK_SYN_SENT
> -		     && dir == IP_CT_DIR_ORIGINAL)
> -		   || (state->state == TCP_CONNTRACK_SYN_RECV
> -		     && dir == IP_CT_DIR_REPLY))
> -		   && after(end, sender->td_end)) {
> +	} else if (tcph->syn &&
> +		   after(end, sender->td_end) &&
> +		   (state->state == TCP_CONNTRACK_SYN_SENT ||
> +		    state->state == TCP_CONNTRACK_SYN_RECV)) {
>  		/*
>  		 * RFC 793: "if a TCP is reinitialized ... then it need
>  		 * not wait at all; it must only be sure to use sequence
>  		 * numbers larger than those recently used."
> -		 */
> -		sender->td_end =
> -		sender->td_maxend = end;
> -		sender->td_maxwin = (win == 0 ? 1 : win);
> -
> -		tcp_options(skb, dataoff, tcph, sender);
> -	} else if (tcph->syn && dir == IP_CT_DIR_REPLY &&
> -		   state->state == TCP_CONNTRACK_SYN_SENT) {
> -		/* Retransmitted syn-ack, or syn (simultaneous open).
>  		 *
> +		 * also check for retransmitted syn-ack, or syn (simultaneous open).
>  		 * Re-init state for this direction, just like for the first
>  		 * syn(-ack) reply, it might differ in seq, ack or tcp options.
> +		 *
> +		 * Check for invalid syn-ack in original direction was already done.
>  		 */
>  		tcp_init_sender(sender, receiver,
>  				skb, dataoff, tcph,
>  				end, win);
> -		if (!tcph->ack)
> -			return true;
>  	}
>  
>  	if (!(tcph->ack)) {
> 

I'd merge the two conditions so that it'd cover both original condition 
branches:

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 8ec55cd72572..87375ce2f995 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct,
 			}
 
 		}
-	} else if (((state->state == TCP_CONNTRACK_SYN_SENT
-		     && dir == IP_CT_DIR_ORIGINAL)
-		   || (state->state == TCP_CONNTRACK_SYN_RECV
-		     && dir == IP_CT_DIR_REPLY))
-		   && after(end, sender->td_end)) {
+	} else if (tcph->syn &&
+		   ((after(end, sender->td_end) &&
+		     (state->state == TCP_CONNTRACK_SYN_SENT ||
+		      state->state == TCP_CONNTRACK_SYN_RECV)) ||
+		    (dir == IP_CT_DIR_REPLY &&
+		     state->state == TCP_CONNTRACK_SYN_SENT))) {
 		/*
 		 * RFC 793: "if a TCP is reinitialized ... then it need
 		 * not wait at all; it must only be sure to use sequence
 		 * numbers larger than those recently used."
-		 */
-		sender->td_end =
-		sender->td_maxend = end;
-		sender->td_maxwin = (win == 0 ? 1 : win);
-
-		tcp_options(skb, dataoff, tcph, sender);
-	} else if (tcph->syn && dir == IP_CT_DIR_REPLY &&
-		   state->state == TCP_CONNTRACK_SYN_SENT) {
-		/* Retransmitted syn-ack, or syn (simultaneous open).
 		 *
+		 * also check for retransmitted syn-ack, or syn (simultaneous open).
 		 * Re-init state for this direction, just like for the first
 		 * syn(-ack) reply, it might differ in seq, ack or tcp options.
+		 *
+		 * Check for invalid syn-ack in original direction was already done.
 		 */
 		tcp_init_sender(sender, receiver,
 				skb, dataoff, tcph,
 				end, win);
-		if (!tcph->ack)
-			return true;
 	}
 
 	if (!(tcph->ack)) {

What do you think?

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-06 19:04                                 ` Jozsef Kadlecsik
@ 2022-04-07 10:26                                   ` Florian Westphal
  2022-04-07 12:48                                     ` Jozsef Kadlecsik
  0 siblings, 1 reply; 38+ messages in thread
From: Florian Westphal @ 2022-04-07 10:26 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Florian Westphal, Neal Cardwell, Eric Dumazet, Jaco Kroon,
	netfilter-devel, netdev

Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> I'd merge the two conditions so that it'd cover both original condition 
> branches:
> 
> diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> index 8ec55cd72572..87375ce2f995 100644
> --- a/net/netfilter/nf_conntrack_proto_tcp.c
> +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct,
>  			}
>  
>  		}
> -	} else if (((state->state == TCP_CONNTRACK_SYN_SENT
> -		     && dir == IP_CT_DIR_ORIGINAL)
> -		   || (state->state == TCP_CONNTRACK_SYN_RECV
> -		     && dir == IP_CT_DIR_REPLY))
> -		   && after(end, sender->td_end)) {
> +	} else if (tcph->syn &&
> +		   ((after(end, sender->td_end) &&
> +		     (state->state == TCP_CONNTRACK_SYN_SENT ||
> +		      state->state == TCP_CONNTRACK_SYN_RECV)) ||
> +		    (dir == IP_CT_DIR_REPLY &&
> +		     state->state == TCP_CONNTRACK_SYN_SENT))) {

Thats what I did as well, I merged the two branches but I made the
2nd clause stricter to also consider the after() test; it would no
longer re-init for syn-acks when sequence did not advance.

Then, dir == IP_CT_DIR_REPLY && state == SYN_SENT is already covered
by earlier test and can be elided.

I'm fine with your version though, will you submit a patch?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-07 10:26                                   ` Florian Westphal
@ 2022-04-07 12:48                                     ` Jozsef Kadlecsik
  2022-04-21 21:14                                       ` Eric Dumazet
  0 siblings, 1 reply; 38+ messages in thread
From: Jozsef Kadlecsik @ 2022-04-07 12:48 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Neal Cardwell, Eric Dumazet, Jaco Kroon, netfilter-devel, netdev

On Thu, 7 Apr 2022, Florian Westphal wrote:

> Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > I'd merge the two conditions so that it'd cover both original condition 
> > branches:
> > 
> > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> > index 8ec55cd72572..87375ce2f995 100644
> > --- a/net/netfilter/nf_conntrack_proto_tcp.c
> > +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> > @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct,
> >  			}
> >  
> >  		}
> > -	} else if (((state->state == TCP_CONNTRACK_SYN_SENT
> > -		     && dir == IP_CT_DIR_ORIGINAL)
> > -		   || (state->state == TCP_CONNTRACK_SYN_RECV
> > -		     && dir == IP_CT_DIR_REPLY))
> > -		   && after(end, sender->td_end)) {
> > +	} else if (tcph->syn &&
> > +		   ((after(end, sender->td_end) &&
> > +		     (state->state == TCP_CONNTRACK_SYN_SENT ||
> > +		      state->state == TCP_CONNTRACK_SYN_RECV)) ||
> > +		    (dir == IP_CT_DIR_REPLY &&
> > +		     state->state == TCP_CONNTRACK_SYN_SENT))) {
> 
> Thats what I did as well, I merged the two branches but I made the
> 2nd clause stricter to also consider the after() test; it would no
> longer re-init for syn-acks when sequence did not advance.

That's perfectly fine.

But what about simultaneous syn? The TCP state is zeroed in the REPLY 
direction, so the after() test can easily be false and the state wouldn't 
be picked up. Therefore I extended the condition.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-07 12:48                                     ` Jozsef Kadlecsik
@ 2022-04-21 21:14                                       ` Eric Dumazet
  2022-04-25  9:29                                         ` Florian Westphal
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Dumazet @ 2022-04-21 21:14 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Florian Westphal, Neal Cardwell, Jaco Kroon, netfilter-devel, netdev

On Thu, Apr 7, 2022 at 5:48 AM Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
>
> On Thu, 7 Apr 2022, Florian Westphal wrote:
>
> > Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> > > I'd merge the two conditions so that it'd cover both original condition
> > > branches:
> > >
> > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> > > index 8ec55cd72572..87375ce2f995 100644
> > > --- a/net/netfilter/nf_conntrack_proto_tcp.c
> > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> > > @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct,
> > >                     }
> > >
> > >             }
> > > -   } else if (((state->state == TCP_CONNTRACK_SYN_SENT
> > > -                && dir == IP_CT_DIR_ORIGINAL)
> > > -              || (state->state == TCP_CONNTRACK_SYN_RECV
> > > -                && dir == IP_CT_DIR_REPLY))
> > > -              && after(end, sender->td_end)) {
> > > +   } else if (tcph->syn &&
> > > +              ((after(end, sender->td_end) &&
> > > +                (state->state == TCP_CONNTRACK_SYN_SENT ||
> > > +                 state->state == TCP_CONNTRACK_SYN_RECV)) ||
> > > +               (dir == IP_CT_DIR_REPLY &&
> > > +                state->state == TCP_CONNTRACK_SYN_SENT))) {
> >
> > Thats what I did as well, I merged the two branches but I made the
> > 2nd clause stricter to also consider the after() test; it would no
> > longer re-init for syn-acks when sequence did not advance.
>
> That's perfectly fine.
>
> But what about simultaneous syn? The TCP state is zeroed in the REPLY
> direction, so the after() test can easily be false and the state wouldn't
> be picked up. Therefore I extended the condition.
>

Hi Jozsef and Florian

Any updates for this issue ?

Thanks !

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections
  2022-04-21 21:14                                       ` Eric Dumazet
@ 2022-04-25  9:29                                         ` Florian Westphal
  0 siblings, 0 replies; 38+ messages in thread
From: Florian Westphal @ 2022-04-25  9:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jozsef Kadlecsik, Florian Westphal, Neal Cardwell, Jaco Kroon,
	netfilter-devel, netdev

Eric Dumazet <edumazet@google.com> wrote:
> Hi Jozsef and Florian
> 
> Any updates for this issue ?

Sorry, I was away for a while.  I will send the patch formally in a few
minutes.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2022-04-25  9:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-30  0:56 linux 5.17.1 disregarding ACK values resulting in stalled TCP connections Jaco
2022-03-30  2:01 ` Neal Cardwell
2022-03-30  2:40   ` Eric Dumazet
2022-03-30  2:58   ` Jaco Kroon
2022-03-30  3:48     ` Eric Dumazet
2022-03-30  6:22       ` Jaco Kroon
2022-03-30 13:56         ` Neal Cardwell
2022-03-30 15:00           ` Jaco Kroon
2022-03-30 16:19             ` Eric Dumazet
2022-03-31 15:41               ` Neal Cardwell
2022-03-31 23:06                 ` Jaco Kroon
2022-04-01  0:10                   ` Eric Dumazet
2022-04-01  0:15                     ` Florian Westphal
2022-04-01 11:54                       ` Jaco Kroon
2022-04-01 12:09                         ` Florian Westphal
2022-04-01  0:33                     ` Jaco Kroon
2022-04-01  0:41                       ` Eric Dumazet
2022-04-01  0:54                         ` Eric Dumazet
2022-04-01 11:36                           ` Jaco Kroon
2022-04-01 13:54                             ` Eric Dumazet
2022-04-01 14:50                   ` Neal Cardwell
2022-04-01 15:39                     ` Neal Cardwell
2022-04-01 15:48                       ` Neal Cardwell
2022-04-02  8:42                       ` Jaco Kroon
2022-04-02 13:20                         ` Eric Dumazet
2022-04-02 22:02                           ` Jaco Kroon
2022-04-02 14:14                         ` Florian Westphal
2022-04-02 15:57                           ` Neal Cardwell
2022-04-02 21:51                           ` Jaco Kroon
2022-04-02 16:29                         ` Neal Cardwell
2022-04-02 16:32                           ` Eric Dumazet
2022-04-02 18:04                             ` Neal Cardwell
2022-04-06 13:58                               ` Florian Westphal
2022-04-06 19:04                                 ` Jozsef Kadlecsik
2022-04-07 10:26                                   ` Florian Westphal
2022-04-07 12:48                                     ` Jozsef Kadlecsik
2022-04-21 21:14                                       ` Eric Dumazet
2022-04-25  9:29                                         ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.