* linux 5.17.1 disregarding ACK values resulting in stalled TCP connections @ 2022-03-30 0:56 Jaco 2022-03-30 2:01 ` Neal Cardwell 0 siblings, 1 reply; 38+ messages in thread From: Jaco @ 2022-03-30 0:56 UTC (permalink / raw) To: LKML [-- Attachment #1: Type: text/plain, Size: 9844 bytes --] Dear All, I'm seeing very strange TCP behaviour. Disabled TCP Segmentation Offload to try and pinpoint this more closely. It seems the kernel is ignoring ACKs coming from the remote side in some cases. In this case, on one of four hosts, and seemingly between this one host and Google ... (We've have two emails to google stuck on another host due to same issue, but several hundred others passed out today on that same host). I also killed selective ACKs as a test as these are known to sometimes cause issues for firewalls and "tcp accelerators" (or used to at the very least). SMTP connection between ourselves and Google ... I'm going to be selective in copying from tcpdump (full coversation up to the point where I killed it because it plainly got stuck in a loop is attached). Connection setup: 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus we shouldn't send segments larger than that, and they "can't". I need to determine if this is some form of offloading or they really are sending >1500 byte frames (which I know won't pass our firewalls without fragmentation so probably some form of NIC offloading - which if it was active on older 5.8 kernels did not cause problems): 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP These are the only two frames I can find that supposedly exceeds the MSS values (although, they don't exceed our value). Then everything goes pretty normal for a bit. The last data we receive from the remote side before stuff goes wrong: 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP We ACK immediately along with the next segment: 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP Hereafter there is a flurry of data that we transmit, all nicely acknowledged, no retransmits that I can pick up (eyeballs). Before a long sequence of TX data we get this ACK: 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 We then continue to RX a sequence of: 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP up to: 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP Before we hit our first retransmit: 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP Since 956700036 is the last ACKed data, this seems correct, not sure what timer this is based on though, the ACK for the just prior data came in ~384ms prior (could be based on normal time to ACK, I don't know, this is about double the usual round-trip-time currently). And then we receive this ACK (we can see this time the kernel waited for ACK of this single segment): 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0 Then we do something (in my opinion) strange by jumping back to the tail of the previous burst: 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP We then jump back and retransmit again from the just received ACK: 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP We then continue from there on as I'd expect (slow restart), this goes pretty normal up to: 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0 Up to here is fine, now things gets bizarre, we just jump to a different sequence number, which has already been ACKed: 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq *956707176*:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0 And remote side re-ACKs the 956710032 value, which frankly indicates we need to realize that the data we are transmitting has already been received, and we can continue on to transmit the segments following up on sequence number 956710032, instead we choose to get stuck in this sequence: 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0 And here the connection dies. It eventually times out, and we retry to the next host, resulting in the same problem. I am aware that Google is having congestion issues in the JHB area in SA currently, and there are probably packet delays and losses somewhere along the line between us, but this really should not stall as dead as it is here. Looking at only the incoming ACK values, I can see they are strictly increasing, so we've never received an ACK > 956710032, but this is still greater than the value we are retransmitting. The first time we transmitted the frame at sequence number 956707176 was part of the longest sequence of TX frames without a returning ACK, part of this sequence: ... 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP ... Google here is ACKing not only the frame we are continuously retransmitting, but also the frame directly after ... so why would the kernel not move on to retransmitting starting from sequence number 956710032 (which is larger than the start sequence number of the frame we are retransmitting)? Kind Regards, Jaco [-- Attachment #2: iewc_google.txt --] [-- Type: text/plain, Size: 38985 bytes --] 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 00:56:17.218754 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633780:956633803, ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 23: SMTP: EHLO uriel.iewc.co.za 00:56:17.380021 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956633803, win 256, options [nop,nop,TS val 3477429381 ecr 3687705645], length 0 00:56:17.383685 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465760:726465949, ack 956633803, win 256, options [nop,nop,TS val 3477429384 ecr 3687705645], length 189: SMTP: 250-mx.google.com at your service, [2c0f:f720:0:3:d6ae:52ff:feb8:f27b] 00:56:17.383714 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 0 00:56:17.383934 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633803:956633813, ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 10: SMTP: STARTTLS 00:56:17.546391 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465949:726465979, ack 956633813, win 256, options [nop,nop,TS val 3477429547 ecr 3687705810], length 30: SMTP: 220 2.0.0 Ready to start TLS 00:56:17.546430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 0 00:56:17.547288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633813:956634111, ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 298: SMTP 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470811:726470898, ack 956634111, win 261, options [nop,nop,TS val 3477429711 ecr 3687705973], length 87: SMTP 00:56:17.709949 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726468395, win 470, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0 00:56:17.709964 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470811, win 452, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0 00:56:17.709978 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470898, win 452, options [nop,nop,TS val 3687706136 ecr 3477429711], length 0 00:56:17.712221 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634111:956634191, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 80: SMTP 00:56:17.712353 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634191:956634236, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 45: SMTP 00:56:17.874774 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956634236, win 261, options [nop,nop,TS val 3477429875 ecr 3687706138], length 0 00:56:17.875264 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470898:726471633, ack 956634236, win 261, options [nop,nop,TS val 3477429876 ecr 3687706138], length 735: SMTP 00:56:17.904288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634236:956634348, ack 726471633, win 447, options [nop,nop,TS val 3687706330 ecr 3477429876], length 112: SMTP 00:56:18.066936 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471633:726471728, ack 956634348, win 261, options [nop,nop,TS val 3477430067 ecr 3687706330], length 95: SMTP 00:56:18.088465 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471728:726471823, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 95: SMTP 00:56:18.088603 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726471823, win 446, options [nop,nop,TS val 3687706515 ecr 3477430067], length 0 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.088973 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956635776:956637204, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.088988 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956637204:956638632, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.088990 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956638632:956640060, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089099 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956640060:956641488, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089103 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956641488:956642916, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089117 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956642916:956644344, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089119 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956644344:956645772, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089226 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956645772:956647200, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089229 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956647200:956648628, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956637204, win 283, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956640060, win 305, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956642916, win 328, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956645772, win 350, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251295 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956648628:956650056, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956650056:956651484, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251314 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956651484:956652912, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251317 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956652912:956654340, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251378 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956648628, win 372, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956654340:956655768, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251435 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956655768:956657196, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251455 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956657196:956658624, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251458 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956658624:956660052, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251563 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956660052:956661480, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251566 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956661480:956662908, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251583 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956662908:956664336, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251585 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956664336:956665764, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251694 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956665764:956667192, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251697 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956667192:956668620, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251713 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956668620:956670048, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251716 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956670048:956671476, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251826 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956671476:956672904, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251829 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956672904:956674332, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251841 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956674332:956675760, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251844 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956675760:956677188, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956651484, win 395, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0 00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956654340, win 417, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0 00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956657196, win 439, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956660052, win 461, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956662908, win 484, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956665764, win 506, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413635 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956677188:956678616, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413639 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956678616:956680044, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413655 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956680044:956681472, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413657 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956681472:956682900, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413774 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956682900:956684328, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413780 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956684328:956685756, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413807 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956685756:956687184, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413810 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956687184:956688612, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956668620, win 528, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956671476, win 551, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413920 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956688612:956690040, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956674332, win 573, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0 00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956677188, win 595, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0 00:56:18.413924 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956690040:956691468, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413938 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956691468:956692896, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413940 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956692896:956694324, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.414048 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956694324:956695752, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414052 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956695752:956697180, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414065 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956697180:956698608, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414067 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956698608:956700036, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414174 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414177 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414190 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414192 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414318 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956710032:956711460, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414424 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956711460:956712888, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414427 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956712888:956714316, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414440 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956714316:956715744, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414442 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956715744:956717172, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414546 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956717172:956718600, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414550 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956718600:956720028, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414562 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956720028:956721456, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414565 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956721456:956722884, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414670 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956722884:956724312, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414673 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956724312:956725740, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414685 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956725740:956727168, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414687 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956727168:956728596, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414793 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956728596:956730024, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414796 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956730024:956731452, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414809 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956731452:956732880, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414811 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956732880:956734308, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956680044, win 618, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956682900, win 640, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956685756, win 662, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956688612, win 685, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956691468, win 707, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576005 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956734308:956735736, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576010 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956735736:956737164, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576025 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956737164:956738592, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956738592:956740020, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576066 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956694324, win 729, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576146 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956740020:956741448, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576152 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956741448:956742876, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576180 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956742876:956744304, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576184 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956744304:956745732, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956697180, win 752, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576304 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956747160:956748588, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576325 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956748588:956750016, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576328 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956750016:956751444, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576441 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956751444:956752872, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576445 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956752872:956754300, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576467 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956754300:956755728, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576470 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956755728:956757156, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576582 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956757156:956758584, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576586 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956758584:956760012, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576606 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956760012:956761440, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576609 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956761440:956762868, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576722 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956762868:956764296, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576726 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956764296:956765724, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576746 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956765724:956767152, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576749 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956767152:956768580, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576863 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956768580:956770008, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576867 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956770008:956771436, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576889 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956771436:956772864, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576892 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956772864:956774292, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577004 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956774292:956775720, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577008 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956775720:956777148, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956777148:956778576, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP 00:56:19.672429 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956702892, win 796, options [nop,nop,TS val 3477431673 ecr 3687707936], length 0 00:56:19.672489 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP 00:56:19.672494 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP 00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956704320, win 807, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0 00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956705748, win 818, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0 00:56:19.834818 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP 00:56:19.834846 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP 00:56:19.997087 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956707176, win 830, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0 00:56:25.900090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687714326 ecr 3477431998], length 1428: SMTP 00:56:26.062272 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477438063 ecr 3687708423], length 0 00:56:32.620090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687721046 ecr 3477431998], length 1428: SMTP 00:56:32.782226 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477444783 ecr 3687708423], length 0 00:56:45.420093 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687733846 ecr 3477431998], length 1428: SMTP 00:56:45.581587 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477457582 ecr 3687708423], length 0 00:57:10.380083 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687758806 ecr 3477431998], length 1428: SMTP 00:57:10.542248 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477482543 ecr 3687708423], length 0 00:58:00.940090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687809366 ecr 3477431998], length 1428: SMTP 00:58:01.102342 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477533103 ecr 3687708423], length 0 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 0:56 linux 5.17.1 disregarding ACK values resulting in stalled TCP connections Jaco @ 2022-03-30 2:01 ` Neal Cardwell 2022-03-30 2:40 ` Eric Dumazet 2022-03-30 2:58 ` Jaco Kroon 0 siblings, 2 replies; 38+ messages in thread From: Neal Cardwell @ 2022-03-30 2:01 UTC (permalink / raw) To: Jaco; +Cc: LKML, Netdev, Eric Dumazet, Yuchung Cheng [-- Attachment #1: Type: text/plain, Size: 10969 bytes --] On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote: > > Dear All, > > I'm seeing very strange TCP behaviour. Disabled TCP Segmentation Offload to > try and pinpoint this more closely. > > It seems the kernel is ignoring ACKs coming from the remote side in some cases. > In this case, on one of four hosts, and seemingly between this one host and > Google ... (We've have two emails to google stuck on another host due to same > issue, but several hundred others passed out today on that same host). I also > killed selective ACKs as a test as these are known to sometimes cause issues > for firewalls and "tcp accelerators" (or used to at the very least). > > SMTP connection between ourselves and Google ... I'm going to be selective in > copying from tcpdump (full coversation up to the point where I killed it > because it plainly got stuck in a loop is attached). > > Connection setup: > > 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 > > 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 > > 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp > > 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 > > This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus > we shouldn't send segments larger than that, and they "can't". I need to > determine if this is some form of offloading or they really are sending >1500 > byte frames (which I know won't pass our firewalls without fragmentation so > probably some form of NIC offloading - which if it was active on older 5.8 > kernels did not cause problems): > > 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP > > 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP > > These are the only two frames I can find that supposedly exceeds the MSS values > (although, they don't exceed our value). > > Then everything goes pretty normal for a bit. The last data we receive from > the remote side before stuff goes wrong: > > 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP > > We ACK immediately along with the next segment: > > 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP > > Hereafter there is a flurry of data that we transmit, all nicely acknowledged, > no retransmits that I can pick up (eyeballs). > > Before a long sequence of TX data we get this ACK: > > 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 > > We then continue to RX a sequence of: > > 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP > > up to: > > 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP > > Before we hit our first retransmit: > > 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP > > Since 956700036 is the last ACKed data, this seems correct, not sure what timer > this is based on though, the ACK for the just prior data came in ~384ms prior > (could be based on normal time to ACK, I don't know, this is about double the > usual round-trip-time currently). > > And then we receive this ACK (we can see this time the kernel waited for ACK of > this single segment): > > 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0 > > Then we do something (in my opinion) strange by jumping back to the tail of the previous burst: > > 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP > > 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP > > We then jump back and retransmit again from the just received ACK: > > 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP > > We then continue from there on as I'd expect (slow restart), this goes pretty > normal up to: > > 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 > > 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP > > 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0 > > Up to here is fine, now things gets bizarre, we just jump to a different > sequence number, which has already been ACKed: > > 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq *956707176*:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP > > 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0 > > And remote side re-ACKs the 956710032 value, which frankly indicates we need to > realize that the data we are transmitting has already been received, and we can > continue on to transmit the segments following up on sequence number 956710032, > instead we choose to get stuck in this sequence: > > 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP > > 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0 > > 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP > > 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0 > > And here the connection dies. It eventually times out, and we retry to the > next host, resulting in the same problem. > > I am aware that Google is having congestion issues in the JHB area in SA > currently, and there are probably packet delays and losses somewhere along the > line between us, but this really should not stall as dead as it is here. > > Looking at only the incoming ACK values, I can see they are strictly > increasing, so we've never received an ACK > 956710032, but this is still > greater than the value we are retransmitting. > > The first time we transmitted the frame at sequence number 956707176 was part > of the longest sequence of TX frames without a returning ACK, part of this > sequence: > > ... > > 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP > > 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP > > 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP > > ... > > Google here is ACKing not only the frame we are continuously retransmitting, > but also the frame directly after ... so why would the kernel not move on to > retransmitting starting from sequence number 956710032 (which is larger than > the start sequence number of the frame we are retransmitting)? > > Kind Regards, > Jaco Thanks for the report! I have CC-ed the netdev list, since it is probably a better forum for this discussion. Can you please attach (or link to) a tcpdump raw .pcap file (produced with the -w flag)? There are a number of tools that will make this easier to visualize and analyze if we can see the raw .pcap file. You may want to anonymize the trace and/or capture just headers, etc (for example, the -s flag can control how much of each packet tcpdump grabs). Can you please share the exact kernel version of the client machine? Also, can you please summarize/clarify whether you think the client, server, or both are misbehaving? Thanks! neal [-- Attachment #2: iewc_google.txt --] [-- Type: text/plain, Size: 38985 bytes --] 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 00:56:17.218754 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633780:956633803, ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 23: SMTP: EHLO uriel.iewc.co.za 00:56:17.380021 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956633803, win 256, options [nop,nop,TS val 3477429381 ecr 3687705645], length 0 00:56:17.383685 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465760:726465949, ack 956633803, win 256, options [nop,nop,TS val 3477429384 ecr 3687705645], length 189: SMTP: 250-mx.google.com at your service, [2c0f:f720:0:3:d6ae:52ff:feb8:f27b] 00:56:17.383714 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 0 00:56:17.383934 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633803:956633813, ack 726465949, win 488, options [nop,nop,TS val 3687705810 ecr 3477429384], length 10: SMTP: STARTTLS 00:56:17.546391 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465949:726465979, ack 956633813, win 256, options [nop,nop,TS val 3477429547 ecr 3687705810], length 30: SMTP: 220 2.0.0 Ready to start TLS 00:56:17.546430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 0 00:56:17.547288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956633813:956634111, ack 726465979, win 488, options [nop,nop,TS val 3687705973 ecr 3477429547], length 298: SMTP 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470811:726470898, ack 956634111, win 261, options [nop,nop,TS val 3477429711 ecr 3687705973], length 87: SMTP 00:56:17.709949 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726468395, win 470, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0 00:56:17.709964 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470811, win 452, options [nop,nop,TS val 3687706136 ecr 3477429710], length 0 00:56:17.709978 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726470898, win 452, options [nop,nop,TS val 3687706136 ecr 3477429711], length 0 00:56:17.712221 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634111:956634191, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 80: SMTP 00:56:17.712353 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634191:956634236, ack 726470898, win 452, options [nop,nop,TS val 3687706138 ecr 3477429711], length 45: SMTP 00:56:17.874774 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956634236, win 261, options [nop,nop,TS val 3477429875 ecr 3687706138], length 0 00:56:17.875264 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726470898:726471633, ack 956634236, win 261, options [nop,nop,TS val 3477429876 ecr 3687706138], length 735: SMTP 00:56:17.904288 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956634236:956634348, ack 726471633, win 447, options [nop,nop,TS val 3687706330 ecr 3477429876], length 112: SMTP 00:56:18.066936 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471633:726471728, ack 956634348, win 261, options [nop,nop,TS val 3477430067 ecr 3687706330], length 95: SMTP 00:56:18.088465 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471728:726471823, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 95: SMTP 00:56:18.088603 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726471823, win 446, options [nop,nop,TS val 3687706515 ecr 3477430067], length 0 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.088973 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956635776:956637204, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.088988 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956637204:956638632, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.088990 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956638632:956640060, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089099 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956640060:956641488, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089103 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956641488:956642916, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089117 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956642916:956644344, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089119 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956644344:956645772, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089226 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956645772:956647200, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.089229 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956647200:956648628, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP 00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956637204, win 283, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251223 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956640060, win 305, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956642916, win 328, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251224 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956645772, win 350, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251295 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956648628:956650056, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956650056:956651484, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251314 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956651484:956652912, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251317 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956652912:956654340, ack 726471919, win 446, options [nop,nop,TS val 3687706677 ecr 3477430252], length 1428: SMTP 00:56:18.251378 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956648628, win 372, options [nop,nop,TS val 3477430252 ecr 3687706515], length 0 00:56:18.251430 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956654340:956655768, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251435 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956655768:956657196, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251455 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956657196:956658624, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251458 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956658624:956660052, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251563 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956660052:956661480, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251566 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956661480:956662908, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251583 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956662908:956664336, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251585 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956664336:956665764, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251694 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956665764:956667192, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251697 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956667192:956668620, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251713 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956668620:956670048, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251716 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956670048:956671476, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251826 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956671476:956672904, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251829 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956672904:956674332, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251841 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956674332:956675760, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.251844 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956675760:956677188, ack 726471919, win 446, options [nop,nop,TS val 3687706678 ecr 3477430252], length 1428: SMTP 00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956651484, win 395, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0 00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956654340, win 417, options [nop,nop,TS val 3477430414 ecr 3687706677], length 0 00:56:18.413569 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956657196, win 439, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956660052, win 461, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956662908, win 484, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413570 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956665764, win 506, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413635 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956677188:956678616, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413639 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956678616:956680044, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413655 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956680044:956681472, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413657 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956681472:956682900, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413774 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956682900:956684328, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413780 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956684328:956685756, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413807 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956685756:956687184, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413810 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956687184:956688612, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956668620, win 528, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413854 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956671476, win 551, options [nop,nop,TS val 3477430414 ecr 3687706678], length 0 00:56:18.413920 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956688612:956690040, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956674332, win 573, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0 00:56:18.413920 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956677188, win 595, options [nop,nop,TS val 3477430415 ecr 3687706678], length 0 00:56:18.413924 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956690040:956691468, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413938 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956691468:956692896, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.413940 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956692896:956694324, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430414], length 1428: SMTP 00:56:18.414048 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956694324:956695752, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414052 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956695752:956697180, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414065 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956697180:956698608, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414067 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956698608:956700036, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414174 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414177 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414190 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414192 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414318 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956710032:956711460, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP 00:56:18.414424 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956711460:956712888, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414427 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956712888:956714316, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414440 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956714316:956715744, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414442 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956715744:956717172, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414546 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956717172:956718600, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414550 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956718600:956720028, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414562 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956720028:956721456, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414565 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956721456:956722884, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414670 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956722884:956724312, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414673 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956724312:956725740, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414685 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956725740:956727168, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414687 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956727168:956728596, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414793 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956728596:956730024, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414796 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956730024:956731452, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414809 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956731452:956732880, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.414811 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956732880:956734308, ack 726471919, win 446, options [nop,nop,TS val 3687706841 ecr 3477430415], length 1428: SMTP 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956680044, win 618, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956682900, win 640, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956685756, win 662, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956688612, win 685, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.575940 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956691468, win 707, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576005 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956734308:956735736, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576010 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956735736:956737164, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576025 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956737164:956738592, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956738592:956740020, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576066 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956694324, win 729, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576146 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956740020:956741448, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576152 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956741448:956742876, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576180 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956742876:956744304, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576184 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956744304:956745732, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956697180, win 752, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576304 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956747160:956748588, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576325 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956748588:956750016, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576328 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956750016:956751444, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP 00:56:18.576441 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956751444:956752872, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576445 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956752872:956754300, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576467 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956754300:956755728, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576470 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956755728:956757156, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576582 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956757156:956758584, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576586 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956758584:956760012, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576606 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956760012:956761440, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576609 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956761440:956762868, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576722 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956762868:956764296, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576726 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956764296:956765724, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576746 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956765724:956767152, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576749 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956767152:956768580, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576863 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956768580:956770008, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576867 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956770008:956771436, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576889 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956771436:956772864, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.576892 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956772864:956774292, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577004 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956774292:956775720, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577008 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956775720:956777148, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577028 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956777148:956778576, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP 00:56:19.672429 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956702892, win 796, options [nop,nop,TS val 3477431673 ecr 3687707936], length 0 00:56:19.672489 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956702892:956704320, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP 00:56:19.672494 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956704320:956705748, ack 726471919, win 446, options [nop,nop,TS val 3687708099 ecr 3477431673], length 1428: SMTP 00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956704320, win 807, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0 00:56:19.834765 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956705748, win 818, options [nop,nop,TS val 3477431835 ecr 3687708099], length 0 00:56:19.834818 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP 00:56:19.834846 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708261 ecr 3477431835], length 1428: SMTP 00:56:19.997087 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956707176, win 830, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0 00:56:25.900090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687714326 ecr 3477431998], length 1428: SMTP 00:56:26.062272 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477438063 ecr 3687708423], length 0 00:56:32.620090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687721046 ecr 3477431998], length 1428: SMTP 00:56:32.782226 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477444783 ecr 3687708423], length 0 00:56:45.420093 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687733846 ecr 3477431998], length 1428: SMTP 00:56:45.581587 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477457582 ecr 3687708423], length 0 00:57:10.380083 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687758806 ecr 3477431998], length 1428: SMTP 00:57:10.542248 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477482543 ecr 3687708423], length 0 00:58:00.940090 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687809366 ecr 3477431998], length 1428: SMTP 00:58:01.102342 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477533103 ecr 3687708423], length 0 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 2:01 ` Neal Cardwell @ 2022-03-30 2:40 ` Eric Dumazet 2022-03-30 2:58 ` Jaco Kroon 1 sibling, 0 replies; 38+ messages in thread From: Eric Dumazet @ 2022-03-30 2:40 UTC (permalink / raw) To: Neal Cardwell; +Cc: Jaco, LKML, Netdev, Yuchung Cheng On Tue, Mar 29, 2022 at 7:01 PM Neal Cardwell <ncardwell@google.com> wrote: > > On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote: > > > > Dear All, > > > > I'm seeing very strange TCP behaviour. Disabled TCP Segmentation Offload to > > try and pinpoint this more closely. > > > > It seems the kernel is ignoring ACKs coming from the remote side in some cases. > > In this case, on one of four hosts, and seemingly between this one host and > > Google ... (We've have two emails to google stuck on another host due to same > > issue, but several hundred others passed out today on that same host). I also > > killed selective ACKs as a test as these are known to sometimes cause issues > > for firewalls and "tcp accelerators" (or used to at the very least). > > > > SMTP connection between ourselves and Google ... I'm going to be selective in > > copying from tcpdump (full coversation up to the point where I killed it > > because it plainly got stuck in a loop is attached). > > > > Connection setup: > > > > 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 > > > > 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 > > > > 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp > > > > 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 > > > > This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus > > we shouldn't send segments larger than that, and they "can't". I need to > > determine if this is some form of offloading or they really are sending >1500 > > byte frames (which I know won't pass our firewalls without fragmentation so > > probably some form of NIC offloading - which if it was active on older 5.8 > > kernels did not cause problems): > > > > 00:56:17.709905 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465979:726468395, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP > > > > 00:56:17.709906 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726468395:726470811, ack 956634111, win 261, options [nop,nop,TS val 3477429710 ecr 3687705973], length 2416: SMTP > > > > These are the only two frames I can find that supposedly exceeds the MSS values > > (although, they don't exceed our value). > > > > Then everything goes pretty normal for a bit. The last data we receive from > > the remote side before stuff goes wrong: > > > > 00:56:18.088725 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726471823:726471919, ack 956634348, win 261, options [nop,nop,TS val 3477430089 ecr 3687706330], length 96: SMTP > > > > We ACK immediately along with the next segment: > > > > 00:56:18.088969 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956634348:956635776, ack 726471919, win 446, options [nop,nop,TS val 3687706515 ecr 3477430089], length 1428: SMTP > > > > Hereafter there is a flurry of data that we transmit, all nicely acknowledged, > > no retransmits that I can pick up (eyeballs). > > > > Before a long sequence of TX data we get this ACK: > > > > 00:56:18.576247 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956700036, win 774, options [nop,nop,TS val 3477430577 ecr 3687706840], length 0 > > > > We then continue to RX a sequence of: > > > > 00:56:18.576300 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956745732:956747160, ack 726471919, win 446, options [nop,nop,TS val 3687707002 ecr 3477430577], length 1428: SMTP > > > > up to: > > > > 00:56:18.577031 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956778576:956780004, ack 726471919, win 446, options [nop,nop,TS val 3687707003 ecr 3477430577], length 1428: SMTP > > > > Before we hit our first retransmit: > > > > 00:56:18.960078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956700036:956701464, ack 726471919, win 446, options [nop,nop,TS val 3687707386 ecr 3477430577], length 1428: SMTP > > > > Since 956700036 is the last ACKed data, this seems correct, not sure what timer > > this is based on though, the ACK for the just prior data came in ~384ms prior > > (could be based on normal time to ACK, I don't know, this is about double the > > usual round-trip-time currently). > > > > And then we receive this ACK (we can see this time the kernel waited for ACK of > > this single segment): > > > > 00:56:19.126678 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956701464, win 785, options [nop,nop,TS val 3477431127 ecr 3687707386], length 0 > > > > Then we do something (in my opinion) strange by jumping back to the tail of the previous burst: > > > > 00:56:19.126735 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956780004:956781432, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP > > > > 00:56:19.126751 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956781432:956782860, ack 726471919, win 446, options [nop,nop,TS val 3687707553 ecr 3477431127], length 1428: SMTP > > > > We then jump back and retransmit again from the just received ACK: > > > > 00:56:19.510078 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956701464:956702892, ack 726471919, win 446, options [nop,nop,TS val 3687707936 ecr 3477431127], length 1428: SMTP > > > > We then continue from there on as I'd expect (slow restart), this goes pretty > > normal up to: > > > > 00:56:19.997088 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956708604, win 841, options [nop,nop,TS val 3477431998 ecr 3687708261], length 0 > > > > 00:56:19.997148 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687708423 ecr 3477431998], length 1428: SMTP > > > > 00:56:20.262683 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432263 ecr 3687708423], length 0 > > > > Up to here is fine, now things gets bizarre, we just jump to a different > > sequence number, which has already been ACKed: > > > > 00:56:20.380076 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq *956707176*:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687708806 ecr 3477431998], length 1428: SMTP > > > > 00:56:20.542356 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477432543 ecr 3687708423], length 0 > > > > And remote side re-ACKs the 956710032 value, which frankly indicates we need to > > realize that the data we are transmitting has already been received, and we can > > continue on to transmit the segments following up on sequence number 956710032, > > instead we choose to get stuck in this sequence: > > > > 00:56:21.180080 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687709606 ecr 3477431998], length 1428: SMTP > > > > 00:56:21.342347 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477433343 ecr 3687708423], length 0 > > > > 00:56:22.780101 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687711206 ecr 3477431998], length 1428: SMTP > > > > 00:56:22.942346 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [.], ack 956710032, win 852, options [nop,nop,TS val 3477434943 ecr 3687708423], length 0 > > > > And here the connection dies. It eventually times out, and we retry to the > > next host, resulting in the same problem. > > > > I am aware that Google is having congestion issues in the JHB area in SA > > currently, and there are probably packet delays and losses somewhere along the > > line between us, but this really should not stall as dead as it is here. > > > > Looking at only the incoming ACK values, I can see they are strictly > > increasing, so we've never received an ACK > 956710032, but this is still > > greater than the value we are retransmitting. > > It could be that ACK packets have a wrong checksum, after some point is reached (some bug in a firewall/middlebox) "tcpdump -v" will tell you something about checksum errors. And/or "nstat -az | grep TcpInCsumError" Also, packets could be dropped in a layer like netfilter. Make sure you do not have a rule rate limiting flows, or something like that. > > The first time we transmitted the frame at sequence number 956707176 was part > > of the longest sequence of TX frames without a returning ACK, part of this > > sequence: > > > > ... > > > > 00:56:18.414299 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956705748:956707176, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP > > > > 00:56:18.414302 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [P.], seq 956707176:956708604, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP > > > > 00:56:18.414316 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], seq 956708604:956710032, ack 726471919, win 446, options [nop,nop,TS val 3687706840 ecr 3477430415], length 1428: SMTP > > > > ... > > > > Google here is ACKing not only the frame we are continuously retransmitting, > > but also the frame directly after ... so why would the kernel not move on to > > retransmitting starting from sequence number 956710032 (which is larger than > > the start sequence number of the frame we are retransmitting)? > > > > Kind Regards, > > Jaco > > Thanks for the report! I have CC-ed the netdev list, since it is > probably a better forum for this discussion. > > Can you please attach (or link to) a tcpdump raw .pcap file (produced > with the -w flag)? There are a number of tools that will make this > easier to visualize and analyze if we can see the raw .pcap file. You > may want to anonymize the trace and/or capture just headers, etc (for > example, the -s flag can control how much of each packet tcpdump > grabs). > > Can you please share the exact kernel version of the client machine? > > Also, can you please summarize/clarify whether you think the client, > server, or both are misbehaving? > > Thanks! > neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 2:01 ` Neal Cardwell 2022-03-30 2:40 ` Eric Dumazet @ 2022-03-30 2:58 ` Jaco Kroon 2022-03-30 3:48 ` Eric Dumazet 1 sibling, 1 reply; 38+ messages in thread From: Jaco Kroon @ 2022-03-30 2:58 UTC (permalink / raw) To: Neal Cardwell; +Cc: LKML, Netdev, Eric Dumazet, Yuchung Cheng [-- Attachment #1: Type: text/plain, Size: 2712 bytes --] Hi Neal, > Thanks for the report! I have CC-ed the netdev list, since it is > probably a better forum for this discussion. Awesome thank you. > > Can you please attach (or link to) a tcpdump raw .pcap file (produced > with the -w flag)? There are a number of tools that will make this > easier to visualize and analyze if we can see the raw .pcap file. You > may want to anonymize the trace and/or capture just headers, etc (for > example, the -s flag can control how much of each packet tcpdump > grabs). Attached. The traffic itself should be mostly encrypted but stripped with -s100 anyway. At this point SACK was still on. I don't know how, or why, but this relates to TFO. After sending report on a hunch (based on comparing the exim logs of a successful delivery compared to a non-successful) and the only difference was that the non-working was stating: TFO mode sendto, no data: EINPROGRESS and then specifically: TCP_FASTOPEN tcpi_unacked 2 The working connections never had the latter line in the output. The moment I set sysctl -w net.ipv4.tcp_fastopen=0 (default is 1) I've managed to flood out about 1200 emails to google in a matter of no more than 15 minutes. In the kernel sources: git log v5.8..v5.17 net/ And searching for TFO only gives so many possible commits that broke this, just looking at changelogs I'm not sure if any of them are relevant. I'm guessing the issue possibly relates to congestion control, as such this is probably the most relevant: commit be5d1b61a2ad28c7e57fe8bfa277373e8ecffcdc Author: Nguyen Dinh Phi <phind.uet@gmail.com> Date: Tue Jul 6 07:19:12 2021 +0800 tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized Just looking at the diff it removes a icsk->icsk_ca_initialized = 0; - the only other place this gets set to 0 is in tcp_disconnect() ... and to 1 in tcp_init_congestion_control() - so I think we might have an uninitialized variable here ... then again tcp_init_socket mentions explicitly that sk_alloc set lots of stuff to 0 - still bugs me that the original commit (8919a9b31eb4) felt the need to set an explicit 0 in tcp_init_transfer(). > > Can you please share the exact kernel version of the client machine? Our side (client) is 5.17.1 (side that initiates TCP/IP connection), I obviously can't comment for the Google side (server). > Also, can you please summarize/clarify whether you think the client, > server, or both are misbehaving? client is re-transmitting frames for which it has already received an ACK from the server. In pcap from frames 105 onwards one can start seeing retransmits, then first "spurious retransmission" as wireshark labels it from frames 122 onwards. Kind Regards, Jaco [-- Attachment #2: iewc_google2.pcap --] [-- Type: application/vnd.tcpdump.pcap, Size: 19828 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 2:58 ` Jaco Kroon @ 2022-03-30 3:48 ` Eric Dumazet 2022-03-30 6:22 ` Jaco Kroon 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-03-30 3:48 UTC (permalink / raw) To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Neal, > > > Thanks for the report! I have CC-ed the netdev list, since it is > > probably a better forum for this discussion. > Awesome thank you. > > > > Can you please attach (or link to) a tcpdump raw .pcap file (produced > > with the -w flag)? There are a number of tools that will make this > > easier to visualize and analyze if we can see the raw .pcap file. You > > may want to anonymize the trace and/or capture just headers, etc (for > > example, the -s flag can control how much of each packet tcpdump > > grabs). > > Attached. > > The traffic itself should be mostly encrypted but stripped with -s100 > anyway. At this point SACK was still on. > > I don't know how, or why, but this relates to TFO. After sending report > on a hunch (based on comparing the exim logs of a successful delivery > compared to a non-successful) and the only difference was that the > non-working was stating: > > TFO mode sendto, no data: EINPROGRESS > > and then specifically: > > TCP_FASTOPEN tcpi_unacked 2 > > The working connections never had the latter line in the output. > > The moment I set sysctl -w net.ipv4.tcp_fastopen=0 (default is 1) I've > managed to flood out about 1200 emails to google in a matter of no more > than 15 minutes. > > In the kernel sources: git log v5.8..v5.17 net/ > > And searching for TFO only gives so many possible commits that broke > this, just looking at changelogs I'm not sure if any of them are > relevant. I'm guessing the issue possibly relates to congestion > control, as such this is probably the most relevant: > > commit be5d1b61a2ad28c7e57fe8bfa277373e8ecffcdc > Author: Nguyen Dinh Phi <phind.uet@gmail.com> > Date: Tue Jul 6 07:19:12 2021 +0800 > > tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized > > Just looking at the diff it removes a icsk->icsk_ca_initialized = 0; - > the only other place this gets set to 0 is in tcp_disconnect() ... and > to 1 in tcp_init_congestion_control() - so I think we might have an > uninitialized variable here ... then again tcp_init_socket mentions > explicitly that sk_alloc set lots of stuff to 0 - still bugs me that the > original commit (8919a9b31eb4) felt the need to set an explicit 0 in > tcp_init_transfer(). I do not think this commit is related to the issue you have. I guess you could try a revert ? Then, if you think old linux versions were ok, start a bisection ? Thank you. (I do not see why a successful TFO would lead to a freeze after ~70 KB of data has been sent) > > > > > Can you please share the exact kernel version of the client machine? > Our side (client) is 5.17.1 (side that initiates TCP/IP connection), I > obviously can't comment for the Google side (server). > > Also, can you please summarize/clarify whether you think the client, > > server, or both are misbehaving? > > client is re-transmitting frames for which it has already received an > ACK from the server. In pcap from frames 105 onwards one can start > seeing retransmits, then first "spurious retransmission" as wireshark > labels it from frames 122 onwards. > > Kind Regards, > Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 3:48 ` Eric Dumazet @ 2022-03-30 6:22 ` Jaco Kroon 2022-03-30 13:56 ` Neal Cardwell 0 siblings, 1 reply; 38+ messages in thread From: Jaco Kroon @ 2022-03-30 6:22 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng Hi Eric, On 2022/03/30 05:48, Eric Dumazet wrote: > On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote: > > I do not think this commit is related to the issue you have. > > I guess you could try a revert ? > > Then, if you think old linux versions were ok, start a bisection ? That'll be interesting, will see if I can reproduce on a non-production host. > > Thank you. > > (I do not see why a successful TFO would lead to a freeze after ~70 KB > of data has been sent) I do actually agree with this in that it makes no sense, but disabling TFO definitely resolved the issue for us. Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 6:22 ` Jaco Kroon @ 2022-03-30 13:56 ` Neal Cardwell 2022-03-30 15:00 ` Jaco Kroon 0 siblings, 1 reply; 38+ messages in thread From: Neal Cardwell @ 2022-03-30 13:56 UTC (permalink / raw) To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng On Wed, Mar 30, 2022 at 2:22 AM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Eric, > > On 2022/03/30 05:48, Eric Dumazet wrote: > > On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote: > > > > I do not think this commit is related to the issue you have. > > > > I guess you could try a revert ? > > > > Then, if you think old linux versions were ok, start a bisection ? > That'll be interesting, will see if I can reproduce on a non-production > host. > > > > Thank you. > > > > (I do not see why a successful TFO would lead to a freeze after ~70 KB > > of data has been sent) > > I do actually agree with this in that it makes no sense, but disabling > TFO definitely resolved the issue for us. > > Kind Regards, > Jaco Thanks for the pcap trace! That's a pretty strange trace. I agree with Eric's theory that this looks like one or more bugs in a firewall, middlebox, or netfilter rule. From the trace it looks like the buggy component is sometimes dropping packets and sometimes corrupting them so that the client's TCP stack ignores them. Interestingly, in that trace the client SYN has a TFO option and cookie, but no data in the SYN. The last packet that looks sane/normal is the ACK from the SMTP server that looks like: 00:00:00.000010 IP6 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 66263 win 774 <nop,nop,TS val 1206544341 ecr 331189186> That's the first ACK that crosses past 2^16. Maybe that is a coincidence, or maybe not. Perhaps the buggy firewall/middlebox/etc is confused by the TFO option, corrupts its state, and thereafter behaves incorrectly past the first 64 KBytes of data from the client. In addition to checking for checksum failures, mentioned by Eric, you could look for PAWS failures, something like: nstat -az | egrep -i 'TcpInCsumError|PAWS' best, neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 13:56 ` Neal Cardwell @ 2022-03-30 15:00 ` Jaco Kroon 2022-03-30 16:19 ` Eric Dumazet 0 siblings, 1 reply; 38+ messages in thread From: Jaco Kroon @ 2022-03-30 15:00 UTC (permalink / raw) To: Neal Cardwell; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng Hi, On 2022/03/30 15:56, Neal Cardwell wrote: > On Wed, Mar 30, 2022 at 2:22 AM Jaco Kroon <jaco@uls.co.za> wrote: >> Hi Eric, >> >> On 2022/03/30 05:48, Eric Dumazet wrote: >>> On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote: >>> >>> I do not think this commit is related to the issue you have. >>> >>> I guess you could try a revert ? >>> >>> Then, if you think old linux versions were ok, start a bisection ? >> That'll be interesting, will see if I can reproduce on a non-production >> host. >>> Thank you. >>> >>> (I do not see why a successful TFO would lead to a freeze after ~70 KB >>> of data has been sent) >> I do actually agree with this in that it makes no sense, but disabling >> TFO definitely resolved the issue for us. >> >> Kind Regards, >> Jaco > Thanks for the pcap trace! That's a pretty strange trace. I agree with > Eric's theory that this looks like one or more bugs in a firewall, > middlebox, or netfilter rule. From the trace it looks like the buggy > component is sometimes dropping packets and sometimes corrupting them > so that the client's TCP stack ignores them. The capture was taken on the client. So the only firewall there is iptables, and I redirected all -j DROP statements to a L_DROP chain which did a -j LOG prior to -j DROP - didn't pick up any drops here. > > Interestingly, in that trace the client SYN has a TFO option and > cookie, but no data in the SYN. So this allows the SMTP server which in the conversation speaks first to identify itself to respond with data in the SYN (not sure that was actually happening but if I recall I did see it send data prior to receiving the final ACK on the handshake. > > The last packet that looks sane/normal is the ACK from the SMTP server > that looks like: > > 00:00:00.000010 IP6 2a00:1450:4013:c16::1a.25 > > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 66263 win > 774 <nop,nop,TS val 1206544341 ecr 331189186> > > That's the first ACK that crosses past 2^16. Maybe that is a > coincidence, or maybe not. Perhaps the buggy firewall/middlebox/etc is I believe it should be because we literally had this on every single connection going out to Google's SMTP ... probably 1/100 connections managed to deliver an email over the connection. Then again ... 64KB isn't that much ... When you state sane/normal, do you mean there is fault with the other frames that could not be explained by packet loss in one or both of the directions? > confused by the TFO option, corrupts its state, and thereafter behaves > incorrectly past the first 64 KBytes of data from the client. Only firewalls we've got are netfilter based, and these packets all passed through the dedicated firewalls at least by the time they reach here. No middleboxes on our end, and if this was Google's side there would be crazy noise be heard, not just me. I think the trigger is packet loss between us (as indicated we know they have link congestion issues in JHB area, it took us the better part of two weeks to get the first line tech on their side to just query the internal teams and probably another week to get the response acknowledging this - mybroadband.co.za has an article about other local ISPs also complaining). > > In addition to checking for checksum failures, mentioned by Eric, you > could look for PAWS failures, something like: > > nstat -az | egrep -i 'TcpInCsumError|PAWS' TcpInCsumErrors 0 0.0 TcpExtPAWSActive 0 0.0 TcpExtPAWSEstab 90092 0.0 TcpExtTCPACKSkippedPAWS 81317 0.0 Not sure what these mean, but i should probably investigate, the latter two are definitely incrementing. Appreciate the feedback and for looking at the traces. Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 15:00 ` Jaco Kroon @ 2022-03-30 16:19 ` Eric Dumazet 2022-03-31 15:41 ` Neal Cardwell 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-03-30 16:19 UTC (permalink / raw) To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi, > > On 2022/03/30 15:56, Neal Cardwell wrote: > > On Wed, Mar 30, 2022 at 2:22 AM Jaco Kroon <jaco@uls.co.za> wrote: > >> Hi Eric, > >> > >> On 2022/03/30 05:48, Eric Dumazet wrote: > >>> On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@uls.co.za> wrote: > >>> > >>> I do not think this commit is related to the issue you have. > >>> > >>> I guess you could try a revert ? > >>> > >>> Then, if you think old linux versions were ok, start a bisection ? > >> That'll be interesting, will see if I can reproduce on a non-production > >> host. > >>> Thank you. > >>> > >>> (I do not see why a successful TFO would lead to a freeze after ~70 KB > >>> of data has been sent) > >> I do actually agree with this in that it makes no sense, but disabling > >> TFO definitely resolved the issue for us. > >> > >> Kind Regards, > >> Jaco > > Thanks for the pcap trace! That's a pretty strange trace. I agree with > > Eric's theory that this looks like one or more bugs in a firewall, > > middlebox, or netfilter rule. From the trace it looks like the buggy > > component is sometimes dropping packets and sometimes corrupting them > > so that the client's TCP stack ignores them. > The capture was taken on the client. So the only firewall there is > iptables, and I redirected all -j DROP statements to a L_DROP chain > which did a -j LOG prior to -j DROP - didn't pick up any drops here. > > > > Interestingly, in that trace the client SYN has a TFO option and > > cookie, but no data in the SYN. > > So this allows the SMTP server which in the conversation speaks first to > identify itself to respond with data in the SYN (not sure that was > actually happening but if I recall I did see it send data prior to > receiving the final ACK on the handshake. > > > > > The last packet that looks sane/normal is the ACK from the SMTP server > > that looks like: > > > > 00:00:00.000010 IP6 2a00:1450:4013:c16::1a.25 > > > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 66263 win > > 774 <nop,nop,TS val 1206544341 ecr 331189186> > > > > That's the first ACK that crosses past 2^16. Maybe that is a > > coincidence, or maybe not. Perhaps the buggy firewall/middlebox/etc is > > I believe it should be because we literally had this on every single > connection going out to Google's SMTP ... probably 1/100 connections > managed to deliver an email over the connection. Then again ... 64KB > isn't that much ... > > When you state sane/normal, do you mean there is fault with the other > frames that could not be explained by packet loss in one or both of the > directions? > > > confused by the TFO option, corrupts its state, and thereafter behaves > > incorrectly past the first 64 KBytes of data from the client. > > Only firewalls we've got are netfilter based, and these packets all > passed through the dedicated firewalls at least by the time they reach > here. No middleboxes on our end, and if this was Google's side there > would be crazy noise be heard, not just me. I think the trigger is > packet loss between us (as indicated we know they have link congestion > issues in JHB area, it took us the better part of two weeks to get the > first line tech on their side to just query the internal teams and > probably another week to get the response acknowledging this - > mybroadband.co.za has an article about other local ISPs also complaining). > > > > > In addition to checking for checksum failures, mentioned by Eric, you > > could look for PAWS failures, something like: > > > > nstat -az | egrep -i 'TcpInCsumError|PAWS' > > TcpInCsumErrors 0 0.0 > TcpExtPAWSActive 0 0.0 > TcpExtPAWSEstab 90092 0.0 > TcpExtTCPACKSkippedPAWS 81317 0.0 > > Not sure what these mean, but i should probably investigate, the latter > two are definitely incrementing. > > Appreciate the feedback and for looking at the traces. > Your pcap does not show any obvious PAWS issues. If the host is lightly loaded you could try while the connection is attempted/frozen perf record -a -g -e skb:kfree_skb sleep 30 perf script (or perf report) ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-30 16:19 ` Eric Dumazet @ 2022-03-31 15:41 ` Neal Cardwell 2022-03-31 23:06 ` Jaco Kroon 0 siblings, 1 reply; 38+ messages in thread From: Neal Cardwell @ 2022-03-31 15:41 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jaco Kroon, LKML, Netdev, Yuchung Cheng [-- Attachment #1: Type: text/plain, Size: 2126 bytes --] On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote: ... > When you state sane/normal, do you mean there is fault with the other > frames that could not be explained by packet loss in one or both of the > directions? Yes. (1) If you look at the attached trace time/sequence plots (from tcptrace and xplot.org) there are several behaviors that do not look like normal congestive packet loss: (a) Literally *all* original transmissions (white segments in the plot) of packets after client sequence 66263 appear lost (are not ACKed). Congestion generally does not behave like that. But broken firewalls/middleboxes do. (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png ) (b) When the client is retransmitting packets, only packets at exactly snd_una are ACKed. The packets beyond that point are always un-ACKed. Again sounds like a broken firewall/middlebox. (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png ) (c) After the client receives the server's "ack 73403", the client ignores/drops all other incoming packets that show up in the trace. As Eric notes, this doesn't look like a PAWS issue. And it doesn't look like a checksum or sequence/ACK validation issue. The client starts ignoring ACKs between two ACKs that have correct checksums, valid ACK numbers, and valid (identical) sequence numbers and TS val and ecr values (here showing absolute sequence/ACK numbers): (i) The client processes this ACK and uses it to advance snd_una: 17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP (6) payload length: 32) 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct) 2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val 1206546583 ecr 331191428> (ii) The client ignores this ACK and all later ACKs: 17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP (6) payload length: 32) 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct) 2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val 1206546583 ecr 331191428> neal [-- Attachment #2: netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png --] [-- Type: image/png, Size: 131216 bytes --] [-- Attachment #3: netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png --] [-- Type: image/png, Size: 128102 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-31 15:41 ` Neal Cardwell @ 2022-03-31 23:06 ` Jaco Kroon 2022-04-01 0:10 ` Eric Dumazet 2022-04-01 14:50 ` Neal Cardwell 0 siblings, 2 replies; 38+ messages in thread From: Jaco Kroon @ 2022-03-31 23:06 UTC (permalink / raw) To: Neal Cardwell, Eric Dumazet; +Cc: LKML, Netdev, Yuchung Cheng Hi Neal, This sniff was grabbed ON THE CLIENT HOST. There is no middlebox or anything between the sniffer and the client. Only the firewall on the host itself, where we've already establish the traffic is NOT DISCARDED (at least not in filter/INPUT). Setup on our end: 2 x routers, usually each with a direct peering with Google (which is being ignored at the moment so instead traffic is incoming via IPT over DD). Connected via switch to 2 x firewalls, of which ONE is active (they have different networks behind them, and could be active / standby for different networks behind them - avoiding active-active because conntrackd is causing more trouble than it's worth), Linux hosts, using netfilter, has been operating for years, no recent kernel upgrades. 4 x hosts in mail cluster, one of which you're looking at here. On 2022/03/31 17:41, Neal Cardwell wrote: > On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote: > ... >> When you state sane/normal, do you mean there is fault with the other >> frames that could not be explained by packet loss in one or both of the >> directions? > Yes. > > (1) If you look at the attached trace time/sequence plots (from > tcptrace and xplot.org) there are several behaviors that do not look > like normal congestive packet loss: OK. I'm not 100% sure how these plots of yours work, but let's see if I can follow your logic here - they mostly make sense. A legend would probably help. As I understand the white dots are original transmits, green is what has been ACKED. R is retransmits ... what's the S? What's the yellow line (I'm guessing receive window as advertised by the server)? > > (a) Literally *all* original transmissions (white segments in the > plot) of packets after client sequence 66263 appear lost (are not > ACKed). Congestion generally does not behave like that. But broken > firewalls/middleboxes do. > (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png ) Agreed. So could it be that something in the transit path towards Google is actually dropping all of that? As stated - I highly doubt this is on our network unless newer kernel (on mail cluster) is doing stuff which is causing older netfilter to drop perhaps? But this doesn't explain why newer kernel retransmits data for which it received an ACK. > > (b) When the client is retransmitting packets, only packets at > exactly snd_una are ACKed. The packets beyond that point are always > un-ACKed. Again sounds like a broken firewall/middlebox. > (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png ) No middlebox between packet sniffer and client ... client here is linux 5.17.1. Brings me back to the only thing that could be dropping the traffic is netfilter on the host, or the kernel doesn't like something about the ACK, or kernel is doing something else wrong as a result of TFO. I'm not sure which option I like less. Unfortunately I also use netfilter for redirecting traffic into haproxy here so can't exactly just switch off netfilter. > > (c) After the client receives the server's "ack 73403", the client > ignores/drops all other incoming packets that show up in the trace. Agreed. However, if I read your graph correctly, it gets an ACK for frame X at ~3.8s into the connection, then for X+2 at 4s, but it keeps retransmitting X+2, not X+1? > > As Eric notes, this doesn't look like a PAWS issue. And it > doesn't look like a checksum or sequence/ACK validation issue. The > client starts ignoring ACKs between two ACKs that have correct > checksums, valid ACK numbers, and valid (identical) sequence numbers > and TS val and ecr values (here showing absolute sequence/ACK > numbers): I'm not familiar with PAWS here. Assuming that the green line is ACKs, then at around 4s we get an ACK that basically ACKs two frames in one (which is fine from my understanding of TCP), and then the second of these frames keeps getting retransmitted going forward, so it's almost like the kernel ACKs the *first* of these two frames but not the second. > > (i) The client processes this ACK and uses it to advance snd_una: > 17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP > (6) payload length: 32) 2a00:1450:4013:c16::1a.25 > > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct) > 2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val > 1206546583 ecr 331191428> > > (ii) The client ignores this ACK and all later ACKs: > 17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP > (6) payload length: 32) 2a00:1450:4013:c16::1a.25 > > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct) > 2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val > 1206546583 ecr 331191428> > > neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-31 23:06 ` Jaco Kroon @ 2022-04-01 0:10 ` Eric Dumazet 2022-04-01 0:15 ` Florian Westphal 2022-04-01 0:33 ` Jaco Kroon 2022-04-01 14:50 ` Neal Cardwell 1 sibling, 2 replies; 38+ messages in thread From: Eric Dumazet @ 2022-04-01 0:10 UTC (permalink / raw) To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng On Thu, Mar 31, 2022 at 4:06 PM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Neal, > > This sniff was grabbed ON THE CLIENT HOST. There is no middlebox or > anything between the sniffer and the client. Only the firewall on the > host itself, where we've already establish the traffic is NOT DISCARDED > (at least not in filter/INPUT). > > Setup on our end: > > 2 x routers, usually each with a direct peering with Google (which is > being ignored at the moment so instead traffic is incoming via IPT over DD). > > Connected via switch to > > 2 x firewalls, of which ONE is active (they have different networks > behind them, and could be active / standby for different networks behind > them - avoiding active-active because conntrackd is causing more trouble > than it's worth), Linux hosts, using netfilter, has been operating for > years, no recent kernel upgrades. Next step would be to attempt removing _all_ firewalls, especially not common setups like yours. conntrack had a bug preventing TFO deployment for a while, because many boxes kept buggy kernel versions for years. 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix tcp_in_window for Fast Open ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 0:10 ` Eric Dumazet @ 2022-04-01 0:15 ` Florian Westphal 2022-04-01 11:54 ` Jaco Kroon 2022-04-01 0:33 ` Jaco Kroon 1 sibling, 1 reply; 38+ messages in thread From: Florian Westphal @ 2022-04-01 0:15 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jaco Kroon, Neal Cardwell, LKML, Netdev, Yuchung Cheng Eric Dumazet <edumazet@google.com> wrote: > Next step would be to attempt removing _all_ firewalls, especially not > common setups like yours. > > conntrack had a bug preventing TFO deployment for a while, because > many boxes kept buggy kernel versions for years. > > 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix > tcp_in_window for Fast Open Jaco could also try with net.netfilter.nf_conntrack_tcp_be_liberal=1 and, if that helps, with liberal=0 and sysctl net.netfilter.nf_conntrack_log_invalid=6 (check dmesg/syslog/nflog). ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 0:15 ` Florian Westphal @ 2022-04-01 11:54 ` Jaco Kroon 2022-04-01 12:09 ` Florian Westphal 0 siblings, 1 reply; 38+ messages in thread From: Jaco Kroon @ 2022-04-01 11:54 UTC (permalink / raw) To: Florian Westphal, Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng Hi, On 2022/04/01 02:15, Florian Westphal wrote: Incidently, I always find your initials to be interesting considering (as far as I know) you work on netfilter firewall. > Eric Dumazet <edumazet@google.com> wrote: >> Next step would be to attempt removing _all_ firewalls, especially not >> common setups like yours. >> >> conntrack had a bug preventing TFO deployment for a while, because >> many boxes kept buggy kernel versions for years. >> >> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix >> tcp_in_window for Fast Open > Jaco could also try with > net.netfilter.nf_conntrack_tcp_be_liberal=1 > > and, if that helps, with liberal=0 and > sysctl net.netfilter.nf_conntrack_log_invalid=6 > > (check dmesg/syslog/nflog). Our core firewalls already had nf_conntrack_tcp_be_liberal for other reasons (asymmetric routing combined with conntrackd left-over if I recall), so maybe that's why it got through there ... don't exactly want to just flip that setting though, is there a way to log if it would have dropped anything, without actually dropping it (yet)? Will do this first, first need to confirm that I can reproduce in a dev environment. Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 11:54 ` Jaco Kroon @ 2022-04-01 12:09 ` Florian Westphal 0 siblings, 0 replies; 38+ messages in thread From: Florian Westphal @ 2022-04-01 12:09 UTC (permalink / raw) To: Jaco Kroon Cc: Florian Westphal, Eric Dumazet, Neal Cardwell, LKML, Netdev, Yuchung Cheng Jaco Kroon <jaco@uls.co.za> wrote: > > Eric Dumazet <edumazet@google.com> wrote: > >> Next step would be to attempt removing _all_ firewalls, especially not > >> common setups like yours. > >> > >> conntrack had a bug preventing TFO deployment for a while, because > >> many boxes kept buggy kernel versions for years. > >> > >> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix > >> tcp_in_window for Fast Open > > Jaco could also try with > > net.netfilter.nf_conntrack_tcp_be_liberal=1 > > > > and, if that helps, with liberal=0 and > > sysctl net.netfilter.nf_conntrack_log_invalid=6 > > > > (check dmesg/syslog/nflog). > > Our core firewalls already had nf_conntrack_tcp_be_liberal for other > reasons (asymmetric routing combined with conntrackd left-over if I > recall), so maybe that's why it got through there ... don't exactly want > to just flip that setting though, is there a way to log if it would have > dropped anything, without actually dropping it (yet)? This means conntrack doesn't tag packets as invalid EVEN if it would consider sequence/ack out-of-window (e.g. due to a bug). I have a hard time seeing how tcp liberal-mode conntrack would be to blame here. Only thing you could also check is if net.netfilter.nf_conntrack_checksum=0 helps (but i doubt it). ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 0:10 ` Eric Dumazet 2022-04-01 0:15 ` Florian Westphal @ 2022-04-01 0:33 ` Jaco Kroon 2022-04-01 0:41 ` Eric Dumazet 1 sibling, 1 reply; 38+ messages in thread From: Jaco Kroon @ 2022-04-01 0:33 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng Hi, On 2022/04/01 02:10, Eric Dumazet wrote: > On Thu, Mar 31, 2022 at 4:06 PM Jaco Kroon <jaco@uls.co.za> wrote: >> Hi Neal, >> >> This sniff was grabbed ON THE CLIENT HOST. There is no middlebox or >> anything between the sniffer and the client. Only the firewall on the >> host itself, where we've already establish the traffic is NOT DISCARDED >> (at least not in filter/INPUT). >> >> Setup on our end: >> >> 2 x routers, usually each with a direct peering with Google (which is >> being ignored at the moment so instead traffic is incoming via IPT over DD). >> >> Connected via switch to >> >> 2 x firewalls, of which ONE is active (they have different networks >> behind them, and could be active / standby for different networks behind >> them - avoiding active-active because conntrackd is causing more trouble >> than it's worth), Linux hosts, using netfilter, has been operating for >> years, no recent kernel upgrades. > Next step would be to attempt removing _all_ firewalls, especially not > common setups like yours. That I'm afraid is not going to happen here. I can't imagine what we're doing is that uncommon. On the host basically for INPUT drop invalid, ACCEPT related established, accept specific ports, drop everything else. Other than the redirects in NAT there really isn't anything "funny". > > conntrack had a bug preventing TFO deployment for a while, because > many boxes kept buggy kernel versions for years. We don't use conntrackd, we tried many years back, but eventually we just ended up using ucarp with /32s on the interfaces and whatever subnet is required for the floating IP itself, combined with OSPF to sort out the routing, that way we get to avoid asymmetric routing and the need for conntrackd. The core firewalls basically on FORWARD does some directing based on ingress and/or egress interface to determine ruleset to apply, again INVALID and RELATED,ESTABLISHED rules at the head. > > 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix > tcp_in_window for Fast Open This is from Aug 9, 2013 ... our firewall's kernel isn't that old :). Again, the traffic was sniffed on the client side of that firewall, and the only firewall between the sniffer and the processing part of the kernel is the local netfilter. I'll deploy same on a dev host we've got in the coming week and start a bisect process. Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 0:33 ` Jaco Kroon @ 2022-04-01 0:41 ` Eric Dumazet 2022-04-01 0:54 ` Eric Dumazet 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-04-01 0:41 UTC (permalink / raw) To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote: > I'll deploy same on a dev host we've got in the coming week and start a > bisect process. Thanks, this will definitely help. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 0:41 ` Eric Dumazet @ 2022-04-01 0:54 ` Eric Dumazet 2022-04-01 11:36 ` Jaco Kroon 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-04-01 0:54 UTC (permalink / raw) To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote: > > > I'll deploy same on a dev host we've got in the coming week and start a > > bisect process. > > Thanks, this will definitely help. One thing I noticed in your pcap is a good amount of drops, as if Hystart was not able to stop slow-start before the drops are happening. TFO with one less RTT at connection establishment could be the trigger. If you are still using cubic, please try to revert. commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001 Author: Eric Dumazet <edumazet@google.com> Date: Tue Nov 23 12:25:35 2021 -0800 tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 0:54 ` Eric Dumazet @ 2022-04-01 11:36 ` Jaco Kroon 2022-04-01 13:54 ` Eric Dumazet 0 siblings, 1 reply; 38+ messages in thread From: Jaco Kroon @ 2022-04-01 11:36 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng Hi Eric, On 2022/04/01 02:54, Eric Dumazet wrote: > On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@google.com> wrote: >> On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote: >> >>> I'll deploy same on a dev host we've got in the coming week and start a >>> bisect process. >> Thanks, this will definitely help. > One thing I noticed in your pcap is a good amount of drops, as if > Hystart was not able to stop slow-start before the drops are > happening. > > TFO with one less RTT at connection establishment could be the trigger. > > If you are still using cubic, please try to revert. Sorry, I understand TCP itself a bit, but I've given up trying to understand the various schedulers a long time ago and am just using the defaults that the kernel provides. How do I check what I'm using, and how can I change that? What is recommended at this stage? > > > commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001 > Author: Eric Dumazet <edumazet@google.com> > Date: Tue Nov 23 12:25:35 2021 -0800 > > tcp_cubic: fix spurious Hystart ACK train detections for > not-cwnd-limited flows Ok, instead of starting with bisect, if I can reproduce in dev I'll use this one first. Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 11:36 ` Jaco Kroon @ 2022-04-01 13:54 ` Eric Dumazet 0 siblings, 0 replies; 38+ messages in thread From: Eric Dumazet @ 2022-04-01 13:54 UTC (permalink / raw) To: Jaco Kroon; +Cc: Neal Cardwell, LKML, Netdev, Yuchung Cheng On Fri, Apr 1, 2022 at 4:36 AM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Eric, > > On 2022/04/01 02:54, Eric Dumazet wrote: > > On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@google.com> wrote: > >> On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@uls.co.za> wrote: > >> > >>> I'll deploy same on a dev host we've got in the coming week and start a > >>> bisect process. > >> Thanks, this will definitely help. > > One thing I noticed in your pcap is a good amount of drops, as if > > Hystart was not able to stop slow-start before the drops are > > happening. > > > > TFO with one less RTT at connection establishment could be the trigger. > > > > If you are still using cubic, please try to revert. > Sorry, I understand TCP itself a bit, but I've given up trying to > understand the various schedulers a long time ago and am just using the > defaults that the kernel provides. How do I check what I'm using, and > how can I change that? What is recommended at this stage? How to check: cat /proc/sys/net/ipv4/tcp_congestion_control" This is of course orthogonal to the buf we are tracking here, but given your long RTT, I would recommend using fq packet scheduler and bbr. tc qd replace dev eth0 root fq # or use mq+fq if your NIC is multi queue and you need a good amount of throughput insmod tcp_bbr # (after enabling CONFIG_TCP_CONG_BBR=m) echo bbr >/proc/sys/net/ipv4/tcp_congestion_control > > > > > > commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001 > > Author: Eric Dumazet <edumazet@google.com> > > Date: Tue Nov 23 12:25:35 2021 -0800 > > > > tcp_cubic: fix spurious Hystart ACK train detections for > > not-cwnd-limited flows > Ok, instead of starting with bisect, if I can reproduce in dev I'll use > this one first. Thanks ! (again this won't fix the bug, this is really a shoot in the dark) ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-03-31 23:06 ` Jaco Kroon 2022-04-01 0:10 ` Eric Dumazet @ 2022-04-01 14:50 ` Neal Cardwell 2022-04-01 15:39 ` Neal Cardwell 1 sibling, 1 reply; 38+ messages in thread From: Neal Cardwell @ 2022-04-01 14:50 UTC (permalink / raw) To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng On Thu, Mar 31, 2022 at 7:06 PM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Neal, > > This sniff was grabbed ON THE CLIENT HOST. There is no middlebox or > anything between the sniffer and the client. Only the firewall on the > host itself, where we've already establish the traffic is NOT DISCARDED > (at least not in filter/INPUT). Yes, understood. Please excuse my general use of the term "firewalls/middleboxes" even where in some contexts it's clear the "middleboxes" aspect of that term could not apply. :-) > Setup on our end: > > 2 x routers, usually each with a direct peering with Google (which is > being ignored at the moment so instead traffic is incoming via IPT over DD). > > Connected via switch to > > 2 x firewalls, of which ONE is active (they have different networks > behind them, and could be active / standby for different networks behind > them - avoiding active-active because conntrackd is causing more trouble > than it's worth), Linux hosts, using netfilter, has been operating for > years, no recent kernel upgrades. > > 4 x hosts in mail cluster, one of which you're looking at here. > > On 2022/03/31 17:41, Neal Cardwell wrote: > > On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@uls.co.za> wrote: > > ... > >> When you state sane/normal, do you mean there is fault with the other > >> frames that could not be explained by packet loss in one or both of the > >> directions? > > Yes. > > > > (1) If you look at the attached trace time/sequence plots (from > > tcptrace and xplot.org) there are several behaviors that do not look > > like normal congestive packet loss: > OK. I'm not 100% sure how these plots of yours work, but let's see if I > can follow your logic here - they mostly make sense. A legend would > probably help. As I understand the white dots are original transmits, > green is what has been ACKED. R is retransmits ... what's the S? "S" is "SACKed", or selectively acknowledged. The SACK blocks below the green ACK lines are DSACK blocks, for "Duplicate SACKs", indicating the receiver has already received that sequence range. > What's the yellow line (I'm guessing receive window as advertised by the > server)? Yes, the yellow line is the right edge of the receive window of the server. > > (a) Literally *all* original transmissions (white segments in the > > plot) of packets after client sequence 66263 appear lost (are not > > ACKed). Congestion generally does not behave like that. But broken > > firewalls/middleboxes do. > > (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png ) > > Agreed. So could it be that something in the transit path towards > Google is actually dropping all of that? It could be. Or it could be a firewall/middlebox. > As stated - I highly doubt this is on our network unless newer kernel > (on mail cluster) is doing stuff which is causing older netfilter to > drop perhaps? But this doesn't explain why newer kernel retransmits > data for which it received an ACK. Yes, I agree that the biggest problem to focus on is the TCP code in the kernel retransmitting data for which the NIC is receiving ACKs. > > > > (b) When the client is retransmitting packets, only packets at > > exactly snd_una are ACKed. The packets beyond that point are always > > un-ACKed. Again sounds like a broken firewall/middlebox. > > (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png ) > No middlebox between packet sniffer and client ... client here is linux > 5.17.1. Brings me back to the only thing that could be dropping the > traffic is netfilter on the host, or the kernel doesn't like something > about the ACK, or kernel is doing something else wrong as a result of > TFO. I'm not sure which option I like less. Unfortunately I also use > netfilter for redirecting traffic into haproxy here so can't exactly > just switch off netfilter. Given the most problematic aspect of the trace, where the client-side TCP connection is repeatedly retransmitting packets for which ACKs are arriving at the NIC (and captured by tcpdump), it seems some software in your kernel is dropping packets between the network device and the TCP layer. Given that you mention "the only thing that could be dropping the traffic is netfilter on the host", it seems like the netfilter rules or software are buggy. A guess would be that the netfilter code is getting into a bad state due to the TFO behavior where there is a data packet arriving from the server immediately after the SYN/ACK and just before the client sends its first ACK: 00:00:00.000000 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590 > 2a00:1450:4013:c16::1a.25: S 3451342529:3451342529(0) win 62580 <mss 8940,sackOK,TS val 331187616 ecr 0,nop,wscale 7,Unknown Option 3472da7bfe84[|tcp]> 00:00:00.164295 IP6 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: S. 2699962254:2699962254(0) ack 3451342530 win 65535 <mss 1440,sackOK,TS val 1206542770 ecr 331187616,nop,wscale 8> # this one is perhaps confusing netfilter?: 00:00:00.001641 IP6 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: P. 1:89(88) ack 1 win 256 <nop,nop,TS val 1206542772 ecr 331187616> 00:00:00.000035 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590 > 2a00:1450:4013:c16::1a.25: . 1:1(0) ack 89 win 489 <nop,nop,TS val 331187782 ecr 1206542772> 00:00:00.000042 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590 > 2a00:1450:4013:c16::1a.25: P. 1:24(23) ack 89 win 489 <nop,nop,TS val 331187782 ecr 1206542772> Re "so can't exactly just switch off netfilter", are there any other counters or logs you can somehow check for netfilter drops? > > > > (c) After the client receives the server's "ack 73403", the client > > ignores/drops all other incoming packets that show up in the trace. > > Agreed. However, if I read your graph correctly, it gets an ACK for > frame X at ~3.8s into the connection, then for X+2 at 4s, but it keeps > retransmitting X+2, not X+1? At t=4s, as I discussed below there are two ACKs that arrive back-to-back, where the client TCP apparently processes the first but not the second. That's why it keeps retransmitting the packet beyond the first ACk but not beyond the second ACK. > > > > > As Eric notes, this doesn't look like a PAWS issue. And it > > doesn't look like a checksum or sequence/ACK validation issue. The > > client starts ignoring ACKs between two ACKs that have correct > > checksums, valid ACK numbers, and valid (identical) sequence numbers > > and TS val and ecr values (here showing absolute sequence/ACK > > numbers): > I'm not familiar with PAWS here. Assuming that the green line is ACKs, > then at around 4s we get an ACK that basically ACKs two frames in one > (which is fine from my understanding of TCP), and then the second of > these frames keeps getting retransmitted going forward, so it's almost > like the kernel ACKs the *first* of these two frames but not the second. Again, there are two ACKs, where the client TCP apparently processes the first but not the second, as discussed here: > > > > (i) The client processes this ACK and uses it to advance snd_una: > > 17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP > > (6) payload length: 32) 2a00:1450:4013:c16::1a.25 > > > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct) > > 2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val > > 1206546583 ecr 331191428> > > > > > (ii) The client ignores this ACK and all later ACKs: > > 17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP > > (6) payload length: 32) 2a00:1450:4013:c16::1a.25 > > > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct) > > 2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val > > 1206546583 ecr 331191428> > > Here are those same two ACKs again, shown with absolute time and relative sequence numbers, to make them easier to parse: (i) The client processes this ACK and uses it to advance snd_una: 17:46:49.889911 IP6 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 73403 win 830 <nop,nop,TS val 1206546583 ecr 331191428> (ii) The client ignores this ACK and all later ACKs: 17:46:49.889912 IP6 2a00:1450:4013:c16::1a.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . 6260:6260(0) ack 74831 win 841 <nop,nop,TS val 1206546583 ecr 331191428> neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 14:50 ` Neal Cardwell @ 2022-04-01 15:39 ` Neal Cardwell 2022-04-01 15:48 ` Neal Cardwell 2022-04-02 8:42 ` Jaco Kroon 0 siblings, 2 replies; 38+ messages in thread From: Neal Cardwell @ 2022-04-01 15:39 UTC (permalink / raw) To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote: ... > Connection setup: > > 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 > > 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 > > 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp > > 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 > > This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus > we shouldn't send segments larger than that, and they "can't". I need to > determine if this is some form of offloading or they really are sending >1500 > byte frames (which I know won't pass our firewalls without fragmentation so > probably some form of NIC offloading - which if it was active on older 5.8 > kernels did not cause problems): Jaco, was there some previous kernel version on these client machines where this problem did not show up? Perhaps the v5.8 version you mention here? Can you please share the exact version number? If so, a hypothesis would be: (1) There is a bug in netfilter's handling of TFO connections where the server sends a data packet after a TFO SYNACK, before the client ACKs anything (as we see in this trace). This bug is perhaps similar in character to the bug fixed by Yuchung's 2013 commit that Eric mentioned: 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix tcp_in_window for Fast Open (2) With kernel v5.8, TFO blackhole detection detected that in your workload there were TFO connections that died due to apparent blackholing (like what's shown in the trace), and dynamically disabled TFO on your machines. This allowed mail traffic to flow, because the netfilter bug was no longer tickled. This worked around the netfilter bug. (3) You upgraded your client-side machine from v5.8 to v5.17, which has the following commit from v5.14, which disables TFO blackhole logic by default: 213ad73d0607 tcp: disable TFO blackhole logic by default (4) Due to (3), the blackhole detection logic was no longer operative, and when the netfilter bug blackholed the connection, TFO stayed enabled. This caused mail traffic to Google to stall. This hypothesis would explain why: o disabling TFO fixes this problem o you are seeing this with a newer kernel (and apparently not with a kernel before v5.14?) With this hypothesis, we need several pieces to trigger this: (a) client side software that tries TFO to a server that supports TFO (like the exim mail transfer agent you are using, connecting to Google) (b) a client-side Linux kernel running buggy netfilter code (you are running netfilter) (c) a client-side Linux kernel with TFO support but no blackhole detection logic active (e.g. v5.14 or later, like your v5.17.1) That's probably a rare combination, so would explain why we have not had this report before. Jaco, to provide some evidence for this hypothesis, can you please re-enable fastopen but also enable the TFO blackhole detection that was disabled in v5.14 (213ad73d0607), with something like: sysctl -w net.ipv4.tcp_fastopen=1 sysctl -w tcp_fastopen_blackhole_timeout=3600 And then after a few hours, check to see if this blackholing behavior has been detected: nstat -az | grep -i blackhole And see if TFO FastOpenActive attempts have been cut to a super-low rate: nstat -az | grep -i fastopenactive thanks, neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 15:39 ` Neal Cardwell @ 2022-04-01 15:48 ` Neal Cardwell 2022-04-02 8:42 ` Jaco Kroon 1 sibling, 0 replies; 38+ messages in thread From: Neal Cardwell @ 2022-04-01 15:48 UTC (permalink / raw) To: Jaco Kroon; +Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang On Fri, Apr 1, 2022 at 11:39 AM Neal Cardwell <ncardwell@google.com> wrote: ... > Jaco, to provide some evidence for this hypothesis, can you please > re-enable fastopen but also enable the TFO blackhole detection that > was disabled in v5.14 (213ad73d0607), with something like: > > sysctl -w net.ipv4.tcp_fastopen=1 > sysctl -w tcp_fastopen_blackhole_timeout=3600 I would also suggest using Florian's suggestion to log invalid packets, so perhaps we can get a clue as to why netfilter thinks these packets are invalid: sysctl net.netfilter.nf_conntrack_log_invalid=6 > And then after a few hours, check to see if this blackholing behavior > has been detected: > nstat -az | grep -i blackhole > And see if TFO FastOpenActive attempts have been cut to a super-low rate: > nstat -az | grep -i fastopenactive Then I would correspondingly echo Florian's suggestion to check dmesg/syslog/nflog to learn more about the drops. neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-01 15:39 ` Neal Cardwell 2022-04-01 15:48 ` Neal Cardwell @ 2022-04-02 8:42 ` Jaco Kroon 2022-04-02 13:20 ` Eric Dumazet ` (2 more replies) 1 sibling, 3 replies; 38+ messages in thread From: Jaco Kroon @ 2022-04-02 8:42 UTC (permalink / raw) To: Neal Cardwell, Florian Westphal Cc: Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang Hi Neal, On 2022/04/01 17:39, Neal Cardwell wrote: > On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote: > ... >> Connection setup: >> >> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 >> >> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 >> >> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp >> >> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 >> >> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus >> we shouldn't send segments larger than that, and they "can't". I need to >> determine if this is some form of offloading or they really are sending >1500 >> byte frames (which I know won't pass our firewalls without fragmentation so >> probably some form of NIC offloading - which if it was active on older 5.8 >> kernels did not cause problems): > Jaco, was there some previous kernel version on these client machines > where this problem did not show up? Perhaps the v5.8 version you > mention here? Can you please share the exact version number? 5.8.14 > > If so, a hypothesis would be: > > (1) There is a bug in netfilter's handling of TFO connections where > the server sends a data packet after a TFO SYNACK, before the client > ACKs anything (as we see in this trace). > > This bug is perhaps similar in character to the bug fixed by Yuchung's > 2013 commit that Eric mentioned: > > 356d7d88e088687b6578ca64601b0a2c9d145296 > netfilter: nf_conntrack: fix tcp_in_window for Fast Open > > (2) With kernel v5.8, TFO blackhole detection detected that in your > workload there were TFO connections that died due to apparent > blackholing (like what's shown in the trace), and dynamically disabled > TFO on your machines. This allowed mail traffic to flow, because the > netfilter bug was no longer tickled. This worked around the netfilter > bug. > > (3) You upgraded your client-side machine from v5.8 to v5.17, which > has the following commit from v5.14, which disables TFO blackhole > logic by default: > 213ad73d0607 tcp: disable TFO blackhole logic by default > > (4) Due to (3), the blackhole detection logic was no longer operative, > and when the netfilter bug blackholed the connection, TFO stayed > enabled. This caused mail traffic to Google to stall. > > This hypothesis would explain why: > o disabling TFO fixes this problem > o you are seeing this with a newer kernel (and apparently not with a > kernel before v5.14?) Agreed. > > With this hypothesis, we need several pieces to trigger this: > > (a) client side software that tries TFO to a server that supports TFO > (like the exim mail transfer agent you are using, connecting to > Google) > > (b) a client-side Linux kernel running buggy netfilter code (you are > running netfilter) > > (c) a client-side Linux kernel with TFO support but no blackhole > detection logic active (e.g. v5.14 or later, like your v5.17.1) > > That's probably a rare combination, so would explain why we have not > had this report before. > > Jaco, to provide some evidence for this hypothesis, can you please > re-enable fastopen but also enable the TFO blackhole detection that > was disabled in v5.14 (213ad73d0607), with something like: > > sysctl -w net.ipv4.tcp_fastopen=1 > sysctl -w tcp_fastopen_blackhole_timeout=3600 Done. Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which generates lots of logs, something specific I should be looking for? I suspect these relate: [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound (over the window of the receiver) IN= OUT=bond0 SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020 WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 GID=12 [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound (over the window of the receiver) IN= OUT=bond0 SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689941170 ACK=4200412020 WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 GID=12 (There are many more of those, and the remote side is Google in this case) > > And then after a few hours, check to see if this blackholing behavior > has been detected: > nstat -az | grep -i blackhole > And see if TFO FastOpenActive attempts have been cut to a super-low rate: > nstat -az | grep -i fastopenactive uriel [06:10:03] ~ # nstat -az | grep -i fastopen TcpExtTCPFastOpenActive 0 0.0 TcpExtTCPFastOpenActiveFail 3739 0.0 TcpExtTCPFastOpenPassive 0 0.0 TcpExtTCPFastOpenPassiveFail 0 0.0 TcpExtTCPFastOpenListenOverflow 0 0.0 TcpExtTCPFastOpenCookieReqd 3378 0.0 TcpExtTCPFastOpenBlackhole 0 0.0 TcpExtTCPFastOpenPassiveAltKey 0 0.0 uriel [09:54:54] ~ # nstat -az | grep -i fastopen TcpExtTCPFastOpenActive 0 0.0 TcpExtTCPFastOpenActiveFail 3742 0.0 TcpExtTCPFastOpenPassive 0 0.0 TcpExtTCPFastOpenPassiveFail 0 0.0 TcpExtTCPFastOpenListenOverflow 0 0.0 TcpExtTCPFastOpenCookieReqd 3391 0.0 TcpExtTCPFastOpenBlackhole 3 0.0 TcpExtTCPFastOpenPassiveAltKey 0 0.0 I'm fairly certain that strongly supports your theory. So I *suspect* the next test would be something like: Disable the blackhole again, let the queue build up a few minutes until we have something from google. Shut down exim so we can isolate SMTP traffic. tcpdump again, capturing the traffic, and correlate the FW logs with the connection? Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 8:42 ` Jaco Kroon @ 2022-04-02 13:20 ` Eric Dumazet 2022-04-02 22:02 ` Jaco Kroon 2022-04-02 14:14 ` Florian Westphal 2022-04-02 16:29 ` Neal Cardwell 2 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-04-02 13:20 UTC (permalink / raw) To: Jaco Kroon Cc: Neal Cardwell, Florian Westphal, LKML, Netdev, Yuchung Cheng, Wei Wang, Pablo Neira Ayuso, Sven Auhagen On Sat, Apr 2, 2022 at 1:42 AM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Neal, > > On 2022/04/01 17:39, Neal Cardwell wrote: > > On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote: > > ... > >> Connection setup: > >> > >> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 > >> > >> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 > >> > >> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp > >> > >> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 > >> > >> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus > >> we shouldn't send segments larger than that, and they "can't". I need to > >> determine if this is some form of offloading or they really are sending >1500 > >> byte frames (which I know won't pass our firewalls without fragmentation so > >> probably some form of NIC offloading - which if it was active on older 5.8 > >> kernels did not cause problems): > > Jaco, was there some previous kernel version on these client machines > > where this problem did not show up? Perhaps the v5.8 version you > > mention here? Can you please share the exact version number? > 5.8.14 > > > > If so, a hypothesis would be: > > > > (1) There is a bug in netfilter's handling of TFO connections where > > the server sends a data packet after a TFO SYNACK, before the client > > ACKs anything (as we see in this trace). > > > > This bug is perhaps similar in character to the bug fixed by Yuchung's > > 2013 commit that Eric mentioned: > > > > 356d7d88e088687b6578ca64601b0a2c9d145296 > > netfilter: nf_conntrack: fix tcp_in_window for Fast Open > > > > (2) With kernel v5.8, TFO blackhole detection detected that in your > > workload there were TFO connections that died due to apparent > > blackholing (like what's shown in the trace), and dynamically disabled > > TFO on your machines. This allowed mail traffic to flow, because the > > netfilter bug was no longer tickled. This worked around the netfilter > > bug. > > > > (3) You upgraded your client-side machine from v5.8 to v5.17, which > > has the following commit from v5.14, which disables TFO blackhole > > logic by default: > > 213ad73d0607 tcp: disable TFO blackhole logic by default > > > > (4) Due to (3), the blackhole detection logic was no longer operative, > > and when the netfilter bug blackholed the connection, TFO stayed > > enabled. This caused mail traffic to Google to stall. > > > > This hypothesis would explain why: > > o disabling TFO fixes this problem > > o you are seeing this with a newer kernel (and apparently not with a > > kernel before v5.14?) > Agreed. > > > > With this hypothesis, we need several pieces to trigger this: > > > > (a) client side software that tries TFO to a server that supports TFO > > (like the exim mail transfer agent you are using, connecting to > > Google) > > > > (b) a client-side Linux kernel running buggy netfilter code (you are > > running netfilter) > > > > (c) a client-side Linux kernel with TFO support but no blackhole > > detection logic active (e.g. v5.14 or later, like your v5.17.1) > > > > That's probably a rare combination, so would explain why we have not > > had this report before. > > > > Jaco, to provide some evidence for this hypothesis, can you please > > re-enable fastopen but also enable the TFO blackhole detection that > > was disabled in v5.14 (213ad73d0607), with something like: > > > > sysctl -w net.ipv4.tcp_fastopen=1 > > sysctl -w tcp_fastopen_blackhole_timeout=3600 > > Done. > > Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which > generates lots of logs, something specific I should be looking for? I > suspect these relate: > > [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound > (over the window of the receiver) IN= OUT=bond0 > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020 > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 > GID=12 > [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound > (over the window of the receiver) IN= OUT=bond0 > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689941170 ACK=4200412020 > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 > GID=12 > > (There are many more of those, and the remote side is Google in this case) > Great. This confirms our suspicions. Please try the following patch that landed in 5.18-rc f2dd495a8d589371289981d5ed33e6873df94ecc netfilter: nf_conntrack_tcp: preserve liberal flag in tcp options CC netfilter folks. Condition triggering the bug : before(seq, sender->td_maxend + 1), I took a look at the code, and it is not clear if td_maxend is properly setup (or if td_scale is cleared at some point while it should not) Alternatively, if conntracking does not know if the connection is using wscale (or what is the scale), the "before(seq, sender->td_maxend + 1)," should not be evaluated/used. Also, I do not see where td_maxend is extended in tcp_init_sender() Probably wrong patch, just to point to the code I do not understand yet. diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 8ec55cd72572e0cca076631e2cc1c11f0c2b86f6..950082785d61b7a2768559c7500d3aee3aaea7c2 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -456,9 +456,10 @@ static void tcp_init_sender(struct ip_ct_tcp_state *sender, /* SYN-ACK in reply to a SYN * or SYN from reply direction in simultaneous open. */ - sender->td_end = - sender->td_maxend = end; - sender->td_maxwin = (win == 0 ? 1 : win); + sender->td_end = end; + sender->td_maxwin = max(win, 1U); + /* WIN in SYN & SYNACK is not scaled */ + sender->td_maxend = end + sender->td_maxwin; tcp_options(skb, dataoff, tcph, sender); /* RFC 1323: ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 13:20 ` Eric Dumazet @ 2022-04-02 22:02 ` Jaco Kroon 0 siblings, 0 replies; 38+ messages in thread From: Jaco Kroon @ 2022-04-02 22:02 UTC (permalink / raw) To: Eric Dumazet Cc: Neal Cardwell, Florian Westphal, LKML, Netdev, Yuchung Cheng, Wei Wang, Pablo Neira Ayuso, Sven Auhagen Hi, On 2022/04/02 15:20, Eric Dumazet wrote: > Great. This confirms our suspicions. > > Please try the following patch that landed in 5.18-rc > > f2dd495a8d589371289981d5ed33e6873df94ecc netfilter: nf_conntrack_tcp: > preserve liberal flag in tcp options Will track this down and deploy in the next day or two. Thank you, Neal and Florian for all the assistance! As an aside, would really like to engage with someone that can assist on the known congestion w.r.t. Google services in JHB, so if you're willing - or can get me in contact with the right people, please do contact me direct off-list (we've alleviated the issue by upgrading out IPT but would like to understand what is going on, can provide ticket references). Kind Regards, Jaco > > CC netfilter folks. > > Condition triggering the bug : > before(seq, sender->td_maxend + 1), > > I took a look at the code, and it is not clear if td_maxend is > properly setup (or if td_scale is cleared at some point while it > should not) > > Alternatively, if conntracking does not know if the connection is > using wscale (or what is the scale), the "before(seq, > sender->td_maxend + 1)," > should not be evaluated/used. > > Also, I do not see where td_maxend is extended in tcp_init_sender() > > Probably wrong patch, just to point to the code I do not understand yet. > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c > b/net/netfilter/nf_conntrack_proto_tcp.c > index 8ec55cd72572e0cca076631e2cc1c11f0c2b86f6..950082785d61b7a2768559c7500d3aee3aaea7c2 > 100644 > --- a/net/netfilter/nf_conntrack_proto_tcp.c > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > @@ -456,9 +456,10 @@ static void tcp_init_sender(struct ip_ct_tcp_state *sender, > /* SYN-ACK in reply to a SYN > * or SYN from reply direction in simultaneous open. > */ > - sender->td_end = > - sender->td_maxend = end; > - sender->td_maxwin = (win == 0 ? 1 : win); > + sender->td_end = end; > + sender->td_maxwin = max(win, 1U); > + /* WIN in SYN & SYNACK is not scaled */ > + sender->td_maxend = end + sender->td_maxwin; > > tcp_options(skb, dataoff, tcph, sender); > /* RFC 1323: ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 8:42 ` Jaco Kroon 2022-04-02 13:20 ` Eric Dumazet @ 2022-04-02 14:14 ` Florian Westphal 2022-04-02 15:57 ` Neal Cardwell 2022-04-02 21:51 ` Jaco Kroon 2022-04-02 16:29 ` Neal Cardwell 2 siblings, 2 replies; 38+ messages in thread From: Florian Westphal @ 2022-04-02 14:14 UTC (permalink / raw) To: Jaco Kroon Cc: Neal Cardwell, Florian Westphal, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang Jaco Kroon <jaco@uls.co.za> wrote: > Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which > generates lots of logs, something specific I should be looking for? I > suspect these relate: > > [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound > (over the window of the receiver) IN= OUT=bond0 > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020 > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 > GID=12 I thought this had "liberal mode" enabled for tcp conntrack? The above implies its off. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 14:14 ` Florian Westphal @ 2022-04-02 15:57 ` Neal Cardwell 2022-04-02 21:51 ` Jaco Kroon 1 sibling, 0 replies; 38+ messages in thread From: Neal Cardwell @ 2022-04-02 15:57 UTC (permalink / raw) To: Florian Westphal Cc: Jaco Kroon, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang On Sat, Apr 2, 2022 at 10:14 AM Florian Westphal <fw@strlen.de> wrote: > > Jaco Kroon <jaco@uls.co.za> wrote: > > Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which > > generates lots of logs, something specific I should be looking for? I > > suspect these relate: > > > > [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound > > (over the window of the receiver) IN= OUT=bond0 > > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b > > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 > > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020 > > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 > > GID=12 > > I thought this had "liberal mode" enabled for tcp conntrack? > The above implies its off. Jaco's email said: "Our core firewalls already had nf_conntrack_tcp_be_liberal". But this log is from the client machine itself, not the core firewall machines. AFAICT it seems the client machine does not have "liberal mode" enabled. neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 14:14 ` Florian Westphal 2022-04-02 15:57 ` Neal Cardwell @ 2022-04-02 21:51 ` Jaco Kroon 1 sibling, 0 replies; 38+ messages in thread From: Jaco Kroon @ 2022-04-02 21:51 UTC (permalink / raw) To: Florian Westphal Cc: Neal Cardwell, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang Hi Florian, On 2022/04/02 16:14, Florian Westphal wrote: > Jaco Kroon <jaco@uls.co.za> wrote: >> Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which >> generates lots of logs, something specific I should be looking for? I >> suspect these relate: >> >> [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound >> (over the window of the receiver) IN= OUT=bond0 >> SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b >> DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 >> FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020 >> WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 >> GID=12 > I thought this had "liberal mode" enabled for tcp conntrack? > The above implies its off. We have liberal on the core firewalls, not on the endpoints ... yes, we do double firewall :). So the firewalls into the subnets has liberal mode (which really was an oversight when axing conntrackd), but the servers themselves do not. Kind Regards, Jaco ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 8:42 ` Jaco Kroon 2022-04-02 13:20 ` Eric Dumazet 2022-04-02 14:14 ` Florian Westphal @ 2022-04-02 16:29 ` Neal Cardwell 2022-04-02 16:32 ` Eric Dumazet 2 siblings, 1 reply; 38+ messages in thread From: Neal Cardwell @ 2022-04-02 16:29 UTC (permalink / raw) To: Jaco Kroon Cc: Florian Westphal, Eric Dumazet, LKML, Netdev, Yuchung Cheng, Wei Wang ) On Sat, Apr 2, 2022 at 4:42 AM Jaco Kroon <jaco@uls.co.za> wrote: > > Hi Neal, > > On 2022/04/01 17:39, Neal Cardwell wrote: > > On Tue, Mar 29, 2022 at 9:03 PM Jaco <jaco@uls.co.za> wrote: > > ... > >> Connection setup: > >> > >> 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,nop], length 0 > >> > >> 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, options [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length 0 > >> > >> 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256, options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gsmtp > >> > >> 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:400c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val 3687705645 ecr 3477429220], length 0 > >> > >> This is pretty normal, we advertise an MSS of 8940 and the return is 1440, thus > >> we shouldn't send segments larger than that, and they "can't". I need to > >> determine if this is some form of offloading or they really are sending >1500 > >> byte frames (which I know won't pass our firewalls without fragmentation so > >> probably some form of NIC offloading - which if it was active on older 5.8 > >> kernels did not cause problems): > > Jaco, was there some previous kernel version on these client machines > > where this problem did not show up? Perhaps the v5.8 version you > > mention here? Can you please share the exact version number? > 5.8.14 Thanks for the client kernel version! (5.8.14) > > If so, a hypothesis would be: > > > > (1) There is a bug in netfilter's handling of TFO connections where > > the server sends a data packet after a TFO SYNACK, before the client > > ACKs anything (as we see in this trace). > > > > This bug is perhaps similar in character to the bug fixed by Yuchung's > > 2013 commit that Eric mentioned: > > > > 356d7d88e088687b6578ca64601b0a2c9d145296 > > netfilter: nf_conntrack: fix tcp_in_window for Fast Open > > > > (2) With kernel v5.8, TFO blackhole detection detected that in your > > workload there were TFO connections that died due to apparent > > blackholing (like what's shown in the trace), and dynamically disabled > > TFO on your machines. This allowed mail traffic to flow, because the > > netfilter bug was no longer tickled. This worked around the netfilter > > bug. > > > > (3) You upgraded your client-side machine from v5.8 to v5.17, which > > has the following commit from v5.14, which disables TFO blackhole > > logic by default: > > 213ad73d0607 tcp: disable TFO blackhole logic by default > > > > (4) Due to (3), the blackhole detection logic was no longer operative, > > and when the netfilter bug blackholed the connection, TFO stayed > > enabled. This caused mail traffic to Google to stall. > > > > This hypothesis would explain why: > > o disabling TFO fixes this problem > > o you are seeing this with a newer kernel (and apparently not with a > > kernel before v5.14?) > Agreed. > > > > With this hypothesis, we need several pieces to trigger this: > > > > (a) client side software that tries TFO to a server that supports TFO > > (like the exim mail transfer agent you are using, connecting to > > Google) > > > > (b) a client-side Linux kernel running buggy netfilter code (you are > > running netfilter) > > > > (c) a client-side Linux kernel with TFO support but no blackhole > > detection logic active (e.g. v5.14 or later, like your v5.17.1) > > > > That's probably a rare combination, so would explain why we have not > > had this report before. > > > > Jaco, to provide some evidence for this hypothesis, can you please > > re-enable fastopen but also enable the TFO blackhole detection that > > was disabled in v5.14 (213ad73d0607), with something like: > > > > sysctl -w net.ipv4.tcp_fastopen=1 > > sysctl -w tcp_fastopen_blackhole_timeout=3600 > > Done. Thanks for running that experiment and reporting your data! That was super-informative. So it seems like we have a working high-level theory about what's going on and where, and we just need to pinpoint the buggy lines in the netfilter conntrack code running on the mail client machines. > Including sysctl net.netfilter.nf_conntrack_log_invalid=6- which > generates lots of logs, something specific I should be looking for? I > suspect these relate: > > [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound > (over the window of the receiver) IN= OUT=bond0 > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689938314 ACK=4200412020 > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 > GID=12 > [Sat Apr 2 10:31:53 2022] nf_ct_proto_6: SEQ is over the upper bound > (over the window of the receiver) IN= OUT=bond0 > SRC=2c0f:f720:0000:0003:d6ae:52ff:feb8:f27b > DST=2a00:1450:400c:0c08:0000:0000:0000:001a LEN=2928 TC=0 HOPLIMIT=64 > FLOWLBL=867133 PROTO=TCP SPT=48920 DPT=25 SEQ=2689941170 ACK=4200412020 > WINDOW=447 RES=0x00 ACK PSH URGP=0 OPT (0101080A2F36C1C120EDFB91) UID=8 > GID=12 > > (There are many more of those, and the remote side is Google in this case) FWIW those log entries indicate netfilter on the mail client machine dropping consecutive outbound skbs with 2*MSS of payload. So that explains the large consecutive losses of client data packets to the e-mail server. That seems to confirm my earlier hunch that those drops of consecutive client data packets "do not look like normal congestive packet loss". neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 16:29 ` Neal Cardwell @ 2022-04-02 16:32 ` Eric Dumazet 2022-04-02 18:04 ` Neal Cardwell 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-04-02 16:32 UTC (permalink / raw) To: Neal Cardwell Cc: Jaco Kroon, Florian Westphal, LKML, Netdev, Yuchung Cheng, Wei Wang On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote: > > FWIW those log entries indicate netfilter on the mail client machine > dropping consecutive outbound skbs with 2*MSS of payload. So that > explains the large consecutive losses of client data packets to the > e-mail server. That seems to confirm my earlier hunch that those drops > of consecutive client data packets "do not look like normal congestive > packet loss". This also explains why we have all these tiny 2-MSS packets in the pcap. Under normal conditions, autocorking should kick in, allowing TCP to build bigger TSO packets. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 16:32 ` Eric Dumazet @ 2022-04-02 18:04 ` Neal Cardwell 2022-04-06 13:58 ` Florian Westphal 0 siblings, 1 reply; 38+ messages in thread From: Neal Cardwell @ 2022-04-02 18:04 UTC (permalink / raw) To: Eric Dumazet Cc: Jaco Kroon, Florian Westphal, LKML, Netdev, Yuchung Cheng, Wei Wang On Sat, Apr 2, 2022 at 12:32 PM Eric Dumazet <edumazet@google.com> wrote: > > On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote: > > > > FWIW those log entries indicate netfilter on the mail client machine > > dropping consecutive outbound skbs with 2*MSS of payload. So that > > explains the large consecutive losses of client data packets to the > > e-mail server. That seems to confirm my earlier hunch that those drops > > of consecutive client data packets "do not look like normal congestive > > packet loss". > > > This also explains why we have all these tiny 2-MSS packets in the pcap. > > Under normal conditions, autocorking should kick in, allowing TCP to > build bigger TSO packets. I have not looked at the conntrack code before today, but AFAICT this is the buggy section of nf_conntrack_proto_tcp.c: } else if (((state->state == TCP_CONNTRACK_SYN_SENT && dir == IP_CT_DIR_ORIGINAL) || (state->state == TCP_CONNTRACK_SYN_RECV && dir == IP_CT_DIR_REPLY)) && after(end, sender->td_end)) { /* * RFC 793: "if a TCP is reinitialized ... then it need * not wait at all; it must only be sure to use sequence * numbers larger than those recently used." */ sender->td_end = sender->td_maxend = end; sender->td_maxwin = (win == 0 ? 1 : win); tcp_options(skb, dataoff, tcph, sender); Note that the tcp_options() function implicitly assumes it is being called on a SYN, because it sets state->td_scale to 0 and only sets state->td_scale to something non-zero if it sees a wscale option. So if we ever call that on an skb that's not a SYN, we will forget that the connection is using the wscale option. But at this point in the code it is calling tcp_options() without first checking that this is a SYN. For this TFO scenario like the one in the trace, where the server sends its first data packet after the SYNACK packet and before the client's first ACK, presumably the conntrack state machine is (correctly) SYN_RECV, and then (incorrectly) executes this code, including the call to tcp_options(), on this first data packet, which has no SYN bit, and no wscale option. Thus tcp_options() zeroes out the server's sending state td_scale and does not set it to a non-zero value. So now conntrack thinks the server is not using the wscale option. So when conntrack interprets future receive windows from the server, it does not scale them (with: win <<= sender->td_scale;), so in this scenario the estimated right edge of the server's receive window (td_maxend) is never advanced past the roughly 64KB value offered in the SYN. Thus when the client sends data packets beyond 64KBytes, conntrack declares them invalid and drops them, due to failing the condition Eric noted above: before(seq, sender->td_maxend + 1), This explains my previous observation that the client's original data packet transmissions are always dropped after the first 64KBytes. Someone more familiar with conntrack may have a good idea about how to best fix this? neal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-02 18:04 ` Neal Cardwell @ 2022-04-06 13:58 ` Florian Westphal 2022-04-06 19:04 ` Jozsef Kadlecsik 0 siblings, 1 reply; 38+ messages in thread From: Florian Westphal @ 2022-04-06 13:58 UTC (permalink / raw) To: Neal Cardwell; +Cc: Eric Dumazet, Jaco Kroon, netfilter-devel, netdev, kadlec Neal Cardwell <ncardwell@google.com> wrote: [ trimmed CCs, add Jozsef and nf-devel ] Neal, Eric, thanks for debugging this problem. > On Sat, Apr 2, 2022 at 12:32 PM Eric Dumazet <edumazet@google.com> wrote: > > On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote: > > > FWIW those log entries indicate netfilter on the mail client machine > > > dropping consecutive outbound skbs with 2*MSS of payload. So that > > > explains the large consecutive losses of client data packets to the > > > e-mail server. That seems to confirm my earlier hunch that those drops > > > of consecutive client data packets "do not look like normal congestive > > > packet loss". > > > > This also explains why we have all these tiny 2-MSS packets in the pcap. > > Under normal conditions, autocorking should kick in, allowing TCP to > > build bigger TSO packets. > > I have not looked at the conntrack code before today, but AFAICT this > is the buggy section of nf_conntrack_proto_tcp.c: > > } else if (((state->state == TCP_CONNTRACK_SYN_SENT > && dir == IP_CT_DIR_ORIGINAL) > || (state->state == TCP_CONNTRACK_SYN_RECV > && dir == IP_CT_DIR_REPLY)) > && after(end, sender->td_end)) { > /* > * RFC 793: "if a TCP is reinitialized ... then it need > * not wait at all; it must only be sure to use sequence > * numbers larger than those recently used." > */ > sender->td_end = > sender->td_maxend = end; > sender->td_maxwin = (win == 0 ? 1 : win); > > tcp_options(skb, dataoff, tcph, sender); > > Note that the tcp_options() function implicitly assumes it is being > called on a SYN, because it sets state->td_scale to 0 and only sets > state->td_scale to something non-zero if it sees a wscale option. So > if we ever call that on an skb that's not a SYN, we will forget that > the connection is using the wscale option. > > But at this point in the code it is calling tcp_options() without > first checking that this is a SYN. Yes, thats the bug, tcp_options() must not be called if syn bit is not set. > For this TFO scenario like the one in the trace, where the server > sends its first data packet after the SYNACK packet and before the > client's first ACK, presumably the conntrack state machine is > (correctly) SYN_RECV, and then (incorrectly) executes this code, Right. Jozsef, for context, sequence is in trace is: S > C Flags [S], seq 3451342529, win 62580, options [mss 8940,sackOK,TS val 331187616 ecr 0,nop,wscale 7,tfo [|tcp]> C > S Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [mss 1440,sackOK,TS val 1206542770 ecr 331187616,nop,wscale 8], length 0 C > S Flags [P.], seq 1:89, ack 1, win 256, options [nop,nop,TS val 1206542772 ecr 331187616], length 88: SMTP [|smtp] Normally, 3rd packet would be S > C, but this one is C > S. So, packet #3 hits the 'reinit' branch which zaps wscale option. > Someone more familiar with conntrack may have a good idea about how to > best fix this? Jozsef, does this look sane to you? It fixes the TFO capture and still passes the test case i made for 82b72cb94666b3dbd7152bb9f441b068af7a921b ("netfilter: conntrack: re-init state for retransmitted syn-ack"). diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 8ec55cd72572..90ad1c0f23b1 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -556,33 +556,24 @@ static bool tcp_in_window(struct nf_conn *ct, } } - } else if (((state->state == TCP_CONNTRACK_SYN_SENT - && dir == IP_CT_DIR_ORIGINAL) - || (state->state == TCP_CONNTRACK_SYN_RECV - && dir == IP_CT_DIR_REPLY)) - && after(end, sender->td_end)) { + } else if (tcph->syn && + after(end, sender->td_end) && + (state->state == TCP_CONNTRACK_SYN_SENT || + state->state == TCP_CONNTRACK_SYN_RECV)) { /* * RFC 793: "if a TCP is reinitialized ... then it need * not wait at all; it must only be sure to use sequence * numbers larger than those recently used." - */ - sender->td_end = - sender->td_maxend = end; - sender->td_maxwin = (win == 0 ? 1 : win); - - tcp_options(skb, dataoff, tcph, sender); - } else if (tcph->syn && dir == IP_CT_DIR_REPLY && - state->state == TCP_CONNTRACK_SYN_SENT) { - /* Retransmitted syn-ack, or syn (simultaneous open). * + * also check for retransmitted syn-ack, or syn (simultaneous open). * Re-init state for this direction, just like for the first * syn(-ack) reply, it might differ in seq, ack or tcp options. + * + * Check for invalid syn-ack in original direction was already done. */ tcp_init_sender(sender, receiver, skb, dataoff, tcph, end, win); - if (!tcph->ack) - return true; } if (!(tcph->ack)) { ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-06 13:58 ` Florian Westphal @ 2022-04-06 19:04 ` Jozsef Kadlecsik 2022-04-07 10:26 ` Florian Westphal 0 siblings, 1 reply; 38+ messages in thread From: Jozsef Kadlecsik @ 2022-04-06 19:04 UTC (permalink / raw) To: Florian Westphal Cc: Neal Cardwell, Eric Dumazet, Jaco Kroon, netfilter-devel, netdev Hi Florian, On Wed, 6 Apr 2022, Florian Westphal wrote: > Neal Cardwell <ncardwell@google.com> wrote: > > [ trimmed CCs, add Jozsef and nf-devel ] > > Neal, Eric, thanks for debugging this problem. > > > On Sat, Apr 2, 2022 at 12:32 PM Eric Dumazet <edumazet@google.com> wrote: > > > On Sat, Apr 2, 2022 at 9:29 AM Neal Cardwell <ncardwell@google.com> wrote: > > > > FWIW those log entries indicate netfilter on the mail client machine > > > > dropping consecutive outbound skbs with 2*MSS of payload. So that > > > > explains the large consecutive losses of client data packets to the > > > > e-mail server. That seems to confirm my earlier hunch that those drops > > > > of consecutive client data packets "do not look like normal congestive > > > > packet loss". > > > > > > This also explains why we have all these tiny 2-MSS packets in the pcap. > > > Under normal conditions, autocorking should kick in, allowing TCP to > > > build bigger TSO packets. > > > > I have not looked at the conntrack code before today, but AFAICT this > > is the buggy section of nf_conntrack_proto_tcp.c: > > > > } else if (((state->state == TCP_CONNTRACK_SYN_SENT > > && dir == IP_CT_DIR_ORIGINAL) > > || (state->state == TCP_CONNTRACK_SYN_RECV > > && dir == IP_CT_DIR_REPLY)) > > && after(end, sender->td_end)) { > > /* > > * RFC 793: "if a TCP is reinitialized ... then it need > > * not wait at all; it must only be sure to use sequence > > * numbers larger than those recently used." > > */ > > sender->td_end = > > sender->td_maxend = end; > > sender->td_maxwin = (win == 0 ? 1 : win); > > > > tcp_options(skb, dataoff, tcph, sender); > > > > Note that the tcp_options() function implicitly assumes it is being > > called on a SYN, because it sets state->td_scale to 0 and only sets > > state->td_scale to something non-zero if it sees a wscale option. So > > if we ever call that on an skb that's not a SYN, we will forget that > > the connection is using the wscale option. > > > > But at this point in the code it is calling tcp_options() without > > first checking that this is a SYN. > > Yes, thats the bug, tcp_options() must not be called if syn bit is not > set. > > > For this TFO scenario like the one in the trace, where the server > > sends its first data packet after the SYNACK packet and before the > > client's first ACK, presumably the conntrack state machine is > > (correctly) SYN_RECV, and then (incorrectly) executes this code, > > Right. Jozsef, for context, sequence is in trace is: > > S > C Flags [S], seq 3451342529, win 62580, options [mss 8940,sackOK,TS val 331187616 ecr 0,nop,wscale 7,tfo [|tcp]> > C > S Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [mss 1440,sackOK,TS val 1206542770 ecr 331187616,nop,wscale 8], length 0 > C > S Flags [P.], seq 1:89, ack 1, win 256, options [nop,nop,TS val 1206542772 ecr 331187616], length 88: SMTP [|smtp] > > Normally, 3rd packet would be S > C, but this one is C > S. > > So, packet #3 hits the 'reinit' branch which zaps wscale option. > > > Someone more familiar with conntrack may have a good idea about how to > > best fix this? > > Jozsef, does this look sane to you? > It fixes the TFO capture and still passes the test case i made for > 82b72cb94666b3dbd7152bb9f441b068af7a921b > ("netfilter: conntrack: re-init state for retransmitted syn-ack"). As far as I see it'd break simultaneous open because after(end, sender->td_end) is called in the new condition: > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > index 8ec55cd72572..90ad1c0f23b1 100644 > --- a/net/netfilter/nf_conntrack_proto_tcp.c > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > @@ -556,33 +556,24 @@ static bool tcp_in_window(struct nf_conn *ct, > } > > } > - } else if (((state->state == TCP_CONNTRACK_SYN_SENT > - && dir == IP_CT_DIR_ORIGINAL) > - || (state->state == TCP_CONNTRACK_SYN_RECV > - && dir == IP_CT_DIR_REPLY)) > - && after(end, sender->td_end)) { > + } else if (tcph->syn && > + after(end, sender->td_end) && > + (state->state == TCP_CONNTRACK_SYN_SENT || > + state->state == TCP_CONNTRACK_SYN_RECV)) { > /* > * RFC 793: "if a TCP is reinitialized ... then it need > * not wait at all; it must only be sure to use sequence > * numbers larger than those recently used." > - */ > - sender->td_end = > - sender->td_maxend = end; > - sender->td_maxwin = (win == 0 ? 1 : win); > - > - tcp_options(skb, dataoff, tcph, sender); > - } else if (tcph->syn && dir == IP_CT_DIR_REPLY && > - state->state == TCP_CONNTRACK_SYN_SENT) { > - /* Retransmitted syn-ack, or syn (simultaneous open). > * > + * also check for retransmitted syn-ack, or syn (simultaneous open). > * Re-init state for this direction, just like for the first > * syn(-ack) reply, it might differ in seq, ack or tcp options. > + * > + * Check for invalid syn-ack in original direction was already done. > */ > tcp_init_sender(sender, receiver, > skb, dataoff, tcph, > end, win); > - if (!tcph->ack) > - return true; > } > > if (!(tcph->ack)) { > I'd merge the two conditions so that it'd cover both original condition branches: diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 8ec55cd72572..87375ce2f995 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct, } } - } else if (((state->state == TCP_CONNTRACK_SYN_SENT - && dir == IP_CT_DIR_ORIGINAL) - || (state->state == TCP_CONNTRACK_SYN_RECV - && dir == IP_CT_DIR_REPLY)) - && after(end, sender->td_end)) { + } else if (tcph->syn && + ((after(end, sender->td_end) && + (state->state == TCP_CONNTRACK_SYN_SENT || + state->state == TCP_CONNTRACK_SYN_RECV)) || + (dir == IP_CT_DIR_REPLY && + state->state == TCP_CONNTRACK_SYN_SENT))) { /* * RFC 793: "if a TCP is reinitialized ... then it need * not wait at all; it must only be sure to use sequence * numbers larger than those recently used." - */ - sender->td_end = - sender->td_maxend = end; - sender->td_maxwin = (win == 0 ? 1 : win); - - tcp_options(skb, dataoff, tcph, sender); - } else if (tcph->syn && dir == IP_CT_DIR_REPLY && - state->state == TCP_CONNTRACK_SYN_SENT) { - /* Retransmitted syn-ack, or syn (simultaneous open). * + * also check for retransmitted syn-ack, or syn (simultaneous open). * Re-init state for this direction, just like for the first * syn(-ack) reply, it might differ in seq, ack or tcp options. + * + * Check for invalid syn-ack in original direction was already done. */ tcp_init_sender(sender, receiver, skb, dataoff, tcph, end, win); - if (!tcph->ack) - return true; } if (!(tcph->ack)) { What do you think? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-06 19:04 ` Jozsef Kadlecsik @ 2022-04-07 10:26 ` Florian Westphal 2022-04-07 12:48 ` Jozsef Kadlecsik 0 siblings, 1 reply; 38+ messages in thread From: Florian Westphal @ 2022-04-07 10:26 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Florian Westphal, Neal Cardwell, Eric Dumazet, Jaco Kroon, netfilter-devel, netdev Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > I'd merge the two conditions so that it'd cover both original condition > branches: > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > index 8ec55cd72572..87375ce2f995 100644 > --- a/net/netfilter/nf_conntrack_proto_tcp.c > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct, > } > > } > - } else if (((state->state == TCP_CONNTRACK_SYN_SENT > - && dir == IP_CT_DIR_ORIGINAL) > - || (state->state == TCP_CONNTRACK_SYN_RECV > - && dir == IP_CT_DIR_REPLY)) > - && after(end, sender->td_end)) { > + } else if (tcph->syn && > + ((after(end, sender->td_end) && > + (state->state == TCP_CONNTRACK_SYN_SENT || > + state->state == TCP_CONNTRACK_SYN_RECV)) || > + (dir == IP_CT_DIR_REPLY && > + state->state == TCP_CONNTRACK_SYN_SENT))) { Thats what I did as well, I merged the two branches but I made the 2nd clause stricter to also consider the after() test; it would no longer re-init for syn-acks when sequence did not advance. Then, dir == IP_CT_DIR_REPLY && state == SYN_SENT is already covered by earlier test and can be elided. I'm fine with your version though, will you submit a patch? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-07 10:26 ` Florian Westphal @ 2022-04-07 12:48 ` Jozsef Kadlecsik 2022-04-21 21:14 ` Eric Dumazet 0 siblings, 1 reply; 38+ messages in thread From: Jozsef Kadlecsik @ 2022-04-07 12:48 UTC (permalink / raw) To: Florian Westphal Cc: Neal Cardwell, Eric Dumazet, Jaco Kroon, netfilter-devel, netdev On Thu, 7 Apr 2022, Florian Westphal wrote: > Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > I'd merge the two conditions so that it'd cover both original condition > > branches: > > > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > > index 8ec55cd72572..87375ce2f995 100644 > > --- a/net/netfilter/nf_conntrack_proto_tcp.c > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > > @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct, > > } > > > > } > > - } else if (((state->state == TCP_CONNTRACK_SYN_SENT > > - && dir == IP_CT_DIR_ORIGINAL) > > - || (state->state == TCP_CONNTRACK_SYN_RECV > > - && dir == IP_CT_DIR_REPLY)) > > - && after(end, sender->td_end)) { > > + } else if (tcph->syn && > > + ((after(end, sender->td_end) && > > + (state->state == TCP_CONNTRACK_SYN_SENT || > > + state->state == TCP_CONNTRACK_SYN_RECV)) || > > + (dir == IP_CT_DIR_REPLY && > > + state->state == TCP_CONNTRACK_SYN_SENT))) { > > Thats what I did as well, I merged the two branches but I made the > 2nd clause stricter to also consider the after() test; it would no > longer re-init for syn-acks when sequence did not advance. That's perfectly fine. But what about simultaneous syn? The TCP state is zeroed in the REPLY direction, so the after() test can easily be false and the state wouldn't be picked up. Therefore I extended the condition. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-07 12:48 ` Jozsef Kadlecsik @ 2022-04-21 21:14 ` Eric Dumazet 2022-04-25 9:29 ` Florian Westphal 0 siblings, 1 reply; 38+ messages in thread From: Eric Dumazet @ 2022-04-21 21:14 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Florian Westphal, Neal Cardwell, Jaco Kroon, netfilter-devel, netdev On Thu, Apr 7, 2022 at 5:48 AM Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > On Thu, 7 Apr 2022, Florian Westphal wrote: > > > Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > > I'd merge the two conditions so that it'd cover both original condition > > > branches: > > > > > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > > > index 8ec55cd72572..87375ce2f995 100644 > > > --- a/net/netfilter/nf_conntrack_proto_tcp.c > > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > > > @@ -556,33 +556,26 @@ static bool tcp_in_window(struct nf_conn *ct, > > > } > > > > > > } > > > - } else if (((state->state == TCP_CONNTRACK_SYN_SENT > > > - && dir == IP_CT_DIR_ORIGINAL) > > > - || (state->state == TCP_CONNTRACK_SYN_RECV > > > - && dir == IP_CT_DIR_REPLY)) > > > - && after(end, sender->td_end)) { > > > + } else if (tcph->syn && > > > + ((after(end, sender->td_end) && > > > + (state->state == TCP_CONNTRACK_SYN_SENT || > > > + state->state == TCP_CONNTRACK_SYN_RECV)) || > > > + (dir == IP_CT_DIR_REPLY && > > > + state->state == TCP_CONNTRACK_SYN_SENT))) { > > > > Thats what I did as well, I merged the two branches but I made the > > 2nd clause stricter to also consider the after() test; it would no > > longer re-init for syn-acks when sequence did not advance. > > That's perfectly fine. > > But what about simultaneous syn? The TCP state is zeroed in the REPLY > direction, so the after() test can easily be false and the state wouldn't > be picked up. Therefore I extended the condition. > Hi Jozsef and Florian Any updates for this issue ? Thanks ! ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections 2022-04-21 21:14 ` Eric Dumazet @ 2022-04-25 9:29 ` Florian Westphal 0 siblings, 0 replies; 38+ messages in thread From: Florian Westphal @ 2022-04-25 9:29 UTC (permalink / raw) To: Eric Dumazet Cc: Jozsef Kadlecsik, Florian Westphal, Neal Cardwell, Jaco Kroon, netfilter-devel, netdev Eric Dumazet <edumazet@google.com> wrote: > Hi Jozsef and Florian > > Any updates for this issue ? Sorry, I was away for a while. I will send the patch formally in a few minutes. ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2022-04-25 9:30 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-03-30 0:56 linux 5.17.1 disregarding ACK values resulting in stalled TCP connections Jaco 2022-03-30 2:01 ` Neal Cardwell 2022-03-30 2:40 ` Eric Dumazet 2022-03-30 2:58 ` Jaco Kroon 2022-03-30 3:48 ` Eric Dumazet 2022-03-30 6:22 ` Jaco Kroon 2022-03-30 13:56 ` Neal Cardwell 2022-03-30 15:00 ` Jaco Kroon 2022-03-30 16:19 ` Eric Dumazet 2022-03-31 15:41 ` Neal Cardwell 2022-03-31 23:06 ` Jaco Kroon 2022-04-01 0:10 ` Eric Dumazet 2022-04-01 0:15 ` Florian Westphal 2022-04-01 11:54 ` Jaco Kroon 2022-04-01 12:09 ` Florian Westphal 2022-04-01 0:33 ` Jaco Kroon 2022-04-01 0:41 ` Eric Dumazet 2022-04-01 0:54 ` Eric Dumazet 2022-04-01 11:36 ` Jaco Kroon 2022-04-01 13:54 ` Eric Dumazet 2022-04-01 14:50 ` Neal Cardwell 2022-04-01 15:39 ` Neal Cardwell 2022-04-01 15:48 ` Neal Cardwell 2022-04-02 8:42 ` Jaco Kroon 2022-04-02 13:20 ` Eric Dumazet 2022-04-02 22:02 ` Jaco Kroon 2022-04-02 14:14 ` Florian Westphal 2022-04-02 15:57 ` Neal Cardwell 2022-04-02 21:51 ` Jaco Kroon 2022-04-02 16:29 ` Neal Cardwell 2022-04-02 16:32 ` Eric Dumazet 2022-04-02 18:04 ` Neal Cardwell 2022-04-06 13:58 ` Florian Westphal 2022-04-06 19:04 ` Jozsef Kadlecsik 2022-04-07 10:26 ` Florian Westphal 2022-04-07 12:48 ` Jozsef Kadlecsik 2022-04-21 21:14 ` Eric Dumazet 2022-04-25 9:29 ` Florian Westphal
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.