On Mon, Nov 20, 2017 at 2:31 AM, Steve Ibanez wrote: > Hi Folks, > > I wanted to check back in on this for another update and to solicit > some more suggestions. I did a bit more digging to try an isolate the > problem. Going back to one of your Oct 19 trace snapshots (attached), AFAICT at the time of the timeout there is actually almost 64KBytes (352553398 + 1448 - 352489686 = 65160) of unacknowledged data. So there really does seem to be a significant chunk of packets that were in-flight that were then declared lost. So here is a possibility: perhaps the combination of CWR+PRR plus tcp_tso_should_defer() means that PRR can make cwnd so gentle that tcp_tso_should_defer() thinks we should wait for another ACK to send, and that ACK doesn't come. Breaking it, down, the potential sequence would be: (1) tcp_write_xmit() does not send, because the CWR behavior, using PRR, does not leave enough cwnd for tcp_tso_should_defer() to think we should send (PRR was originally designed for recovery, which did not have TSO deferral) (2) TLP does not fire, because we are in state CWR, not Open (3) The only remaining option is an RTO, which fires. In other words, the possibility is that, at the time of the stall, the cwnd is reasonably high, but tcp_packets_in_flight() is also quite high, so either there is (a) literally no unused cwnd left ( tcp_packets_in_flight() == cwnd), or (b) some mechanism like tcp_tso_should_defer() is deciding that there is not enough available cwnd for it to make sense to chop off a fraction of a TSO skb to send now. One way to test that conjecture would be to disable tcp_tso_should_defer() by adding a: goto send_now; at the top of tcp_tso_should_defer(). If that doesn't prevent the freezes then I would recommend adding printks or other instrumentation to tcp_write_xmit() to log: - time - ca_state - cwnd - ssthresh - tcp_packets_in_flight() - the reason for breaking out of the tcp_write_xmit() loop (tso deferral, no packets left, tcp_snd_wnd_test, tcp_nagle_test, etc) cheers, neal