From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neal Cardwell Subject: Re: Linux ECN Handling Date: Wed, 3 Jan 2018 14:39:29 -0500 Message-ID: References: <20171019124312.GE16796@breakpoint.cc> <5A006CF6.1020608@iogearbox.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Eric Dumazet , Yuchung Cheng , Daniel Borkmann , Netdev , Florian Westphal , Mohammad Alizadeh , Lawrence Brakmo To: Steve Ibanez Return-path: Received: from mail-wr0-f172.google.com ([209.85.128.172]:37790 "EHLO mail-wr0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750865AbeACTjw (ORCPT ); Wed, 3 Jan 2018 14:39:52 -0500 Received: by mail-wr0-f172.google.com with SMTP id f8so2682116wre.4 for ; Wed, 03 Jan 2018 11:39:51 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Tue, Jan 2, 2018 at 6:57 PM, Steve Ibanez wrote: > Hi Neal, > > Sorry, my last email was incorrect. It turns out the default tcp > congestion control alg that was being used on my client machines was > cubic instead of dctcp. That is why tp->processing_cwr field was never > set in the tcp_rcv_established function. I've changed the default back > to dctcp on all of my machines. > > I am now logging the value of tp->rcv_nxt at the top of the > tcp_transmit_skb() function for all CWR segments. I see that during > normal operation, the value of tp->rcv_nxt is equal to the SeqNo in > the CWR segment + length of the CWR segment. OK, thanks. That makes sense. This part I didn't understand: > However, for the unACKed > CWR segment, the value of tp->rcv_nxt is just equal to the SeqNo in > the CWR segment (i.e. not incremented by the length). And I see that > by the time the tcp_ack_snd_check() function is executed, tp->rcv_nxt > has been incremented by the length of the unACKed CWR segment. I would have thought that for the processing of the skb that has the CWR, the sequence would be: (1) "...the tcp_ack_snd_check() function is executed, tp->rcv_nxt has been incremented by the length of the unACKed CWR segment" (2) then we send the ACK, and the instrumentation at the top of the tcp_transmit_skb() function logs that rcv_nxt value (which "has been incremented by the length of the unACKed CWR segment"). But you are saying "for the unACKed CWR segment, the value of tp->rcv_nxt is just equal to the SeqNo in the CWR segment (i.e. not incremented by the length)", which does not seem to match my prediction in (2). Apparently I am mis-understanding the sequence. Perhaps you can help clear it up for me? :-) Is it possible that the case where you see "tp->rcv_nxt is just equal to the SeqNo in the CWR segment" is a log line that was logged while processing the skb that precedes the skb with the CWR? > The tcp_transmit_skb() function sets the outgoing segment's ack_seq to > be tp->rcv_next: > > th->ack_seq = htonl(tp->rcv_nxt); > > So I think the rcv_nxt field is supposed to be incremented before > reaching tcp_transmit_skb(). Can you see any reason as to why this > field would not be incremented for CWR segments sometimes? No, so far I haven't been able to think of a reason why rcv_nxt would not be incremented for in-order CWR-marked segments... cheers, neal