All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Neal Cardwell <ncardwell@google.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Yuchung Cheng <ycheng@google.com>,
	Nandita Dukkipati <nanditad@google.com>,
	Van Jacobson <vanj@google.com>
Subject: Re: [PATCH v2 net-next] tcp: remove a bogus TSO split
Date: Fri, 13 Dec 2013 10:38:46 -0800	[thread overview]
Message-ID: <1386959926.19078.169.camel@edumazet-glaptop2.roam.corp.google.com> (raw)
In-Reply-To: <CADVnQy=ayiUVjXPscUkONDOGhfQ7xDyqn=KFzrndjw2mrPufrw@mail.gmail.com>

On Fri, 2013-12-13 at 13:17 -0500, Neal Cardwell wrote:

> It occurred to me after staring at this code a little longer that
> although the Nagle test is already done before the call to
> tcp_mss_split_point(), the tcp_nagle_check() is only run if
> tso_segs==1 and is only checking whether skb->len < mss_now, so the
> Nagle code currently is implicitly assuming that if there is an skb
> that is not an exact multiple of MSS, then the tcp_mss_split_point()
> will chop off the odd bytes at the end and we'll loop back to the top
> of the tcp_write_xmit() loop to make a Nagle decision on an skb that
> has tso_segs==1.
> 
> So if we just make this improvement to tcp_mss_split_point() and the
> adjustment to tcp_minshall_update() (as in patch v2), we will still
> have broken Nagle behavior in a scenario like:
> 
> - suppose MSS=1460 and Nagle is enabled (TCP_NODELAY is not set)
> 
> - suppose there is an outstanding un-ACKed small packet, so that
> tcp_minshall_check() returns true
> 
> - user writes 2000 bytes
> 
> - tso_segs is 2, so we do not call tcp_nagle_test()
> 
> - new version of tcp_mss_split_point() sends out the full 2000 bytes,
>   leading to having 2 packets smaller than an MSS un-ACKed in the network,
>   violating the invariant the minshall/nagle code is trying to maintain
>   or having only one such packet un-ACKed in the network.
> 
> Previously, tcp_mss_split_point() would have split off the 2000-1460
> bytes into a new skb, we would have looped back and executed the
> tcp_nagle_test() code and decided not to send that small sub-MSS
> 2000-1460 packet.
> 

Hmm, it seems we need to refactor a bit.

tcp_snd_test() seems to not care of TSO being partial.

  reply	other threads:[~2013-12-13 18:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-12 19:28 [PATCH net-next] tcp: remove a bogus TSO split Eric Dumazet
2013-12-13 14:15 ` Neal Cardwell
2013-12-13 14:54   ` Eric Dumazet
2013-12-13 16:22     ` Neal Cardwell
2013-12-13 17:54       ` Eric Dumazet
2013-12-13 16:58   ` David Laight
2013-12-13 17:56     ` Eric Dumazet
2013-12-13 18:13 ` [PATCH v2 " Eric Dumazet
2013-12-13 18:17   ` Neal Cardwell
2013-12-13 18:38     ` Eric Dumazet [this message]
2013-12-13 21:51   ` [PATCH v3 net-next] tcp: refine TSO splits Eric Dumazet
2013-12-14  1:59     ` Neal Cardwell
2013-12-14  2:05       ` Eric Dumazet
2013-12-17 20:15         ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1386959926.19078.169.camel@edumazet-glaptop2.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=nanditad@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=vanj@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.