All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neal Cardwell <ncardwell@google.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Yuchung Cheng <ycheng@google.com>,
	Nandita Dukkipati <nanditad@google.com>,
	Van Jacobson <vanj@google.com>
Subject: Re: [PATCH v2 net-next] tcp: remove a bogus TSO split
Date: Fri, 13 Dec 2013 13:17:32 -0500	[thread overview]
Message-ID: <CADVnQy=ayiUVjXPscUkONDOGhfQ7xDyqn=KFzrndjw2mrPufrw@mail.gmail.com> (raw)
In-Reply-To: <1386958426.19078.162.camel@edumazet-glaptop2.roam.corp.google.com>

On Fri, Dec 13, 2013 at 1:13 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> While investigating performance problems on small RPC workloads,
> I noticed linux TCP stack was always splitting the last TSO skb
> into two parts (skbs). One being a multiple of MSS, and a small one
> with the Push flag. This split is done even if TCP_NODELAY is set.
>
...
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Nandita Dukkipati <nanditad@google.com>
> Cc: Van Jacobson <vanj@google.com>
> ---
> v2: changed tcp_minshall_update() as Neal pointed out.

It occurred to me after staring at this code a little longer that
although the Nagle test is already done before the call to
tcp_mss_split_point(), the tcp_nagle_check() is only run if
tso_segs==1 and is only checking whether skb->len < mss_now, so the
Nagle code currently is implicitly assuming that if there is an skb
that is not an exact multiple of MSS, then the tcp_mss_split_point()
will chop off the odd bytes at the end and we'll loop back to the top
of the tcp_write_xmit() loop to make a Nagle decision on an skb that
has tso_segs==1.

So if we just make this improvement to tcp_mss_split_point() and the
adjustment to tcp_minshall_update() (as in patch v2), we will still
have broken Nagle behavior in a scenario like:

- suppose MSS=1460 and Nagle is enabled (TCP_NODELAY is not set)

- suppose there is an outstanding un-ACKed small packet, so that
tcp_minshall_check() returns true

- user writes 2000 bytes

- tso_segs is 2, so we do not call tcp_nagle_test()

- new version of tcp_mss_split_point() sends out the full 2000 bytes,
  leading to having 2 packets smaller than an MSS un-ACKed in the network,
  violating the invariant the minshall/nagle code is trying to maintain
  or having only one such packet un-ACKed in the network.

Previously, tcp_mss_split_point() would have split off the 2000-1460
bytes into a new skb, we would have looped back and executed the
tcp_nagle_test() code and decided not to send that small sub-MSS
2000-1460 packet.

neal

  reply	other threads:[~2013-12-13 18:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-12 19:28 [PATCH net-next] tcp: remove a bogus TSO split Eric Dumazet
2013-12-13 14:15 ` Neal Cardwell
2013-12-13 14:54   ` Eric Dumazet
2013-12-13 16:22     ` Neal Cardwell
2013-12-13 17:54       ` Eric Dumazet
2013-12-13 16:58   ` David Laight
2013-12-13 17:56     ` Eric Dumazet
2013-12-13 18:13 ` [PATCH v2 " Eric Dumazet
2013-12-13 18:17   ` Neal Cardwell [this message]
2013-12-13 18:38     ` Eric Dumazet
2013-12-13 21:51   ` [PATCH v3 net-next] tcp: refine TSO splits Eric Dumazet
2013-12-14  1:59     ` Neal Cardwell
2013-12-14  2:05       ` Eric Dumazet
2013-12-17 20:15         ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADVnQy=ayiUVjXPscUkONDOGhfQ7xDyqn=KFzrndjw2mrPufrw@mail.gmail.com' \
    --to=ncardwell@google.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=nanditad@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=vanj@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.