From: Dean <seattleplus@gmail.com>
To: "J.Bruce Fields" <bfields@citi.umich.edu>
Cc: NeilBrown <neilb@suse.de>,
Olga Kornievskaia <aglo@citi.umich.edu>,
NFS <linux-nfs@vger.kernel.org>
Subject: Re: Is tcp autotuning really what NFS wants?
Date: Wed, 10 Jul 2013 10:33:09 -0700 [thread overview]
Message-ID: <51DD9AD5.1030508@gmail.com> (raw)
In-Reply-To: <20130710022735.GI8281@fieldses.org>
> This could significantly limit the amount of parallelism that can be
achieved for a single TCP connection (and given that the
> Linux client strongly prefers a single connection now, this could
become more of an issue).
I understand the simplicity in using a single tcp connection, but
performance-wise it is definitely not the way to go on WAN links. When
even a miniscule amount of packet loss is added to the link (<0.001%
packet loss), the tcp buffer collapses and performance drops
significantly (especially on 10GigE WAN links). I think new TCP
algorithms could help the problem somewhat, but nothing available today
makes much of a difference vs. cubic.
Using multiple tcp connections allows better saturation of the link,
since when packet loss occurs on a stream, the other streams can fill
the void. Today, the only solution is to scale up the number of
physical clients, which has high coordination overhead, or use a wan
accelerator such as Bitspeed or Riverbed (which comes with its own
issues such as extra hardware, cost, etc).
> It does make a difference on high bandwidth-product networks (something
> people have also hit). I'd rather not regress there and also would
> rather not require manual tuning for something we should be able to get
> right automatically.'
Previous to this patch, the tcp buffer was fixed to such a small size
(especially for writes) that the idea of parallelism was moot anyways.
Whatever the tcp buffer negotiates to now is definitely bigger than was
what there before hand, which I think is brought out by the fact that no
performance regression was found.
Regressing back to the old way is a death nail to any system with a
delay of >1ms or a bandwidth of >1GigE, so I definitely hope we never go
there. Of course, now that autoscaling allows the tcp buffer to grow to
reasonable values to achieve good performance for 10+GigE and WAN links,
if we can improve the parallelism/stability even further, that would be
great.
Dean
next prev parent reply other threads:[~2013-07-10 17:33 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20130710092255.0240a36d@notabene.brown>
2013-07-10 2:27 ` Is tcp autotuning really what NFS wants? J.Bruce Fields
2013-07-10 4:32 ` NeilBrown
2013-07-10 19:07 ` J.Bruce Fields
2013-07-15 4:32 ` NeilBrown
2013-07-16 1:58 ` J.Bruce Fields
2013-07-16 4:00 ` NeilBrown
2013-07-16 14:24 ` J.Bruce Fields
2013-07-18 0:03 ` Ben Myers
2013-07-24 21:07 ` J.Bruce Fields
2013-07-25 1:30 ` [PATCH] NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure NeilBrown
2013-07-25 12:35 ` Jim Rees
2013-07-25 20:18 ` J.Bruce Fields
2013-07-25 20:33 ` NeilBrown
2013-07-26 14:19 ` J.Bruce Fields
2013-07-30 2:48 ` NeilBrown
2013-08-01 2:49 ` J.Bruce Fields
2013-07-10 17:33 ` Dean [this message]
2013-07-10 17:39 ` Is tcp autotuning really what NFS wants? Ben Greear
2013-07-15 4:35 ` NeilBrown
2013-07-15 23:32 ` Ben Greear
2013-07-16 4:46 ` NeilBrown
2013-07-10 19:59 ` Michael Richardson
2013-07-15 1:26 ` Jim Rees
2013-07-15 5:02 ` NeilBrown
2013-07-15 11:57 ` Jim Rees
2013-07-15 13:42 ` Jim Rees
2013-07-16 1:10 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51DD9AD5.1030508@gmail.com \
--to=seattleplus@gmail.com \
--cc=aglo@citi.umich.edu \
--cc=bfields@citi.umich.edu \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.