All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dean <seattleplus@gmail.com>
To: "J.Bruce Fields" <bfields@citi.umich.edu>
Cc: NeilBrown <neilb@suse.de>,
	Olga Kornievskaia <aglo@citi.umich.edu>,
	NFS <linux-nfs@vger.kernel.org>
Subject: Re: Is tcp autotuning really what NFS wants?
Date: Wed, 10 Jul 2013 10:33:09 -0700	[thread overview]
Message-ID: <51DD9AD5.1030508@gmail.com> (raw)
In-Reply-To: <20130710022735.GI8281@fieldses.org>

 > This could significantly limit the amount of parallelism that can be 
achieved for a single TCP connection (and given that the
 > Linux client strongly prefers a single connection now, this could 
become more of an issue).

I understand the simplicity in using a single tcp connection, but 
performance-wise it is definitely not the way to go on WAN links. When 
even a miniscule amount of packet loss is added to the link (<0.001% 
packet loss), the tcp buffer collapses and performance drops 
significantly (especially on 10GigE WAN links).  I think new TCP 
algorithms could help the problem somewhat, but nothing available today 
makes much of a difference vs. cubic.

Using multiple tcp connections allows better saturation of the link, 
since when packet loss occurs on a stream, the other streams can fill 
the void.  Today, the only solution is to scale up the number of 
physical clients, which has high coordination overhead, or use a wan 
accelerator such as Bitspeed or Riverbed (which comes with its own 
issues such as extra hardware, cost, etc).


> It does make a difference on high bandwidth-product networks (something
> people have also hit).  I'd rather not regress there and also would
> rather not require manual tuning for something we should be able to get
> right automatically.'

Previous to this patch, the tcp buffer was fixed to such a small size 
(especially for writes) that the idea of parallelism was moot anyways.  
Whatever the tcp buffer negotiates to now is definitely bigger than was 
what there before hand, which I think is brought out by the fact that no 
performance regression was found.

Regressing back to the old way is a death nail to any system with a 
delay of >1ms or a bandwidth of >1GigE, so I definitely hope we never go 
there.  Of course, now that autoscaling allows the tcp buffer to grow to 
reasonable values to achieve good performance for 10+GigE and WAN links, 
if we can improve the parallelism/stability even further, that would be 
great.
Dean

  parent reply	other threads:[~2013-07-10 17:33 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20130710092255.0240a36d@notabene.brown>
2013-07-10  2:27 ` Is tcp autotuning really what NFS wants? J.Bruce Fields
2013-07-10  4:32   ` NeilBrown
2013-07-10 19:07     ` J.Bruce Fields
2013-07-15  4:32       ` NeilBrown
2013-07-16  1:58         ` J.Bruce Fields
2013-07-16  4:00           ` NeilBrown
2013-07-16 14:24             ` J.Bruce Fields
2013-07-18  0:03               ` Ben Myers
2013-07-24 21:07                 ` J.Bruce Fields
2013-07-25  1:30                   ` [PATCH] NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure NeilBrown
2013-07-25 12:35                     ` Jim Rees
2013-07-25 20:18                     ` J.Bruce Fields
2013-07-25 20:33                       ` NeilBrown
2013-07-26 14:19                         ` J.Bruce Fields
2013-07-30  2:48                           ` NeilBrown
2013-08-01  2:49                             ` J.Bruce Fields
2013-07-10 17:33   ` Dean [this message]
2013-07-10 17:39     ` Is tcp autotuning really what NFS wants? Ben Greear
2013-07-15  4:35       ` NeilBrown
2013-07-15 23:32         ` Ben Greear
2013-07-16  4:46           ` NeilBrown
2013-07-10 19:59     ` Michael Richardson
2013-07-15  1:26   ` Jim Rees
2013-07-15  5:02     ` NeilBrown
2013-07-15 11:57       ` Jim Rees
2013-07-15 13:42   ` Jim Rees
2013-07-16  1:10     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51DD9AD5.1030508@gmail.com \
    --to=seattleplus@gmail.com \
    --cc=aglo@citi.umich.edu \
    --cc=bfields@citi.umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.