All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daire Byrne <daire@dneg.com>
To: linux-nfs <linux-nfs@vger.kernel.org>
Subject: parallel file create rates (+high latency)
Date: Sun, 23 Jan 2022 23:53:08 +0000	[thread overview]
Message-ID: <CAPt2mGOaRsKOiL_wuSK_D5oYYnn0R-pvVsZc5HYGdEbT2FngtQ@mail.gmail.com> (raw)

Hi,

I've been experimenting a bit more with high latency NFSv4.2 (200ms).
I've noticed a difference between the file creation rates when you
have parallel processes running against a single client mount creating
files in multiple directories compared to in one shared directory.

If I start 100 processes on the same client creating unique files in a
single shared directory (with 200ms latency), the rate of new file
creates is limited to around 3 files per second. Something like this:

# add latency to the client
sudo tc qdisc replace dev eth0 root netem delay 200ms

sudo mount -o vers=4.2,nocto,actimeo=3600 server:/data /tmp/data
for x in {1..10000}; do
    echo /tmp/data/dir1/touch.$x
done | xargs -n1 -P 100 -iX -t touch X 2>&1 | pv -l -a > /dev/null

It's a similar (slow) result for NFSv3. If we run it again just to
update the existing files, it's a lot faster because of the
nocto,actimeo and open file caching (32 files/s).

Then if I switch it so that each process on the client creates
hundreds of files in a unique directory per process, the aggregate
file create rate increases to 32 per second. For NFSv3 it's 162
aggregate new files per second. So much better parallelism is possible
when the creates are spread across multiple remote directories on the
same client.

If I then take the slow 3 creates per second example again and instead
use 10 client hosts (all with 200ms latency) and set them all creating
in the same remote server directory, then we get 3 x 10 = 30 creates
per second.

So we can achieve some parallel file create performance in the same
remote directory but just not from a single client running multiple
processes. Which makes me think it's more of a client limitation
rather than a server locking issue?

My interest in this (as always) is because while having hundreds of
processes creating files in the same directory might not be a common
workload, it is if you are re-exporting a filesystem and multiple
clients are creating new files for writing. For example a batch job
creating files in a common output directory.

Re-exporting is a useful way of caching mostly read heavy workloads
but then performance suffers for these metadata heavy or writing
workloads. The parallel performance (nfsd threads) with a single
client mountpoint just can't compete with directly connected clients
to the originating server.

Does anyone have any idea what the specific bottlenecks are here for
parallel file creates from a single client to a single directory?

Cheers,

Daire

             reply	other threads:[~2022-01-23 23:53 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-23 23:53 Daire Byrne [this message]
2022-01-24 13:52 ` parallel file create rates (+high latency) Daire Byrne
2022-01-24 19:37 ` J. Bruce Fields
2022-01-24 20:10   ` Daire Byrne
2022-01-24 20:50     ` J. Bruce Fields
2022-01-25 12:52       ` Daire Byrne
2022-01-25 13:59         ` J. Bruce Fields
2022-01-25 15:24           ` Daire Byrne
2022-01-25 15:30           ` Chuck Lever III
2022-01-25 21:50             ` Patrick Goetz
2022-01-25 21:58               ` Chuck Lever III
2022-01-25 21:59               ` Bruce Fields
2022-01-25 22:11                 ` Patrick Goetz
2022-01-25 22:41                   ` Daire Byrne
2022-01-25 23:01                     ` Patrick Goetz
2022-01-25 23:25                       ` Daire Byrne
2022-01-25 21:15   ` Patrick Goetz
2022-01-25 21:20     ` J. Bruce Fields
2022-01-26  0:02       ` NeilBrown
2022-01-26  0:28         ` Daire Byrne
2022-01-26  2:57         ` J. Bruce Fields
2022-02-08 18:48           ` Daire Byrne
2022-02-10 18:19             ` Daire Byrne
2022-02-11 15:59               ` J. Bruce Fields
2022-02-17 19:50                 ` Daire Byrne
2022-02-18  7:46                   ` NeilBrown
2022-02-21 13:59                     ` Daire Byrne
2022-04-25 13:00                       ` Daire Byrne
2022-04-25 13:22                         ` J. Bruce Fields
2022-04-25 15:24                           ` Daire Byrne
2022-04-25 16:02                             ` J. Bruce Fields
2022-04-25 16:47                               ` Daire Byrne
2022-04-26  1:36                                 ` NeilBrown
2022-04-26 12:29                                   ` Daire Byrne
2022-04-28  5:46                                     ` NeilBrown
2022-04-29  7:55                                       ` Daire Byrne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPt2mGOaRsKOiL_wuSK_D5oYYnn0R-pvVsZc5HYGdEbT2FngtQ@mail.gmail.com \
    --to=daire@dneg.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.