All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daire Byrne <daire@dneg.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: parallel file create rates (+high latency)
Date: Mon, 24 Jan 2022 20:10:07 +0000	[thread overview]
Message-ID: <CAPt2mGOCn5OaeZm24+zh92qRcWTF8h-H2WXqScz9RMfo4r_-Qw@mail.gmail.com> (raw)
In-Reply-To: <20220124193759.GA4975@fieldses.org>

On Mon, 24 Jan 2022 at 19:38, J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Sun, Jan 23, 2022 at 11:53:08PM +0000, Daire Byrne wrote:
> > I've been experimenting a bit more with high latency NFSv4.2 (200ms).
> > I've noticed a difference between the file creation rates when you
> > have parallel processes running against a single client mount creating
> > files in multiple directories compared to in one shared directory.
>
> The Linux VFS requires an exclusive lock on the directory while you're
> creating a file.

Right. So when I mounted the same server/dir multiple times using
namespaces, all I was really doing was making the VFS *think* I wanted
locks on different directories even though the remote server directory
was actually the same?

> So, if L is the time in seconds required to create a single file, you're
> never going to be able to create more than 1/L files per second, because
> there's no parallelism.

And things like directory delegations can't help with this kind of
workload? You can't batch directories locks or file creates I guess.

> So, it's not surprising you'd get a higher rate when creating in
> multiple directories.
>
> Also, that lock's taken on both client and server.  So it makes sense
> that you might get a little more parallelism from multiple clients.
>
> So the usual advice is just to try to get that latency number as low as
> possible, by using a low-latency network and storage that can commit
> very quickly.  (An NFS server isn't permitted to reply to the RPC
> creating the new file until the new file actually hits stable storage.)
>
> Are you really seeing 200ms in production?

Yea, it's just a (crazy) test for now. This is the latency between two
of our offices. Running batch jobs over this kind of latency with a
NFS re-export server doing all the caching works surprisingly well.

It's just these file creations that's the deal breaker. A batch job
might create 100,000+ files in a single directory across many clients.

Maybe many containerised re-export servers in round-robin with a
common cache is the only way to get more directory locks and file
creates in flight at the same time.

Cheers,

Daire

  reply	other threads:[~2022-01-24 21:58 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-23 23:53 parallel file create rates (+high latency) Daire Byrne
2022-01-24 13:52 ` Daire Byrne
2022-01-24 19:37 ` J. Bruce Fields
2022-01-24 20:10   ` Daire Byrne [this message]
2022-01-24 20:50     ` J. Bruce Fields
2022-01-25 12:52       ` Daire Byrne
2022-01-25 13:59         ` J. Bruce Fields
2022-01-25 15:24           ` Daire Byrne
2022-01-25 15:30           ` Chuck Lever III
2022-01-25 21:50             ` Patrick Goetz
2022-01-25 21:58               ` Chuck Lever III
2022-01-25 21:59               ` Bruce Fields
2022-01-25 22:11                 ` Patrick Goetz
2022-01-25 22:41                   ` Daire Byrne
2022-01-25 23:01                     ` Patrick Goetz
2022-01-25 23:25                       ` Daire Byrne
2022-01-25 21:15   ` Patrick Goetz
2022-01-25 21:20     ` J. Bruce Fields
2022-01-26  0:02       ` NeilBrown
2022-01-26  0:28         ` Daire Byrne
2022-01-26  2:57         ` J. Bruce Fields
2022-02-08 18:48           ` Daire Byrne
2022-02-10 18:19             ` Daire Byrne
2022-02-11 15:59               ` J. Bruce Fields
2022-02-17 19:50                 ` Daire Byrne
2022-02-18  7:46                   ` NeilBrown
2022-02-21 13:59                     ` Daire Byrne
2022-04-25 13:00                       ` Daire Byrne
2022-04-25 13:22                         ` J. Bruce Fields
2022-04-25 15:24                           ` Daire Byrne
2022-04-25 16:02                             ` J. Bruce Fields
2022-04-25 16:47                               ` Daire Byrne
2022-04-26  1:36                                 ` NeilBrown
2022-04-26 12:29                                   ` Daire Byrne
2022-04-28  5:46                                     ` NeilBrown
2022-04-29  7:55                                       ` Daire Byrne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPt2mGOCn5OaeZm24+zh92qRcWTF8h-H2WXqScz9RMfo4r_-Qw@mail.gmail.com \
    --to=daire@dneg.com \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.