All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "bcodding@redhat.com" <bcodding@redhat.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v9 14/27] NFS: Improve heuristic for readdirplus
Date: Thu, 10 Mar 2022 20:15:04 +0000	[thread overview]
Message-ID: <28d6a094ddca6e4e6c15e055ec3ef6b10d57cbd3.camel@hammerspace.com> (raw)
In-Reply-To: <A7BBBBF2-768E-487C-A890-7E5AF1D40027@redhat.com>

On Wed, 2022-03-09 at 12:39 -0500, Benjamin Coddington wrote:
> On 27 Feb 2022, at 18:12, trondmy@kernel.org wrote:
> 
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > The heuristic for readdirplus is designed to try to detect 'ls -l'
> > and
> > similar patterns. It does so by looking for cache hit/miss patterns
> > in
> > both the attribute cache and in the dcache of the files in a given
> > directory, and then sets a flag for the readdirplus code to
> > interpret.
> > 
> > The problem with this approach is that a single attribute or dcache
> > miss
> > can cause the NFS code to force a refresh of the attributes for the
> > entire set of files contained in the directory.
> > 
> > To be able to make a more nuanced decision, let's sample the number
> > of
> > hits and misses in the set of open directory descriptors. That
> > allows us
> > to set thresholds at which we start preferring READDIRPLUS over
> > regular
> > READDIR, or at which we start to force a re-read of the remaining
> > readdir cache using READDIRPLUS.
> 
> I like this patch very much.
> 
> The heuristic doesn't kick-in until "ls -l" makes its second call
> into
> nfs_readdir(), and for my filenames with 8 chars, that means that
> there are
> about 5800 GETATTRs generated before we clean the cache to do more
> READDIRPLUS.  That's a large number to compound on connection
> latency.
> 
> We've already got some complaints that folk's 2nd "ls -l" takes "so
> much
> longer" after 1a34c8c9a49e.
> 
> Can we possibly limit our first pass through nfs_readdir() so that
> the
> heuristic takes effect sooner?
> 

The problem is really that 'ls' (or possibly glibc) is passing in a
pretty huge buffer to the getdents() system call.

On my setup, that buffer appears to be 80K in size. So what happens is
that we get that first getdents() call, and so we fill the 80K buffer
with as many files as will fit. That can quickly run into several
thousand entries, if the filenames are relatively short.

Then 'ls' goes through the contents and does a stat() (or a statx()) on
each entry, and so we record the statistics. However that means those
first several thousand entries are indeed going to use cached data, or
force GETATTR to go on the wire. We only start using forced readdirplus
on the second pass.

Yes, I suppose we could limit getdents() to ignore the buffer size, and
just return fewer entries, however what's the "right" size in that
case?
More to the point, how much pain are we going to accept before we give
up trying these assorted heuristics, and just define a readdirplus()
system call modelled on statx()?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  parent reply	other threads:[~2022-03-10 20:15 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-27 23:12 [PATCH v9 00/27] Readdir improvements trondmy
2022-02-27 23:12 ` [PATCH v9 01/27] NFS: Return valid errors from nfs2/3_decode_dirent() trondmy
2022-02-27 23:12   ` [PATCH v9 02/27] NFS: constify nfs_server_capable() and nfs_have_writebacks() trondmy
2022-02-27 23:12     ` [PATCH v9 03/27] NFS: Trace lookup revalidation failure trondmy
2022-02-27 23:12       ` [PATCH v9 04/27] NFS: Initialise the readdir verifier as best we can in nfs_opendir() trondmy
2022-02-27 23:12         ` [PATCH v9 05/27] NFS: Use kzalloc() to avoid initialising the nfs_open_dir_context trondmy
2022-02-27 23:12           ` [PATCH v9 06/27] NFS: Calculate page offsets algorithmically trondmy
2022-02-27 23:12             ` [PATCH v9 07/27] NFS: Store the change attribute in the directory page cache trondmy
2022-02-27 23:12               ` [PATCH v9 08/27] NFS: Don't re-read the entire page cache to find the next cookie trondmy
2022-02-27 23:12                 ` [PATCH v9 09/27] NFS: Don't advance the page pointer unless the page is full trondmy
2022-02-27 23:12                   ` [PATCH v9 10/27] NFS: Adjust the amount of readahead performed by NFS readdir trondmy
2022-02-27 23:12                     ` [PATCH v9 11/27] NFS: If the cookie verifier changes, we must invalidate the page cache trondmy
2022-02-27 23:12                       ` [PATCH v9 12/27] NFS: Simplify nfs_readdir_xdr_to_array() trondmy
2022-02-27 23:12                         ` [PATCH v9 13/27] NFS: Reduce use of uncached readdir trondmy
2022-02-27 23:12                           ` [PATCH v9 14/27] NFS: Improve heuristic for readdirplus trondmy
2022-02-27 23:12                             ` [PATCH v9 15/27] NFS: Don't ask for readdirplus unless it can help nfs_getattr() trondmy
2022-02-27 23:12                               ` [PATCH v9 16/27] NFSv4: Ask for a full XDR buffer of readdir goodness trondmy
2022-02-27 23:12                                 ` [PATCH v9 17/27] NFS: Readdirplus can't help lookup for case insensitive filesystems trondmy
2022-02-27 23:12                                   ` [PATCH v9 18/27] NFS: Don't request readdirplus when revalidation was forced trondmy
2022-02-27 23:12                                     ` [PATCH v9 19/27] NFS: Add basic readdir tracing trondmy
2022-02-27 23:12                                       ` [PATCH v9 20/27] NFS: Trace effects of readdirplus on the dcache trondmy
2022-02-27 23:12                                         ` [PATCH v9 21/27] NFS: Trace effects of the readdirplus heuristic trondmy
2022-02-27 23:12                                           ` [PATCH v9 22/27] NFS: Clean up page array initialisation/free trondmy
2022-02-27 23:12                                             ` [PATCH v9 23/27] NFS: Convert readdir page cache to use a cookie based index trondmy
2022-02-27 23:12                                               ` [PATCH v9 24/27] NFS: Fix up forced readdirplus trondmy
2022-02-27 23:12                                                 ` [PATCH v9 25/27] NFS: Remove unnecessary cache invalidations for directories trondmy
2022-02-27 23:12                                                   ` [PATCH v9 26/27] NFS: Optimise away the previous cookie field trondmy
2022-02-27 23:12                                                     ` [PATCH v9 27/27] NFS: Cache all entries in the readdirplus reply trondmy
2022-03-09 20:01                                               ` [PATCH v9 23/27] NFS: Convert readdir page cache to use a cookie based index Benjamin Coddington
2022-03-09 21:03                                                 ` Benjamin Coddington
2022-03-10 21:07                                                 ` Trond Myklebust
2022-03-11 11:58                                                   ` Benjamin Coddington
2022-03-11 14:02                                                     ` Trond Myklebust
2022-03-11 16:14                                                       ` Benjamin Coddington
2022-03-11 16:51                                                         ` Trond Myklebust
2022-03-09 17:39                             ` [PATCH v9 14/27] NFS: Improve heuristic for readdirplus Benjamin Coddington
2022-03-10 14:31                               ` [PATCH] NFS: Trigger "ls -l" readdir heuristic sooner Benjamin Coddington
2022-03-16 22:25                                 ` Olga Kornievskaia
2022-03-10 20:15                               ` Trond Myklebust [this message]
2022-03-11 11:28                                 ` [PATCH v9 14/27] NFS: Improve heuristic for readdirplus Benjamin Coddington
2022-03-01 19:09               ` [PATCH v9 07/27] NFS: Store the change attribute in the directory page cache Anna Schumaker
2022-03-01 23:11                 ` Trond Myklebust
2022-03-09 13:42       ` [PATCH v9 03/27] NFS: Trace lookup revalidation failure Benjamin Coddington
2022-03-09 15:28         ` Chuck Lever III
2022-03-09 21:35           ` Benjamin Coddington
2022-03-09 21:32 ` [PATCH v9 00/27] Readdir improvements Benjamin Coddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28d6a094ddca6e4e6c15e055ec3ef6b10d57cbd3.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.