All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Trond Myklebust" <trondmy@hammerspace.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 4/5] NFS: Further optimisations for 'ls -l'
Date: Wed, 29 Sep 2021 16:21:23 -0400	[thread overview]
Message-ID: <F6D17359-3190-4A67-9DF7-08BCE61BE075@redhat.com> (raw)
In-Reply-To: <f09a7c00b70d51a442542dec1e1ba98289ad612c.camel@hammerspace.com>

On 29 Sep 2021, at 12:23, Trond Myklebust wrote:

> On Wed, 2021-09-29 at 10:56 -0400, Benjamin Coddington wrote:
>> On 29 Sep 2021, at 9:49, trondmy@kernel.org wrote:
>>
>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>
>>> If a user is doing 'ls -l', we have a heuristic in GETATTR that
>>> tells
>>> the readdir code to try to use READDIRPLUS in order to refresh the
>>> inode
>>> attributes. In certain cirumstances, we also try to invalidate the
>>> remaining directory entries in order to ensure this refresh.
>>>
>>> If there are multiple readers of the directory, we probably should
>>> avoid
>>> invalidating the page cache, since the heuristic breaks down in
>>> that
>>> situation anyway.
>>
>> Hi Trond, I'm curious about the motivation for this work because
>> we're often
>> managing expectations about performance between various workloads. 
>> What
>> does the workload look like that prompted you to make this
>> optimization?
>>
>> I'm also interested because I have some work that improves readdir
>> performance for multiple readers on large directories, but is rotting
>> because things have been "good enough" for folks over here.
>>
>
> Are you talking about this patch specifically, or the series in
> general?

A bit of both - the first two patches didn't really make sense to me, but I
get it now.  Thanks.

> The general motivation for the series is that in certain circumstances
> involving a read-only scenario (no changes to the directory) and
> multiple processes we're seeing a regression w.r.t. older kernels.
>
> The motivation for this particular patch is twofold:
>    1. I want to get rid of the cached page_index in the inode and
>       replace it with something less persistent (inodes are forever,
>       unlike file descriptors).
>    2. The heuristic in question is designed to only work in the single
>       process/single threaded case.

Make sense to me now.

I'm wondering if this might be creating a READDIR op amplification for N
readers sharing the directory's fd because one process can be dropping the
cache, which artificially deflates desc->page_index for a given dir_cookie.
That lower page_index gets reused on the next pass dropping the cache, and
it'll keep using the bottom pages and doing READDIRs.  That wasn't possible
before because we only invalidated past nfsi->page_index.

Maybe an unlikely case (but I think some http servers could behave this
way), and I'd have to test it to be sure.  I can try to do that unless you
think its not possible or concerning.

Ben


  reply	other threads:[~2021-09-29 20:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-29 13:49 [PATCH 1/5] NFS: Don't set NFS_INO_DATA_INVAL_DEFER and NFS_INO_INVALID_DATA trondmy
2021-09-29 13:49 ` [PATCH 2/5] NFS: Ignore the directory size when marking for revalidation trondmy
2021-09-29 13:49   ` [PATCH 3/5] NFS: Fix up nfs_readdir_inode_mapping_valid() trondmy
2021-09-29 13:49     ` [PATCH 4/5] NFS: Further optimisations for 'ls -l' trondmy
2021-09-29 13:49       ` [PATCH 5/5] NFS: Fix dentry verifier races trondmy
2021-09-29 14:56       ` [PATCH 4/5] NFS: Further optimisations for 'ls -l' Benjamin Coddington
2021-09-29 16:23         ` Trond Myklebust
2021-09-29 20:21           ` Benjamin Coddington [this message]
2021-09-29 20:35             ` Trond Myklebust
2021-09-30 14:29               ` Benjamin Coddington
2021-09-30 14:51                 ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F6D17359-3190-4A67-9DF7-08BCE61BE075@redhat.com \
    --to=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.