All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Trond Myklebust" <trondmy@primarydata.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 2/3] NFS: Be more targeted about readdirplus use when doing lookup/revalidation
Date: Fri, 02 Dec 2016 08:56:33 -0500	[thread overview]
Message-ID: <60326804-0548-43C5-94B4-97FE1AE4B16B@redhat.com> (raw)
In-Reply-To: <1480625258.10526.1.camel@primarydata.com>

On 1 Dec 2016, at 15:47, Trond Myklebust wrote:

> On Wed, 2016-11-30 at 14:09 -0500, Benjamin Coddington wrote:
>> .. this one breaks things again:
>>
>> On 19 Nov 2016, at 11:54, Trond Myklebust wrote:
>>
>>>  static
>>>  void nfs_advise_use_readdirplus(struct inode *dir)
>>>  {
>>> -	set_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(dir)->flags);
>>> +	struct nfs_inode *nfsi = NFS_I(dir);
>>> +
>>> +	if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
>>> +	    !list_empty(&nfsi->open_files)) {
>>> +		set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
>>> +		invalidate_mapping_pages(dir->i_mapping, 0, -1);
>>> +	}
>>>  }
>>
>> So every time advise_use_readdirplus it drops the mapping.. but what 
>> about
>> subsequent calls into nfs_readdir() to get the next batch of
>> entries?  
>> For
>> the ls -l case, we want to keep setting NFS_INO_ADVISE_RDPLUS, but we
>> shouldn't start over filling the mapping from the beginning again.
>
> How do I ensure that the readdir isn't being served from cache, if I
> don't invalidate the mapping?

The way it is now.  Isn't advise_use_readdirplus() just a marker to say
"continue to use readdirplus" after nfs_force_use_readdirplus() has
invalidated the mapping?

> The intention of the patch is to ensure that we only call this on a dcache
> or inode attribute cache miss.

I think it also needs to be called when we detect if a child of the
directory is looked up/revalidated in order to set NFS_INO_ADVISE_RDPLUS
again.  Otherwise we lose the optimization in commit 311324ad1713.

Maybe I am just really confused..

>>> @@ -1211,9 +1220,6 @@ static int nfs_lookup_revalidate(struct
>>> dentry 
>>> *dentry, unsigned int flags)
>>>  out_set_verifier:
>>>  	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
>>>   out_valid:
>>> -	/* Success: notify readdir to use READDIRPLUS */
>>> -	nfs_advise_use_readdirplus(dir);
>>> - out_valid_noent:
>>>  	if (flags & LOOKUP_RCU) {
>>>  		if (parent != ACCESS_ONCE(dentry->d_parent))
>>>  			return -ECHILD;
>>
>>
>> Now when listing with `ls -l`:  the second call into nfs_readdir()
>> to 
>> get
>> the next batch of entries will not have NFS_INO_ADVISE_RDPLUS.
>>
>> I think this removes nfs_advise_use_readdirplus(dir) from the common 
>> "goto
>> out_valid" path through nfs_lookup_revalidate() (the block with the 
>> 'iff'
>> typo).
>>
>
> Actually, 'iff' is intentionally used there as the common shorthand for
> 'if and only if' (https://www.merriam-webster.com/dictionary/iff).

Ah, and now I see it used everywhere.  Thank you for filling in that hole
in my head.

> As I said above, the point is to only trigger READDIRPLUS when we know
> the dcache or the inode cache needs revalidation. Otherwise we want to
> use the less expensive READDIR.

Not if using ls -l.  With this patch, an ls -l of a directory of 2000
entries follows this pattern:

- nfs_readdir() returns first 1024 entries obtained with READDIRPLUS
- nfs_readdir() returns last 976 entries with READDIR
- stat()/LOOKUPs are done on those 976
- ls -l does its final getdents() on the last position, but we've now
  invalidated the pages, so finally:
- nfs_readdir() is called with position 2000, and we do READDIRPLUS for
  _every_ entry all over again.

This seems to break commit 311324ad1713's optimization, since now we'll send
RPC to read the entire directory twice for ls -l.

> I'm open for suggestions as to how we can improve that heuristic.

OK.  Right now I've got nothing to suggest, but I'll spend some time thinking
about it.

Ben

  reply	other threads:[~2016-12-02 13:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-19 16:54 [PATCH 0/3] Address readdirplus performance Trond Myklebust
2016-11-19 16:54 ` [PATCH 1/3] NFS: Fix a performance regression in readdir Trond Myklebust
2016-11-19 16:54   ` [PATCH 2/3] NFS: Be more targeted about readdirplus use when doing lookup/revalidation Trond Myklebust
2016-11-19 16:54     ` [PATCH 3/3] NFS: Replace nfs_force_use_readdirplus() with nfs_advise_use_readdirplus() Trond Myklebust
2016-11-30 19:09     ` [PATCH 2/3] NFS: Be more targeted about readdirplus use when doing lookup/revalidation Benjamin Coddington
2016-12-01 20:47       ` Trond Myklebust
2016-12-02 13:56         ` Benjamin Coddington [this message]
2016-12-02 14:32           ` Trond Myklebust
2016-11-30 19:08   ` [PATCH 1/3] NFS: Fix a performance regression in readdir Benjamin Coddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60326804-0548-43C5-94B4-97FE1AE4B16B@redhat.com \
    --to=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.