Re: Lookup revalidation for OPEN_CLAIM_FH

From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Trond Myklebust" <trondmy@hammerspace.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: Lookup revalidation for OPEN_CLAIM_FH
Date: Thu, 16 Jan 2020 10:13:47 -0500	[thread overview]
Message-ID: <C69931E7-7465-4662-91AC-C74609A4CDB2@redhat.com> (raw)
In-Reply-To: <7eae4162d7c8a85bbb7fddab3a818472ec2ebc54.camel@hammerspace.com>

On 16 Jan 2020, at 9:35, Trond Myklebust wrote:

> On Thu, 2020-01-16 at 08:51 -0500, Benjamin Coddington wrote:
>> Hi Trond,
>>
>> I'd like to fix up lookup revalidation for v4.1+ when the client is
>> using
>> OPEN_CLAIM_FH.  The fixes a while back for Stan Hu's case do not seem
>> to
>> improve things for v4.1, and actually make the behavior a bit worse
>> since we
>> no longer pass through nfs_lookup_verify_inode(), which would catch
>> the
>> cases where nlink == 0.
>>
>> Would you accept work to _always_ revalidate the dentry's parent for
>> CLAIM_FH?  Alternatively, it seems that CLAIM_NULL would be
>> preferable for
>> this case, though I don't know how the client would know when to
>> decide
>> between them.
>>
>> Here's a simple reproducer for convenience, I think we've already all
>> agreed
>> that the behavior we want is for the second open by `cat` to reflect
>> the
>> results of the move on the server, or at least eventually later opens
>> would
>> revalidate the dentry:
>>
>> #!/bin/bash
>>
>> set -o xtrace
>> vers=4.1
>>
>> exportfs -ua
>> exportfs -o rw,sec=sys,no_root_squash *:/exports
>>
>> mkdir /mnt/localhost || true
>>
>> rm -f /exports/file{1,2}
>>
>> echo this is file 1 > /exports/file1
>> echo this is file 2 > /exports/file2
>>
>> mount -t nfs -ov$vers,sec=sys localhost:/exports /mnt/localhost
>>
>> tail -f /mnt/localhost/file1 &
>> sleep 1
>>
>> # this is file 1
>> cat /mnt/localhost/file1
>>
>> # overwrite the file on the server:
>> mv -f /exports/file2 /exports/file1
>>
>> # this is file 2
>> cat /mnt/localhost/file1
>>
>> killall tail
>> # this is file 2
>> cat /mnt/localhost/file1
>> umount /mnt/localhost
>>
>>
>> Switching the $vers variable between v4.0 and v4.1 in this script
>> shows the
>> difference in behavior.
>>
>
> If somebody needs stronger lookup cache revalidation, then that's what
> they have the 'lookupcache=none' mount option for. We have these
> 'lookupcache' mount options in order to allow users to tailor the
> caching behaviour (on a per-mount basis) should the default behaviour
> be insufficiently strict.
>
> Since your testcase doesn't use that mount option, I don't see what it
> is proving other than what we already know about the default lookup
> caching: namely that it sacrifices some accuracy in the interest of
> file open performance.

Thanks for the look.  The testcase only provides a comparison of different
behavior between v4.0 and v4.1, which is due to our use of CLAIM_NULL vs
CLAIM_FH.

Indeed, setting lookupcache=none give real-time updates.  With default
lookupcache, after the directory attributes time out, the client will
likewise get the namespace right.  So, this isn't a major problem, but we do
have some  QE folks that are unhappy about it.

Can we improve things a bit for v4.1 without sacrificing performance?  I
can't think of a reason to not go back to CLAIM_NULL in nfs4_file_open().
Maybe it is a bit more work on the server to have to do one extra lookup per
open, but we'll end up with the right file each time.

It makes sense to keep CLAIM_FH for recovery, but why keep it for regular
opens?

Ben