On Mon, Nov 16 2020, Trond Myklebust wrote: > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote: >> On Mon, Nov 16 2020, Trond Myklebust wrote: >> >> > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote: >> > > >> > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor >> > > nfs_lookup_revalidate()") >> > > and error from nfs_lookup_verify_inode() other than -ESTALE would >> > > result >> > > in nfs_lookup_revalidate() returning that error code (-ESTALE is >> > > mapped >> > > to zero). >> > > Since that commit, all errors result in zero being returned. >> > > >> > > When nfs_lookup_revalidate() returns zero, the dentry is >> > > invalidated >> > > and, significantly, if the dentry is a directory that is mounted >> > > on, >> > > that mountpoint is lost. >> > > >> > > If you: >> > >  - mount an NFS filesystem which contains a directory >> > >  - mount something (e.g. tmpfs) on that directory >> > >  - use iptables (or scissors) to block traffic to the server >> > >  - ls -l the-mounted-on-directory >> > >  - interrupt the 'ls -l' >> > > you will find that the directory has been unmounted. >> > > >> > > This can be fixed by returning the actual error code from >> > > nfs_lookup_verify_inode() rather then zero (except for -ESTALE). >> > > >> > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") >> > > Signed-off-by: NeilBrown >> > > --- >> > >  fs/nfs/dir.c | 8 +++++--- >> > >  1 file changed, 5 insertions(+), 3 deletions(-) >> > > >> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c >> > > index cb52db9a0cfb..d24acf556e9e 100644 >> > > --- a/fs/nfs/dir.c >> > > +++ b/fs/nfs/dir.c >> > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir, >> > > struct dentry *dentry, >> > >                          unsigned int flags) >> > >  { >> > >         struct inode *inode; >> > > -       int error; >> > > +       int error = 0; >> > >   >> > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE); >> > >         inode = d_inode(dentry); >> > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode >> > > *dir, >> > > struct dentry *dentry, >> > >             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) >> > > { >> > >                 error = nfs_lookup_verify_inode(inode, flags); >> > >                 if (error) { >> > > -                       if (error == -ESTALE) >> > > +                       if (error == -ESTALE) { >> > >                                 nfs_zap_caches(dir); >> > > +                               error = 0; >> > > +                       } >> > >                         goto out_bad; >> > >                 } >> > >                 nfs_advise_use_readdirplus(dir); >> > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir, >> > > struct dentry *dentry, >> > >  out_bad: >> > >         if (flags & LOOKUP_RCU) >> > >                 return -ECHILD; >> > > -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0); >> > > +       return nfs_lookup_revalidate_done(dir, dentry, inode, >> > > error); >> > >> > Which errors do we actually need to return here? As far as I can >> > tell, >> > the only errors that nfs_lookup_verify_inode() is supposed to >> > return is >> > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT. >> > >> > Why would it be better to return those errors rather than just a 0 >> > when >> > we need to invalidate the inode, particularly since we already have >> > a >> > special case in nfs_lookup_revalidate_done() when the dentry is >> > root? >> >> ERESTARTSYS is the error that easily causes problems. >> >> Returning 0 causes d_invalidate() to be called which is quite heavy >> handed in mountpoints. > > My point is that it shouldn't get returned for mountpoints. See > nfs_lookup_revalidate_done(). nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many mountpoints are IS_ROOT(), not all are (--bind easily makes others). But that isn't even really relevant here. The dentry being revalidated is the underlying directory - that something else is mounted on. step_into() which follows mount points is called in walk_component() *after* lookup_fast or lookup_slow which will have revalidated the dentry. NeilBrown > >> So it is only reasonable to return 0 when we have unambiguous >> confirmation from the server that the object no longer exists.  >> ESTALE >> is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM, >> ETIMEDOUT are transient and don't justify d_invalidate() being >> called. >> >> (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are >> clearly invalid.") >>  fixed much the same bug 3 years ago). >>   >> Thanks, >> NeilBrown >> >> >> > >> > >  } >> > >   >> > >  static int >> > >> > -- >> > Trond Myklebust >> > Linux NFS client maintainer, Hammerspace >> > trond.myklebust@hammerspace.com > > -- > Trond Myklebust > CTO, Hammerspace Inc > 4984 El Camino Real, Suite 208 > Los Altos, CA 94022 > ​ > www.hammer.space