linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "neilb@suse.de" <neilb@suse.de>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
Date: Mon, 16 Nov 2020 05:32:31 +0000	[thread overview]
Message-ID: <0673647d9d70f31a02c74da713e5343ac3918835.camel@hammerspace.com> (raw)
In-Reply-To: <87tutpopfj.fsf@notabene.neil.brown.name>

On Mon, 2020-11-16 at 16:12 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> > > > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > > > 
> > > > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > > > > > 
> > > > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > > > > > nfs_lookup_revalidate()")
> > > > > > > and error from nfs_lookup_verify_inode() other than -
> > > > > > > ESTALE
> > > > > > > would
> > > > > > > result
> > > > > > > in nfs_lookup_revalidate() returning that error code (-
> > > > > > > ESTALE
> > > > > > > is
> > > > > > > mapped
> > > > > > > to zero).
> > > > > > > Since that commit, all errors result in zero being
> > > > > > > returned.
> > > > > > > 
> > > > > > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > > > > > invalidated
> > > > > > > and, significantly, if the dentry is a directory that is
> > > > > > > mounted
> > > > > > > on,
> > > > > > > that mountpoint is lost.
> > > > > > > 
> > > > > > > If you:
> > > > > > >  - mount an NFS filesystem which contains a directory
> > > > > > >  - mount something (e.g. tmpfs) on that directory
> > > > > > >  - use iptables (or scissors) to block traffic to the
> > > > > > > server
> > > > > > >  - ls -l the-mounted-on-directory
> > > > > > >  - interrupt the 'ls -l'
> > > > > > > you will find that the directory has been unmounted.
> > > > > > > 
> > > > > > > This can be fixed by returning the actual error code from
> > > > > > > nfs_lookup_verify_inode() rather then zero (except for -
> > > > > > > ESTALE).
> > > > > > > 
> > > > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor
> > > > > > > nfs_lookup_revalidate()")
> > > > > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > > > > ---
> > > > > > >  fs/nfs/dir.c | 8 +++++---
> > > > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > > > > index cb52db9a0cfb..d24acf556e9e 100644
> > > > > > > --- a/fs/nfs/dir.c
> > > > > > > +++ b/fs/nfs/dir.c
> > > > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >                          unsigned int flags)
> > > > > > >  {
> > > > > > >         struct inode *inode;
> > > > > > > -       int error;
> > > > > > > +       int error = 0;
> > > > > > >  
> > > > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > > > > > >         inode = d_inode(dentry);
> > > > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >             nfs_check_verifier(dir, dentry, flags &
> > > > > > > LOOKUP_RCU))
> > > > > > > {
> > > > > > >                 error = nfs_lookup_verify_inode(inode,
> > > > > > > flags);
> > > > > > >                 if (error) {
> > > > > > > -                       if (error == -ESTALE)
> > > > > > > +                       if (error == -ESTALE) {
> > > > > > >                                 nfs_zap_caches(dir);
> > > > > > > +                               error = 0;
> > > > > > > +                       }
> > > > > > >                         goto out_bad;
> > > > > > >                 }
> > > > > > >                 nfs_advise_use_readdirplus(dir);
> > > > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >  out_bad:
> > > > > > >         if (flags & LOOKUP_RCU)
> > > > > > >                 return -ECHILD;
> > > > > > > -       return nfs_lookup_revalidate_done(dir, dentry,
> > > > > > > inode,
> > > > > > > 0);
> > > > > > > +       return nfs_lookup_revalidate_done(dir, dentry,
> > > > > > > inode,
> > > > > > > error);
> > > > > > 
> > > > > > Which errors do we actually need to return here? As far as
> > > > > > I
> > > > > > can
> > > > > > tell,
> > > > > > the only errors that nfs_lookup_verify_inode() is supposed
> > > > > > to
> > > > > > return is
> > > > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > > > > > 
> > > > > > Why would it be better to return those errors rather than
> > > > > > just
> > > > > > a 0
> > > > > > when
> > > > > > we need to invalidate the inode, particularly since we
> > > > > > already
> > > > > > have
> > > > > > a
> > > > > > special case in nfs_lookup_revalidate_done() when the
> > > > > > dentry is
> > > > > > root?
> > > > > 
> > > > > ERESTARTSYS is the error that easily causes problems.
> > > > > 
> > > > > Returning 0 causes d_invalidate() to be called which is quite
> > > > > heavy
> > > > > handed in mountpoints.
> > > > 
> > > > My point is that it shouldn't get returned for mountpoints. See
> > > > nfs_lookup_revalidate_done().
> > > 
> > > nfs_lookup_revalidate_done() only checks IS_ROOT(), and while
> > > many
> > > mountpoints are IS_ROOT(), not all are (--bind easily makes
> > > others).
> > > 
> > > But that isn't even really relevant here.  The dentry being
> > > revalidated
> > > is the underlying directory - that something else is mounted on.
> > > step_into() which follows mount points is called in
> > > walk_component()
> > > *after* lookup_fast or lookup_slow which will have revalidated
> > > the
> > > dentry.
> > 
> > So then why is it not sufficient to just add a check for
> > d_mountpoint()? This is a revalidation, not a new lookup.
> > 
> 
> I guess you could do that.
> But why would you want to call d_invalidate() just because a signal
> was
> received, or a memory allocation failed?

Why would I care about the error return from nfs_lookup_verify_inode()?
This is a revalidation, and so sometimes the error returned is not
transient, but is persistent (e.g. EIO/ETIMEDOUT if the server is
down). In those cases, I still want to be able to do things like
unmount the filesystem.

> 
> NeilBrown
> 
> 
> > > 
> > > NeilBrown
> > > 
> > > 
> > > > 
> > > > > So it is only reasonable to return 0 when we have unambiguous
> > > > > confirmation from the server that the object no longer
> > > > > exists. 
> > > > > ESTALE
> > > > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS,
> > > > > ENOMEM,
> > > > > ETIMEDOUT are transient and don't justify d_invalidate()
> > > > > being
> > > > > called.
> > > > > 
> > > > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that
> > > > > are
> > > > > clearly invalid.")
> > > > >  fixed much the same bug 3 years ago).
> > > > >  
> > > > > Thanks,
> > > > > NeilBrown
> > > > > 
> > > > > 
> > > > > > 
> > > > > > >  }
> > > > > > >  
> > > > > > >  static int
> > > > > > 
> > > > > > -- 
> > > > > > Trond Myklebust
> > > > > > Linux NFS client maintainer, Hammerspace
> > > > > > trond.myklebust@hammerspace.com
> > > > 
> > > > -- 
> > > > Trond Myklebust
> > > > CTO, Hammerspace Inc
> > > > 4984 El Camino Real, Suite 208
> > > > Los Altos, CA 94022
> > > > ​
> > > > www.hammer.space
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com

-- 
Trond Myklebust
CTO, Hammerspace Inc
4984 El Camino Real, Suite 208
Los Altos, CA 94022
​
www.hammer.space


  reply	other threads:[~2020-11-16  5:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-16  2:59 [PATCH] NFS: only invalidate dentrys that are clearly invalid NeilBrown
2020-11-16  4:27 ` Trond Myklebust
2020-11-16  4:43   ` NeilBrown
2020-11-16  4:50     ` Trond Myklebust
2020-11-16  5:00       ` NeilBrown
2020-11-16  5:07         ` Trond Myklebust
2020-11-16  5:12           ` NeilBrown
2020-11-16  5:32             ` Trond Myklebust [this message]
2020-11-16  6:08               ` NeilBrown
  -- strict thread matches above, loose matches on Subject: below --
2017-07-05  2:22 NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0673647d9d70f31a02c74da713e5343ac3918835.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).