linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFS: only invalidate dentrys that are clearly invalid.
@ 2020-11-16  2:59 NeilBrown
  2020-11-16  4:27 ` Trond Myklebust
  0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2020-11-16  2:59 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2152 bytes --]


Prior to commit 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
and error from nfs_lookup_verify_inode() other than -ESTALE would result
in nfs_lookup_revalidate() returning that error code (-ESTALE is mapped
to zero).
Since that commit, all errors result in zero being returned.

When nfs_lookup_revalidate() returns zero, the dentry is invalidated
and, significantly, if the dentry is a directory that is mounted on,
that mountpoint is lost.

If you:
 - mount an NFS filesystem which contains a directory
 - mount something (e.g. tmpfs) on that directory
 - use iptables (or scissors) to block traffic to the server
 - ls -l the-mounted-on-directory
 - interrupt the 'ls -l'
you will find that the directory has been unmounted.

This can be fixed by returning the actual error code from
nfs_lookup_verify_inode() rather then zero (except for -ESTALE).

Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfs/dir.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index cb52db9a0cfb..d24acf556e9e 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 			 unsigned int flags)
 {
 	struct inode *inode;
-	int error;
+	int error = 0;
 
 	nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
 	inode = d_inode(dentry);
@@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 	    nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
 		error = nfs_lookup_verify_inode(inode, flags);
 		if (error) {
-			if (error == -ESTALE)
+			if (error == -ESTALE) {
 				nfs_zap_caches(dir);
+				error = 0;
+			}
 			goto out_bad;
 		}
 		nfs_advise_use_readdirplus(dir);
@@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 out_bad:
 	if (flags & LOOKUP_RCU)
 		return -ECHILD;
-	return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
+	return nfs_lookup_revalidate_done(dir, dentry, inode, error);
 }
 
 static int
-- 
2.29.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  2:59 [PATCH] NFS: only invalidate dentrys that are clearly invalid NeilBrown
@ 2020-11-16  4:27 ` Trond Myklebust
  2020-11-16  4:43   ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2020-11-16  4:27 UTC (permalink / raw)
  To: neilb, anna.schumaker; +Cc: linux-nfs, linux-kernel

On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> 
> Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> nfs_lookup_revalidate()")
> and error from nfs_lookup_verify_inode() other than -ESTALE would
> result
> in nfs_lookup_revalidate() returning that error code (-ESTALE is
> mapped
> to zero).
> Since that commit, all errors result in zero being returned.
> 
> When nfs_lookup_revalidate() returns zero, the dentry is invalidated
> and, significantly, if the dentry is a directory that is mounted on,
> that mountpoint is lost.
> 
> If you:
>  - mount an NFS filesystem which contains a directory
>  - mount something (e.g. tmpfs) on that directory
>  - use iptables (or scissors) to block traffic to the server
>  - ls -l the-mounted-on-directory
>  - interrupt the 'ls -l'
> you will find that the directory has been unmounted.
> 
> This can be fixed by returning the actual error code from
> nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
> 
> Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfs/dir.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index cb52db9a0cfb..d24acf556e9e 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> struct dentry *dentry,
>                          unsigned int flags)
>  {
>         struct inode *inode;
> -       int error;
> +       int error = 0;
>  
>         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>         inode = d_inode(dentry);
> @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode *dir,
> struct dentry *dentry,
>             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
>                 error = nfs_lookup_verify_inode(inode, flags);
>                 if (error) {
> -                       if (error == -ESTALE)
> +                       if (error == -ESTALE) {
>                                 nfs_zap_caches(dir);
> +                               error = 0;
> +                       }
>                         goto out_bad;
>                 }
>                 nfs_advise_use_readdirplus(dir);
> @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> struct dentry *dentry,
>  out_bad:
>         if (flags & LOOKUP_RCU)
>                 return -ECHILD;
> -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
> +       return nfs_lookup_revalidate_done(dir, dentry, inode, error);

Which errors do we actually need to return here? As far as I can tell,
the only errors that nfs_lookup_verify_inode() is supposed to return is
ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.

Why would it be better to return those errors rather than just a 0 when
we need to invalidate the inode, particularly since we already have a
special case in nfs_lookup_revalidate_done() when the dentry is root?

>  }
>  
>  static int

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  4:27 ` Trond Myklebust
@ 2020-11-16  4:43   ` NeilBrown
  2020-11-16  4:50     ` Trond Myklebust
  0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2020-11-16  4:43 UTC (permalink / raw)
  To: Trond Myklebust, anna.schumaker; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4190 bytes --]

On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> 
>> Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> nfs_lookup_revalidate()")
>> and error from nfs_lookup_verify_inode() other than -ESTALE would
>> result
>> in nfs_lookup_revalidate() returning that error code (-ESTALE is
>> mapped
>> to zero).
>> Since that commit, all errors result in zero being returned.
>> 
>> When nfs_lookup_revalidate() returns zero, the dentry is invalidated
>> and, significantly, if the dentry is a directory that is mounted on,
>> that mountpoint is lost.
>> 
>> If you:
>>  - mount an NFS filesystem which contains a directory
>>  - mount something (e.g. tmpfs) on that directory
>>  - use iptables (or scissors) to block traffic to the server
>>  - ls -l the-mounted-on-directory
>>  - interrupt the 'ls -l'
>> you will find that the directory has been unmounted.
>> 
>> This can be fixed by returning the actual error code from
>> nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
>> 
>> Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> Signed-off-by: NeilBrown <neilb@suse.de>
>> ---
>>  fs/nfs/dir.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>> 
>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> index cb52db9a0cfb..d24acf556e9e 100644
>> --- a/fs/nfs/dir.c
>> +++ b/fs/nfs/dir.c
>> @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>>                          unsigned int flags)
>>  {
>>         struct inode *inode;
>> -       int error;
>> +       int error = 0;
>>  
>>         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>>         inode = d_inode(dentry);
>> @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>>             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
>>                 error = nfs_lookup_verify_inode(inode, flags);
>>                 if (error) {
>> -                       if (error == -ESTALE)
>> +                       if (error == -ESTALE) {
>>                                 nfs_zap_caches(dir);
>> +                               error = 0;
>> +                       }
>>                         goto out_bad;
>>                 }
>>                 nfs_advise_use_readdirplus(dir);
>> @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>>  out_bad:
>>         if (flags & LOOKUP_RCU)
>>                 return -ECHILD;
>> -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
>> +       return nfs_lookup_revalidate_done(dir, dentry, inode, error);
>
> Which errors do we actually need to return here? As far as I can tell,
> the only errors that nfs_lookup_verify_inode() is supposed to return is
> ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>
> Why would it be better to return those errors rather than just a 0 when
> we need to invalidate the inode, particularly since we already have a
> special case in nfs_lookup_revalidate_done() when the dentry is root?

ERESTARTSYS is the error that easily causes problems.

Returning 0 causes d_invalidate() to be called which is quite heavy
handed in mountpoints.
So it is only reasonable to return 0 when we have unambiguous
confirmation from the server that the object no longer exists.  ESTALE
is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
ETIMEDOUT are transient and don't justify d_invalidate() being called.

(BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are clearly invalid.")
 fixed much the same bug 3 years ago).
 
Thanks,
NeilBrown


>
>>  }
>>  
>>  static int
>
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  4:43   ` NeilBrown
@ 2020-11-16  4:50     ` Trond Myklebust
  2020-11-16  5:00       ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2020-11-16  4:50 UTC (permalink / raw)
  To: neilb, anna.schumaker; +Cc: linux-nfs, linux-kernel

On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > 
> > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > nfs_lookup_revalidate()")
> > > and error from nfs_lookup_verify_inode() other than -ESTALE would
> > > result
> > > in nfs_lookup_revalidate() returning that error code (-ESTALE is
> > > mapped
> > > to zero).
> > > Since that commit, all errors result in zero being returned.
> > > 
> > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > invalidated
> > > and, significantly, if the dentry is a directory that is mounted
> > > on,
> > > that mountpoint is lost.
> > > 
> > > If you:
> > >  - mount an NFS filesystem which contains a directory
> > >  - mount something (e.g. tmpfs) on that directory
> > >  - use iptables (or scissors) to block traffic to the server
> > >  - ls -l the-mounted-on-directory
> > >  - interrupt the 'ls -l'
> > > you will find that the directory has been unmounted.
> > > 
> > > This can be fixed by returning the actual error code from
> > > nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
> > > 
> > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/nfs/dir.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > index cb52db9a0cfb..d24acf556e9e 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> > > struct dentry *dentry,
> > >                          unsigned int flags)
> > >  {
> > >         struct inode *inode;
> > > -       int error;
> > > +       int error = 0;
> > >  
> > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > >         inode = d_inode(dentry);
> > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
> > > *dir,
> > > struct dentry *dentry,
> > >             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU))
> > > {
> > >                 error = nfs_lookup_verify_inode(inode, flags);
> > >                 if (error) {
> > > -                       if (error == -ESTALE)
> > > +                       if (error == -ESTALE) {
> > >                                 nfs_zap_caches(dir);
> > > +                               error = 0;
> > > +                       }
> > >                         goto out_bad;
> > >                 }
> > >                 nfs_advise_use_readdirplus(dir);
> > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> > > struct dentry *dentry,
> > >  out_bad:
> > >         if (flags & LOOKUP_RCU)
> > >                 return -ECHILD;
> > > -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
> > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
> > > error);
> > 
> > Which errors do we actually need to return here? As far as I can
> > tell,
> > the only errors that nfs_lookup_verify_inode() is supposed to
> > return is
> > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > 
> > Why would it be better to return those errors rather than just a 0
> > when
> > we need to invalidate the inode, particularly since we already have
> > a
> > special case in nfs_lookup_revalidate_done() when the dentry is
> > root?
> 
> ERESTARTSYS is the error that easily causes problems.
> 
> Returning 0 causes d_invalidate() to be called which is quite heavy
> handed in mountpoints.

My point is that it shouldn't get returned for mountpoints. See
nfs_lookup_revalidate_done().

> So it is only reasonable to return 0 when we have unambiguous
> confirmation from the server that the object no longer exists. 
> ESTALE
> is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
> ETIMEDOUT are transient and don't justify d_invalidate() being
> called.
> 
> (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
> clearly invalid.")
>  fixed much the same bug 3 years ago).
>  
> Thanks,
> NeilBrown
> 
> 
> > 
> > >  }
> > >  
> > >  static int
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com

-- 
Trond Myklebust
CTO, Hammerspace Inc
4984 El Camino Real, Suite 208
Los Altos, CA 94022
​
www.hammer.space


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  4:50     ` Trond Myklebust
@ 2020-11-16  5:00       ` NeilBrown
  2020-11-16  5:07         ` Trond Myklebust
  0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2020-11-16  5:00 UTC (permalink / raw)
  To: Trond Myklebust, anna.schumaker; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5444 bytes --]

On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
>> On Mon, Nov 16 2020, Trond Myklebust wrote:
>> 
>> > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> > > 
>> > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> > > nfs_lookup_revalidate()")
>> > > and error from nfs_lookup_verify_inode() other than -ESTALE would
>> > > result
>> > > in nfs_lookup_revalidate() returning that error code (-ESTALE is
>> > > mapped
>> > > to zero).
>> > > Since that commit, all errors result in zero being returned.
>> > > 
>> > > When nfs_lookup_revalidate() returns zero, the dentry is
>> > > invalidated
>> > > and, significantly, if the dentry is a directory that is mounted
>> > > on,
>> > > that mountpoint is lost.
>> > > 
>> > > If you:
>> > >  - mount an NFS filesystem which contains a directory
>> > >  - mount something (e.g. tmpfs) on that directory
>> > >  - use iptables (or scissors) to block traffic to the server
>> > >  - ls -l the-mounted-on-directory
>> > >  - interrupt the 'ls -l'
>> > > you will find that the directory has been unmounted.
>> > > 
>> > > This can be fixed by returning the actual error code from
>> > > nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
>> > > 
>> > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> > > Signed-off-by: NeilBrown <neilb@suse.de>
>> > > ---
>> > >  fs/nfs/dir.c | 8 +++++---
>> > >  1 file changed, 5 insertions(+), 3 deletions(-)
>> > > 
>> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > > index cb52db9a0cfb..d24acf556e9e 100644
>> > > --- a/fs/nfs/dir.c
>> > > +++ b/fs/nfs/dir.c
>> > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> > > struct dentry *dentry,
>> > >                          unsigned int flags)
>> > >  {
>> > >         struct inode *inode;
>> > > -       int error;
>> > > +       int error = 0;
>> > >  
>> > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> > >         inode = d_inode(dentry);
>> > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
>> > > *dir,
>> > > struct dentry *dentry,
>> > >             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU))
>> > > {
>> > >                 error = nfs_lookup_verify_inode(inode, flags);
>> > >                 if (error) {
>> > > -                       if (error == -ESTALE)
>> > > +                       if (error == -ESTALE) {
>> > >                                 nfs_zap_caches(dir);
>> > > +                               error = 0;
>> > > +                       }
>> > >                         goto out_bad;
>> > >                 }
>> > >                 nfs_advise_use_readdirplus(dir);
>> > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> > > struct dentry *dentry,
>> > >  out_bad:
>> > >         if (flags & LOOKUP_RCU)
>> > >                 return -ECHILD;
>> > > -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
>> > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
>> > > error);
>> > 
>> > Which errors do we actually need to return here? As far as I can
>> > tell,
>> > the only errors that nfs_lookup_verify_inode() is supposed to
>> > return is
>> > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>> > 
>> > Why would it be better to return those errors rather than just a 0
>> > when
>> > we need to invalidate the inode, particularly since we already have
>> > a
>> > special case in nfs_lookup_revalidate_done() when the dentry is
>> > root?
>> 
>> ERESTARTSYS is the error that easily causes problems.
>> 
>> Returning 0 causes d_invalidate() to be called which is quite heavy
>> handed in mountpoints.
>
> My point is that it shouldn't get returned for mountpoints. See
> nfs_lookup_revalidate_done().

nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many
mountpoints are IS_ROOT(), not all are (--bind easily makes others).

But that isn't even really relevant here.  The dentry being revalidated
is the underlying directory - that something else is mounted on.
step_into() which follows mount points is called in walk_component()
*after* lookup_fast or lookup_slow which will have revalidated the
dentry.

NeilBrown


>
>> So it is only reasonable to return 0 when we have unambiguous
>> confirmation from the server that the object no longer exists. 
>> ESTALE
>> is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
>> ETIMEDOUT are transient and don't justify d_invalidate() being
>> called.
>> 
>> (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
>> clearly invalid.")
>>  fixed much the same bug 3 years ago).
>>  
>> Thanks,
>> NeilBrown
>> 
>> 
>> > 
>> > >  }
>> > >  
>> > >  static int
>> > 
>> > -- 
>> > Trond Myklebust
>> > Linux NFS client maintainer, Hammerspace
>> > trond.myklebust@hammerspace.com
>
> -- 
> Trond Myklebust
> CTO, Hammerspace Inc
> 4984 El Camino Real, Suite 208
> Los Altos, CA 94022
> ​
> www.hammer.space

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  5:00       ` NeilBrown
@ 2020-11-16  5:07         ` Trond Myklebust
  2020-11-16  5:12           ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2020-11-16  5:07 UTC (permalink / raw)
  To: neilb, anna.schumaker; +Cc: linux-nfs, linux-kernel

On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > > > 
> > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > > > nfs_lookup_revalidate()")
> > > > > and error from nfs_lookup_verify_inode() other than -ESTALE
> > > > > would
> > > > > result
> > > > > in nfs_lookup_revalidate() returning that error code (-ESTALE
> > > > > is
> > > > > mapped
> > > > > to zero).
> > > > > Since that commit, all errors result in zero being returned.
> > > > > 
> > > > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > > > invalidated
> > > > > and, significantly, if the dentry is a directory that is
> > > > > mounted
> > > > > on,
> > > > > that mountpoint is lost.
> > > > > 
> > > > > If you:
> > > > >  - mount an NFS filesystem which contains a directory
> > > > >  - mount something (e.g. tmpfs) on that directory
> > > > >  - use iptables (or scissors) to block traffic to the server
> > > > >  - ls -l the-mounted-on-directory
> > > > >  - interrupt the 'ls -l'
> > > > > you will find that the directory has been unmounted.
> > > > > 
> > > > > This can be fixed by returning the actual error code from
> > > > > nfs_lookup_verify_inode() rather then zero (except for -
> > > > > ESTALE).
> > > > > 
> > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
> > > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > > ---
> > > > >  fs/nfs/dir.c | 8 +++++---
> > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > > index cb52db9a0cfb..d24acf556e9e 100644
> > > > > --- a/fs/nfs/dir.c
> > > > > +++ b/fs/nfs/dir.c
> > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode
> > > > > *dir,
> > > > > struct dentry *dentry,
> > > > >                          unsigned int flags)
> > > > >  {
> > > > >         struct inode *inode;
> > > > > -       int error;
> > > > > +       int error = 0;
> > > > >  
> > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > > > >         inode = d_inode(dentry);
> > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
> > > > > *dir,
> > > > > struct dentry *dentry,
> > > > >             nfs_check_verifier(dir, dentry, flags &
> > > > > LOOKUP_RCU))
> > > > > {
> > > > >                 error = nfs_lookup_verify_inode(inode,
> > > > > flags);
> > > > >                 if (error) {
> > > > > -                       if (error == -ESTALE)
> > > > > +                       if (error == -ESTALE) {
> > > > >                                 nfs_zap_caches(dir);
> > > > > +                               error = 0;
> > > > > +                       }
> > > > >                         goto out_bad;
> > > > >                 }
> > > > >                 nfs_advise_use_readdirplus(dir);
> > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode
> > > > > *dir,
> > > > > struct dentry *dentry,
> > > > >  out_bad:
> > > > >         if (flags & LOOKUP_RCU)
> > > > >                 return -ECHILD;
> > > > > -       return nfs_lookup_revalidate_done(dir, dentry, inode,
> > > > > 0);
> > > > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
> > > > > error);
> > > > 
> > > > Which errors do we actually need to return here? As far as I
> > > > can
> > > > tell,
> > > > the only errors that nfs_lookup_verify_inode() is supposed to
> > > > return is
> > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > > > 
> > > > Why would it be better to return those errors rather than just
> > > > a 0
> > > > when
> > > > we need to invalidate the inode, particularly since we already
> > > > have
> > > > a
> > > > special case in nfs_lookup_revalidate_done() when the dentry is
> > > > root?
> > > 
> > > ERESTARTSYS is the error that easily causes problems.
> > > 
> > > Returning 0 causes d_invalidate() to be called which is quite
> > > heavy
> > > handed in mountpoints.
> > 
> > My point is that it shouldn't get returned for mountpoints. See
> > nfs_lookup_revalidate_done().
> 
> nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many
> mountpoints are IS_ROOT(), not all are (--bind easily makes others).
> 
> But that isn't even really relevant here.  The dentry being
> revalidated
> is the underlying directory - that something else is mounted on.
> step_into() which follows mount points is called in walk_component()
> *after* lookup_fast or lookup_slow which will have revalidated the
> dentry.

So then why is it not sufficient to just add a check for
d_mountpoint()? This is a revalidation, not a new lookup.

> 
> NeilBrown
> 
> 
> > 
> > > So it is only reasonable to return 0 when we have unambiguous
> > > confirmation from the server that the object no longer exists. 
> > > ESTALE
> > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
> > > ETIMEDOUT are transient and don't justify d_invalidate() being
> > > called.
> > > 
> > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
> > > clearly invalid.")
> > >  fixed much the same bug 3 years ago).
> > >  
> > > Thanks,
> > > NeilBrown
> > > 
> > > 
> > > > 
> > > > >  }
> > > > >  
> > > > >  static int
> > > > 
> > > > -- 
> > > > Trond Myklebust
> > > > Linux NFS client maintainer, Hammerspace
> > > > trond.myklebust@hammerspace.com
> > 
> > -- 
> > Trond Myklebust
> > CTO, Hammerspace Inc
> > 4984 El Camino Real, Suite 208
> > Los Altos, CA 94022
> > ​
> > www.hammer.space

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  5:07         ` Trond Myklebust
@ 2020-11-16  5:12           ` NeilBrown
  2020-11-16  5:32             ` Trond Myklebust
  0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2020-11-16  5:12 UTC (permalink / raw)
  To: Trond Myklebust, anna.schumaker; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6642 bytes --]

On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
>> On Mon, Nov 16 2020, Trond Myklebust wrote:
>> 
>> > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
>> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
>> > > 
>> > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> > > > > 
>> > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> > > > > nfs_lookup_revalidate()")
>> > > > > and error from nfs_lookup_verify_inode() other than -ESTALE
>> > > > > would
>> > > > > result
>> > > > > in nfs_lookup_revalidate() returning that error code (-ESTALE
>> > > > > is
>> > > > > mapped
>> > > > > to zero).
>> > > > > Since that commit, all errors result in zero being returned.
>> > > > > 
>> > > > > When nfs_lookup_revalidate() returns zero, the dentry is
>> > > > > invalidated
>> > > > > and, significantly, if the dentry is a directory that is
>> > > > > mounted
>> > > > > on,
>> > > > > that mountpoint is lost.
>> > > > > 
>> > > > > If you:
>> > > > >  - mount an NFS filesystem which contains a directory
>> > > > >  - mount something (e.g. tmpfs) on that directory
>> > > > >  - use iptables (or scissors) to block traffic to the server
>> > > > >  - ls -l the-mounted-on-directory
>> > > > >  - interrupt the 'ls -l'
>> > > > > you will find that the directory has been unmounted.
>> > > > > 
>> > > > > This can be fixed by returning the actual error code from
>> > > > > nfs_lookup_verify_inode() rather then zero (except for -
>> > > > > ESTALE).
>> > > > > 
>> > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> > > > > Signed-off-by: NeilBrown <neilb@suse.de>
>> > > > > ---
>> > > > >  fs/nfs/dir.c | 8 +++++---
>> > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
>> > > > > 
>> > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > > > > index cb52db9a0cfb..d24acf556e9e 100644
>> > > > > --- a/fs/nfs/dir.c
>> > > > > +++ b/fs/nfs/dir.c
>> > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode
>> > > > > *dir,
>> > > > > struct dentry *dentry,
>> > > > >                          unsigned int flags)
>> > > > >  {
>> > > > >         struct inode *inode;
>> > > > > -       int error;
>> > > > > +       int error = 0;
>> > > > >  
>> > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> > > > >         inode = d_inode(dentry);
>> > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
>> > > > > *dir,
>> > > > > struct dentry *dentry,
>> > > > >             nfs_check_verifier(dir, dentry, flags &
>> > > > > LOOKUP_RCU))
>> > > > > {
>> > > > >                 error = nfs_lookup_verify_inode(inode,
>> > > > > flags);
>> > > > >                 if (error) {
>> > > > > -                       if (error == -ESTALE)
>> > > > > +                       if (error == -ESTALE) {
>> > > > >                                 nfs_zap_caches(dir);
>> > > > > +                               error = 0;
>> > > > > +                       }
>> > > > >                         goto out_bad;
>> > > > >                 }
>> > > > >                 nfs_advise_use_readdirplus(dir);
>> > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode
>> > > > > *dir,
>> > > > > struct dentry *dentry,
>> > > > >  out_bad:
>> > > > >         if (flags & LOOKUP_RCU)
>> > > > >                 return -ECHILD;
>> > > > > -       return nfs_lookup_revalidate_done(dir, dentry, inode,
>> > > > > 0);
>> > > > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
>> > > > > error);
>> > > > 
>> > > > Which errors do we actually need to return here? As far as I
>> > > > can
>> > > > tell,
>> > > > the only errors that nfs_lookup_verify_inode() is supposed to
>> > > > return is
>> > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>> > > > 
>> > > > Why would it be better to return those errors rather than just
>> > > > a 0
>> > > > when
>> > > > we need to invalidate the inode, particularly since we already
>> > > > have
>> > > > a
>> > > > special case in nfs_lookup_revalidate_done() when the dentry is
>> > > > root?
>> > > 
>> > > ERESTARTSYS is the error that easily causes problems.
>> > > 
>> > > Returning 0 causes d_invalidate() to be called which is quite
>> > > heavy
>> > > handed in mountpoints.
>> > 
>> > My point is that it shouldn't get returned for mountpoints. See
>> > nfs_lookup_revalidate_done().
>> 
>> nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many
>> mountpoints are IS_ROOT(), not all are (--bind easily makes others).
>> 
>> But that isn't even really relevant here.  The dentry being
>> revalidated
>> is the underlying directory - that something else is mounted on.
>> step_into() which follows mount points is called in walk_component()
>> *after* lookup_fast or lookup_slow which will have revalidated the
>> dentry.
>
> So then why is it not sufficient to just add a check for
> d_mountpoint()? This is a revalidation, not a new lookup.
>

I guess you could do that.
But why would you want to call d_invalidate() just because a signal was
received, or a memory allocation failed?

NeilBrown


>> 
>> NeilBrown
>> 
>> 
>> > 
>> > > So it is only reasonable to return 0 when we have unambiguous
>> > > confirmation from the server that the object no longer exists. 
>> > > ESTALE
>> > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
>> > > ETIMEDOUT are transient and don't justify d_invalidate() being
>> > > called.
>> > > 
>> > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
>> > > clearly invalid.")
>> > >  fixed much the same bug 3 years ago).
>> > >  
>> > > Thanks,
>> > > NeilBrown
>> > > 
>> > > 
>> > > > 
>> > > > >  }
>> > > > >  
>> > > > >  static int
>> > > > 
>> > > > -- 
>> > > > Trond Myklebust
>> > > > Linux NFS client maintainer, Hammerspace
>> > > > trond.myklebust@hammerspace.com
>> > 
>> > -- 
>> > Trond Myklebust
>> > CTO, Hammerspace Inc
>> > 4984 El Camino Real, Suite 208
>> > Los Altos, CA 94022
>> > ​
>> > www.hammer.space
>
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  5:12           ` NeilBrown
@ 2020-11-16  5:32             ` Trond Myklebust
  2020-11-16  6:08               ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Trond Myklebust @ 2020-11-16  5:32 UTC (permalink / raw)
  To: neilb, anna.schumaker; +Cc: linux-nfs, linux-kernel

On Mon, 2020-11-16 at 16:12 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> > > > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > > > 
> > > > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > > > > > 
> > > > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > > > > > nfs_lookup_revalidate()")
> > > > > > > and error from nfs_lookup_verify_inode() other than -
> > > > > > > ESTALE
> > > > > > > would
> > > > > > > result
> > > > > > > in nfs_lookup_revalidate() returning that error code (-
> > > > > > > ESTALE
> > > > > > > is
> > > > > > > mapped
> > > > > > > to zero).
> > > > > > > Since that commit, all errors result in zero being
> > > > > > > returned.
> > > > > > > 
> > > > > > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > > > > > invalidated
> > > > > > > and, significantly, if the dentry is a directory that is
> > > > > > > mounted
> > > > > > > on,
> > > > > > > that mountpoint is lost.
> > > > > > > 
> > > > > > > If you:
> > > > > > >  - mount an NFS filesystem which contains a directory
> > > > > > >  - mount something (e.g. tmpfs) on that directory
> > > > > > >  - use iptables (or scissors) to block traffic to the
> > > > > > > server
> > > > > > >  - ls -l the-mounted-on-directory
> > > > > > >  - interrupt the 'ls -l'
> > > > > > > you will find that the directory has been unmounted.
> > > > > > > 
> > > > > > > This can be fixed by returning the actual error code from
> > > > > > > nfs_lookup_verify_inode() rather then zero (except for -
> > > > > > > ESTALE).
> > > > > > > 
> > > > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor
> > > > > > > nfs_lookup_revalidate()")
> > > > > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > > > > ---
> > > > > > >  fs/nfs/dir.c | 8 +++++---
> > > > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > > > > index cb52db9a0cfb..d24acf556e9e 100644
> > > > > > > --- a/fs/nfs/dir.c
> > > > > > > +++ b/fs/nfs/dir.c
> > > > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >                          unsigned int flags)
> > > > > > >  {
> > > > > > >         struct inode *inode;
> > > > > > > -       int error;
> > > > > > > +       int error = 0;
> > > > > > >  
> > > > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > > > > > >         inode = d_inode(dentry);
> > > > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >             nfs_check_verifier(dir, dentry, flags &
> > > > > > > LOOKUP_RCU))
> > > > > > > {
> > > > > > >                 error = nfs_lookup_verify_inode(inode,
> > > > > > > flags);
> > > > > > >                 if (error) {
> > > > > > > -                       if (error == -ESTALE)
> > > > > > > +                       if (error == -ESTALE) {
> > > > > > >                                 nfs_zap_caches(dir);
> > > > > > > +                               error = 0;
> > > > > > > +                       }
> > > > > > >                         goto out_bad;
> > > > > > >                 }
> > > > > > >                 nfs_advise_use_readdirplus(dir);
> > > > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >  out_bad:
> > > > > > >         if (flags & LOOKUP_RCU)
> > > > > > >                 return -ECHILD;
> > > > > > > -       return nfs_lookup_revalidate_done(dir, dentry,
> > > > > > > inode,
> > > > > > > 0);
> > > > > > > +       return nfs_lookup_revalidate_done(dir, dentry,
> > > > > > > inode,
> > > > > > > error);
> > > > > > 
> > > > > > Which errors do we actually need to return here? As far as
> > > > > > I
> > > > > > can
> > > > > > tell,
> > > > > > the only errors that nfs_lookup_verify_inode() is supposed
> > > > > > to
> > > > > > return is
> > > > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > > > > > 
> > > > > > Why would it be better to return those errors rather than
> > > > > > just
> > > > > > a 0
> > > > > > when
> > > > > > we need to invalidate the inode, particularly since we
> > > > > > already
> > > > > > have
> > > > > > a
> > > > > > special case in nfs_lookup_revalidate_done() when the
> > > > > > dentry is
> > > > > > root?
> > > > > 
> > > > > ERESTARTSYS is the error that easily causes problems.
> > > > > 
> > > > > Returning 0 causes d_invalidate() to be called which is quite
> > > > > heavy
> > > > > handed in mountpoints.
> > > > 
> > > > My point is that it shouldn't get returned for mountpoints. See
> > > > nfs_lookup_revalidate_done().
> > > 
> > > nfs_lookup_revalidate_done() only checks IS_ROOT(), and while
> > > many
> > > mountpoints are IS_ROOT(), not all are (--bind easily makes
> > > others).
> > > 
> > > But that isn't even really relevant here.  The dentry being
> > > revalidated
> > > is the underlying directory - that something else is mounted on.
> > > step_into() which follows mount points is called in
> > > walk_component()
> > > *after* lookup_fast or lookup_slow which will have revalidated
> > > the
> > > dentry.
> > 
> > So then why is it not sufficient to just add a check for
> > d_mountpoint()? This is a revalidation, not a new lookup.
> > 
> 
> I guess you could do that.
> But why would you want to call d_invalidate() just because a signal
> was
> received, or a memory allocation failed?

Why would I care about the error return from nfs_lookup_verify_inode()?
This is a revalidation, and so sometimes the error returned is not
transient, but is persistent (e.g. EIO/ETIMEDOUT if the server is
down). In those cases, I still want to be able to do things like
unmount the filesystem.

> 
> NeilBrown
> 
> 
> > > 
> > > NeilBrown
> > > 
> > > 
> > > > 
> > > > > So it is only reasonable to return 0 when we have unambiguous
> > > > > confirmation from the server that the object no longer
> > > > > exists. 
> > > > > ESTALE
> > > > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS,
> > > > > ENOMEM,
> > > > > ETIMEDOUT are transient and don't justify d_invalidate()
> > > > > being
> > > > > called.
> > > > > 
> > > > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that
> > > > > are
> > > > > clearly invalid.")
> > > > >  fixed much the same bug 3 years ago).
> > > > >  
> > > > > Thanks,
> > > > > NeilBrown
> > > > > 
> > > > > 
> > > > > > 
> > > > > > >  }
> > > > > > >  
> > > > > > >  static int
> > > > > > 
> > > > > > -- 
> > > > > > Trond Myklebust
> > > > > > Linux NFS client maintainer, Hammerspace
> > > > > > trond.myklebust@hammerspace.com
> > > > 
> > > > -- 
> > > > Trond Myklebust
> > > > CTO, Hammerspace Inc
> > > > 4984 El Camino Real, Suite 208
> > > > Los Altos, CA 94022
> > > > ​
> > > > www.hammer.space
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com

-- 
Trond Myklebust
CTO, Hammerspace Inc
4984 El Camino Real, Suite 208
Los Altos, CA 94022
​
www.hammer.space


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
  2020-11-16  5:32             ` Trond Myklebust
@ 2020-11-16  6:08               ` NeilBrown
  0 siblings, 0 replies; 10+ messages in thread
From: NeilBrown @ 2020-11-16  6:08 UTC (permalink / raw)
  To: Trond Myklebust, anna.schumaker; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 8983 bytes --]

On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 16:12 +1100, NeilBrown wrote:
>> On Mon, Nov 16 2020, Trond Myklebust wrote:
>> 
>> > On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
>> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
>> > > 
>> > > > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
>> > > > > On Mon, Nov 16 2020, Trond Myklebust wrote:
>> > > > > 
>> > > > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> > > > > > > 
>> > > > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> > > > > > > nfs_lookup_revalidate()")
>> > > > > > > and error from nfs_lookup_verify_inode() other than -
>> > > > > > > ESTALE
>> > > > > > > would
>> > > > > > > result
>> > > > > > > in nfs_lookup_revalidate() returning that error code (-
>> > > > > > > ESTALE
>> > > > > > > is
>> > > > > > > mapped
>> > > > > > > to zero).
>> > > > > > > Since that commit, all errors result in zero being
>> > > > > > > returned.
>> > > > > > > 
>> > > > > > > When nfs_lookup_revalidate() returns zero, the dentry is
>> > > > > > > invalidated
>> > > > > > > and, significantly, if the dentry is a directory that is
>> > > > > > > mounted
>> > > > > > > on,
>> > > > > > > that mountpoint is lost.
>> > > > > > > 
>> > > > > > > If you:
>> > > > > > >  - mount an NFS filesystem which contains a directory
>> > > > > > >  - mount something (e.g. tmpfs) on that directory
>> > > > > > >  - use iptables (or scissors) to block traffic to the
>> > > > > > > server
>> > > > > > >  - ls -l the-mounted-on-directory
>> > > > > > >  - interrupt the 'ls -l'
>> > > > > > > you will find that the directory has been unmounted.
>> > > > > > > 
>> > > > > > > This can be fixed by returning the actual error code from
>> > > > > > > nfs_lookup_verify_inode() rather then zero (except for -
>> > > > > > > ESTALE).
>> > > > > > > 
>> > > > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor
>> > > > > > > nfs_lookup_revalidate()")
>> > > > > > > Signed-off-by: NeilBrown <neilb@suse.de>
>> > > > > > > ---
>> > > > > > >  fs/nfs/dir.c | 8 +++++---
>> > > > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
>> > > > > > > 
>> > > > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > > > > > > index cb52db9a0cfb..d24acf556e9e 100644
>> > > > > > > --- a/fs/nfs/dir.c
>> > > > > > > +++ b/fs/nfs/dir.c
>> > > > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct
>> > > > > > > inode
>> > > > > > > *dir,
>> > > > > > > struct dentry *dentry,
>> > > > > > >                          unsigned int flags)
>> > > > > > >  {
>> > > > > > >         struct inode *inode;
>> > > > > > > -       int error;
>> > > > > > > +       int error = 0;
>> > > > > > >  
>> > > > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> > > > > > >         inode = d_inode(dentry);
>> > > > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct
>> > > > > > > inode
>> > > > > > > *dir,
>> > > > > > > struct dentry *dentry,
>> > > > > > >             nfs_check_verifier(dir, dentry, flags &
>> > > > > > > LOOKUP_RCU))
>> > > > > > > {
>> > > > > > >                 error = nfs_lookup_verify_inode(inode,
>> > > > > > > flags);
>> > > > > > >                 if (error) {
>> > > > > > > -                       if (error == -ESTALE)
>> > > > > > > +                       if (error == -ESTALE) {
>> > > > > > >                                 nfs_zap_caches(dir);
>> > > > > > > +                               error = 0;
>> > > > > > > +                       }
>> > > > > > >                         goto out_bad;
>> > > > > > >                 }
>> > > > > > >                 nfs_advise_use_readdirplus(dir);
>> > > > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct
>> > > > > > > inode
>> > > > > > > *dir,
>> > > > > > > struct dentry *dentry,
>> > > > > > >  out_bad:
>> > > > > > >         if (flags & LOOKUP_RCU)
>> > > > > > >                 return -ECHILD;
>> > > > > > > -       return nfs_lookup_revalidate_done(dir, dentry,
>> > > > > > > inode,
>> > > > > > > 0);
>> > > > > > > +       return nfs_lookup_revalidate_done(dir, dentry,
>> > > > > > > inode,
>> > > > > > > error);
>> > > > > > 
>> > > > > > Which errors do we actually need to return here? As far as
>> > > > > > I
>> > > > > > can
>> > > > > > tell,
>> > > > > > the only errors that nfs_lookup_verify_inode() is supposed
>> > > > > > to
>> > > > > > return is
>> > > > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>> > > > > > 
>> > > > > > Why would it be better to return those errors rather than
>> > > > > > just
>> > > > > > a 0
>> > > > > > when
>> > > > > > we need to invalidate the inode, particularly since we
>> > > > > > already
>> > > > > > have
>> > > > > > a
>> > > > > > special case in nfs_lookup_revalidate_done() when the
>> > > > > > dentry is
>> > > > > > root?
>> > > > > 
>> > > > > ERESTARTSYS is the error that easily causes problems.
>> > > > > 
>> > > > > Returning 0 causes d_invalidate() to be called which is quite
>> > > > > heavy
>> > > > > handed in mountpoints.
>> > > > 
>> > > > My point is that it shouldn't get returned for mountpoints. See
>> > > > nfs_lookup_revalidate_done().
>> > > 
>> > > nfs_lookup_revalidate_done() only checks IS_ROOT(), and while
>> > > many
>> > > mountpoints are IS_ROOT(), not all are (--bind easily makes
>> > > others).
>> > > 
>> > > But that isn't even really relevant here.  The dentry being
>> > > revalidated
>> > > is the underlying directory - that something else is mounted on.
>> > > step_into() which follows mount points is called in
>> > > walk_component()
>> > > *after* lookup_fast or lookup_slow which will have revalidated
>> > > the
>> > > dentry.
>> > 
>> > So then why is it not sufficient to just add a check for
>> > d_mountpoint()? This is a revalidation, not a new lookup.
>> > 
>> 
>> I guess you could do that.
>> But why would you want to call d_invalidate() just because a signal
>> was
>> received, or a memory allocation failed?
>
> Why would I care about the error return from nfs_lookup_verify_inode()?

?? Why wouldn't you?  What am I missing?

> This is a revalidation, and so sometimes the error returned is not
> transient, but is persistent (e.g. EIO/ETIMEDOUT if the server is
> down). In those cases, I still want to be able to do things like
> unmount the filesystem.

Why do you think that a server being down is a persistent error?
Servers come back up.

I don't have a particular opinion on how EIO and ETIMEDOUT should be
handled as I don't have a clear idea of what underlying issue produces
them.

I do know that ESTALE means the directory doesn't exist any more, so the
dentry should be invalidated, whether it is mounted on or not.
I do know that ERESTARTSYS means that a fatal signal was received by the
current process, so it is inappropriate to invalidate the dentry.

So I'm certain that we need to be able to handle different error codes
differently.  Maybe EIO should be treated the same as ESTALE.  Probably
ENOMEM should be handled like ERESTARTSYS....

NeilBrown


>
>> 
>> NeilBrown
>> 
>> 
>> > > 
>> > > NeilBrown
>> > > 
>> > > 
>> > > > 
>> > > > > So it is only reasonable to return 0 when we have unambiguous
>> > > > > confirmation from the server that the object no longer
>> > > > > exists. 
>> > > > > ESTALE
>> > > > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS,
>> > > > > ENOMEM,
>> > > > > ETIMEDOUT are transient and don't justify d_invalidate()
>> > > > > being
>> > > > > called.
>> > > > > 
>> > > > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that
>> > > > > are
>> > > > > clearly invalid.")
>> > > > >  fixed much the same bug 3 years ago).
>> > > > >  
>> > > > > Thanks,
>> > > > > NeilBrown
>> > > > > 
>> > > > > 
>> > > > > > 
>> > > > > > >  }
>> > > > > > >  
>> > > > > > >  static int
>> > > > > > 
>> > > > > > -- 
>> > > > > > Trond Myklebust
>> > > > > > Linux NFS client maintainer, Hammerspace
>> > > > > > trond.myklebust@hammerspace.com
>> > > > 
>> > > > -- 
>> > > > Trond Myklebust
>> > > > CTO, Hammerspace Inc
>> > > > 4984 El Camino Real, Suite 208
>> > > > Los Altos, CA 94022
>> > > > ​
>> > > > www.hammer.space
>> > 
>> > -- 
>> > Trond Myklebust
>> > Linux NFS client maintainer, Hammerspace
>> > trond.myklebust@hammerspace.com
>
> -- 
> Trond Myklebust
> CTO, Hammerspace Inc
> 4984 El Camino Real, Suite 208
> Los Altos, CA 94022
> ​
> www.hammer.space

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] NFS: only invalidate dentrys that are clearly invalid.
@ 2017-07-05  2:22 NeilBrown
  0 siblings, 0 replies; 10+ messages in thread
From: NeilBrown @ 2017-07-05  2:22 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs, linux-fsdevel, LKML

[-- Attachment #1: Type: text/plain, Size: 3708 bytes --]


Since commit bafc9b754f75 ("vfs: More precise tests in d_invalidate")
in v3.18, a return of '0' from ->d_revalidate() will cause the dentry
to be invalidated even if it has filesystems mounted on or it or on a
descendant.  The mounted filesystem is unmounted.

This means we need to be careful not to return 0 unless the directory
referred to truly is invalid.  So -ESTALE or -ENOENT should invalidate
the directory.  Other errors such a -EPERM or -ERESTARTSYS should be
returned from ->d_revalidate() so they are propagated to the caller.

A particular problem can be demonstrated by:

1/ mount an NFS filesystem using NFSv3 on /mnt
2/ mount any other filesystem on /mnt/foo
3/ ls /mnt/foo
4/ turn off network, or otherwise make the server unable to respond
5/ ls /mnt/foo &
6/ cat /proc/$!/stack # note that nfs_lookup_revalidate is in the call stack
7/ kill -9 $! # this results in -ERESTARTSYS being returned
8/ observe that /mnt/foo has been unmounted.

This patch changes nfs_lookup_revalidate() to only treat
  -ESTALE from nfs_lookup_verify_inode() and
  -ESTALE or -ENOENT from ->lookup()
as indicating an invalid inode.  Other errors are returned.

Also nfs_check_inode_attributes() is changed to return -ESTALE rather
than -EIO.  This is consistent with the error returned in similar
circumstances from nfs_update_inode().

As this bug allows any user to unmount a filesystem mounted on an NFS
filesystem, this fix is suitable for stable kernels.

Fixes: bafc9b754f75 ("vfs: More precise tests in d_invalidate")
Cc: stable@vger.kernel.org (v3.18+)
Signed-off-by: NeilBrown <neilb@suse.com>
---
 fs/nfs/dir.c   | 12 ++++++++----
 fs/nfs/inode.c |  4 ++--
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 4f0706bd387f..727c5a53417a 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1115,11 +1115,13 @@ static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
 	/* Force a full look up iff the parent directory has changed */
 	if (!nfs_is_exclusive_create(dir, flags) &&
 	    nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
-
-		if (nfs_lookup_verify_inode(inode, flags)) {
+		error = nfs_lookup_verify_inode(inode, flags);
+		if (error) {
 			if (flags & LOOKUP_RCU)
 				return -ECHILD;
-			goto out_zap_parent;
+			if (error == -ESTALE)
+				goto out_zap_parent;
+			goto out_error;
 		}
 		nfs_advise_use_readdirplus(dir);
 		goto out_valid;
@@ -1144,8 +1146,10 @@ static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
 	trace_nfs_lookup_revalidate_enter(dir, dentry, flags);
 	error = NFS_PROTO(dir)->lookup(dir, &dentry->d_name, fhandle, fattr, label);
 	trace_nfs_lookup_revalidate_exit(dir, dentry, flags, error);
-	if (error)
+	if (error == -ESTALE || error == -ENOENT)
 		goto out_bad;
+	if (error)
+		goto out_error;
 	if (nfs_compare_fh(NFS_FH(inode), fhandle))
 		goto out_bad;
 	if ((error = nfs_refresh_inode(inode, fattr)) != 0)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 5751c1310177..7e7a894601b9 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1321,9 +1321,9 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat
 		return 0;
 	/* Has the inode gone and changed behind our back? */
 	if ((fattr->valid & NFS_ATTR_FATTR_FILEID) && nfsi->fileid != fattr->fileid)
-		return -EIO;
+		return -ESTALE;
 	if ((fattr->valid & NFS_ATTR_FATTR_TYPE) && (inode->i_mode & S_IFMT) != (fattr->mode & S_IFMT))
-		return -EIO;
+		return -ESTALE;
 
 	if (!nfs_file_has_buffered_writers(nfsi)) {
 		/* Verify a few of the more important attributes */
-- 
2.12.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-16  6:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16  2:59 [PATCH] NFS: only invalidate dentrys that are clearly invalid NeilBrown
2020-11-16  4:27 ` Trond Myklebust
2020-11-16  4:43   ` NeilBrown
2020-11-16  4:50     ` Trond Myklebust
2020-11-16  5:00       ` NeilBrown
2020-11-16  5:07         ` Trond Myklebust
2020-11-16  5:12           ` NeilBrown
2020-11-16  5:32             ` Trond Myklebust
2020-11-16  6:08               ` NeilBrown
  -- strict thread matches above, loose matches on Subject: below --
2017-07-05  2:22 NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).