linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFSv4: open() should try lease recovery on NFS4ERR_EXPIRED
@ 2019-12-16 17:54 Robert Milkowski
  2019-12-16 18:36 ` Robert Milkowski
  2019-12-16 18:39 ` Trond Myklebust
  0 siblings, 2 replies; 3+ messages in thread
From: Robert Milkowski @ 2019-12-16 17:54 UTC (permalink / raw)
  To: linux-nfs
  Cc: 'Trond Myklebust', 'Anna Schumaker',
	linux-kernel, linux-nfs

From: Robert Milkowski <rmilkowski@gmail.com>

Currently, if an nfs server returns NFS4ERR_EXPIRED to open,
we return EIO to applications without even trying to recover.

Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
---
 fs/nfs/nfs4proc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index aad65dd..04e6a13 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct nfs_server
*server,
 						stateid);
 				goto wait_on_recovery;
 			}
+			if (state == NULL) {
+				nfs4_schedule_lease_recovery(clp);
+				goto wait_on_recovery;
+			}
 			/* Fall through */
 		case -NFS4ERR_OPENMODE:
 			if (inode) {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* RE: [PATCH] NFSv4: open() should try lease recovery on NFS4ERR_EXPIRED
  2019-12-16 17:54 [PATCH] NFSv4: open() should try lease recovery on NFS4ERR_EXPIRED Robert Milkowski
@ 2019-12-16 18:36 ` Robert Milkowski
  2019-12-16 18:39 ` Trond Myklebust
  1 sibling, 0 replies; 3+ messages in thread
From: Robert Milkowski @ 2019-12-16 18:36 UTC (permalink / raw)
  To: linux-nfs
  Cc: 'Trond Myklebust', 'Anna Schumaker',
	linux-kernel, linux-nfs

Hi,


Currently if nfsv4 lease expires on a server Linux client will return EIO to
open and won't try to recover.

For example:

$ date; strace -t -e trace=open head /mnt/3/var/log/vmware-vmsvc.log
>/dev/null
Mon 16 Dec 17:37:16 GMT 2019
...
17:37:16 open("/mnt/3/var/log/vmware-vmsvc.log", O_RDONLY) = -1 EIO
(Input/output error)
...


The network/nfs traffic:

2638 17:37:16.286642918  10.50.2.57 -> 10.50.2.59  NFS 246 V4 Call ACCESS
FH: 0x62d40c52, [Check: RD LU MD XT DL]
2639 17:37:16.286929287  10.50.2.59 -> 10.50.2.57  NFS 194 V4 Reply (Call In
2638) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
2640 17:37:16.287295942  10.50.2.57 -> 10.50.2.59  NFS 258 V4 Call ACCESS
FH: 0xeddb7439, [Check: RD LU MD XT DL]
2641 17:37:16.287448815  10.50.2.59 -> 10.50.2.57  NFS 194 V4 Reply (Call In
2640) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
2642 17:37:16.287666576  10.50.2.57 -> 10.50.2.59  NFS 266 V4 Call ACCESS
FH: 0x8503e2cd, [Check: RD LU MD XT DL]
2643 17:37:16.287786851  10.50.2.59 -> 10.50.2.57  NFS 194 V4 Reply (Call In
2642) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
2644 17:37:16.287984808  10.50.2.57 -> 10.50.2.59  NFS 350 V4 Call OPEN DH:
0x8503e2cd/vmware-vmsvc.log
2645 17:37:16.288194276  10.50.2.59 -> 10.50.2.57  NFS 122 V4 Reply (Call In
2644) OPEN Status: NFS4ERR_EXPIRED


Both Linux and Solaris NFS servers return NFS4ERR_EXPIRED if nfsv4 lease
expire.
Recent ONTAP versions return NFS4ERR_STALE_CLIENTID which is handled
correctly.

The patch changes handling of NFS4ERR_EXPIRED error to also try to recover.

The issue can be triggered by exporting over NFSv4 a filesystem with a
sub-filesystem,
for example by exporting / and /var (assuming /var is a separate
filesystem).

How to reproduce the issue:

On an NFS server run:

# cat /etc/exports
/ *(rw,sync)
/var *(rw,sync)


On a Linux NFS client:

# mount -o vers=4 10.50.2.59:/ /mnt/3

$ date; strace -t -e trace=open head /mnt/3/var/log/vmware-vmsvc.log
>/dev/null
Mon 16 Dec 17:28:45 GMT 2019
...
17:28:45 open("/mnt/3/var/log/vmware-vmsvc.log", O_RDONLY) = 3
...


Now, on the NFS client run:

# while [ 1 ]; do date; umount /mnt/3/var; ls /mnt/3/var >/dev/null; sleep
10; done
...

Because of another bug (see email with subject: [PATCH] NFSv4:
nfs4_do_fsinfo() should not do implicit lease renewals)
the above while loop will prevent the NFS client from sending RENEW
operations to server as it currently assumes an implicit
lease renewal which I believe is not compliant with the RFC. If you now wait
long enough for the NFS server to expire the lease,
and then try to open a file (with no other NFS activity) it should result in
server sending NFS4ERR_EXPIRED and the NFS LINUX client
will return EIO to an application as shown above.


This also gets triggered if a sub-mount in automatically unmounted after
nfs_mountpoint_expiry_timeout, then shortly after there is some
access to it which will trigger for it to be automatically mounted again
which will delay the client in sending the RENEW operation.
This results  in a short window during which open() will return EIO to any
files from the file server.

This was tested with 5.5.0-rc2 and the provided patch is applied on top of
the 5.5.0-rc2 as well.


Best regards,
 Robert Milkowski


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] NFSv4: open() should try lease recovery on NFS4ERR_EXPIRED
  2019-12-16 17:54 [PATCH] NFSv4: open() should try lease recovery on NFS4ERR_EXPIRED Robert Milkowski
  2019-12-16 18:36 ` Robert Milkowski
@ 2019-12-16 18:39 ` Trond Myklebust
  1 sibling, 0 replies; 3+ messages in thread
From: Trond Myklebust @ 2019-12-16 18:39 UTC (permalink / raw)
  To: linux-nfs, rmilkowski; +Cc: anna.schumaker, linux-kernel

On Mon, 2019-12-16 at 17:54 +0000, Robert Milkowski wrote:
> From: Robert Milkowski <rmilkowski@gmail.com>
> 
> Currently, if an nfs server returns NFS4ERR_EXPIRED to open,
> we return EIO to applications without even trying to recover.
> 
> Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> ---
>  fs/nfs/nfs4proc.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index aad65dd..04e6a13 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct
> nfs_server
> *server,
>  						stateid);
>  				goto wait_on_recovery;
>  			}
> +			if (state == NULL) {
> +				nfs4_schedule_lease_recovery(clp);
> +				goto wait_on_recovery;
> +			}
>  			/* Fall through */
>  		case -NFS4ERR_OPENMODE:
>  			if (inode) {

Ugh. If we need a state==NULL for NFSv4.0, then let's get rid of the
switch() fall through in the above case, and rather add a delegation
return to the EXPIRED handling.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-12-16 18:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-16 17:54 [PATCH] NFSv4: open() should try lease recovery on NFS4ERR_EXPIRED Robert Milkowski
2019-12-16 18:36 ` Robert Milkowski
2019-12-16 18:39 ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).