linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
@ 2019-12-16 17:38 Robert Milkowski
  2019-12-16 18:24 ` Trond Myklebust
  2019-12-16 18:36 ` Robert Milkowski
  0 siblings, 2 replies; 9+ messages in thread
From: Robert Milkowski @ 2019-12-16 17:38 UTC (permalink / raw)
  To: linux-nfs
  Cc: 'Trond Myklebust', 'Anna Schumaker',
	linux-kernel, linux-nfs

From: Robert Milkowski <rmilkowski@gmail.com>

Currently, each time nfs4_do_fsinfo() is called it will do an implicit
NFS4 lease renewal, which is not compliant with the NFS4 specification.
This can result in a lease being expired by NFS server which will then
return NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
---
 fs/nfs/nfs4proc.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 76d3716..aad65dd 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5019,16 +5019,19 @@ static int nfs4_do_fsinfo(struct nfs_server *server,
struct nfs_fh *fhandle, str
 	struct nfs4_exception exception = {
 		.interruptible = true,
 	};
-	unsigned long now = jiffies;
+	unsigned long last_renewal = jiffies;
 	int err;
 
 	do {
 		err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
 		trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr, err);
 		if (err == 0) {
+			/* no implicit lease renewal allowed here */
+			if (server->nfs_client->cl_last_renewal != 0)
+				last_renewal =
server->nfs_client->cl_last_renewal;
 			nfs4_set_lease_period(server->nfs_client,
 					fsinfo->lease_time * HZ,
-					now);
+					last_renewal);
 			break;
 		}
 		err = nfs4_handle_exception(server, err, &exception);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 17:38 [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals Robert Milkowski
@ 2019-12-16 18:24 ` Trond Myklebust
  2019-12-16 18:43   ` Robert Milkowski
  2019-12-16 18:36 ` Robert Milkowski
  1 sibling, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2019-12-16 18:24 UTC (permalink / raw)
  To: linux-nfs, rmilkowski; +Cc: anna.schumaker, linux-kernel

On Mon, 2019-12-16 at 17:38 +0000, Robert Milkowski wrote:
> From: Robert Milkowski <rmilkowski@gmail.com>
> 
> Currently, each time nfs4_do_fsinfo() is called it will do an
> implicit
> NFS4 lease renewal, which is not compliant with the NFS4
> specification.
> This can result in a lease being expired by NFS server which will
> then
> return NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> 
> Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> ---
>  fs/nfs/nfs4proc.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..aad65dd 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5019,16 +5019,19 @@ static int nfs4_do_fsinfo(struct nfs_server
> *server,
> struct nfs_fh *fhandle, str
>  	struct nfs4_exception exception = {
>  		.interruptible = true,
>  	};
> -	unsigned long now = jiffies;
> +	unsigned long last_renewal = jiffies;
>  	int err;
>  
>  	do {
>  		err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
>  		trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr, err);
>  		if (err == 0) {
> +			/* no implicit lease renewal allowed here */
> +			if (server->nfs_client->cl_last_renewal != 0)
> +				last_renewal =
> server->nfs_client->cl_last_renewal;
>  			nfs4_set_lease_period(server->nfs_client,
>  					fsinfo->lease_time * HZ,
> -					now);
> +					last_renewal);
>  			break;
>  		}
>  		err = nfs4_handle_exception(server, err, &exception);

NACK. The above argument only applies to legacy minor version 0 setups,
and does not apply to NFSv4.1 or newer.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 17:38 [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals Robert Milkowski
  2019-12-16 18:24 ` Trond Myklebust
@ 2019-12-16 18:36 ` Robert Milkowski
  2019-12-16 18:41   ` Robert Milkowski
  1 sibling, 1 reply; 9+ messages in thread
From: Robert Milkowski @ 2019-12-16 18:36 UTC (permalink / raw)
  To: linux-nfs
  Cc: 'Trond Myklebust', 'Anna Schumaker',
	linux-kernel, linux-nfs

Hi,

If a sub-filesystem (nfsv4 mirror mount) gets unmounted and then mounted
again (by accessing it) the nfs4_do_fsinfo() function is called,
which currently assumes implicit lease renewal. I believe this is no
compliant with the RFC.

I've managed to trigger the issue by two different methods:

1) in prod

If there is an NFSv4 filesystem mounted with sub-mounts (similar setup as
below), after nfs_mountpoint_expiry_timeout of inactivity, 
Each submount will be unmounted. If now it gets accessed again it will be
automatically mounted again which will result currently
in implicit lease renewal on the client side which in turn can result in a
relatively small window where the client thinks its lease
is still valid while an nfs server has already expired the lease.

2) manual unmount

# cat /etc/exports
/ *(rw,sync)
/var *(rw,sync)


On a Linux NFS client:

# mount -o vers=4 10.50.2.59:/ /mnt/3
$ head /mnt/3/var/log/vmware-vmsvc.log >/dev/null
$ df -h | tail -2
10.50.2.59:/                                                  29G   25G
2.3G  92% /mnt/3
10.50.2.59:/var                                               20G  2.2G
16G  13% /mnt/3/var

# while [ 1 ]; do date; umount /mnt/3/var; ls /mnt/3/var >/dev/null; sleep
10; done
...

By constantly unmounting the sub-filesystem (/var) and then accessing it so
it gets mounted again (which triggers the nfs4_do_fsinfo()),
the cl_last_renewal is set to now on the client which prevents RENEW
operations from being send, and eventually the NFS server
will expire the lease (common defaults are 60s on Linux and 90s on Solaris
servers).

In testing I confirmed that both Linux and Solaris NFSv4 servers will not do
an implicit lease renewal in this case
(nfs4_do_fsinfo() results in GETATTR operations being send), in which case
the lease might expire and both Linux and Solaris
NFS servers will return NFS4ERR_EXPIRED.

The error is not handled correctly either and will result in EIO propagated
to an application issuing open().
See my other email with subject: [PATCH] NFSv4: open() should try lease
recovery on NFS4ERR_EXPIRED
which contains a fix for the NFS4ERR_EXPIRED handling.

This patch however fixes the issue with implicit lease renewal by not
setting the cl_last_renewal to now,
unless there is no lease set yet.

This was tested with 5.5.0-rc2 and the provided patch is applied on top of
the 5.5.0-rc2 as well.


btw: Recent ONTAP versions return NFS4ERR_STALE_CLIENTID which is handled
correctly - Linux client will try to renew its lease and if successful it
will retry open().
     


Best regards,
 Robert Milkowski



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 18:36 ` Robert Milkowski
@ 2019-12-16 18:41   ` Robert Milkowski
  0 siblings, 0 replies; 9+ messages in thread
From: Robert Milkowski @ 2019-12-16 18:41 UTC (permalink / raw)
  To: linux-nfs
  Cc: 'Trond Myklebust', 'Anna Schumaker',
	linux-kernel, linux-nfs


>btw: Recent ONTAP versions return NFS4ERR_STALE_CLIENTID which is handled
correctly - Linux client will try to renew its lease and if successful it
will retry open().
 
One more comment here, although the issue won't manifest to applications
with NetApp fileservers in this case, it is still wrong for the client to
end up in such a state.
Also, you will need to wait longer (keep unmounting/mounting) with NetApp
filers as they seem to have implemented courtesy locks/delayed lease
expiration as per the RFC.
This means to reproduce the issue against NetApp you need to wait at least
2x grace period + 1 minute).


    
Best regards,
 Robert Milkowski




^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 18:24 ` Trond Myklebust
@ 2019-12-16 18:43   ` Robert Milkowski
  2019-12-16 18:58     ` Trond Myklebust
  2019-12-16 18:58     ` Chuck Lever
  0 siblings, 2 replies; 9+ messages in thread
From: Robert Milkowski @ 2019-12-16 18:43 UTC (permalink / raw)
  To: 'Trond Myklebust', linux-nfs; +Cc: anna.schumaker, linux-kernel



> From: Trond Myklebust <trondmy@hammerspace.com> 
...
>NACK. The above argument only applies to legacy minor version 0 setups, and does not apply to NFSv4.1 or newer.

Correct. However many sites still use v4.0.


Best regards,
 Robert Milkowski


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 18:43   ` Robert Milkowski
@ 2019-12-16 18:58     ` Trond Myklebust
  2019-12-17 18:12       ` Robert Milkowski
  2019-12-16 18:58     ` Chuck Lever
  1 sibling, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2019-12-16 18:58 UTC (permalink / raw)
  To: linux-nfs, rmilkowski; +Cc: anna.schumaker, linux-kernel

-On Mon, 2019-12-16 at 18:43 +0000, Robert Milkowski wrote:
> > From: Trond Myklebust <trondmy@hammerspace.com> 
> ...
> > NACK. The above argument only applies to legacy minor version 0
> > setups, and does not apply to NFSv4.1 or newer.
> 
> Correct. However many sites still use v4.0.
> 

That's not a good reason to break code that works just fine for
NFSv4.1.

It would be better to move the initialisation of clp->cl_last_renewal
into nfs4_init_clientid() and nfs41_init_clientid() (after the calls to
nfs4_proc_setclientid_confirm() and nfs4_proc_create_session()
respectively).

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 18:43   ` Robert Milkowski
  2019-12-16 18:58     ` Trond Myklebust
@ 2019-12-16 18:58     ` Chuck Lever
  1 sibling, 0 replies; 9+ messages in thread
From: Chuck Lever @ 2019-12-16 18:58 UTC (permalink / raw)
  To: Robert Milkowski
  Cc: Linux NFS Mailing List, Trond Myklebust, Anna Schumaker,
	Linux Kernel Mailing List



> On Dec 16, 2019, at 1:43 PM, Robert Milkowski <rmilkowski@gmail.com> wrote:
> 
> 
> 
>> From: Trond Myklebust <trondmy@hammerspace.com> 
> ...
>> NACK. The above argument only applies to legacy minor version 0 setups, and does not apply to NFSv4.1 or newer.
> 
> Correct. However many sites still use v4.0.

v4.1 and newer actually do renew the lease with the FSINFO COMPOUND,
since the COMPOUND starts with a SEQUENCE operation.

For v4.0, you can get the equivalent behavior by adding a RENEW to
the end of the FSINFO COMPOUND.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-16 18:58     ` Trond Myklebust
@ 2019-12-17 18:12       ` Robert Milkowski
  2019-12-18 14:48         ` Robert Milkowski
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Milkowski @ 2019-12-17 18:12 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: chuck.lever, linux-nfs, anna.schumaker, linux-kernel

On Mon, 16 Dec 2019 at 18:58, Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> -On Mon, 2019-12-16 at 18:43 +0000, Robert Milkowski wrote:
> > > From: Trond Myklebust <trondmy@hammerspace.com>
> > ...
> > > NACK. The above argument only applies to legacy minor version 0
> > > setups, and does not apply to NFSv4.1 or newer.
> >
> > Correct. However many sites still use v4.0.
> >
>
> That's not a good reason to break code that works just fine for
> NFSv4.1.
>

Of course not, that's not what I meant and I misunderstood your nack too.

> It would be better to move the initialisation of clp->cl_last_renewal
> into nfs4_init_clientid() and nfs41_init_clientid() (after the calls to
> nfs4_proc_setclientid_confirm() and nfs4_proc_create_session()
> respectively).
>

This could be done but this is potentially a separate change, as in
nfs4_do_fsinfo() we still need to
make sure we do not implicitly renew lease for v4.0, so I think the
patch needs to be modified as:

...
+                       /* no implicit lease renewal allowed here for v4.0 */
+                       if (server->nfs_client->cl_minorversion == 0
&& server->nfs_client->cl_last_renewal != 0)
+                               last_renewal =
server->nfs_client->cl_last_renewal;
                        nfs4_set_lease_period(server->nfs_client,
                                        fsinfo->lease_time * HZ,
-                                       now);
+                                       last_renewal);
...

This way it won't affect newer nfs versions (I don't think it affected
them in any bad way, still it was unintended and not correct).
Agree with the above change?

btw: Chuck, thanks for the hint regarding the SEQUENCE op on v4.1+

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-17 18:12       ` Robert Milkowski
@ 2019-12-18 14:48         ` Robert Milkowski
  0 siblings, 0 replies; 9+ messages in thread
From: Robert Milkowski @ 2019-12-18 14:48 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: chuck.lever, linux-nfs, anna.schumaker, linux-kernel

On Tue, 17 Dec 2019 at 18:12, Robert Milkowski <rmilkowski@gmail.com> wrote:
>
> On Mon, 16 Dec 2019 at 18:58, Trond Myklebust <trondmy@hammerspace.com> wrote:
> >
> > It would be better to move the initialisation of clp->cl_last_renewal
> > into nfs4_init_clientid() and nfs41_init_clientid() (after the calls to
> > nfs4_proc_setclientid_confirm() and nfs4_proc_create_session()
> > respectively).
> >
>
> This could be done but this is potentially a separate change, as in
> nfs4_do_fsinfo() we still need to
> make sure we do not implicitly renew lease for v4.0, so I think the
> patch needs to be modified as:

I took a look at the client initialization and the
nfs4_init_clientid() already calls nfs4_setup_state_renewal(),
which in turn calls nfs4_proc_get_lease_time(), however that might
return with an error in which case it won't set the cl_last_renewal,
the code is:

static int nfs4_setup_state_renewal(struct nfs_client *clp)
...
        status = nfs4_proc_get_lease_time(clp, &fsinfo);
        if (status == 0) {
                nfs4_set_lease_period(clp, fsinfo.lease_time * HZ, now);
...

In my environment with nfsv4+krb the nfs4_proc_get_lease_time()
returns with -10016 (NFS4ERR_WRONGSEC) during initial mount (which
after a quick scan of the rfc seems that might be ok initially).
Also for v4.1 do_renew_lease() is called by nfs41_sequence_process()
multiple times during mount before we even reach nfs4_do_fsinfo() so
for 4.1+ it never gets cl_last_renewal==0, at least not under this
circumstances.

There might be other cases where nfs4_do_fsinfo() will get
clp->cl_last_renewal=0 for nfsv4.0, and since the nfs4_do_fsinfo()
already initializes the cl_last_renewal record, plus as Chuck pointed
out in case of v4.1+
it is expected to implicitly renew the lease anyway, I would rather
focus on fixing only the reported issue here for now, which is the
incorrect implicit renewal in fsinfo for v4.0, and leave improvements
to client initialization for another day.

I will send an updated patch shortly which doesn't impact v4.1+.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-12-18 14:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-16 17:38 [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals Robert Milkowski
2019-12-16 18:24 ` Trond Myklebust
2019-12-16 18:43   ` Robert Milkowski
2019-12-16 18:58     ` Trond Myklebust
2019-12-17 18:12       ` Robert Milkowski
2019-12-18 14:48         ` Robert Milkowski
2019-12-16 18:58     ` Chuck Lever
2019-12-16 18:36 ` Robert Milkowski
2019-12-16 18:41   ` Robert Milkowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).