linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
@ 2019-12-30 15:20 Robert Milkowski
  2019-12-30 15:36 ` Chuck Lever
  0 siblings, 1 reply; 11+ messages in thread
From: Robert Milkowski @ 2019-12-30 15:20 UTC (permalink / raw)
  To: linux-nfs
  Cc: 'Chuck Lever', 'Trond Myklebust',
	'Anna Schumaker',
	linux-kernel

From: Robert Milkowski <rmilkowski@gmail.com>

Currently, each time nfs4_do_fsinfo() is called it will do an implicit
NFS4 lease renewal, which is not compliant with the NFS4 specification.
This can result in a lease being expired by an NFS server.

Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
introduced implicit client lease renewal in nfs4_do_fsinfo(),
which can result in the NFSv4.0 lease to expire on a server side,
and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

This can easily be reproduced by frequently unmounting a sub-mount,
then stat'ing it to get it mounted again, which will delay or even
completely prevent client from sending RENEW operations if no other
NFS operations are issued. Eventually nfs server will expire client's
lease and return an error on file access or next RENEW.

This can also happen when a sub-mount is automatically unmounted
due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
mounted again via stat(). This can result in a short window during
which client's lease will expire on a server but not on a client.
This specific case was observed on production systems.

This patch makes an explicit lease renewal instead of an implicit one,
by adding RENEW to a compound operation issued by nfs4_do_fsinfo(),
similarly to NFSv4.1 which adds SEQUENCE operation.

Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
---
 fs/nfs/nfs4proc.c       |  4 ++++
 fs/nfs/nfs4xdr.c        | 13 +++++++++++--
 include/linux/nfs_xdr.h |  3 +++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 76d3716..6d075f0 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4998,12 +4998,16 @@ static int nfs4_proc_statfs(struct nfs_server *server, struct nfs_fh *fhandle, s
 static int _nfs4_do_fsinfo(struct nfs_server *server, struct nfs_fh *fhandle,
 		struct nfs_fsinfo *fsinfo)
 {
+	struct nfs_client *clp = server->nfs_client;
 	struct nfs4_fsinfo_arg args = {
 		.fh = fhandle,
 		.bitmask = server->attr_bitmask,
+		.clientid = clp->cl_clientid,
+		.renew = nfs4_has_session(clp) ? 0 : 1,		/* append RENEW */
 	};
 	struct nfs4_fsinfo_res res = {
 		.fsinfo = fsinfo,
+		.renew = nfs4_has_session(clp) ? 0 : 1,
 	};
 	struct rpc_message msg = {
 		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_FSINFO],
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 936c577..0ce9a10 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -555,11 +555,13 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
 #define NFS4_enc_fsinfo_sz	(compound_encode_hdr_maxsz + \
 				encode_sequence_maxsz + \
 				encode_putfh_maxsz + \
-				encode_fsinfo_maxsz)
+				encode_fsinfo_maxsz + \
+				encode_renew_maxsz)
 #define NFS4_dec_fsinfo_sz	(compound_decode_hdr_maxsz + \
 				decode_sequence_maxsz + \
 				decode_putfh_maxsz + \
-				decode_fsinfo_maxsz)
+				decode_fsinfo_maxsz + \
+				decode_renew_maxsz)
 #define NFS4_enc_renew_sz	(compound_encode_hdr_maxsz + \
 				encode_renew_maxsz)
 #define NFS4_dec_renew_sz	(compound_decode_hdr_maxsz + \
@@ -2646,6 +2648,8 @@ static void nfs4_xdr_enc_fsinfo(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_sequence(xdr, &args->seq_args, &hdr);
 	encode_putfh(xdr, args->fh, &hdr);
 	encode_fsinfo(xdr, args->bitmask, &hdr);
+	if (args->renew)
+		encode_renew(xdr, args->clientid, &hdr);
 	encode_nops(&hdr);
 }
 
@@ -6778,6 +6782,11 @@ static int nfs4_xdr_dec_fsinfo(struct rpc_rqst *req, struct xdr_stream *xdr,
 		status = decode_putfh(xdr);
 	if (!status)
 		status = decode_fsinfo(xdr, res->fsinfo);
+	if (status)
+		goto out;
+	if (res->renew)
+		status = decode_renew(xdr);
+out:
 	return status;
 }
 
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 72d5695..49bd673 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1025,11 +1025,14 @@ struct nfs4_fsinfo_arg {
 	struct nfs4_sequence_args	seq_args;
 	const struct nfs_fh *		fh;
 	const u32 *			bitmask;
+	clientid4			clientid;
+	unsigned char			renew:1;
 };
 
 struct nfs4_fsinfo_res {
 	struct nfs4_sequence_res	seq_res;
 	struct nfs_fsinfo	       *fsinfo;
+	unsigned char			renew:1;
 };
 
 struct nfs4_getattr_arg {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-30 15:20 [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals Robert Milkowski
@ 2019-12-30 15:36 ` Chuck Lever
  2020-01-20 17:55   ` Robert Milkowski
  0 siblings, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2019-12-30 15:36 UTC (permalink / raw)
  To: Robert Milkowski
  Cc: Linux NFS Mailing List, Trond Myklebust, Anna Schumaker, linux-kernel



> On Dec 30, 2019, at 10:20 AM, Robert Milkowski <rmilkowski@gmail.com> wrote:
> 
> From: Robert Milkowski <rmilkowski@gmail.com>
> 
> Currently, each time nfs4_do_fsinfo() is called it will do an implicit
> NFS4 lease renewal, which is not compliant with the NFS4 specification.
> This can result in a lease being expired by an NFS server.
> 
> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> introduced implicit client lease renewal in nfs4_do_fsinfo(),
> which can result in the NFSv4.0 lease to expire on a server side,
> and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> 
> This can easily be reproduced by frequently unmounting a sub-mount,
> then stat'ing it to get it mounted again, which will delay or even
> completely prevent client from sending RENEW operations if no other
> NFS operations are issued. Eventually nfs server will expire client's
> lease and return an error on file access or next RENEW.
> 
> This can also happen when a sub-mount is automatically unmounted
> due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> mounted again via stat(). This can result in a short window during
> which client's lease will expire on a server but not on a client.
> This specific case was observed on production systems.
> 
> This patch makes an explicit lease renewal instead of an implicit one,
> by adding RENEW to a compound operation issued by nfs4_do_fsinfo(),
> similarly to NFSv4.1 which adds SEQUENCE operation.
> 
> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>


> ---
> fs/nfs/nfs4proc.c       |  4 ++++
> fs/nfs/nfs4xdr.c        | 13 +++++++++++--
> include/linux/nfs_xdr.h |  3 +++
> 3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..6d075f0 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -4998,12 +4998,16 @@ static int nfs4_proc_statfs(struct nfs_server *server, struct nfs_fh *fhandle, s
> static int _nfs4_do_fsinfo(struct nfs_server *server, struct nfs_fh *fhandle,
> 		struct nfs_fsinfo *fsinfo)
> {
> +	struct nfs_client *clp = server->nfs_client;
> 	struct nfs4_fsinfo_arg args = {
> 		.fh = fhandle,
> 		.bitmask = server->attr_bitmask,
> +		.clientid = clp->cl_clientid,
> +		.renew = nfs4_has_session(clp) ? 0 : 1,		/* append RENEW */
> 	};
> 	struct nfs4_fsinfo_res res = {
> 		.fsinfo = fsinfo,
> +		.renew = nfs4_has_session(clp) ? 0 : 1,
> 	};
> 	struct rpc_message msg = {
> 		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_FSINFO],
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 936c577..0ce9a10 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -555,11 +555,13 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
> #define NFS4_enc_fsinfo_sz	(compound_encode_hdr_maxsz + \
> 				encode_sequence_maxsz + \
> 				encode_putfh_maxsz + \
> -				encode_fsinfo_maxsz)
> +				encode_fsinfo_maxsz + \
> +				encode_renew_maxsz)
> #define NFS4_dec_fsinfo_sz	(compound_decode_hdr_maxsz + \
> 				decode_sequence_maxsz + \
> 				decode_putfh_maxsz + \
> -				decode_fsinfo_maxsz)
> +				decode_fsinfo_maxsz + \
> +				decode_renew_maxsz)
> #define NFS4_enc_renew_sz	(compound_encode_hdr_maxsz + \
> 				encode_renew_maxsz)
> #define NFS4_dec_renew_sz	(compound_decode_hdr_maxsz + \
> @@ -2646,6 +2648,8 @@ static void nfs4_xdr_enc_fsinfo(struct rpc_rqst *req, struct xdr_stream *xdr,
> 	encode_sequence(xdr, &args->seq_args, &hdr);
> 	encode_putfh(xdr, args->fh, &hdr);
> 	encode_fsinfo(xdr, args->bitmask, &hdr);
> +	if (args->renew)
> +		encode_renew(xdr, args->clientid, &hdr);
> 	encode_nops(&hdr);
> }
> 
> @@ -6778,6 +6782,11 @@ static int nfs4_xdr_dec_fsinfo(struct rpc_rqst *req, struct xdr_stream *xdr,
> 		status = decode_putfh(xdr);
> 	if (!status)
> 		status = decode_fsinfo(xdr, res->fsinfo);
> +	if (status)
> +		goto out;
> +	if (res->renew)
> +		status = decode_renew(xdr);
> +out:
> 	return status;
> }
> 
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index 72d5695..49bd673 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1025,11 +1025,14 @@ struct nfs4_fsinfo_arg {
> 	struct nfs4_sequence_args	seq_args;
> 	const struct nfs_fh *		fh;
> 	const u32 *			bitmask;
> +	clientid4			clientid;
> +	unsigned char			renew:1;
> };
> 
> struct nfs4_fsinfo_res {
> 	struct nfs4_sequence_res	seq_res;
> 	struct nfs_fsinfo	       *fsinfo;
> +	unsigned char			renew:1;
> };
> 
> struct nfs4_getattr_arg {
> -- 
> 1.8.3.1
> 
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2019-12-30 15:36 ` Chuck Lever
@ 2020-01-20 17:55   ` Robert Milkowski
  2020-01-22 19:10     ` Schumaker, Anna
  0 siblings, 1 reply; 11+ messages in thread
From: Robert Milkowski @ 2020-01-20 17:55 UTC (permalink / raw)
  To: 'Chuck Lever', 'Trond Myklebust'
  Cc: 'Linux NFS Mailing List', 'Trond Myklebust',
	'Anna Schumaker',
	linux-kernel



> -----Original Message-----
> From: Chuck Lever <chuck.lever@oracle.com>
> Sent: 30 December 2019 15:37
> To: Robert Milkowski <rmilkowski@gmail.com>
> Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond Myklebust
> <trond.myklebust@hammerspace.com>; Anna Schumaker
> <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit
> lease renewals
> 
> 
> 
> > On Dec 30, 2019, at 10:20 AM, Robert Milkowski <rmilkowski@gmail.com>
> wrote:
> >
> > From: Robert Milkowski <rmilkowski@gmail.com>
> >
> > Currently, each time nfs4_do_fsinfo() is called it will do an implicit
> > NFS4 lease renewal, which is not compliant with the NFS4
> specification.
> > This can result in a lease being expired by an NFS server.
> >
> > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> > introduced implicit client lease renewal in nfs4_do_fsinfo(), which
> > can result in the NFSv4.0 lease to expire on a server side, and
> > servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> >
> > This can easily be reproduced by frequently unmounting a sub-mount,
> > then stat'ing it to get it mounted again, which will delay or even
> > completely prevent client from sending RENEW operations if no other
> > NFS operations are issued. Eventually nfs server will expire client's
> > lease and return an error on file access or next RENEW.
> >
> > This can also happen when a sub-mount is automatically unmounted due
> > to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> > mounted again via stat(). This can result in a short window during
> > which client's lease will expire on a server but not on a client.
> > This specific case was observed on production systems.
> >
> > This patch makes an explicit lease renewal instead of an implicit one,
> > by adding RENEW to a compound operation issued by nfs4_do_fsinfo(),
> > similarly to NFSv4.1 which adds SEQUENCE operation.
> >
> > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> 
> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> 
> 

How do we progress it further?

-- 
Robert Milkowski




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-20 17:55   ` Robert Milkowski
@ 2020-01-22 19:10     ` Schumaker, Anna
  2020-01-22 19:55       ` Robert Milkowski
  2020-01-23 19:08       ` Trond Myklebust
  0 siblings, 2 replies; 11+ messages in thread
From: Schumaker, Anna @ 2020-01-22 19:10 UTC (permalink / raw)
  To: rmilkowski, chuck.lever, trondmy; +Cc: linux-nfs, linux-kernel, trond.myklebust

Hi Robert,

On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> > -----Original Message-----
> > From: Chuck Lever <chuck.lever@oracle.com>
> > Sent: 30 December 2019 15:37
> > To: Robert Milkowski <rmilkowski@gmail.com>
> > Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond Myklebust
> > <trond.myklebust@hammerspace.com>; Anna Schumaker
> > <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit
> > lease renewals
> > 
> > 
> > 
> > > On Dec 30, 2019, at 10:20 AM, Robert Milkowski <rmilkowski@gmail.com>
> > wrote:
> > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > 
> > > Currently, each time nfs4_do_fsinfo() is called it will do an implicit
> > > NFS4 lease renewal, which is not compliant with the NFS4
> > specification.
> > > This can result in a lease being expired by an NFS server.
> > > 
> > > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> > > introduced implicit client lease renewal in nfs4_do_fsinfo(), which
> > > can result in the NFSv4.0 lease to expire on a server side, and
> > > servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> > > 
> > > This can easily be reproduced by frequently unmounting a sub-mount,
> > > then stat'ing it to get it mounted again, which will delay or even
> > > completely prevent client from sending RENEW operations if no other
> > > NFS operations are issued. Eventually nfs server will expire client's
> > > lease and return an error on file access or next RENEW.
> > > 
> > > This can also happen when a sub-mount is automatically unmounted due
> > > to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> > > mounted again via stat(). This can result in a short window during
> > > which client's lease will expire on a server but not on a client.
> > > This specific case was observed on production systems.
> > > 
> > > This patch makes an explicit lease renewal instead of an implicit one,
> > > by adding RENEW to a compound operation issued by nfs4_do_fsinfo(),
> > > similarly to NFSv4.1 which adds SEQUENCE operation.
> > > 
> > > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > 
> > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> > 
> > 
> 
> How do we progress it further?

Thanks for following up! I have the patch included in my linux-next branch for
the next merge window.

Anna

> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-22 19:10     ` Schumaker, Anna
@ 2020-01-22 19:55       ` Robert Milkowski
  2020-01-23 19:08       ` Trond Myklebust
  1 sibling, 0 replies; 11+ messages in thread
From: Robert Milkowski @ 2020-01-22 19:55 UTC (permalink / raw)
  To: 'Schumaker, Anna', chuck.lever, trondmy
  Cc: linux-nfs, linux-kernel, trond.myklebust



> -----Original Message-----
> From: Schumaker, Anna <Anna.Schumaker@netapp.com>
> Sent: 22 January 2020 19:11
> To: rmilkowski@gmail.com; chuck.lever@oracle.com; trondmy@hammerspace.com
> Cc: linux-nfs@vger.kernel.org; linux-kernel@vger.kernel.org;
> trond.myklebust@hammerspace.com
> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit
> lease renewals
> 
> Hi Robert,
> 
> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> > > -----Original Message-----
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > Sent: 30 December 2019 15:37
> > > To: Robert Milkowski <rmilkowski@gmail.com>
> > > Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond
> > > Myklebust <trond.myklebust@hammerspace.com>; Anna Schumaker
> > > <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
> > > implicit lease renewals
> > >
> > >
> > >
> > > > On Dec 30, 2019, at 10:20 AM, Robert Milkowski
> > > > <rmilkowski@gmail.com>
> > > wrote:
> > > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > >
> > > > Currently, each time nfs4_do_fsinfo() is called it will do an
> > > > implicit
> > > > NFS4 lease renewal, which is not compliant with the NFS4
> > > specification.
> > > > This can result in a lease being expired by an NFS server.
> > > >
> > > > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> > > > introduced implicit client lease renewal in nfs4_do_fsinfo(),
> > > > which can result in the NFSv4.0 lease to expire on a server side,
> > > > and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> > > >
> > > > This can easily be reproduced by frequently unmounting a
> > > > sub-mount, then stat'ing it to get it mounted again, which will
> > > > delay or even completely prevent client from sending RENEW
> > > > operations if no other NFS operations are issued. Eventually nfs
> > > > server will expire client's lease and return an error on file access
> or next RENEW.
> > > >
> > > > This can also happen when a sub-mount is automatically unmounted
> > > > due to inactivity (after nfs_mountpoint_expiry_timeout), then it
> > > > is mounted again via stat(). This can result in a short window
> > > > during which client's lease will expire on a server but not on a
> client.
> > > > This specific case was observed on production systems.
> > > >
> > > > This patch makes an explicit lease renewal instead of an implicit
> > > > one, by adding RENEW to a compound operation issued by
> > > > nfs4_do_fsinfo(), similarly to NFSv4.1 which adds SEQUENCE
> operation.
> > > >
> > > > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
> > > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > >
> > > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> > >
> > >
> >
> > How do we progress it further?
> 
> Thanks for following up! I have the patch included in my linux-next branch
> for the next merge window.
> 
> Anna

Nice. Thanks!


-- 
Robert Milkowski



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-22 19:10     ` Schumaker, Anna
  2020-01-22 19:55       ` Robert Milkowski
@ 2020-01-23 19:08       ` Trond Myklebust
  2020-01-27 14:45         ` Robert Milkowski
  1 sibling, 1 reply; 11+ messages in thread
From: Trond Myklebust @ 2020-01-23 19:08 UTC (permalink / raw)
  To: rmilkowski, Anna.Schumaker, chuck.lever; +Cc: linux-nfs, linux-kernel

On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
> Hi Robert,
> 
> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> > > -----Original Message-----
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > Sent: 30 December 2019 15:37
> > > To: Robert Milkowski <rmilkowski@gmail.com>
> > > Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond
> > > Myklebust
> > > <trond.myklebust@hammerspace.com>; Anna Schumaker
> > > <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
> > > implicit
> > > lease renewals
> > > 
> > > 
> > > 
> > > > On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
> > > > rmilkowski@gmail.com>
> > > wrote:
> > > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > > 
> > > > Currently, each time nfs4_do_fsinfo() is called it will do an
> > > > implicit
> > > > NFS4 lease renewal, which is not compliant with the NFS4
> > > specification.
> > > > This can result in a lease being expired by an NFS server.
> > > > 
> > > > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > leases")
> > > > introduced implicit client lease renewal in nfs4_do_fsinfo(),
> > > > which
> > > > can result in the NFSv4.0 lease to expire on a server side, and
> > > > servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> > > > 
> > > > This can easily be reproduced by frequently unmounting a sub-
> > > > mount,
> > > > then stat'ing it to get it mounted again, which will delay or
> > > > even
> > > > completely prevent client from sending RENEW operations if no
> > > > other
> > > > NFS operations are issued. Eventually nfs server will expire
> > > > client's
> > > > lease and return an error on file access or next RENEW.
> > > > 
> > > > This can also happen when a sub-mount is automatically
> > > > unmounted due
> > > > to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> > > > mounted again via stat(). This can result in a short window
> > > > during
> > > > which client's lease will expire on a server but not on a
> > > > client.
> > > > This specific case was observed on production systems.
> > > > 
> > > > This patch makes an explicit lease renewal instead of an
> > > > implicit one,
> > > > by adding RENEW to a compound operation issued by
> > > > nfs4_do_fsinfo(),
> > > > similarly to NFSv4.1 which adds SEQUENCE operation.
> > > > 
> > > > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > leases")
> > > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > > 
> > > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > 
> > 
> > How do we progress it further?
> 
> Thanks for following up! I have the patch included in my linux-next
> branch for
> the next merge window.

NACK. This is the wrong way to solve the problem. Lease renewal in
NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
extra error condition that has absolutely nothing to do with the
functionality of retrieving per-filesystem attributes.

All that needs to be done here is to move the setting of clp-
>cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
nfs4_proc_setclientid_confirm() and nfs4_update_session() call
do_renew_lease().

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-23 19:08       ` Trond Myklebust
@ 2020-01-27 14:45         ` Robert Milkowski
  2020-01-27 15:05           ` Chuck Lever
  2020-01-27 15:54           ` Trond Myklebust
  0 siblings, 2 replies; 11+ messages in thread
From: Robert Milkowski @ 2020-01-27 14:45 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna.Schumaker, chuck.lever, linux-nfs, linux-kernel

On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
> > Hi Robert,
> >
> > On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> > > > -----Original Message-----
> > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > Sent: 30 December 2019 15:37
> > > > To: Robert Milkowski <rmilkowski@gmail.com>
> > > > Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond
> > > > Myklebust
> > > > <trond.myklebust@hammerspace.com>; Anna Schumaker
> > > > <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
> > > > implicit
> > > > lease renewals
> > > >
> > > >
> > > >
> > > > > On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
> > > > > rmilkowski@gmail.com>
> > > > wrote:
> > > > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > > >
> > > > > Currently, each time nfs4_do_fsinfo() is called it will do an
> > > > > implicit
> > > > > NFS4 lease renewal, which is not compliant with the NFS4
> > > > specification.
> > > > > This can result in a lease being expired by an NFS server.
> > > > >
> > > > > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > > leases")
> > > > > introduced implicit client lease renewal in nfs4_do_fsinfo(),
> > > > > which
> > > > > can result in the NFSv4.0 lease to expire on a server side, and
> > > > > servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> > > > >
> > > > > This can easily be reproduced by frequently unmounting a sub-
> > > > > mount,
> > > > > then stat'ing it to get it mounted again, which will delay or
> > > > > even
> > > > > completely prevent client from sending RENEW operations if no
> > > > > other
> > > > > NFS operations are issued. Eventually nfs server will expire
> > > > > client's
> > > > > lease and return an error on file access or next RENEW.
> > > > >
> > > > > This can also happen when a sub-mount is automatically
> > > > > unmounted due
> > > > > to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> > > > > mounted again via stat(). This can result in a short window
> > > > > during
> > > > > which client's lease will expire on a server but not on a
> > > > > client.
> > > > > This specific case was observed on production systems.
> > > > >
> > > > > This patch makes an explicit lease renewal instead of an
> > > > > implicit one,
> > > > > by adding RENEW to a compound operation issued by
> > > > > nfs4_do_fsinfo(),
> > > > > similarly to NFSv4.1 which adds SEQUENCE operation.
> > > > >
> > > > > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > > leases")
> > > > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > > >
> > > > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> > > >
> > > >
> > >
> > > How do we progress it further?
> >
> > Thanks for following up! I have the patch included in my linux-next
> > branch for
> > the next merge window.
>
> NACK. This is the wrong way to solve the problem. Lease renewal in
> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
> extra error condition that has absolutely nothing to do with the
> functionality of retrieving per-filesystem attributes.

Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
send by fsinfo, and as Chuck pointed out
it makes sense to do similarly for 4.0, which is what this patch does.

Or as per the v2 version of the patch, not do the implicit RENEW for
4.0, which was a simpler patch,
but then not in-line with 4.1+.

>
> All that needs to be done here is to move the setting of clp-
> >cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
> nfs4_proc_setclientid_confirm() and nfs4_update_session() call
> do_renew_lease().
>

This would also require nfs4_setup_state_renewal() to call
do_renew_lease() I think - at least it currently calls
nfs4_set_lease_period().
Also, iirc fsinfo() not setting cl_last_renewal leads to
cl_last_renewal initialization issues under some circumstances.

Then the RFC 7530 in section 16.34.5 states:
"SETCLIENTID_CONFIRM does not establish or renew a lease.", so calling
do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be
ok.

I'm not sure if is is valid to do implicit lease renewal in
nfs4_update_session() either...


Anyway, the patch would be something like (haven't tested it yet):

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 76d3716..a7af864 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5019,16 +5019,13 @@ static int nfs4_do_fsinfo(struct nfs_server
*server, struct nfs_fh *fhandle, str
        struct nfs4_exception exception = {
                .interruptible = true,
        };
-       unsigned long now = jiffies;
        int err;

        do {
                err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
                trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr, err);
                if (err == 0) {
-                       nfs4_set_lease_period(server->nfs_client,
-                                       fsinfo->lease_time * HZ,
-                                       now);
+                       nfs4_set_lease_period(server->nfs_client,
fsinfo->lease_time * HZ)
                        break;
                }
                err = nfs4_handle_exception(server, err, &exception);
@@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct nfs_client *clp,
                clp->cl_clientid);
        status = rpc_call_sync(clp->cl_rpcclient, &msg,
                               RPC_TASK_TIMEOUT | RPC_TASK_NO_ROUND_ROBIN);
+       if(status == 0) {
+               unsigned long now = jiffies;
+               do_renew_lease(clp, now);
+       }
        trace_nfs4_setclientid_confirm(clp, status);
        dprintk("NFS reply setclientid_confirm: %d\n", status);
        return status;
@@ -8590,6 +8591,8 @@ static void nfs4_update_session(struct
nfs4_session *session,
        if (res->flags & SESSION4_BACK_CHAN)
                memcpy(&session->bc_attrs, &res->bc_attrs,
                                sizeof(session->bc_attrs));
+       unsigned long now = jiffies;
+       do_renew_lease(session->clp, now);
 }

 static int _nfs4_proc_create_session(struct nfs_client *clp,
diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c
index 6ea431b..ff876dd 100644
--- a/fs/nfs/nfs4renewd.c
+++ b/fs/nfs/nfs4renewd.c
@@ -138,15 +138,12 @@
  *
  * @clp: pointer to nfs_client
  * @lease: new value for lease period
- * @lastrenewed: time at which lease was last renewed
  */
 void nfs4_set_lease_period(struct nfs_client *clp,
-               unsigned long lease,
-               unsigned long lastrenewed)
+               unsigned long lease)
 {
        spin_lock(&clp->cl_lock);
        clp->cl_lease_time = lease;
-       clp->cl_last_renewal = lastrenewed;
        spin_unlock(&clp->cl_lock);

        /* Cap maximum reconnect timeout at 1/2 lease period */
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 3455232..d7b02fd 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -102,7 +102,8 @@ static int nfs4_setup_state_renewal(struct nfs_client *clp)
        now = jiffies;
        status = nfs4_proc_get_lease_time(clp, &fsinfo);
        if (status == 0) {
-               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ, now);
+               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ);
+               nfs4_do_renew_lease(clp, now);
                nfs4_schedule_state_renewal(clp);
        }



But it potentially has issues, as just pointed out, mainly with not
being compliant with rfc again.

Also see the comments above about the SEQUENCE.
Trond - ?

Chuck, Anna - which way do you prefer, the one proposed by Chuck or
the one now proposed by Trond?

Or perhaps my v2 patch which is least invasive and just stops doing
implicit renewals for 4.0 if cl_last_renewal is already set.

I personally think the way proposed by Chuck is fully compliant with
RFC and in-line with what we currently do for 4.1 via SEQUENCE,
so perhaps the best option. ?

-- 
Robert Milkowski

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-27 14:45         ` Robert Milkowski
@ 2020-01-27 15:05           ` Chuck Lever
  2020-01-27 15:34             ` Robert Milkowski
  2020-01-27 15:54           ` Trond Myklebust
  1 sibling, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2020-01-27 15:05 UTC (permalink / raw)
  To: Robert Milkowski
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List, linux-kernel



> On Jan 27, 2020, at 9:45 AM, Robert Milkowski <rmilkowski@gmail.com> wrote:
> 
> On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@hammerspace.com> wrote:
>> 
>> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
>>> Hi Robert,
>>> 
>>> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
>>>>> -----Original Message-----
>>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>> Sent: 30 December 2019 15:37
>>>>> To: Robert Milkowski <rmilkowski@gmail.com>
>>>>> Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond
>>>>> Myklebust
>>>>> <trond.myklebust@hammerspace.com>; Anna Schumaker
>>>>> <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
>>>>> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
>>>>> implicit
>>>>> lease renewals
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
>>>>>> rmilkowski@gmail.com>
>>>>> wrote:
>>>>>> From: Robert Milkowski <rmilkowski@gmail.com>
>>>>>> 
>>>>>> Currently, each time nfs4_do_fsinfo() is called it will do an
>>>>>> implicit
>>>>>> NFS4 lease renewal, which is not compliant with the NFS4
>>>>> specification.
>>>>>> This can result in a lease being expired by an NFS server.
>>>>>> 
>>>>>> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
>>>>>> leases")
>>>>>> introduced implicit client lease renewal in nfs4_do_fsinfo(),
>>>>>> which
>>>>>> can result in the NFSv4.0 lease to expire on a server side, and
>>>>>> servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
>>>>>> 
>>>>>> This can easily be reproduced by frequently unmounting a sub-
>>>>>> mount,
>>>>>> then stat'ing it to get it mounted again, which will delay or
>>>>>> even
>>>>>> completely prevent client from sending RENEW operations if no
>>>>>> other
>>>>>> NFS operations are issued. Eventually nfs server will expire
>>>>>> client's
>>>>>> lease and return an error on file access or next RENEW.
>>>>>> 
>>>>>> This can also happen when a sub-mount is automatically
>>>>>> unmounted due
>>>>>> to inactivity (after nfs_mountpoint_expiry_timeout), then it is
>>>>>> mounted again via stat(). This can result in a short window
>>>>>> during
>>>>>> which client's lease will expire on a server but not on a
>>>>>> client.
>>>>>> This specific case was observed on production systems.
>>>>>> 
>>>>>> This patch makes an explicit lease renewal instead of an
>>>>>> implicit one,
>>>>>> by adding RENEW to a compound operation issued by
>>>>>> nfs4_do_fsinfo(),
>>>>>> similarly to NFSv4.1 which adds SEQUENCE operation.
>>>>>> 
>>>>>> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
>>>>>> leases")
>>>>>> Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
>>>>> 
>>>>> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> 
>>>>> 
>>>> 
>>>> How do we progress it further?
>>> 
>>> Thanks for following up! I have the patch included in my linux-next
>>> branch for
>>> the next merge window.
>> 
>> NACK. This is the wrong way to solve the problem. Lease renewal in
>> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
>> extra error condition that has absolutely nothing to do with the
>> functionality of retrieving per-filesystem attributes.
> 
> Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
> send by fsinfo, and as Chuck pointed out
> it makes sense to do similarly for 4.0, which is what this patch does.

I did say that.

However, I can see that for NFSv4.1+, the client code handling the
SEQUENCE response will update cl_last_renewal. It does not need to
be done in the fsinfo code.

The NFSv4.0 behavior should be correct if cl_last_renewal is not
updated. That should force the client to send a separate RENEW
operation so that both the client and server agree that the lease
is active.

If I understand Trond correctly?


> Or as per the v2 version of the patch, not do the implicit RENEW for
> 4.0, which was a simpler patch,
> but then not in-line with 4.1+.
> 
>> 
>> All that needs to be done here is to move the setting of clp-
>>> cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
>> nfs4_proc_setclientid_confirm() and nfs4_update_session() call
>> do_renew_lease().
>> 
> 
> This would also require nfs4_setup_state_renewal() to call
> do_renew_lease() I think - at least it currently calls
> nfs4_set_lease_period().
> Also, iirc fsinfo() not setting cl_last_renewal leads to
> cl_last_renewal initialization issues under some circumstances.
> 
> Then the RFC 7530 in section 16.34.5 states:
> "SETCLIENTID_CONFIRM does not establish or renew a lease.", so calling
> do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be
> ok.
> 
> I'm not sure if is is valid to do implicit lease renewal in
> nfs4_update_session() either...
> 
> 
> Anyway, the patch would be something like (haven't tested it yet):
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..a7af864 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5019,16 +5019,13 @@ static int nfs4_do_fsinfo(struct nfs_server
> *server, struct nfs_fh *fhandle, str
>        struct nfs4_exception exception = {
>                .interruptible = true,
>        };
> -       unsigned long now = jiffies;
>        int err;
> 
>        do {
>                err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
>                trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr, err);
>                if (err == 0) {
> -                       nfs4_set_lease_period(server->nfs_client,
> -                                       fsinfo->lease_time * HZ,
> -                                       now);
> +                       nfs4_set_lease_period(server->nfs_client,
> fsinfo->lease_time * HZ)
>                        break;
>                }
>                err = nfs4_handle_exception(server, err, &exception);
> @@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct nfs_client *clp,
>                clp->cl_clientid);
>        status = rpc_call_sync(clp->cl_rpcclient, &msg,
>                               RPC_TASK_TIMEOUT | RPC_TASK_NO_ROUND_ROBIN);
> +       if(status == 0) {
> +               unsigned long now = jiffies;
> +               do_renew_lease(clp, now);
> +       }
>        trace_nfs4_setclientid_confirm(clp, status);
>        dprintk("NFS reply setclientid_confirm: %d\n", status);
>        return status;
> @@ -8590,6 +8591,8 @@ static void nfs4_update_session(struct
> nfs4_session *session,
>        if (res->flags & SESSION4_BACK_CHAN)
>                memcpy(&session->bc_attrs, &res->bc_attrs,
>                                sizeof(session->bc_attrs));
> +       unsigned long now = jiffies;
> +       do_renew_lease(session->clp, now);
> }
> 
> static int _nfs4_proc_create_session(struct nfs_client *clp,
> diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c
> index 6ea431b..ff876dd 100644
> --- a/fs/nfs/nfs4renewd.c
> +++ b/fs/nfs/nfs4renewd.c
> @@ -138,15 +138,12 @@
>  *
>  * @clp: pointer to nfs_client
>  * @lease: new value for lease period
> - * @lastrenewed: time at which lease was last renewed
>  */
> void nfs4_set_lease_period(struct nfs_client *clp,
> -               unsigned long lease,
> -               unsigned long lastrenewed)
> +               unsigned long lease)
> {
>        spin_lock(&clp->cl_lock);
>        clp->cl_lease_time = lease;
> -       clp->cl_last_renewal = lastrenewed;
>        spin_unlock(&clp->cl_lock);
> 
>        /* Cap maximum reconnect timeout at 1/2 lease period */
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 3455232..d7b02fd 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -102,7 +102,8 @@ static int nfs4_setup_state_renewal(struct nfs_client *clp)
>        now = jiffies;
>        status = nfs4_proc_get_lease_time(clp, &fsinfo);
>        if (status == 0) {
> -               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ, now);
> +               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ);
> +               nfs4_do_renew_lease(clp, now);
>                nfs4_schedule_state_renewal(clp);
>        }
> 
> 
> 
> But it potentially has issues, as just pointed out, mainly with not
> being compliant with rfc again.
> 
> Also see the comments above about the SEQUENCE.
> Trond - ?
> 
> Chuck, Anna - which way do you prefer, the one proposed by Chuck or
> the one now proposed by Trond?
> 
> Or perhaps my v2 patch which is least invasive and just stops doing
> implicit renewals for 4.0 if cl_last_renewal is already set.
> 
> I personally think the way proposed by Chuck is fully compliant with
> RFC and in-line with what we currently do for 4.1 via SEQUENCE,
> so perhaps the best option. ?
> 
> -- 
> Robert Milkowski

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-27 15:05           ` Chuck Lever
@ 2020-01-27 15:34             ` Robert Milkowski
  0 siblings, 0 replies; 11+ messages in thread
From: Robert Milkowski @ 2020-01-27 15:34 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List, linux-kernel

On Mon, 27 Jan 2020 at 15:05, Chuck Lever <chuck.lever@oracle.com> wrote:
>
>
>
> > On Jan 27, 2020, at 9:45 AM, Robert Milkowski <rmilkowski@gmail.com> wrote:
> >
> > On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@hammerspace.com> wrote:
> >>
> >> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
> >>> Hi Robert,
> >>>
> >>> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> >>>>> -----Original Message-----
> >>>>> From: Chuck Lever <chuck.lever@oracle.com>
> >>>>> Sent: 30 December 2019 15:37
> >>>>> To: Robert Milkowski <rmilkowski@gmail.com>
> >>>>> Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond
> >>>>> Myklebust
> >>>>> <trond.myklebust@hammerspace.com>; Anna Schumaker
> >>>>> <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> >>>>> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
> >>>>> implicit
> >>>>> lease renewals
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
> >>>>>> rmilkowski@gmail.com>
> >>>>> wrote:
> >>>>>> From: Robert Milkowski <rmilkowski@gmail.com>
> >>>>>>
> >>>>>> Currently, each time nfs4_do_fsinfo() is called it will do an
> >>>>>> implicit
> >>>>>> NFS4 lease renewal, which is not compliant with the NFS4
> >>>>> specification.
> >>>>>> This can result in a lease being expired by an NFS server.
> >>>>>>
> >>>>>> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> >>>>>> leases")
> >>>>>> introduced implicit client lease renewal in nfs4_do_fsinfo(),
> >>>>>> which
> >>>>>> can result in the NFSv4.0 lease to expire on a server side, and
> >>>>>> servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> >>>>>>
> >>>>>> This can easily be reproduced by frequently unmounting a sub-
> >>>>>> mount,
> >>>>>> then stat'ing it to get it mounted again, which will delay or
> >>>>>> even
> >>>>>> completely prevent client from sending RENEW operations if no
> >>>>>> other
> >>>>>> NFS operations are issued. Eventually nfs server will expire
> >>>>>> client's
> >>>>>> lease and return an error on file access or next RENEW.
> >>>>>>
> >>>>>> This can also happen when a sub-mount is automatically
> >>>>>> unmounted due
> >>>>>> to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> >>>>>> mounted again via stat(). This can result in a short window
> >>>>>> during
> >>>>>> which client's lease will expire on a server but not on a
> >>>>>> client.
> >>>>>> This specific case was observed on production systems.
> >>>>>>
> >>>>>> This patch makes an explicit lease renewal instead of an
> >>>>>> implicit one,
> >>>>>> by adding RENEW to a compound operation issued by
> >>>>>> nfs4_do_fsinfo(),
> >>>>>> similarly to NFSv4.1 which adds SEQUENCE operation.
> >>>>>>
> >>>>>> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> >>>>>> leases")
> >>>>>> Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> >>>>>
> >>>>> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> >>>>>
> >>>>>
> >>>>
> >>>> How do we progress it further?
> >>>
> >>> Thanks for following up! I have the patch included in my linux-next
> >>> branch for
> >>> the next merge window.
> >>
> >> NACK. This is the wrong way to solve the problem. Lease renewal in
> >> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
> >> extra error condition that has absolutely nothing to do with the
> >> functionality of retrieving per-filesystem attributes.
> >
> > Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
> > send by fsinfo, and as Chuck pointed out
> > it makes sense to do similarly for 4.0, which is what this patch does.
>
> I did say that.
>
> However, I can see that for NFSv4.1+, the client code handling the
> SEQUENCE response will update cl_last_renewal. It does not need to
> be done in the fsinfo code.

I think it is the case, yes.

>
> The NFSv4.0 behavior should be correct if cl_last_renewal is not
> updated. That should force the client to send a separate RENEW
> operation so that both the client and server agree that the lease
> is active.
>

I was thinking about removing the call to update the last renewal
entirely in do_fsinfo(),
however as briefly discussed back in Dec there is an issue with
cl_last_renewal initialization on initial mount in 4.0
I observed it to be 0, as nfs4_setup_state_renewal() calls
nfs4_proc_get_lease_time() on initial mount and if no error it calls
nfs4_set_lease_period().
However with sec=krb the call to nfs4_proc_get_lease_time() returns
NFS4ERR_WRONGSEC) during initial mount (which seems to be ok), which
results in not setting cl_last_renewal,
which iirc prevented scheduling RENEW operations altogether. As
discussed then, this is really a separate issue which should be fixed
separately.
Once fixed then fsinfo() shouldn't need to set cl_last_renewal at all,
still sending RENEW in fsinfo() seems like a good idea to make it more
inline with what we do for 4.1.

> If I understand Trond correctly?
>

The problem with what Trond proposed is that it seems to go against
the rfc, although it should fix the initialization issue.

-- 
Robert Milkowski

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-27 14:45         ` Robert Milkowski
  2020-01-27 15:05           ` Chuck Lever
@ 2020-01-27 15:54           ` Trond Myklebust
  2020-01-27 17:26             ` Robert Milkowski
  1 sibling, 1 reply; 11+ messages in thread
From: Trond Myklebust @ 2020-01-27 15:54 UTC (permalink / raw)
  To: rmilkowski; +Cc: linux-nfs, Anna.Schumaker, linux-kernel, chuck.lever

On Mon, 2020-01-27 at 14:45 +0000, Robert Milkowski wrote:
> On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <
> trondmy@hammerspace.com> wrote:
> > On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
> > > Hi Robert,
> > > 
> > > On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> > > > > -----Original Message-----
> > > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > > Sent: 30 December 2019 15:37
> > > > > To: Robert Milkowski <rmilkowski@gmail.com>
> > > > > Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; Trond
> > > > > Myklebust
> > > > > <trond.myklebust@hammerspace.com>; Anna Schumaker
> > > > > <anna.schumaker@netapp.com>; linux-kernel@vger.kernel.org
> > > > > Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not
> > > > > do
> > > > > implicit
> > > > > lease renewals
> > > > > 
> > > > > 
> > > > > 
> > > > > > On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
> > > > > > rmilkowski@gmail.com>
> > > > > wrote:
> > > > > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > > > > 
> > > > > > Currently, each time nfs4_do_fsinfo() is called it will do
> > > > > > an
> > > > > > implicit
> > > > > > NFS4 lease renewal, which is not compliant with the NFS4
> > > > > specification.
> > > > > > This can result in a lease being expired by an NFS server.
> > > > > > 
> > > > > > Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > > > leases")
> > > > > > introduced implicit client lease renewal in
> > > > > > nfs4_do_fsinfo(),
> > > > > > which
> > > > > > can result in the NFSv4.0 lease to expire on a server side,
> > > > > > and
> > > > > > servers returning NFS4ERR_EXPIRED or
> > > > > > NFS4ERR_STALE_CLIENTID.
> > > > > > 
> > > > > > This can easily be reproduced by frequently unmounting a
> > > > > > sub-
> > > > > > mount,
> > > > > > then stat'ing it to get it mounted again, which will delay
> > > > > > or
> > > > > > even
> > > > > > completely prevent client from sending RENEW operations if
> > > > > > no
> > > > > > other
> > > > > > NFS operations are issued. Eventually nfs server will
> > > > > > expire
> > > > > > client's
> > > > > > lease and return an error on file access or next RENEW.
> > > > > > 
> > > > > > This can also happen when a sub-mount is automatically
> > > > > > unmounted due
> > > > > > to inactivity (after nfs_mountpoint_expiry_timeout), then
> > > > > > it is
> > > > > > mounted again via stat(). This can result in a short window
> > > > > > during
> > > > > > which client's lease will expire on a server but not on a
> > > > > > client.
> > > > > > This specific case was observed on production systems.
> > > > > > 
> > > > > > This patch makes an explicit lease renewal instead of an
> > > > > > implicit one,
> > > > > > by adding RENEW to a compound operation issued by
> > > > > > nfs4_do_fsinfo(),
> > > > > > similarly to NFSv4.1 which adds SEQUENCE operation.
> > > > > > 
> > > > > > Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> > > > > > leases")
> > > > > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > > > > 
> > > > > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> > > > > 
> > > > > 
> > > > 
> > > > How do we progress it further?
> > > 
> > > Thanks for following up! I have the patch included in my linux-
> > > next
> > > branch for
> > > the next merge window.
> > 
> > NACK. This is the wrong way to solve the problem. Lease renewal in
> > NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
> > extra error condition that has absolutely nothing to do with the
> > functionality of retrieving per-filesystem attributes.
> 
> Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
> send by fsinfo, and as Chuck pointed out
> it makes sense to do similarly for 4.0, which is what this patch
> does.

That's a bogus argument. The NFSv4.x SEQUENCE is mandated by the
protocol in order to manage the only once semantics via the session
slot mechanism. While it does implicitly renew the lease, that's not at
all the reason why it is present in this particular RPC call.

> Or as per the v2 version of the patch, not do the implicit RENEW for
> 4.0, which was a simpler patch,
> but then not in-line with 4.1+.
> 
> > All that needs to be done here is to move the setting of clp-
> > > cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
> > nfs4_proc_setclientid_confirm() and nfs4_update_session() call
> > do_renew_lease().
> > 
> 
> This would also require nfs4_setup_state_renewal() to call
> do_renew_lease() I think - at least it currently calls
> nfs4_set_lease_period().

This is the mechanism we're replacing. There is no need for a call to
do_renew_lease() in nfs4_setup_state_renewal().

> Also, iirc fsinfo() not setting cl_last_renewal leads to
> cl_last_renewal initialization issues under some circumstances.
> 
> Then the RFC 7530 in section 16.34.5 states:
> "SETCLIENTID_CONFIRM does not establish or renew a lease.", so
> calling
> do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be
> ok.

So just move the do_renew_lease() to nfs4_proc_setclientid(). The
successful combination SETCLIENTID+SETCLIENTID_CONFIRM definitely
_does_ establish a lease.

> I'm not sure if is is valid to do implicit lease renewal in
> nfs4_update_session() either...

Ditto. Move to the exchange_id call.

> 
> Anyway, the patch would be something like (haven't tested it yet):
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..a7af864 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5019,16 +5019,13 @@ static int nfs4_do_fsinfo(struct nfs_server
> *server, struct nfs_fh *fhandle, str
>         struct nfs4_exception exception = {
>                 .interruptible = true,
>         };
> -       unsigned long now = jiffies;
>         int err;
> 
>         do {
>                 err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
>                 trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr,
> err);
>                 if (err == 0) {
> -                       nfs4_set_lease_period(server->nfs_client,
> -                                       fsinfo->lease_time * HZ,
> -                                       now);
> +                       nfs4_set_lease_period(server->nfs_client,
> fsinfo->lease_time * HZ)
>                         break;
>                 }
>                 err = nfs4_handle_exception(server, err, &exception);
> @@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct
> nfs_client *clp,
>                 clp->cl_clientid);
>         status = rpc_call_sync(clp->cl_rpcclient, &msg,
>                                RPC_TASK_TIMEOUT |
> RPC_TASK_NO_ROUND_ROBIN);
> +       if(status == 0) {
> +               unsigned long now = jiffies;

The variable 'now' needs to be set before the RPC call in order to
avoid overestimating the remaining lease period.

> +               do_renew_lease(clp, now);
> +       }
>         trace_nfs4_setclientid_confirm(clp, status);
>         dprintk("NFS reply setclientid_confirm: %d\n", status);
>         return status;
> @@ -8590,6 +8591,8 @@ static void nfs4_update_session(struct
> nfs4_session *session,
>         if (res->flags & SESSION4_BACK_CHAN)
>                 memcpy(&session->bc_attrs, &res->bc_attrs,
>                                 sizeof(session->bc_attrs));
> +       unsigned long now = jiffies;
> +       do_renew_lease(session->clp, now);
>  }
> 
>  static int _nfs4_proc_create_session(struct nfs_client *clp,
> diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c
> index 6ea431b..ff876dd 100644
> --- a/fs/nfs/nfs4renewd.c
> +++ b/fs/nfs/nfs4renewd.c
> @@ -138,15 +138,12 @@
>   *
>   * @clp: pointer to nfs_client
>   * @lease: new value for lease period
> - * @lastrenewed: time at which lease was last renewed
>   */
>  void nfs4_set_lease_period(struct nfs_client *clp,
> -               unsigned long lease,
> -               unsigned long lastrenewed)
> +               unsigned long lease)
>  {
>         spin_lock(&clp->cl_lock);
>         clp->cl_lease_time = lease;
> -       clp->cl_last_renewal = lastrenewed;
>         spin_unlock(&clp->cl_lock);
> 
>         /* Cap maximum reconnect timeout at 1/2 lease period */
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 3455232..d7b02fd 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -102,7 +102,8 @@ static int nfs4_setup_state_renewal(struct
> nfs_client *clp)
>         now = jiffies;
>         status = nfs4_proc_get_lease_time(clp, &fsinfo);
>         if (status == 0) {
> -               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ,
> now);
> +               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ);
> +               nfs4_do_renew_lease(clp, now);

As stated above, this call to nfs4_do_renew_lease() is not needed.

>                 nfs4_schedule_state_renewal(clp);
>         }
> 
> 
> 
> But it potentially has issues, as just pointed out, mainly with not
> being compliant with rfc again.

See above.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
  2020-01-27 15:54           ` Trond Myklebust
@ 2020-01-27 17:26             ` Robert Milkowski
  0 siblings, 0 replies; 11+ messages in thread
From: Robert Milkowski @ 2020-01-27 17:26 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs, Anna.Schumaker, linux-kernel, chuck.lever

> >
> > > All that needs to be done here is to move the setting of clp-
> > > > cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
> > > nfs4_proc_setclientid_confirm() and nfs4_update_session() call
> > > do_renew_lease().
> > >
> >
> > This would also require nfs4_setup_state_renewal() to call
> > do_renew_lease() I think - at least it currently calls
> > nfs4_set_lease_period().
>
> This is the mechanism we're replacing. There is no need for a call to
> do_renew_lease() in nfs4_setup_state_renewal().

ok

>
> > Also, iirc fsinfo() not setting cl_last_renewal leads to
> > cl_last_renewal initialization issues under some circumstances.
> >
> > Then the RFC 7530 in section 16.34.5 states:
> > "SETCLIENTID_CONFIRM does not establish or renew a lease.", so
> > calling
> > do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be
> > ok.
>
> So just move the do_renew_lease() to nfs4_proc_setclientid(). The
> successful combination SETCLIENTID+SETCLIENTID_CONFIRM definitely
> _does_ establish a lease.
>

ok

> > I'm not sure if is is valid to do implicit lease renewal in
> > nfs4_update_session() either...
>
> Ditto. Move to the exchange_id call.
>

ok

> >
> > Anyway, the patch would be something like (haven't tested it yet):
> >
...
> > @@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct
> > nfs_client *clp,
> >                 clp->cl_clientid);
> >         status = rpc_call_sync(clp->cl_rpcclient, &msg,
> >                                RPC_TASK_TIMEOUT |
> > RPC_TASK_NO_ROUND_ROBIN);
> > +       if(status == 0) {
> > +               unsigned long now = jiffies;
>
> The variable 'now' needs to be set before the RPC call in order to
> avoid overestimating the remaining lease period.
>

yes, thx.

I will get a new patch tested and submitted here this week.

-- 
Robert Milkowski

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-01-27 17:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-30 15:20 [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals Robert Milkowski
2019-12-30 15:36 ` Chuck Lever
2020-01-20 17:55   ` Robert Milkowski
2020-01-22 19:10     ` Schumaker, Anna
2020-01-22 19:55       ` Robert Milkowski
2020-01-23 19:08       ` Trond Myklebust
2020-01-27 14:45         ` Robert Milkowski
2020-01-27 15:05           ` Chuck Lever
2020-01-27 15:34             ` Robert Milkowski
2020-01-27 15:54           ` Trond Myklebust
2020-01-27 17:26             ` Robert Milkowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).