linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v3 0/5] un-deprecate nfsdcld
@ 2019-03-26 22:06 Scott Mayhew
  2019-03-26 22:06 ` [PATCH RFC v3 1/5] nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array Scott Mayhew
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Scott Mayhew @ 2019-03-26 22:06 UTC (permalink / raw)
  To: bfields, jlayton; +Cc: linux-nfs

When nfsdcld was released, it was quickly deprecated in favor of the
nfsdcltrack usermodehelper, so as to not require another running daemon.
That prevents NFSv4 clients from reclaiming locks from nfsd's running in
containers, since neither nfsdcltrack nor the legacy client tracking
code work in containers.  These patches un-deprecate the use of nfsdcld
for NFSv4 client tracking.

These patches are intended to go alongside some nfs-utils patches that
introduce an enhancement that allows nfsd to "slurp" up the client
records during client tracking initialization and store them internally
in hash table.  This enables nfsd to check whether an NFSv4 client is
allowed to reclaim without having to do an upcall to nfsdcld.  It also
allows nfsd to decide to end the v4 grace period early if the number of
RECLAIM_COMPLETE operations it has received from "known" clients is
equal to the number of entries in the hash table.  It also allows nfsd
to skip the v4 grace period altogether if it knows there are no clients
allowed to reclaim.

There is a fallback to allow nfsd to continue to work with older nfsdcld
daemons in the event that any are out in the wild (unlikely).
Everything should work fine except nfsd will not be able to exit the
grace period early or skip the grace period altogether.

v3:
- nfs4_state_start_net() now calls nfsd4_end_grace() instead of
  open-coding it
- Removed some unnecessary initializations of nr_reclaim_complete
- Removed dec_reclaim_complete() altogether.  If we're calling
  expire_client() as a result of receiving a DESTROY_CLIENTID or
  SETCLIENTID_CONFIRM then we wouldn't want to decrement the count.  And
  laundromat thread never expires clients due to loss of lease until the
  grace period is over (the laundromat gets rescheduled by 1 second as
  long as clients are reclaiming during that 1 second, for up to 2x the
  original grace period in total).  So dec_reclaim_complete() is
  unnecessary.
- Changed the preference order of client tracking methods.  The new
  order is 1) new nfdscld, 2) old nfsdcld, 3) nfsdcltrack, 4) legacy
  v4recovery directory.
- Added some special handling of legacy records sent by nfsdcld in the
  GraceStart downcalls.  nfsdcld will do a one-time "upgrade" from old
  tracking methods.  No changes are needed to accomodate nfsdcltrack
  records, but legacy records will be prefixed with the string "hash:", 
  and in the event that we do have legacy records in the
  reclaim_str_hashtbl we may need to do a second lookup using the hash
  in the event that a lookup using the client name string fails.

v2:
- Addressed some coding style issues in nfsd4_create_clid_dir() &
  nfsd4_remove_clid_dir()

Scott Mayhew (5):
  nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed
    char array
  nfsd: un-deprecate nfsdcld
  nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
  nfsd: re-order client tracking method selection
  nfsd: handle legacy client tracking records sent by nfsdcld

 fs/nfsd/netns.h               |   3 +
 fs/nfsd/nfs4recover.c         | 436 +++++++++++++++++++++++++++++++---
 fs/nfsd/nfs4state.c           |  63 +++--
 fs/nfsd/nfsctl.c              |   1 +
 fs/nfsd/state.h               |   8 +-
 include/uapi/linux/nfsd/cld.h |   1 +
 6 files changed, 457 insertions(+), 55 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH RFC v3 1/5] nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array
  2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
@ 2019-03-26 22:06 ` Scott Mayhew
  2019-03-26 22:06 ` [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld Scott Mayhew
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Scott Mayhew @ 2019-03-26 22:06 UTC (permalink / raw)
  To: bfields, jlayton; +Cc: linux-nfs

This will allow the reclaim_str_hashtbl to store either the recovery
directory names used by the legacy client tracking code or the full
client strings used by the nfsdcld client tracking code.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfsd/nfs4recover.c | 109 +++++++++++++++++++++++++++++++++++-------
 fs/nfsd/nfs4state.c   |  30 ++++++------
 fs/nfsd/state.h       |   8 ++--
 3 files changed, 110 insertions(+), 37 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 5188f9f70c78..2243b909b407 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -169,13 +169,34 @@ legacy_recdir_name_error(struct nfs4_client *clp, int error)
 	}
 }
 
+static void
+__nfsd4_create_reclaim_record_grace(struct nfs4_client *clp,
+		const char *dname, int len, struct nfsd_net *nn)
+{
+	struct xdr_netobj name;
+	struct nfs4_client_reclaim *crp;
+
+	name.data = kmemdup(dname, len, GFP_KERNEL);
+	if (!name.data) {
+		dprintk("%s: failed to allocate memory for name.data!\n",
+			__func__);
+		return;
+	}
+	name.len = len;
+	crp = nfs4_client_to_reclaim(name, nn);
+	if (!crp) {
+		kfree(name.data);
+		return;
+	}
+	crp->cr_clp = clp;
+}
+
 static void
 nfsd4_create_clid_dir(struct nfs4_client *clp)
 {
 	const struct cred *original_cred;
 	char dname[HEXDIR_LEN];
 	struct dentry *dir, *dentry;
-	struct nfs4_client_reclaim *crp;
 	int status;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
@@ -221,11 +242,9 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
 out_unlock:
 	inode_unlock(d_inode(dir));
 	if (status == 0) {
-		if (nn->in_grace) {
-			crp = nfs4_client_to_reclaim(dname, nn);
-			if (crp)
-				crp->cr_clp = clp;
-		}
+		if (nn->in_grace)
+			__nfsd4_create_reclaim_record_grace(clp, dname,
+					HEXDIR_LEN, nn);
 		vfs_fsync(nn->rec_file, 0);
 	} else {
 		printk(KERN_ERR "NFSD: failed to write recovery record"
@@ -345,11 +364,30 @@ nfsd4_unlink_clid_dir(char *name, int namlen, struct nfsd_net *nn)
 	return status;
 }
 
+static void
+__nfsd4_remove_reclaim_record_grace(const char *dname, int len,
+		struct nfsd_net *nn)
+{
+	struct xdr_netobj name;
+	struct nfs4_client_reclaim *crp;
+
+	name.data = kmemdup(dname, len, GFP_KERNEL);
+	if (!name.data) {
+		dprintk("%s: failed to allocate memory for name.data!\n",
+			__func__);
+		return;
+	}
+	name.len = len;
+	crp = nfsd4_find_reclaim_client(name, nn);
+	kfree(name.data);
+	if (crp)
+		nfs4_remove_reclaim_record(crp, nn);
+}
+
 static void
 nfsd4_remove_clid_dir(struct nfs4_client *clp)
 {
 	const struct cred *original_cred;
-	struct nfs4_client_reclaim *crp;
 	char dname[HEXDIR_LEN];
 	int status;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
@@ -374,12 +412,9 @@ nfsd4_remove_clid_dir(struct nfs4_client *clp)
 	nfs4_reset_creds(original_cred);
 	if (status == 0) {
 		vfs_fsync(nn->rec_file, 0);
-		if (nn->in_grace) {
-			/* remove reclaim record */
-			crp = nfsd4_find_reclaim_client(dname, nn);
-			if (crp)
-				nfs4_remove_reclaim_record(crp, nn);
-		}
+		if (nn->in_grace)
+			__nfsd4_remove_reclaim_record_grace(dname,
+					HEXDIR_LEN, nn);
 	}
 out_drop_write:
 	mnt_drop_write_file(nn->rec_file);
@@ -393,14 +428,31 @@ static int
 purge_old(struct dentry *parent, struct dentry *child, struct nfsd_net *nn)
 {
 	int status;
+	struct xdr_netobj name;
 
-	if (nfs4_has_reclaimed_state(child->d_name.name, nn))
+	if (child->d_name.len != HEXDIR_LEN - 1) {
+		printk("%s: illegal name %pd in recovery directory\n",
+				__func__, child);
+		/* Keep trying; maybe the others are OK: */
 		return 0;
+	}
+	name.data = kmemdup_nul(child->d_name.name, child->d_name.len, GFP_KERNEL);
+	if (!name.data) {
+		dprintk("%s: failed to allocate memory for name.data!\n",
+			__func__);
+		goto out;
+	}
+	name.len = HEXDIR_LEN;
+	if (nfs4_has_reclaimed_state(name, nn))
+		goto out_free;
 
 	status = vfs_rmdir(d_inode(parent), child);
 	if (status)
 		printk("failed to remove client recovery directory %pd\n",
 				child);
+out_free:
+	kfree(name.data);
+out:
 	/* Keep trying, success or failure: */
 	return 0;
 }
@@ -430,13 +482,24 @@ nfsd4_recdir_purge_old(struct nfsd_net *nn)
 static int
 load_recdir(struct dentry *parent, struct dentry *child, struct nfsd_net *nn)
 {
+	struct xdr_netobj name;
+
 	if (child->d_name.len != HEXDIR_LEN - 1) {
-		printk("nfsd4: illegal name %pd in recovery directory\n",
-				child);
+		printk("%s: illegal name %pd in recovery directory\n",
+				__func__, child);
 		/* Keep trying; maybe the others are OK: */
 		return 0;
 	}
-	nfs4_client_to_reclaim(child->d_name.name, nn);
+	name.data = kmemdup_nul(child->d_name.name, child->d_name.len, GFP_KERNEL);
+	if (!name.data) {
+		dprintk("%s: failed to allocate memory for name.data!\n",
+			__func__);
+		goto out;
+	}
+	name.len = HEXDIR_LEN;
+	if (!nfs4_client_to_reclaim(name, nn))
+		kfree(name.data);
+out:
 	return 0;
 }
 
@@ -616,6 +679,7 @@ nfsd4_check_legacy_client(struct nfs4_client *clp)
 	char dname[HEXDIR_LEN];
 	struct nfs4_client_reclaim *crp;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+	struct xdr_netobj name;
 
 	/* did we already find that this client is stable? */
 	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
@@ -628,13 +692,22 @@ nfsd4_check_legacy_client(struct nfs4_client *clp)
 	}
 
 	/* look for it in the reclaim hashtable otherwise */
-	crp = nfsd4_find_reclaim_client(dname, nn);
+	name.data = kmemdup(dname, HEXDIR_LEN, GFP_KERNEL);
+	if (!name.data) {
+		dprintk("%s: failed to allocate memory for name.data!\n",
+			__func__);
+		goto out_enoent;
+	}
+	name.len = HEXDIR_LEN;
+	crp = nfsd4_find_reclaim_client(name, nn);
+	kfree(name.data);
 	if (crp) {
 		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
 		crp->cr_clp = clp;
 		return 0;
 	}
 
+out_enoent:
 	return -ENOENT;
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6a45fb00c5fc..898f1c8c66ca 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1058,9 +1058,9 @@ static unsigned int clientid_hashval(u32 id)
 	return id & CLIENT_HASH_MASK;
 }
 
-static unsigned int clientstr_hashval(const char *name)
+static unsigned int clientstr_hashval(struct xdr_netobj name)
 {
-	return opaque_hashval(name, 8) & CLIENT_HASH_MASK;
+	return opaque_hashval(name.data, 8) & CLIENT_HASH_MASK;
 }
 
 /*
@@ -2039,11 +2039,6 @@ compare_blob(const struct xdr_netobj *o1, const struct xdr_netobj *o2)
 	return memcmp(o1->data, o2->data, o1->len);
 }
 
-static int same_name(const char *n1, const char *n2)
-{
-	return 0 == memcmp(n1, n2, HEXDIR_LEN);
-}
-
 static int
 same_verf(nfs4_verifier *v1, nfs4_verifier *v2)
 {
@@ -6449,7 +6444,7 @@ alloc_reclaim(void)
 }
 
 bool
-nfs4_has_reclaimed_state(const char *name, struct nfsd_net *nn)
+nfs4_has_reclaimed_state(struct xdr_netobj name, struct nfsd_net *nn)
 {
 	struct nfs4_client_reclaim *crp;
 
@@ -6459,20 +6454,24 @@ nfs4_has_reclaimed_state(const char *name, struct nfsd_net *nn)
 
 /*
  * failure => all reset bets are off, nfserr_no_grace...
+ *
+ * The caller is responsible for freeing name.data if NULL is returned (it
+ * will be freed in nfs4_remove_reclaim_record in the normal case).
  */
 struct nfs4_client_reclaim *
-nfs4_client_to_reclaim(const char *name, struct nfsd_net *nn)
+nfs4_client_to_reclaim(struct xdr_netobj name, struct nfsd_net *nn)
 {
 	unsigned int strhashval;
 	struct nfs4_client_reclaim *crp;
 
-	dprintk("NFSD nfs4_client_to_reclaim NAME: %.*s\n", HEXDIR_LEN, name);
+	dprintk("NFSD nfs4_client_to_reclaim NAME: %.*s\n", name.len, name.data);
 	crp = alloc_reclaim();
 	if (crp) {
 		strhashval = clientstr_hashval(name);
 		INIT_LIST_HEAD(&crp->cr_strhash);
 		list_add(&crp->cr_strhash, &nn->reclaim_str_hashtbl[strhashval]);
-		memcpy(crp->cr_recdir, name, HEXDIR_LEN);
+		crp->cr_name.data = name.data;
+		crp->cr_name.len = name.len;
 		crp->cr_clp = NULL;
 		nn->reclaim_str_hashtbl_size++;
 	}
@@ -6483,6 +6482,7 @@ void
 nfs4_remove_reclaim_record(struct nfs4_client_reclaim *crp, struct nfsd_net *nn)
 {
 	list_del(&crp->cr_strhash);
+	kfree(crp->cr_name.data);
 	kfree(crp);
 	nn->reclaim_str_hashtbl_size--;
 }
@@ -6506,16 +6506,16 @@ nfs4_release_reclaim(struct nfsd_net *nn)
 /*
  * called from OPEN, CLAIM_PREVIOUS with a new clientid. */
 struct nfs4_client_reclaim *
-nfsd4_find_reclaim_client(const char *recdir, struct nfsd_net *nn)
+nfsd4_find_reclaim_client(struct xdr_netobj name, struct nfsd_net *nn)
 {
 	unsigned int strhashval;
 	struct nfs4_client_reclaim *crp = NULL;
 
-	dprintk("NFSD: nfs4_find_reclaim_client for recdir %s\n", recdir);
+	dprintk("NFSD: nfs4_find_reclaim_client for name %.*s\n", name.len, name.data);
 
-	strhashval = clientstr_hashval(recdir);
+	strhashval = clientstr_hashval(name);
 	list_for_each_entry(crp, &nn->reclaim_str_hashtbl[strhashval], cr_strhash) {
-		if (same_name(crp->cr_recdir, recdir)) {
+		if (compare_blob(&crp->cr_name, &name) == 0) {
 			return crp;
 		}
 	}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 396c76755b03..3e5b7dd604bd 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -367,7 +367,7 @@ struct nfs4_client {
 struct nfs4_client_reclaim {
 	struct list_head	cr_strhash;	/* hash by cr_name */
 	struct nfs4_client	*cr_clp;	/* pointer to associated clp */
-	char			cr_recdir[HEXDIR_LEN]; /* recover dir */
+	struct xdr_netobj	cr_name;	/* recovery dir name */
 };
 
 /* A reasonable value for REPLAY_ISIZE was estimated as follows:  
@@ -619,7 +619,7 @@ void nfs4_put_stid(struct nfs4_stid *s);
 void nfs4_inc_and_copy_stateid(stateid_t *dst, struct nfs4_stid *stid);
 void nfs4_remove_reclaim_record(struct nfs4_client_reclaim *, struct nfsd_net *);
 extern void nfs4_release_reclaim(struct nfsd_net *);
-extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(const char *recdir,
+extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(struct xdr_netobj name,
 							struct nfsd_net *nn);
 extern __be32 nfs4_check_open_reclaim(clientid_t *clid,
 		struct nfsd4_compound_state *cstate, struct nfsd_net *nn);
@@ -634,9 +634,9 @@ extern void nfsd4_destroy_callback_queue(void);
 extern void nfsd4_shutdown_callback(struct nfs4_client *);
 extern void nfsd4_shutdown_copy(struct nfs4_client *clp);
 extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
-extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
+extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(struct xdr_netobj name,
 							struct nfsd_net *nn);
-extern bool nfs4_has_reclaimed_state(const char *name, struct nfsd_net *nn);
+extern bool nfs4_has_reclaimed_state(struct xdr_netobj name, struct nfsd_net *nn);
 
 struct nfs4_file *find_file(struct knfsd_fh *fh);
 void put_nfs4_file(struct nfs4_file *fi);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld
  2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
  2019-03-26 22:06 ` [PATCH RFC v3 1/5] nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array Scott Mayhew
@ 2019-03-26 22:06 ` Scott Mayhew
  2019-04-05 18:54   ` J. Bruce Fields
  2019-03-26 22:06 ` [PATCH RFC v3 3/5] nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld Scott Mayhew
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Scott Mayhew @ 2019-03-26 22:06 UTC (permalink / raw)
  To: bfields, jlayton; +Cc: linux-nfs

When nfsdcld was released, it was quickly deprecated in favor of the
nfsdcltrack usermodehelper, so as to not require another running daemon.
That prevents NFSv4 clients from reclaiming locks from nfsd's running in
containers, since neither nfsdcltrack nor the legacy client tracking
code work in containers.

This commit un-deprecates the use of nfsdcld, with one twist: we will
populate the reclaim_str_hashtbl on startup.

During client tracking initialization, do an upcall ("GraceStart") to
nfsdcld to get a list of clients from the database.  nfsdcld will do
one downcall with a status of -EINPROGRESS for each client record in
the database, which in turn will cause an nfs4_client_reclaim to be
added to the reclaim_str_hashtbl.  When complete, nfsdcld will do a
final downcall with a status of 0.

This will save nfsd from having to do an upcall to the daemon during
nfs4_check_open_reclaim() processing.

Even though nfsdcld was quickly deprecated, there is a very small chance
of old nfsdcld daemons running in the wild.  These will respond to the
new "GraceStart" upcall with -EOPNOTSUPP, in which case we will log a
message and fall back to the original nfsdcld tracking ops (now called
nfsd4_cld_tracking_ops_v0).

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfsd/nfs4recover.c         | 225 ++++++++++++++++++++++++++++++++--
 include/uapi/linux/nfsd/cld.h |   1 +
 2 files changed, 218 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 2243b909b407..89c2a27956d0 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -779,6 +779,34 @@ cld_pipe_upcall(struct rpc_pipe *pipe, struct cld_msg *cmsg)
 	return ret;
 }
 
+static ssize_t
+__cld_pipe_inprogress_downcall(const struct cld_msg __user *cmsg,
+		struct nfsd_net *nn)
+{
+	uint8_t cmd;
+	struct xdr_netobj name;
+	uint16_t namelen;
+
+	if (get_user(cmd, &cmsg->cm_cmd)) {
+		dprintk("%s: error when copying cmd from userspace", __func__);
+		return -EFAULT;
+	}
+	if (cmd == Cld_GraceStart) {
+		if (get_user(namelen, &cmsg->cm_u.cm_name.cn_len))
+			return -EFAULT;
+		name.data = memdup_user(&cmsg->cm_u.cm_name.cn_id, namelen);
+		if (IS_ERR_OR_NULL(name.data))
+			return -EFAULT;
+		name.len = namelen;
+		if (!nfs4_client_to_reclaim(name, nn)) {
+			kfree(name.data);
+			return -EFAULT;
+		}
+		return sizeof(*cmsg);
+	}
+	return -EFAULT;
+}
+
 static ssize_t
 cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
 {
@@ -788,6 +816,7 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
 	struct nfsd_net *nn = net_generic(file_inode(filp)->i_sb->s_fs_info,
 						nfsd_net_id);
 	struct cld_net *cn = nn->cld_net;
+	int16_t status;
 
 	if (mlen != sizeof(*cmsg)) {
 		dprintk("%s: got %zu bytes, expected %zu\n", __func__, mlen,
@@ -801,13 +830,24 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
 		return -EFAULT;
 	}
 
+	/*
+	 * copy the status so we know whether to remove the upcall from the
+	 * list (for -EINPROGRESS, we just want to make sure the xid is
+	 * valid, not remove the upcall from the list)
+	 */
+	if (get_user(status, &cmsg->cm_status)) {
+		dprintk("%s: error when copying status from userspace", __func__);
+		return -EFAULT;
+	}
+
 	/* walk the list and find corresponding xid */
 	cup = NULL;
 	spin_lock(&cn->cn_lock);
 	list_for_each_entry(tmp, &cn->cn_list, cu_list) {
 		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
 			cup = tmp;
-			list_del_init(&cup->cu_list);
+			if (status != -EINPROGRESS)
+				list_del_init(&cup->cu_list);
 			break;
 		}
 	}
@@ -819,6 +859,9 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
 		return -EINVAL;
 	}
 
+	if (status == -EINPROGRESS)
+		return __cld_pipe_inprogress_downcall(cmsg, nn);
+
 	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
 		return -EFAULT;
 
@@ -1065,9 +1108,14 @@ nfsd4_cld_remove(struct nfs4_client *clp)
 				"record from stable storage: %d\n", ret);
 }
 
-/* Check for presence of a record, and update its timestamp */
+/*
+ * For older nfsdcld's that do not allow us to "slurp" the clients
+ * from the tracking database during startup.
+ *
+ * Check for presence of a record, and update its timestamp
+ */
 static int
-nfsd4_cld_check(struct nfs4_client *clp)
+nfsd4_cld_check_v0(struct nfs4_client *clp)
 {
 	int ret;
 	struct cld_upcall *cup;
@@ -1100,8 +1148,61 @@ nfsd4_cld_check(struct nfs4_client *clp)
 	return ret;
 }
 
+/*
+ * For newer nfsdcld's that allow us to "slurp" the clients
+ * from the tracking database during startup.
+ *
+ * Check for presence of a record in the reclaim_str_hashtbl
+ */
+static int
+nfsd4_cld_check(struct nfs4_client *clp)
+{
+	struct nfs4_client_reclaim *crp;
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	/* did we already find that this client is stable? */
+	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+		return 0;
+
+	/* look for it in the reclaim hashtable otherwise */
+	crp = nfsd4_find_reclaim_client(clp->cl_name, nn);
+	if (crp) {
+		crp->cr_clp = clp;
+		return 0;
+	}
+
+	return -ENOENT;
+}
+
+static int
+nfsd4_cld_grace_start(struct nfsd_net *nn)
+{
+	int ret;
+	struct cld_upcall *cup;
+	struct cld_net *cn = nn->cld_net;
+
+	cup = alloc_cld_upcall(cn);
+	if (!cup) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	cup->cu_msg.cm_cmd = Cld_GraceStart;
+	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
+	if (!ret)
+		ret = cup->cu_msg.cm_status;
+
+	free_cld_upcall(cup);
+out_err:
+	if (ret)
+		dprintk("%s: Unable to get clients from userspace: %d\n",
+			__func__, ret);
+	return ret;
+}
+
+/* For older nfsdcld's that need cm_gracetime */
 static void
-nfsd4_cld_grace_done(struct nfsd_net *nn)
+nfsd4_cld_grace_done_v0(struct nfsd_net *nn)
 {
 	int ret;
 	struct cld_upcall *cup;
@@ -1125,11 +1226,118 @@ nfsd4_cld_grace_done(struct nfsd_net *nn)
 		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
 }
 
-static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
+/*
+ * For newer nfsdcld's that do not need cm_gracetime.  We also need to call
+ * nfs4_release_reclaim() to clear out the reclaim_str_hashtbl.
+ */
+static void
+nfsd4_cld_grace_done(struct nfsd_net *nn)
+{
+	int ret;
+	struct cld_upcall *cup;
+	struct cld_net *cn = nn->cld_net;
+
+	cup = alloc_cld_upcall(cn);
+	if (!cup) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	cup->cu_msg.cm_cmd = Cld_GraceDone;
+	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
+	if (!ret)
+		ret = cup->cu_msg.cm_status;
+
+	free_cld_upcall(cup);
+out_err:
+	nfs4_release_reclaim(nn);
+	if (ret)
+		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
+}
+
+static int
+nfs4_cld_state_init(struct net *net)
+{
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+	int i;
+
+	nn->reclaim_str_hashtbl = kmalloc_array(CLIENT_HASH_SIZE,
+						sizeof(struct list_head),
+						GFP_KERNEL);
+	if (!nn->reclaim_str_hashtbl)
+		return -ENOMEM;
+
+	for (i = 0; i < CLIENT_HASH_SIZE; i++)
+		INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]);
+	nn->reclaim_str_hashtbl_size = 0;
+
+	return 0;
+}
+
+static void
+nfs4_cld_state_shutdown(struct net *net)
+{
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+	kfree(nn->reclaim_str_hashtbl);
+}
+
+static int
+nfsd4_cld_tracking_init(struct net *net)
+{
+	int status;
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+	status = nfs4_cld_state_init(net);
+	if (status)
+		return status;
+
+	status = nfsd4_init_cld_pipe(net);
+	if (status)
+		goto err_shutdown;
+
+	status = nfsd4_cld_grace_start(nn);
+	if (status) {
+		if (status == -EOPNOTSUPP)
+			printk(KERN_WARNING "NFSD: Please upgrade nfsdcld.\n");
+		nfs4_release_reclaim(nn);
+		goto err_remove;
+	}
+	return 0;
+
+err_remove:
+	nfsd4_remove_cld_pipe(net);
+err_shutdown:
+	nfs4_cld_state_shutdown(net);
+	return status;
+}
+
+static void
+nfsd4_cld_tracking_exit(struct net *net)
+{
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+	nfs4_release_reclaim(nn);
+	nfsd4_remove_cld_pipe(net);
+	nfs4_cld_state_shutdown(net);
+}
+
+/* For older nfsdcld's */
+static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops_v0 = {
 	.init		= nfsd4_init_cld_pipe,
 	.exit		= nfsd4_remove_cld_pipe,
 	.create		= nfsd4_cld_create,
 	.remove		= nfsd4_cld_remove,
+	.check		= nfsd4_cld_check_v0,
+	.grace_done	= nfsd4_cld_grace_done_v0,
+};
+
+/* For newer nfsdcld's */
+static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
+	.init		= nfsd4_cld_tracking_init,
+	.exit		= nfsd4_cld_tracking_exit,
+	.create		= nfsd4_cld_create,
+	.remove		= nfsd4_cld_remove,
 	.check		= nfsd4_cld_check,
 	.grace_done	= nfsd4_cld_grace_done,
 };
@@ -1514,9 +1722,10 @@ nfsd4_client_tracking_init(struct net *net)
 
 	/* Finally, try to use nfsdcld */
 	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
-	printk(KERN_WARNING "NFSD: the nfsdcld client tracking upcall will be "
-			"removed in 3.10. Please transition to using "
-			"nfsdcltrack.\n");
+	status = nn->client_tracking_ops->init(net);
+	if (!status)
+		return status;
+	nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
 do_init:
 	status = nn->client_tracking_ops->init(net);
 	if (status) {
diff --git a/include/uapi/linux/nfsd/cld.h b/include/uapi/linux/nfsd/cld.h
index f8f5cccad749..b1e9de4f07d5 100644
--- a/include/uapi/linux/nfsd/cld.h
+++ b/include/uapi/linux/nfsd/cld.h
@@ -36,6 +36,7 @@ enum cld_command {
 	Cld_Remove,		/* remove record of this cm_id */
 	Cld_Check,		/* is this cm_id allowed? */
 	Cld_GraceDone,		/* grace period is complete */
+	Cld_GraceStart,
 };
 
 /* representation of long-form NFSv4 client ID */
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v3 3/5] nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
  2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
  2019-03-26 22:06 ` [PATCH RFC v3 1/5] nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array Scott Mayhew
  2019-03-26 22:06 ` [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld Scott Mayhew
@ 2019-03-26 22:06 ` Scott Mayhew
  2019-03-26 22:06 ` [PATCH RFC v3 4/5] nfsd: re-order client tracking method selection Scott Mayhew
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Scott Mayhew @ 2019-03-26 22:06 UTC (permalink / raw)
  To: bfields, jlayton; +Cc: linux-nfs

When using nfsdcld for NFSv4 client tracking, track the number of
RECLAIM_COMPLETE operations we receive from "known" clients to help in
deciding if we can lift the grace period early (or whether we need to
start a v4 grace period at all).

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfsd/netns.h       |  3 +++
 fs/nfsd/nfs4recover.c |  3 +++
 fs/nfsd/nfs4state.c   | 33 ++++++++++++++++++++++++++++++++-
 fs/nfsd/nfsctl.c      |  1 +
 4 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 32cb8c027483..b42aaa22fba2 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -104,6 +104,9 @@ struct nfsd_net {
 	time_t nfsd4_grace;
 	bool somebody_reclaimed;
 
+	bool track_reclaim_completes;
+	atomic_t nr_reclaim_complete;
+
 	bool nfsd_net_up;
 	bool lockd_up;
 
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 89c2a27956d0..09e3c9352778 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -1270,6 +1270,8 @@ nfs4_cld_state_init(struct net *net)
 	for (i = 0; i < CLIENT_HASH_SIZE; i++)
 		INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]);
 	nn->reclaim_str_hashtbl_size = 0;
+	nn->track_reclaim_completes = true;
+	atomic_set(&nn->nr_reclaim_complete, 0);
 
 	return 0;
 }
@@ -1279,6 +1281,7 @@ nfs4_cld_state_shutdown(struct net *net)
 {
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
+	nn->track_reclaim_completes = false;
 	kfree(nn->reclaim_str_hashtbl);
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 898f1c8c66ca..4b96d41ab695 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -77,6 +77,7 @@ static u64 current_sessionid = 1;
 /* forward declarations */
 static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner);
 static void nfs4_free_ol_stateid(struct nfs4_stid *stid);
+void nfsd4_end_grace(struct nfsd_net *nn);
 
 /* Locking: */
 
@@ -1988,6 +1989,22 @@ destroy_client(struct nfs4_client *clp)
 	__destroy_client(clp);
 }
 
+static void inc_reclaim_complete(struct nfs4_client *clp)
+{
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	if (!nn->track_reclaim_completes)
+		return;
+	if (!nfsd4_find_reclaim_client(clp->cl_name, nn))
+		return;
+	if (atomic_inc_return(&nn->nr_reclaim_complete) ==
+			nn->reclaim_str_hashtbl_size) {
+		printk(KERN_INFO "NFSD: all clients done reclaiming, ending NFSv4 grace period (net %x)\n",
+				clp->net->ns.inum);
+		nfsd4_end_grace(nn);
+	}
+}
+
 static void expire_client(struct nfs4_client *clp)
 {
 	unhash_client(clp);
@@ -3339,6 +3356,7 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp,
 
 	status = nfs_ok;
 	nfsd4_client_record_create(cstate->session->se_client);
+	inc_reclaim_complete(cstate->session->se_client);
 out:
 	return status;
 }
@@ -4698,7 +4716,6 @@ nfsd4_end_grace(struct nfsd_net *nn)
 	if (nn->grace_ended)
 		return;
 
-	dprintk("NFSD: end of grace period\n");
 	nn->grace_ended = true;
 	/*
 	 * If the server goes down again right now, an NFSv4
@@ -4734,6 +4751,10 @@ static bool clients_still_reclaiming(struct nfsd_net *nn)
 	unsigned long double_grace_period_end = nn->boot_time +
 						2 * nn->nfsd4_lease;
 
+	if (nn->track_reclaim_completes &&
+			atomic_read(&nn->nr_reclaim_complete) ==
+			nn->reclaim_str_hashtbl_size)
+		return false;
 	if (!nn->somebody_reclaimed)
 		return false;
 	nn->somebody_reclaimed = false;
@@ -4764,6 +4785,7 @@ nfs4_laundromat(struct nfsd_net *nn)
 		new_timeo = 0;
 		goto out;
 	}
+	dprintk("NFSD: end of grace period\n");
 	nfsd4_end_grace(nn);
 	INIT_LIST_HEAD(&reaplist);
 	spin_lock(&nn->client_lock);
@@ -7253,10 +7275,19 @@ nfs4_state_start_net(struct net *net)
 		return ret;
 	locks_start_grace(net, &nn->nfsd4_manager);
 	nfsd4_client_tracking_init(net);
+	if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0)
+		goto skip_grace;
 	printk(KERN_INFO "NFSD: starting %ld-second grace period (net %x)\n",
 	       nn->nfsd4_grace, net->ns.inum);
 	queue_delayed_work(laundry_wq, &nn->laundromat_work, nn->nfsd4_grace * HZ);
 	return 0;
+
+skip_grace:
+	printk(KERN_INFO "NFSD: no clients to reclaim, skipping NFSv4 grace period (net %x)\n",
+			net->ns.inum);
+	queue_delayed_work(laundry_wq, &nn->laundromat_work, nn->nfsd4_lease * HZ);
+	nfsd4_end_grace(nn);
+	return 0;
 }
 
 /* initialization to perform when the nfsd service is started: */
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index f2feb2d11bae..6667d3a4f839 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1242,6 +1242,7 @@ static __net_init int nfsd_init_net(struct net *net)
 	nn->nfsd4_lease = 90;	/* default lease time */
 	nn->nfsd4_grace = 90;
 	nn->somebody_reclaimed = false;
+	nn->track_reclaim_completes = false;
 	nn->clverifier_counter = prandom_u32();
 	nn->clientid_counter = prandom_u32();
 	nn->s2s_cp_cl_id = nn->clientid_counter++;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v3 4/5] nfsd: re-order client tracking method selection
  2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
                   ` (2 preceding siblings ...)
  2019-03-26 22:06 ` [PATCH RFC v3 3/5] nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld Scott Mayhew
@ 2019-03-26 22:06 ` Scott Mayhew
  2019-04-05 20:40   ` J. Bruce Fields
  2019-03-26 22:06 ` [PATCH RFC v3 5/5] nfsd: handle legacy client tracking records sent by nfsdcld Scott Mayhew
  2019-03-28 21:17 ` [PATCH RFC v3 0/5] un-deprecate nfsdcld J. Bruce Fields
  5 siblings, 1 reply; 13+ messages in thread
From: Scott Mayhew @ 2019-03-26 22:06 UTC (permalink / raw)
  To: bfields, jlayton; +Cc: linux-nfs

The new order is first nfsdcld, then the UMH upcall, and finally the
legacy tracking method.  Added some printk's to the tracking
initialization functions so it's clear which tracking method was
ultimately selected.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfsd/nfs4recover.c | 80 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 65 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 09e3c9352778..d60001da1dbf 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -628,6 +628,7 @@ nfsd4_legacy_tracking_init(struct net *net)
 	status = nfsd4_load_reboot_recovery_data(net);
 	if (status)
 		goto err;
+	printk("NFSD: Using legacy client tracking operations.\n");
 	return 0;
 
 err:
@@ -937,7 +938,7 @@ nfsd4_cld_unregister_net(struct net *net, struct rpc_pipe *pipe)
 
 /* Initialize rpc_pipefs pipe for communication with client tracking daemon */
 static int
-nfsd4_init_cld_pipe(struct net *net)
+__nfsd4_init_cld_pipe(struct net *net)
 {
 	int ret;
 	struct dentry *dentry;
@@ -980,6 +981,17 @@ nfsd4_init_cld_pipe(struct net *net)
 	return ret;
 }
 
+static int
+nfsd4_init_cld_pipe(struct net *net)
+{
+	int status;
+
+	status = __nfsd4_init_cld_pipe(net);
+	if (!status)
+		printk("NFSD: Using old nfsdcld client tracking operations.\n");
+	return status;
+}
+
 static void
 nfsd4_remove_cld_pipe(struct net *net)
 {
@@ -1285,27 +1297,55 @@ nfs4_cld_state_shutdown(struct net *net)
 	kfree(nn->reclaim_str_hashtbl);
 }
 
+static bool
+cld_running(struct nfsd_net *nn)
+{
+	struct cld_net *cn = nn->cld_net;
+	struct rpc_pipe *pipe = cn->cn_pipe;
+
+	return pipe->nreaders || pipe->nwriters;
+}
+
 static int
 nfsd4_cld_tracking_init(struct net *net)
 {
 	int status;
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+	bool running;
+	int retries = 10;
 
 	status = nfs4_cld_state_init(net);
 	if (status)
 		return status;
 
-	status = nfsd4_init_cld_pipe(net);
+	status = __nfsd4_init_cld_pipe(net);
 	if (status)
 		goto err_shutdown;
 
+	/*
+	 * rpc pipe upcalls take 30 seconds to time out, so we don't want to
+	 * queue an upcall unless we know that nfsdcld is running (because we
+	 * want this to fail fast so that nfsd4_client_tracking_init() can try
+	 * the next client tracking method).  nfsdcld should already be running
+	 * before nfsd is started, so the wait here is for nfsdcld to open the
+	 * pipefs file we just created.
+	 */
+	while (!(running = cld_running(nn)) && retries--)
+		msleep(100);
+
+	if (!running) {
+		status = -ETIMEDOUT;
+		goto err_remove;
+	}
+
 	status = nfsd4_cld_grace_start(nn);
 	if (status) {
 		if (status == -EOPNOTSUPP)
 			printk(KERN_WARNING "NFSD: Please upgrade nfsdcld.\n");
 		nfs4_release_reclaim(nn);
 		goto err_remove;
-	}
+	} else
+		printk("NFSD: Using nfsdcld client tracking operations.\n");
 	return 0;
 
 err_remove:
@@ -1552,6 +1592,8 @@ nfsd4_umh_cltrack_init(struct net *net)
 
 	ret = nfsd4_umh_cltrack_upcall("init", NULL, grace_start, NULL);
 	kfree(grace_start);
+	if (!ret)
+		printk("NFSD: Using UMH upcall client tracking operations.\n");
 	return ret;
 }
 
@@ -1701,9 +1743,20 @@ nfsd4_client_tracking_init(struct net *net)
 	if (nn->client_tracking_ops)
 		goto do_init;
 
+	/* First, try to use nfsdcld */
+	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
+	status = nn->client_tracking_ops->init(net);
+	if (!status)
+		return status;
+	if (status != -ETIMEDOUT) {
+		nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
+		status = nn->client_tracking_ops->init(net);
+		if (!status)
+			return status;
+	}
+
 	/*
-	 * First, try a UMH upcall. It should succeed or fail quickly, so
-	 * there's little harm in trying that first.
+	 * Next, try the UMH upcall.
 	 */
 	nn->client_tracking_ops = &nfsd4_umh_tracking_ops;
 	status = nn->client_tracking_ops->init(net);
@@ -1711,26 +1764,23 @@ nfsd4_client_tracking_init(struct net *net)
 		return status;
 
 	/*
-	 * See if the recoverydir exists and is a directory. If it is,
-	 * then use the legacy ops.
+	 * Finally, See if the recoverydir exists and is a directory.
+	 * If it is, then use the legacy ops.
 	 */
 	nn->client_tracking_ops = &nfsd4_legacy_tracking_ops;
 	status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
 	if (!status) {
 		status = d_is_dir(path.dentry);
 		path_put(&path);
-		if (status)
-			goto do_init;
+		if (!status) {
+			status = -EINVAL;
+			goto out;
+		}
 	}
 
-	/* Finally, try to use nfsdcld */
-	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
-	status = nn->client_tracking_ops->init(net);
-	if (!status)
-		return status;
-	nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
 do_init:
 	status = nn->client_tracking_ops->init(net);
+out:
 	if (status) {
 		printk(KERN_WARNING "NFSD: Unable to initialize client "
 				    "recovery tracking! (%d)\n", status);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v3 5/5] nfsd: handle legacy client tracking records sent by nfsdcld
  2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
                   ` (3 preceding siblings ...)
  2019-03-26 22:06 ` [PATCH RFC v3 4/5] nfsd: re-order client tracking method selection Scott Mayhew
@ 2019-03-26 22:06 ` Scott Mayhew
  2019-03-28 21:17 ` [PATCH RFC v3 0/5] un-deprecate nfsdcld J. Bruce Fields
  5 siblings, 0 replies; 13+ messages in thread
From: Scott Mayhew @ 2019-03-26 22:06 UTC (permalink / raw)
  To: bfields, jlayton; +Cc: linux-nfs

The new nfsdcld will do a one-time "upgrade" where it searches for
records from nfsdcltrack and the legacy tracking during startup.
Legacy records will be prefixed with the string "hash:", which we need
to strip off before adding to the reclaim_str_hashtbl.  When legacy
records are encountered, set the new cn_has_legacy flag in the cld_net.
When this flag is set, if the search for a reclaim record based on the
client name string fails, then do a second search based on the hash of
the name string.

Note that if there are legacy records then the grace period will not
be lifted early via the tracking of RECLAIM_COMPLETEs.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfsd/nfs4recover.c | 39 +++++++++++++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index d60001da1dbf..3a123c88120a 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -731,6 +731,7 @@ struct cld_net {
 	spinlock_t		 cn_lock;
 	struct list_head	 cn_list;
 	unsigned int		 cn_xid;
+	bool			 cn_has_legacy;
 };
 
 struct cld_upcall {
@@ -787,6 +788,7 @@ __cld_pipe_inprogress_downcall(const struct cld_msg __user *cmsg,
 	uint8_t cmd;
 	struct xdr_netobj name;
 	uint16_t namelen;
+	struct cld_net *cn = nn->cld_net;
 
 	if (get_user(cmd, &cmsg->cm_cmd)) {
 		dprintk("%s: error when copying cmd from userspace", __func__);
@@ -799,6 +801,11 @@ __cld_pipe_inprogress_downcall(const struct cld_msg __user *cmsg,
 		if (IS_ERR_OR_NULL(name.data))
 			return -EFAULT;
 		name.len = namelen;
+		if (name.len > 5 && memcmp(name.data, "hash:", 5) == 0) {
+			name.len = name.len - 5;
+			memmove(name.data, name.data + 5, name.len);
+			cn->cn_has_legacy = true;
+		}
 		if (!nfs4_client_to_reclaim(name, nn)) {
 			kfree(name.data);
 			return -EFAULT;
@@ -969,6 +976,7 @@ __nfsd4_init_cld_pipe(struct net *net)
 	}
 
 	cn->cn_pipe->dentry = dentry;
+	cn->cn_has_legacy = false;
 	nn->cld_net = cn;
 	return 0;
 
@@ -1171,6 +1179,10 @@ nfsd4_cld_check(struct nfs4_client *clp)
 {
 	struct nfs4_client_reclaim *crp;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+	struct cld_net *cn = nn->cld_net;
+	int status;
+	char dname[HEXDIR_LEN];
+	struct xdr_netobj name;
 
 	/* did we already find that this client is stable? */
 	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
@@ -1178,12 +1190,31 @@ nfsd4_cld_check(struct nfs4_client *clp)
 
 	/* look for it in the reclaim hashtable otherwise */
 	crp = nfsd4_find_reclaim_client(clp->cl_name, nn);
-	if (crp) {
-		crp->cr_clp = clp;
-		return 0;
-	}
+	if (crp)
+		goto found;
+
+	if (cn->cn_has_legacy) {
+		status = nfs4_make_rec_clidname(dname, &clp->cl_name);
+		if (status)
+			return -ENOENT;
+
+		name.data = kmemdup(dname, HEXDIR_LEN, GFP_KERNEL);
+		if (!name.data) {
+			dprintk("%s: failed to allocate memory for name.data!\n",
+				__func__);
+			return -ENOENT;
+		}
+		name.len = HEXDIR_LEN;
+		crp = nfsd4_find_reclaim_client(name, nn);
+		kfree(name.data);
+		if (crp)
+			goto found;
 
+	}
 	return -ENOENT;
+found:
+	crp->cr_clp = clp;
+	return 0;
 }
 
 static int
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 0/5] un-deprecate nfsdcld
  2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
                   ` (4 preceding siblings ...)
  2019-03-26 22:06 ` [PATCH RFC v3 5/5] nfsd: handle legacy client tracking records sent by nfsdcld Scott Mayhew
@ 2019-03-28 21:17 ` J. Bruce Fields
  2019-03-29 12:16   ` Scott Mayhew
  5 siblings, 1 reply; 13+ messages in thread
From: J. Bruce Fields @ 2019-03-28 21:17 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: jlayton, linux-nfs

Thanks!  It looks good to me on a first skim, but I probably won't get a
proper look at it till next week.

You say "RFC"--are there still known issue that you're working on?

--b.

On Tue, Mar 26, 2019 at 06:06:25PM -0400, Scott Mayhew wrote:
> When nfsdcld was released, it was quickly deprecated in favor of the
> nfsdcltrack usermodehelper, so as to not require another running daemon.
> That prevents NFSv4 clients from reclaiming locks from nfsd's running in
> containers, since neither nfsdcltrack nor the legacy client tracking
> code work in containers.  These patches un-deprecate the use of nfsdcld
> for NFSv4 client tracking.
> 
> These patches are intended to go alongside some nfs-utils patches that
> introduce an enhancement that allows nfsd to "slurp" up the client
> records during client tracking initialization and store them internally
> in hash table.  This enables nfsd to check whether an NFSv4 client is
> allowed to reclaim without having to do an upcall to nfsdcld.  It also
> allows nfsd to decide to end the v4 grace period early if the number of
> RECLAIM_COMPLETE operations it has received from "known" clients is
> equal to the number of entries in the hash table.  It also allows nfsd
> to skip the v4 grace period altogether if it knows there are no clients
> allowed to reclaim.
> 
> There is a fallback to allow nfsd to continue to work with older nfsdcld
> daemons in the event that any are out in the wild (unlikely).
> Everything should work fine except nfsd will not be able to exit the
> grace period early or skip the grace period altogether.
> 
> v3:
> - nfs4_state_start_net() now calls nfsd4_end_grace() instead of
>   open-coding it
> - Removed some unnecessary initializations of nr_reclaim_complete
> - Removed dec_reclaim_complete() altogether.  If we're calling
>   expire_client() as a result of receiving a DESTROY_CLIENTID or
>   SETCLIENTID_CONFIRM then we wouldn't want to decrement the count.  And
>   laundromat thread never expires clients due to loss of lease until the
>   grace period is over (the laundromat gets rescheduled by 1 second as
>   long as clients are reclaiming during that 1 second, for up to 2x the
>   original grace period in total).  So dec_reclaim_complete() is
>   unnecessary.
> - Changed the preference order of client tracking methods.  The new
>   order is 1) new nfdscld, 2) old nfsdcld, 3) nfsdcltrack, 4) legacy
>   v4recovery directory.
> - Added some special handling of legacy records sent by nfsdcld in the
>   GraceStart downcalls.  nfsdcld will do a one-time "upgrade" from old
>   tracking methods.  No changes are needed to accomodate nfsdcltrack
>   records, but legacy records will be prefixed with the string "hash:", 
>   and in the event that we do have legacy records in the
>   reclaim_str_hashtbl we may need to do a second lookup using the hash
>   in the event that a lookup using the client name string fails.
> 
> v2:
> - Addressed some coding style issues in nfsd4_create_clid_dir() &
>   nfsd4_remove_clid_dir()
> 
> Scott Mayhew (5):
>   nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed
>     char array
>   nfsd: un-deprecate nfsdcld
>   nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
>   nfsd: re-order client tracking method selection
>   nfsd: handle legacy client tracking records sent by nfsdcld
> 
>  fs/nfsd/netns.h               |   3 +
>  fs/nfsd/nfs4recover.c         | 436 +++++++++++++++++++++++++++++++---
>  fs/nfsd/nfs4state.c           |  63 +++--
>  fs/nfsd/nfsctl.c              |   1 +
>  fs/nfsd/state.h               |   8 +-
>  include/uapi/linux/nfsd/cld.h |   1 +
>  6 files changed, 457 insertions(+), 55 deletions(-)
> 
> -- 
> 2.17.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 0/5] un-deprecate nfsdcld
  2019-03-28 21:17 ` [PATCH RFC v3 0/5] un-deprecate nfsdcld J. Bruce Fields
@ 2019-03-29 12:16   ` Scott Mayhew
  2019-04-05 20:43     ` J. Bruce Fields
  0 siblings, 1 reply; 13+ messages in thread
From: Scott Mayhew @ 2019-03-29 12:16 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, linux-nfs

On Thu, 28 Mar 2019, J. Bruce Fields wrote:

> Thanks!  It looks good to me on a first skim, but I probably won't get a
> proper look at it till next week.
> 
> You say "RFC"--are there still known issue that you're working on?

No, I just added RFC because of the new stuff I added for handling the
legacy records.  Plus I wasn't sure if you would have a problem with the
server not lifting the grace period early if there were legacy records.
I could do it if I moved some code out of nfs4recover.c but I chose not
to since it should really be a one-time thing and the original legacy
tracking didn't lift the grace period early anyway.

-Scott
> 
> --b.
> 
> On Tue, Mar 26, 2019 at 06:06:25PM -0400, Scott Mayhew wrote:
> > When nfsdcld was released, it was quickly deprecated in favor of the
> > nfsdcltrack usermodehelper, so as to not require another running daemon.
> > That prevents NFSv4 clients from reclaiming locks from nfsd's running in
> > containers, since neither nfsdcltrack nor the legacy client tracking
> > code work in containers.  These patches un-deprecate the use of nfsdcld
> > for NFSv4 client tracking.
> > 
> > These patches are intended to go alongside some nfs-utils patches that
> > introduce an enhancement that allows nfsd to "slurp" up the client
> > records during client tracking initialization and store them internally
> > in hash table.  This enables nfsd to check whether an NFSv4 client is
> > allowed to reclaim without having to do an upcall to nfsdcld.  It also
> > allows nfsd to decide to end the v4 grace period early if the number of
> > RECLAIM_COMPLETE operations it has received from "known" clients is
> > equal to the number of entries in the hash table.  It also allows nfsd
> > to skip the v4 grace period altogether if it knows there are no clients
> > allowed to reclaim.
> > 
> > There is a fallback to allow nfsd to continue to work with older nfsdcld
> > daemons in the event that any are out in the wild (unlikely).
> > Everything should work fine except nfsd will not be able to exit the
> > grace period early or skip the grace period altogether.
> > 
> > v3:
> > - nfs4_state_start_net() now calls nfsd4_end_grace() instead of
> >   open-coding it
> > - Removed some unnecessary initializations of nr_reclaim_complete
> > - Removed dec_reclaim_complete() altogether.  If we're calling
> >   expire_client() as a result of receiving a DESTROY_CLIENTID or
> >   SETCLIENTID_CONFIRM then we wouldn't want to decrement the count.  And
> >   laundromat thread never expires clients due to loss of lease until the
> >   grace period is over (the laundromat gets rescheduled by 1 second as
> >   long as clients are reclaiming during that 1 second, for up to 2x the
> >   original grace period in total).  So dec_reclaim_complete() is
> >   unnecessary.
> > - Changed the preference order of client tracking methods.  The new
> >   order is 1) new nfdscld, 2) old nfsdcld, 3) nfsdcltrack, 4) legacy
> >   v4recovery directory.
> > - Added some special handling of legacy records sent by nfsdcld in the
> >   GraceStart downcalls.  nfsdcld will do a one-time "upgrade" from old
> >   tracking methods.  No changes are needed to accomodate nfsdcltrack
> >   records, but legacy records will be prefixed with the string "hash:", 
> >   and in the event that we do have legacy records in the
> >   reclaim_str_hashtbl we may need to do a second lookup using the hash
> >   in the event that a lookup using the client name string fails.
> > 
> > v2:
> > - Addressed some coding style issues in nfsd4_create_clid_dir() &
> >   nfsd4_remove_clid_dir()
> > 
> > Scott Mayhew (5):
> >   nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed
> >     char array
> >   nfsd: un-deprecate nfsdcld
> >   nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
> >   nfsd: re-order client tracking method selection
> >   nfsd: handle legacy client tracking records sent by nfsdcld
> > 
> >  fs/nfsd/netns.h               |   3 +
> >  fs/nfsd/nfs4recover.c         | 436 +++++++++++++++++++++++++++++++---
> >  fs/nfsd/nfs4state.c           |  63 +++--
> >  fs/nfsd/nfsctl.c              |   1 +
> >  fs/nfsd/state.h               |   8 +-
> >  include/uapi/linux/nfsd/cld.h |   1 +
> >  6 files changed, 457 insertions(+), 55 deletions(-)
> > 
> > -- 
> > 2.17.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld
  2019-03-26 22:06 ` [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld Scott Mayhew
@ 2019-04-05 18:54   ` J. Bruce Fields
  2019-04-05 19:26     ` Scott Mayhew
  0 siblings, 1 reply; 13+ messages in thread
From: J. Bruce Fields @ 2019-04-05 18:54 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: jlayton, linux-nfs

On Tue, Mar 26, 2019 at 06:06:27PM -0400, Scott Mayhew wrote:
> When nfsdcld was released, it was quickly deprecated in favor of the
> nfsdcltrack usermodehelper, so as to not require another running daemon.
> That prevents NFSv4 clients from reclaiming locks from nfsd's running in
> containers, since neither nfsdcltrack nor the legacy client tracking
> code work in containers.
> 
> This commit un-deprecates the use of nfsdcld, with one twist: we will
> populate the reclaim_str_hashtbl on startup.
> 
> During client tracking initialization, do an upcall ("GraceStart") to
> nfsdcld to get a list of clients from the database.  nfsdcld will do
> one downcall with a status of -EINPROGRESS for each client record in
> the database, which in turn will cause an nfs4_client_reclaim to be
> added to the reclaim_str_hashtbl.  When complete, nfsdcld will do a
> final downcall with a status of 0.
> 
> This will save nfsd from having to do an upcall to the daemon during
> nfs4_check_open_reclaim() processing.
> 
> Even though nfsdcld was quickly deprecated, there is a very small chance
> of old nfsdcld daemons running in the wild.  These will respond to the
> new "GraceStart" upcall with -EOPNOTSUPP, in which case we will log a
> message and fall back to the original nfsdcld tracking ops (now called
> nfsd4_cld_tracking_ops_v0).
> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfsd/nfs4recover.c         | 225 ++++++++++++++++++++++++++++++++--
>  include/uapi/linux/nfsd/cld.h |   1 +
>  2 files changed, 218 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 2243b909b407..89c2a27956d0 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -779,6 +779,34 @@ cld_pipe_upcall(struct rpc_pipe *pipe, struct cld_msg *cmsg)
>  	return ret;
>  }
>  
> +static ssize_t
> +__cld_pipe_inprogress_downcall(const struct cld_msg __user *cmsg,
> +		struct nfsd_net *nn)
> +{
> +	uint8_t cmd;
> +	struct xdr_netobj name;
> +	uint16_t namelen;
> +
> +	if (get_user(cmd, &cmsg->cm_cmd)) {
> +		dprintk("%s: error when copying cmd from userspace", __func__);
> +		return -EFAULT;
> +	}
> +	if (cmd == Cld_GraceStart) {
> +		if (get_user(namelen, &cmsg->cm_u.cm_name.cn_len))
> +			return -EFAULT;
> +		name.data = memdup_user(&cmsg->cm_u.cm_name.cn_id, namelen);
> +		if (IS_ERR_OR_NULL(name.data))
> +			return -EFAULT;
> +		name.len = namelen;
> +		if (!nfs4_client_to_reclaim(name, nn)) {
> +			kfree(name.data);
> +			return -EFAULT;
> +		}
> +		return sizeof(*cmsg);
> +	}
> +	return -EFAULT;
> +}
> +
>  static ssize_t
>  cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
>  {
> @@ -788,6 +816,7 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
>  	struct nfsd_net *nn = net_generic(file_inode(filp)->i_sb->s_fs_info,
>  						nfsd_net_id);
>  	struct cld_net *cn = nn->cld_net;
> +	int16_t status;
>  
>  	if (mlen != sizeof(*cmsg)) {
>  		dprintk("%s: got %zu bytes, expected %zu\n", __func__, mlen,
> @@ -801,13 +830,24 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
>  		return -EFAULT;
>  	}
>  
> +	/*
> +	 * copy the status so we know whether to remove the upcall from the
> +	 * list (for -EINPROGRESS, we just want to make sure the xid is
> +	 * valid, not remove the upcall from the list)
> +	 */
> +	if (get_user(status, &cmsg->cm_status)) {
> +		dprintk("%s: error when copying status from userspace", __func__);
> +		return -EFAULT;
> +	}
> +
>  	/* walk the list and find corresponding xid */
>  	cup = NULL;
>  	spin_lock(&cn->cn_lock);
>  	list_for_each_entry(tmp, &cn->cn_list, cu_list) {
>  		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
>  			cup = tmp;
> -			list_del_init(&cup->cu_list);
> +			if (status != -EINPROGRESS)
> +				list_del_init(&cup->cu_list);
>  			break;
>  		}
>  	}
> @@ -819,6 +859,9 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
>  		return -EINVAL;
>  	}
>  
> +	if (status == -EINPROGRESS)
> +		return __cld_pipe_inprogress_downcall(cmsg, nn);
> +
>  	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
>  		return -EFAULT;
>  
> @@ -1065,9 +1108,14 @@ nfsd4_cld_remove(struct nfs4_client *clp)
>  				"record from stable storage: %d\n", ret);
>  }
>  
> -/* Check for presence of a record, and update its timestamp */
> +/*
> + * For older nfsdcld's that do not allow us to "slurp" the clients
> + * from the tracking database during startup.
> + *
> + * Check for presence of a record, and update its timestamp
> + */
>  static int
> -nfsd4_cld_check(struct nfs4_client *clp)
> +nfsd4_cld_check_v0(struct nfs4_client *clp)
>  {
>  	int ret;
>  	struct cld_upcall *cup;
> @@ -1100,8 +1148,61 @@ nfsd4_cld_check(struct nfs4_client *clp)
>  	return ret;
>  }
>  
> +/*
> + * For newer nfsdcld's that allow us to "slurp" the clients
> + * from the tracking database during startup.
> + *
> + * Check for presence of a record in the reclaim_str_hashtbl
> + */
> +static int
> +nfsd4_cld_check(struct nfs4_client *clp)
> +{
> +	struct nfs4_client_reclaim *crp;
> +	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> +
> +	/* did we already find that this client is stable? */
> +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> +		return 0;
> +
> +	/* look for it in the reclaim hashtable otherwise */
> +	crp = nfsd4_find_reclaim_client(clp->cl_name, nn);
> +	if (crp) {
> +		crp->cr_clp = clp;
> +		return 0;
> +	}
> +
> +	return -ENOENT;

Does this need to set NFSD4_CLIENT_STABLE as well, like
nfsd4_cld_check_v0 does?

--b.

> +}
> +
> +static int
> +nfsd4_cld_grace_start(struct nfsd_net *nn)
> +{
> +	int ret;
> +	struct cld_upcall *cup;
> +	struct cld_net *cn = nn->cld_net;
> +
> +	cup = alloc_cld_upcall(cn);
> +	if (!cup) {
> +		ret = -ENOMEM;
> +		goto out_err;
> +	}
> +
> +	cup->cu_msg.cm_cmd = Cld_GraceStart;
> +	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
> +	if (!ret)
> +		ret = cup->cu_msg.cm_status;
> +
> +	free_cld_upcall(cup);
> +out_err:
> +	if (ret)
> +		dprintk("%s: Unable to get clients from userspace: %d\n",
> +			__func__, ret);
> +	return ret;
> +}
> +
> +/* For older nfsdcld's that need cm_gracetime */
>  static void
> -nfsd4_cld_grace_done(struct nfsd_net *nn)
> +nfsd4_cld_grace_done_v0(struct nfsd_net *nn)
>  {
>  	int ret;
>  	struct cld_upcall *cup;
> @@ -1125,11 +1226,118 @@ nfsd4_cld_grace_done(struct nfsd_net *nn)
>  		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
>  }
>  
> -static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> +/*
> + * For newer nfsdcld's that do not need cm_gracetime.  We also need to call
> + * nfs4_release_reclaim() to clear out the reclaim_str_hashtbl.
> + */
> +static void
> +nfsd4_cld_grace_done(struct nfsd_net *nn)
> +{
> +	int ret;
> +	struct cld_upcall *cup;
> +	struct cld_net *cn = nn->cld_net;
> +
> +	cup = alloc_cld_upcall(cn);
> +	if (!cup) {
> +		ret = -ENOMEM;
> +		goto out_err;
> +	}
> +
> +	cup->cu_msg.cm_cmd = Cld_GraceDone;
> +	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
> +	if (!ret)
> +		ret = cup->cu_msg.cm_status;
> +
> +	free_cld_upcall(cup);
> +out_err:
> +	nfs4_release_reclaim(nn);
> +	if (ret)
> +		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> +}
> +
> +static int
> +nfs4_cld_state_init(struct net *net)
> +{
> +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> +	int i;
> +
> +	nn->reclaim_str_hashtbl = kmalloc_array(CLIENT_HASH_SIZE,
> +						sizeof(struct list_head),
> +						GFP_KERNEL);
> +	if (!nn->reclaim_str_hashtbl)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < CLIENT_HASH_SIZE; i++)
> +		INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]);
> +	nn->reclaim_str_hashtbl_size = 0;
> +
> +	return 0;
> +}
> +
> +static void
> +nfs4_cld_state_shutdown(struct net *net)
> +{
> +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> +
> +	kfree(nn->reclaim_str_hashtbl);
> +}
> +
> +static int
> +nfsd4_cld_tracking_init(struct net *net)
> +{
> +	int status;
> +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> +
> +	status = nfs4_cld_state_init(net);
> +	if (status)
> +		return status;
> +
> +	status = nfsd4_init_cld_pipe(net);
> +	if (status)
> +		goto err_shutdown;
> +
> +	status = nfsd4_cld_grace_start(nn);
> +	if (status) {
> +		if (status == -EOPNOTSUPP)
> +			printk(KERN_WARNING "NFSD: Please upgrade nfsdcld.\n");
> +		nfs4_release_reclaim(nn);
> +		goto err_remove;
> +	}
> +	return 0;
> +
> +err_remove:
> +	nfsd4_remove_cld_pipe(net);
> +err_shutdown:
> +	nfs4_cld_state_shutdown(net);
> +	return status;
> +}
> +
> +static void
> +nfsd4_cld_tracking_exit(struct net *net)
> +{
> +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> +
> +	nfs4_release_reclaim(nn);
> +	nfsd4_remove_cld_pipe(net);
> +	nfs4_cld_state_shutdown(net);
> +}
> +
> +/* For older nfsdcld's */
> +static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops_v0 = {
>  	.init		= nfsd4_init_cld_pipe,
>  	.exit		= nfsd4_remove_cld_pipe,
>  	.create		= nfsd4_cld_create,
>  	.remove		= nfsd4_cld_remove,
> +	.check		= nfsd4_cld_check_v0,
> +	.grace_done	= nfsd4_cld_grace_done_v0,
> +};
> +
> +/* For newer nfsdcld's */
> +static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> +	.init		= nfsd4_cld_tracking_init,
> +	.exit		= nfsd4_cld_tracking_exit,
> +	.create		= nfsd4_cld_create,
> +	.remove		= nfsd4_cld_remove,
>  	.check		= nfsd4_cld_check,
>  	.grace_done	= nfsd4_cld_grace_done,
>  };
> @@ -1514,9 +1722,10 @@ nfsd4_client_tracking_init(struct net *net)
>  
>  	/* Finally, try to use nfsdcld */
>  	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
> -	printk(KERN_WARNING "NFSD: the nfsdcld client tracking upcall will be "
> -			"removed in 3.10. Please transition to using "
> -			"nfsdcltrack.\n");
> +	status = nn->client_tracking_ops->init(net);
> +	if (!status)
> +		return status;
> +	nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
>  do_init:
>  	status = nn->client_tracking_ops->init(net);
>  	if (status) {
> diff --git a/include/uapi/linux/nfsd/cld.h b/include/uapi/linux/nfsd/cld.h
> index f8f5cccad749..b1e9de4f07d5 100644
> --- a/include/uapi/linux/nfsd/cld.h
> +++ b/include/uapi/linux/nfsd/cld.h
> @@ -36,6 +36,7 @@ enum cld_command {
>  	Cld_Remove,		/* remove record of this cm_id */
>  	Cld_Check,		/* is this cm_id allowed? */
>  	Cld_GraceDone,		/* grace period is complete */
> +	Cld_GraceStart,
>  };
>  
>  /* representation of long-form NFSv4 client ID */
> -- 
> 2.17.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld
  2019-04-05 18:54   ` J. Bruce Fields
@ 2019-04-05 19:26     ` Scott Mayhew
  2019-04-05 20:11       ` J. Bruce Fields
  0 siblings, 1 reply; 13+ messages in thread
From: Scott Mayhew @ 2019-04-05 19:26 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, linux-nfs

On Fri, 05 Apr 2019, J. Bruce Fields wrote:

> On Tue, Mar 26, 2019 at 06:06:27PM -0400, Scott Mayhew wrote:
> > When nfsdcld was released, it was quickly deprecated in favor of the
> > nfsdcltrack usermodehelper, so as to not require another running daemon.
> > That prevents NFSv4 clients from reclaiming locks from nfsd's running in
> > containers, since neither nfsdcltrack nor the legacy client tracking
> > code work in containers.
> > 
> > This commit un-deprecates the use of nfsdcld, with one twist: we will
> > populate the reclaim_str_hashtbl on startup.
> > 
> > During client tracking initialization, do an upcall ("GraceStart") to
> > nfsdcld to get a list of clients from the database.  nfsdcld will do
> > one downcall with a status of -EINPROGRESS for each client record in
> > the database, which in turn will cause an nfs4_client_reclaim to be
> > added to the reclaim_str_hashtbl.  When complete, nfsdcld will do a
> > final downcall with a status of 0.
> > 
> > This will save nfsd from having to do an upcall to the daemon during
> > nfs4_check_open_reclaim() processing.
> > 
> > Even though nfsdcld was quickly deprecated, there is a very small chance
> > of old nfsdcld daemons running in the wild.  These will respond to the
> > new "GraceStart" upcall with -EOPNOTSUPP, in which case we will log a
> > message and fall back to the original nfsdcld tracking ops (now called
> > nfsd4_cld_tracking_ops_v0).
> > 
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> >  fs/nfsd/nfs4recover.c         | 225 ++++++++++++++++++++++++++++++++--
> >  include/uapi/linux/nfsd/cld.h |   1 +
> >  2 files changed, 218 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > index 2243b909b407..89c2a27956d0 100644
> > --- a/fs/nfsd/nfs4recover.c
> > +++ b/fs/nfsd/nfs4recover.c
> > @@ -779,6 +779,34 @@ cld_pipe_upcall(struct rpc_pipe *pipe, struct cld_msg *cmsg)
> >  	return ret;
> >  }
> >  
> > +static ssize_t
> > +__cld_pipe_inprogress_downcall(const struct cld_msg __user *cmsg,
> > +		struct nfsd_net *nn)
> > +{
> > +	uint8_t cmd;
> > +	struct xdr_netobj name;
> > +	uint16_t namelen;
> > +
> > +	if (get_user(cmd, &cmsg->cm_cmd)) {
> > +		dprintk("%s: error when copying cmd from userspace", __func__);
> > +		return -EFAULT;
> > +	}
> > +	if (cmd == Cld_GraceStart) {
> > +		if (get_user(namelen, &cmsg->cm_u.cm_name.cn_len))
> > +			return -EFAULT;
> > +		name.data = memdup_user(&cmsg->cm_u.cm_name.cn_id, namelen);
> > +		if (IS_ERR_OR_NULL(name.data))
> > +			return -EFAULT;
> > +		name.len = namelen;
> > +		if (!nfs4_client_to_reclaim(name, nn)) {
> > +			kfree(name.data);
> > +			return -EFAULT;
> > +		}
> > +		return sizeof(*cmsg);
> > +	}
> > +	return -EFAULT;
> > +}
> > +
> >  static ssize_t
> >  cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> >  {
> > @@ -788,6 +816,7 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> >  	struct nfsd_net *nn = net_generic(file_inode(filp)->i_sb->s_fs_info,
> >  						nfsd_net_id);
> >  	struct cld_net *cn = nn->cld_net;
> > +	int16_t status;
> >  
> >  	if (mlen != sizeof(*cmsg)) {
> >  		dprintk("%s: got %zu bytes, expected %zu\n", __func__, mlen,
> > @@ -801,13 +830,24 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> >  		return -EFAULT;
> >  	}
> >  
> > +	/*
> > +	 * copy the status so we know whether to remove the upcall from the
> > +	 * list (for -EINPROGRESS, we just want to make sure the xid is
> > +	 * valid, not remove the upcall from the list)
> > +	 */
> > +	if (get_user(status, &cmsg->cm_status)) {
> > +		dprintk("%s: error when copying status from userspace", __func__);
> > +		return -EFAULT;
> > +	}
> > +
> >  	/* walk the list and find corresponding xid */
> >  	cup = NULL;
> >  	spin_lock(&cn->cn_lock);
> >  	list_for_each_entry(tmp, &cn->cn_list, cu_list) {
> >  		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
> >  			cup = tmp;
> > -			list_del_init(&cup->cu_list);
> > +			if (status != -EINPROGRESS)
> > +				list_del_init(&cup->cu_list);
> >  			break;
> >  		}
> >  	}
> > @@ -819,6 +859,9 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> >  		return -EINVAL;
> >  	}
> >  
> > +	if (status == -EINPROGRESS)
> > +		return __cld_pipe_inprogress_downcall(cmsg, nn);
> > +
> >  	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
> >  		return -EFAULT;
> >  
> > @@ -1065,9 +1108,14 @@ nfsd4_cld_remove(struct nfs4_client *clp)
> >  				"record from stable storage: %d\n", ret);
> >  }
> >  
> > -/* Check for presence of a record, and update its timestamp */
> > +/*
> > + * For older nfsdcld's that do not allow us to "slurp" the clients
> > + * from the tracking database during startup.
> > + *
> > + * Check for presence of a record, and update its timestamp
> > + */
> >  static int
> > -nfsd4_cld_check(struct nfs4_client *clp)
> > +nfsd4_cld_check_v0(struct nfs4_client *clp)
> >  {
> >  	int ret;
> >  	struct cld_upcall *cup;
> > @@ -1100,8 +1148,61 @@ nfsd4_cld_check(struct nfs4_client *clp)
> >  	return ret;
> >  }
> >  
> > +/*
> > + * For newer nfsdcld's that allow us to "slurp" the clients
> > + * from the tracking database during startup.
> > + *
> > + * Check for presence of a record in the reclaim_str_hashtbl
> > + */
> > +static int
> > +nfsd4_cld_check(struct nfs4_client *clp)
> > +{
> > +	struct nfs4_client_reclaim *crp;
> > +	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> > +
> > +	/* did we already find that this client is stable? */
> > +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> > +		return 0;
> > +
> > +	/* look for it in the reclaim hashtable otherwise */
> > +	crp = nfsd4_find_reclaim_client(clp->cl_name, nn);
> > +	if (crp) {
> > +		crp->cr_clp = clp;
> > +		return 0;
> > +	}
> > +
> > +	return -ENOENT;
> 
> Does this need to set NFSD4_CLIENT_STABLE as well, like
> nfsd4_cld_check_v0 does?

No - all we've checked here is the reclaim_str_hashtbl, which is
populated from the table for the recovery epoch in nfsdcld's database.
The stable flag gets set once we've done a 'create' upcall, which causes
nfsdcld to write a record to the table for the current epoch.  If the
stable flag is already set then we don't even do that upcall.

-Scott
> 
> --b.
> 
> > +}
> > +
> > +static int
> > +nfsd4_cld_grace_start(struct nfsd_net *nn)
> > +{
> > +	int ret;
> > +	struct cld_upcall *cup;
> > +	struct cld_net *cn = nn->cld_net;
> > +
> > +	cup = alloc_cld_upcall(cn);
> > +	if (!cup) {
> > +		ret = -ENOMEM;
> > +		goto out_err;
> > +	}
> > +
> > +	cup->cu_msg.cm_cmd = Cld_GraceStart;
> > +	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
> > +	if (!ret)
> > +		ret = cup->cu_msg.cm_status;
> > +
> > +	free_cld_upcall(cup);
> > +out_err:
> > +	if (ret)
> > +		dprintk("%s: Unable to get clients from userspace: %d\n",
> > +			__func__, ret);
> > +	return ret;
> > +}
> > +
> > +/* For older nfsdcld's that need cm_gracetime */
> >  static void
> > -nfsd4_cld_grace_done(struct nfsd_net *nn)
> > +nfsd4_cld_grace_done_v0(struct nfsd_net *nn)
> >  {
> >  	int ret;
> >  	struct cld_upcall *cup;
> > @@ -1125,11 +1226,118 @@ nfsd4_cld_grace_done(struct nfsd_net *nn)
> >  		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> >  }
> >  
> > -static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> > +/*
> > + * For newer nfsdcld's that do not need cm_gracetime.  We also need to call
> > + * nfs4_release_reclaim() to clear out the reclaim_str_hashtbl.
> > + */
> > +static void
> > +nfsd4_cld_grace_done(struct nfsd_net *nn)
> > +{
> > +	int ret;
> > +	struct cld_upcall *cup;
> > +	struct cld_net *cn = nn->cld_net;
> > +
> > +	cup = alloc_cld_upcall(cn);
> > +	if (!cup) {
> > +		ret = -ENOMEM;
> > +		goto out_err;
> > +	}
> > +
> > +	cup->cu_msg.cm_cmd = Cld_GraceDone;
> > +	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
> > +	if (!ret)
> > +		ret = cup->cu_msg.cm_status;
> > +
> > +	free_cld_upcall(cup);
> > +out_err:
> > +	nfs4_release_reclaim(nn);
> > +	if (ret)
> > +		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> > +}
> > +
> > +static int
> > +nfs4_cld_state_init(struct net *net)
> > +{
> > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > +	int i;
> > +
> > +	nn->reclaim_str_hashtbl = kmalloc_array(CLIENT_HASH_SIZE,
> > +						sizeof(struct list_head),
> > +						GFP_KERNEL);
> > +	if (!nn->reclaim_str_hashtbl)
> > +		return -ENOMEM;
> > +
> > +	for (i = 0; i < CLIENT_HASH_SIZE; i++)
> > +		INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]);
> > +	nn->reclaim_str_hashtbl_size = 0;
> > +
> > +	return 0;
> > +}
> > +
> > +static void
> > +nfs4_cld_state_shutdown(struct net *net)
> > +{
> > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > +
> > +	kfree(nn->reclaim_str_hashtbl);
> > +}
> > +
> > +static int
> > +nfsd4_cld_tracking_init(struct net *net)
> > +{
> > +	int status;
> > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > +
> > +	status = nfs4_cld_state_init(net);
> > +	if (status)
> > +		return status;
> > +
> > +	status = nfsd4_init_cld_pipe(net);
> > +	if (status)
> > +		goto err_shutdown;
> > +
> > +	status = nfsd4_cld_grace_start(nn);
> > +	if (status) {
> > +		if (status == -EOPNOTSUPP)
> > +			printk(KERN_WARNING "NFSD: Please upgrade nfsdcld.\n");
> > +		nfs4_release_reclaim(nn);
> > +		goto err_remove;
> > +	}
> > +	return 0;
> > +
> > +err_remove:
> > +	nfsd4_remove_cld_pipe(net);
> > +err_shutdown:
> > +	nfs4_cld_state_shutdown(net);
> > +	return status;
> > +}
> > +
> > +static void
> > +nfsd4_cld_tracking_exit(struct net *net)
> > +{
> > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > +
> > +	nfs4_release_reclaim(nn);
> > +	nfsd4_remove_cld_pipe(net);
> > +	nfs4_cld_state_shutdown(net);
> > +}
> > +
> > +/* For older nfsdcld's */
> > +static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops_v0 = {
> >  	.init		= nfsd4_init_cld_pipe,
> >  	.exit		= nfsd4_remove_cld_pipe,
> >  	.create		= nfsd4_cld_create,
> >  	.remove		= nfsd4_cld_remove,
> > +	.check		= nfsd4_cld_check_v0,
> > +	.grace_done	= nfsd4_cld_grace_done_v0,
> > +};
> > +
> > +/* For newer nfsdcld's */
> > +static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> > +	.init		= nfsd4_cld_tracking_init,
> > +	.exit		= nfsd4_cld_tracking_exit,
> > +	.create		= nfsd4_cld_create,
> > +	.remove		= nfsd4_cld_remove,
> >  	.check		= nfsd4_cld_check,
> >  	.grace_done	= nfsd4_cld_grace_done,
> >  };
> > @@ -1514,9 +1722,10 @@ nfsd4_client_tracking_init(struct net *net)
> >  
> >  	/* Finally, try to use nfsdcld */
> >  	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
> > -	printk(KERN_WARNING "NFSD: the nfsdcld client tracking upcall will be "
> > -			"removed in 3.10. Please transition to using "
> > -			"nfsdcltrack.\n");
> > +	status = nn->client_tracking_ops->init(net);
> > +	if (!status)
> > +		return status;
> > +	nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
> >  do_init:
> >  	status = nn->client_tracking_ops->init(net);
> >  	if (status) {
> > diff --git a/include/uapi/linux/nfsd/cld.h b/include/uapi/linux/nfsd/cld.h
> > index f8f5cccad749..b1e9de4f07d5 100644
> > --- a/include/uapi/linux/nfsd/cld.h
> > +++ b/include/uapi/linux/nfsd/cld.h
> > @@ -36,6 +36,7 @@ enum cld_command {
> >  	Cld_Remove,		/* remove record of this cm_id */
> >  	Cld_Check,		/* is this cm_id allowed? */
> >  	Cld_GraceDone,		/* grace period is complete */
> > +	Cld_GraceStart,
> >  };
> >  
> >  /* representation of long-form NFSv4 client ID */
> > -- 
> > 2.17.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld
  2019-04-05 19:26     ` Scott Mayhew
@ 2019-04-05 20:11       ` J. Bruce Fields
  0 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2019-04-05 20:11 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: jlayton, linux-nfs

On Fri, Apr 05, 2019 at 03:26:53PM -0400, Scott Mayhew wrote:
> On Fri, 05 Apr 2019, J. Bruce Fields wrote:
> 
> > On Tue, Mar 26, 2019 at 06:06:27PM -0400, Scott Mayhew wrote:
> > > When nfsdcld was released, it was quickly deprecated in favor of the
> > > nfsdcltrack usermodehelper, so as to not require another running daemon.
> > > That prevents NFSv4 clients from reclaiming locks from nfsd's running in
> > > containers, since neither nfsdcltrack nor the legacy client tracking
> > > code work in containers.
> > > 
> > > This commit un-deprecates the use of nfsdcld, with one twist: we will
> > > populate the reclaim_str_hashtbl on startup.
> > > 
> > > During client tracking initialization, do an upcall ("GraceStart") to
> > > nfsdcld to get a list of clients from the database.  nfsdcld will do
> > > one downcall with a status of -EINPROGRESS for each client record in
> > > the database, which in turn will cause an nfs4_client_reclaim to be
> > > added to the reclaim_str_hashtbl.  When complete, nfsdcld will do a
> > > final downcall with a status of 0.
> > > 
> > > This will save nfsd from having to do an upcall to the daemon during
> > > nfs4_check_open_reclaim() processing.
> > > 
> > > Even though nfsdcld was quickly deprecated, there is a very small chance
> > > of old nfsdcld daemons running in the wild.  These will respond to the
> > > new "GraceStart" upcall with -EOPNOTSUPP, in which case we will log a
> > > message and fall back to the original nfsdcld tracking ops (now called
> > > nfsd4_cld_tracking_ops_v0).
> > > 
> > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > ---
> > >  fs/nfsd/nfs4recover.c         | 225 ++++++++++++++++++++++++++++++++--
> > >  include/uapi/linux/nfsd/cld.h |   1 +
> > >  2 files changed, 218 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > index 2243b909b407..89c2a27956d0 100644
> > > --- a/fs/nfsd/nfs4recover.c
> > > +++ b/fs/nfsd/nfs4recover.c
> > > @@ -779,6 +779,34 @@ cld_pipe_upcall(struct rpc_pipe *pipe, struct cld_msg *cmsg)
> > >  	return ret;
> > >  }
> > >  
> > > +static ssize_t
> > > +__cld_pipe_inprogress_downcall(const struct cld_msg __user *cmsg,
> > > +		struct nfsd_net *nn)
> > > +{
> > > +	uint8_t cmd;
> > > +	struct xdr_netobj name;
> > > +	uint16_t namelen;
> > > +
> > > +	if (get_user(cmd, &cmsg->cm_cmd)) {
> > > +		dprintk("%s: error when copying cmd from userspace", __func__);
> > > +		return -EFAULT;
> > > +	}
> > > +	if (cmd == Cld_GraceStart) {
> > > +		if (get_user(namelen, &cmsg->cm_u.cm_name.cn_len))
> > > +			return -EFAULT;
> > > +		name.data = memdup_user(&cmsg->cm_u.cm_name.cn_id, namelen);
> > > +		if (IS_ERR_OR_NULL(name.data))
> > > +			return -EFAULT;
> > > +		name.len = namelen;
> > > +		if (!nfs4_client_to_reclaim(name, nn)) {
> > > +			kfree(name.data);
> > > +			return -EFAULT;
> > > +		}
> > > +		return sizeof(*cmsg);
> > > +	}
> > > +	return -EFAULT;
> > > +}
> > > +
> > >  static ssize_t
> > >  cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> > >  {
> > > @@ -788,6 +816,7 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> > >  	struct nfsd_net *nn = net_generic(file_inode(filp)->i_sb->s_fs_info,
> > >  						nfsd_net_id);
> > >  	struct cld_net *cn = nn->cld_net;
> > > +	int16_t status;
> > >  
> > >  	if (mlen != sizeof(*cmsg)) {
> > >  		dprintk("%s: got %zu bytes, expected %zu\n", __func__, mlen,
> > > @@ -801,13 +830,24 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> > >  		return -EFAULT;
> > >  	}
> > >  
> > > +	/*
> > > +	 * copy the status so we know whether to remove the upcall from the
> > > +	 * list (for -EINPROGRESS, we just want to make sure the xid is
> > > +	 * valid, not remove the upcall from the list)
> > > +	 */
> > > +	if (get_user(status, &cmsg->cm_status)) {
> > > +		dprintk("%s: error when copying status from userspace", __func__);
> > > +		return -EFAULT;
> > > +	}
> > > +
> > >  	/* walk the list and find corresponding xid */
> > >  	cup = NULL;
> > >  	spin_lock(&cn->cn_lock);
> > >  	list_for_each_entry(tmp, &cn->cn_list, cu_list) {
> > >  		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
> > >  			cup = tmp;
> > > -			list_del_init(&cup->cu_list);
> > > +			if (status != -EINPROGRESS)
> > > +				list_del_init(&cup->cu_list);
> > >  			break;
> > >  		}
> > >  	}
> > > @@ -819,6 +859,9 @@ cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> > >  		return -EINVAL;
> > >  	}
> > >  
> > > +	if (status == -EINPROGRESS)
> > > +		return __cld_pipe_inprogress_downcall(cmsg, nn);
> > > +
> > >  	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
> > >  		return -EFAULT;
> > >  
> > > @@ -1065,9 +1108,14 @@ nfsd4_cld_remove(struct nfs4_client *clp)
> > >  				"record from stable storage: %d\n", ret);
> > >  }
> > >  
> > > -/* Check for presence of a record, and update its timestamp */
> > > +/*
> > > + * For older nfsdcld's that do not allow us to "slurp" the clients
> > > + * from the tracking database during startup.
> > > + *
> > > + * Check for presence of a record, and update its timestamp
> > > + */
> > >  static int
> > > -nfsd4_cld_check(struct nfs4_client *clp)
> > > +nfsd4_cld_check_v0(struct nfs4_client *clp)
> > >  {
> > >  	int ret;
> > >  	struct cld_upcall *cup;
> > > @@ -1100,8 +1148,61 @@ nfsd4_cld_check(struct nfs4_client *clp)
> > >  	return ret;
> > >  }
> > >  
> > > +/*
> > > + * For newer nfsdcld's that allow us to "slurp" the clients
> > > + * from the tracking database during startup.
> > > + *
> > > + * Check for presence of a record in the reclaim_str_hashtbl
> > > + */
> > > +static int
> > > +nfsd4_cld_check(struct nfs4_client *clp)
> > > +{
> > > +	struct nfs4_client_reclaim *crp;
> > > +	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> > > +
> > > +	/* did we already find that this client is stable? */
> > > +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> > > +		return 0;
> > > +
> > > +	/* look for it in the reclaim hashtable otherwise */
> > > +	crp = nfsd4_find_reclaim_client(clp->cl_name, nn);
> > > +	if (crp) {
> > > +		crp->cr_clp = clp;
> > > +		return 0;
> > > +	}
> > > +
> > > +	return -ENOENT;
> > 
> > Does this need to set NFSD4_CLIENT_STABLE as well, like
> > nfsd4_cld_check_v0 does?
> 
> No - all we've checked here is the reclaim_str_hashtbl, which is
> populated from the table for the recovery epoch in nfsdcld's database.
> The stable flag gets set once we've done a 'create' upcall, which causes
> nfsdcld to write a record to the table for the current epoch.  If the
> stable flag is already set then we don't even do that upcall.

OK, sorry, I think I get it now....

--b.

> 
> -Scott
> > 
> > --b.
> > 
> > > +}
> > > +
> > > +static int
> > > +nfsd4_cld_grace_start(struct nfsd_net *nn)
> > > +{
> > > +	int ret;
> > > +	struct cld_upcall *cup;
> > > +	struct cld_net *cn = nn->cld_net;
> > > +
> > > +	cup = alloc_cld_upcall(cn);
> > > +	if (!cup) {
> > > +		ret = -ENOMEM;
> > > +		goto out_err;
> > > +	}
> > > +
> > > +	cup->cu_msg.cm_cmd = Cld_GraceStart;
> > > +	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
> > > +	if (!ret)
> > > +		ret = cup->cu_msg.cm_status;
> > > +
> > > +	free_cld_upcall(cup);
> > > +out_err:
> > > +	if (ret)
> > > +		dprintk("%s: Unable to get clients from userspace: %d\n",
> > > +			__func__, ret);
> > > +	return ret;
> > > +}
> > > +
> > > +/* For older nfsdcld's that need cm_gracetime */
> > >  static void
> > > -nfsd4_cld_grace_done(struct nfsd_net *nn)
> > > +nfsd4_cld_grace_done_v0(struct nfsd_net *nn)
> > >  {
> > >  	int ret;
> > >  	struct cld_upcall *cup;
> > > @@ -1125,11 +1226,118 @@ nfsd4_cld_grace_done(struct nfsd_net *nn)
> > >  		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> > >  }
> > >  
> > > -static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> > > +/*
> > > + * For newer nfsdcld's that do not need cm_gracetime.  We also need to call
> > > + * nfs4_release_reclaim() to clear out the reclaim_str_hashtbl.
> > > + */
> > > +static void
> > > +nfsd4_cld_grace_done(struct nfsd_net *nn)
> > > +{
> > > +	int ret;
> > > +	struct cld_upcall *cup;
> > > +	struct cld_net *cn = nn->cld_net;
> > > +
> > > +	cup = alloc_cld_upcall(cn);
> > > +	if (!cup) {
> > > +		ret = -ENOMEM;
> > > +		goto out_err;
> > > +	}
> > > +
> > > +	cup->cu_msg.cm_cmd = Cld_GraceDone;
> > > +	ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
> > > +	if (!ret)
> > > +		ret = cup->cu_msg.cm_status;
> > > +
> > > +	free_cld_upcall(cup);
> > > +out_err:
> > > +	nfs4_release_reclaim(nn);
> > > +	if (ret)
> > > +		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> > > +}
> > > +
> > > +static int
> > > +nfs4_cld_state_init(struct net *net)
> > > +{
> > > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > +	int i;
> > > +
> > > +	nn->reclaim_str_hashtbl = kmalloc_array(CLIENT_HASH_SIZE,
> > > +						sizeof(struct list_head),
> > > +						GFP_KERNEL);
> > > +	if (!nn->reclaim_str_hashtbl)
> > > +		return -ENOMEM;
> > > +
> > > +	for (i = 0; i < CLIENT_HASH_SIZE; i++)
> > > +		INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]);
> > > +	nn->reclaim_str_hashtbl_size = 0;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void
> > > +nfs4_cld_state_shutdown(struct net *net)
> > > +{
> > > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > +
> > > +	kfree(nn->reclaim_str_hashtbl);
> > > +}
> > > +
> > > +static int
> > > +nfsd4_cld_tracking_init(struct net *net)
> > > +{
> > > +	int status;
> > > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > +
> > > +	status = nfs4_cld_state_init(net);
> > > +	if (status)
> > > +		return status;
> > > +
> > > +	status = nfsd4_init_cld_pipe(net);
> > > +	if (status)
> > > +		goto err_shutdown;
> > > +
> > > +	status = nfsd4_cld_grace_start(nn);
> > > +	if (status) {
> > > +		if (status == -EOPNOTSUPP)
> > > +			printk(KERN_WARNING "NFSD: Please upgrade nfsdcld.\n");
> > > +		nfs4_release_reclaim(nn);
> > > +		goto err_remove;
> > > +	}
> > > +	return 0;
> > > +
> > > +err_remove:
> > > +	nfsd4_remove_cld_pipe(net);
> > > +err_shutdown:
> > > +	nfs4_cld_state_shutdown(net);
> > > +	return status;
> > > +}
> > > +
> > > +static void
> > > +nfsd4_cld_tracking_exit(struct net *net)
> > > +{
> > > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > +
> > > +	nfs4_release_reclaim(nn);
> > > +	nfsd4_remove_cld_pipe(net);
> > > +	nfs4_cld_state_shutdown(net);
> > > +}
> > > +
> > > +/* For older nfsdcld's */
> > > +static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops_v0 = {
> > >  	.init		= nfsd4_init_cld_pipe,
> > >  	.exit		= nfsd4_remove_cld_pipe,
> > >  	.create		= nfsd4_cld_create,
> > >  	.remove		= nfsd4_cld_remove,
> > > +	.check		= nfsd4_cld_check_v0,
> > > +	.grace_done	= nfsd4_cld_grace_done_v0,
> > > +};
> > > +
> > > +/* For newer nfsdcld's */
> > > +static const struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> > > +	.init		= nfsd4_cld_tracking_init,
> > > +	.exit		= nfsd4_cld_tracking_exit,
> > > +	.create		= nfsd4_cld_create,
> > > +	.remove		= nfsd4_cld_remove,
> > >  	.check		= nfsd4_cld_check,
> > >  	.grace_done	= nfsd4_cld_grace_done,
> > >  };
> > > @@ -1514,9 +1722,10 @@ nfsd4_client_tracking_init(struct net *net)
> > >  
> > >  	/* Finally, try to use nfsdcld */
> > >  	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
> > > -	printk(KERN_WARNING "NFSD: the nfsdcld client tracking upcall will be "
> > > -			"removed in 3.10. Please transition to using "
> > > -			"nfsdcltrack.\n");
> > > +	status = nn->client_tracking_ops->init(net);
> > > +	if (!status)
> > > +		return status;
> > > +	nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
> > >  do_init:
> > >  	status = nn->client_tracking_ops->init(net);
> > >  	if (status) {
> > > diff --git a/include/uapi/linux/nfsd/cld.h b/include/uapi/linux/nfsd/cld.h
> > > index f8f5cccad749..b1e9de4f07d5 100644
> > > --- a/include/uapi/linux/nfsd/cld.h
> > > +++ b/include/uapi/linux/nfsd/cld.h
> > > @@ -36,6 +36,7 @@ enum cld_command {
> > >  	Cld_Remove,		/* remove record of this cm_id */
> > >  	Cld_Check,		/* is this cm_id allowed? */
> > >  	Cld_GraceDone,		/* grace period is complete */
> > > +	Cld_GraceStart,
> > >  };
> > >  
> > >  /* representation of long-form NFSv4 client ID */
> > > -- 
> > > 2.17.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 4/5] nfsd: re-order client tracking method selection
  2019-03-26 22:06 ` [PATCH RFC v3 4/5] nfsd: re-order client tracking method selection Scott Mayhew
@ 2019-04-05 20:40   ` J. Bruce Fields
  0 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2019-04-05 20:40 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: jlayton, linux-nfs

On Tue, Mar 26, 2019 at 06:06:29PM -0400, Scott Mayhew wrote:
> The new order is first nfsdcld, then the UMH upcall, and finally the
> legacy tracking method.  Added some printk's to the tracking

I'm hesitant to add printk's (as opposed to dprintk's or tracepoints),
but this is just one line on nfsd startup, and maybe it's helpful
figuring out what's going on if for example nfsdcld isn't starting when
it should.  OK.

--b.

> initialization functions so it's clear which tracking method was
> ultimately selected.
> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfsd/nfs4recover.c | 80 +++++++++++++++++++++++++++++++++++--------
>  1 file changed, 65 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 09e3c9352778..d60001da1dbf 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -628,6 +628,7 @@ nfsd4_legacy_tracking_init(struct net *net)
>  	status = nfsd4_load_reboot_recovery_data(net);
>  	if (status)
>  		goto err;
> +	printk("NFSD: Using legacy client tracking operations.\n");
>  	return 0;
>  
>  err:
> @@ -937,7 +938,7 @@ nfsd4_cld_unregister_net(struct net *net, struct rpc_pipe *pipe)
>  
>  /* Initialize rpc_pipefs pipe for communication with client tracking daemon */
>  static int
> -nfsd4_init_cld_pipe(struct net *net)
> +__nfsd4_init_cld_pipe(struct net *net)
>  {
>  	int ret;
>  	struct dentry *dentry;
> @@ -980,6 +981,17 @@ nfsd4_init_cld_pipe(struct net *net)
>  	return ret;
>  }
>  
> +static int
> +nfsd4_init_cld_pipe(struct net *net)
> +{
> +	int status;
> +
> +	status = __nfsd4_init_cld_pipe(net);
> +	if (!status)
> +		printk("NFSD: Using old nfsdcld client tracking operations.\n");
> +	return status;
> +}
> +
>  static void
>  nfsd4_remove_cld_pipe(struct net *net)
>  {
> @@ -1285,27 +1297,55 @@ nfs4_cld_state_shutdown(struct net *net)
>  	kfree(nn->reclaim_str_hashtbl);
>  }
>  
> +static bool
> +cld_running(struct nfsd_net *nn)
> +{
> +	struct cld_net *cn = nn->cld_net;
> +	struct rpc_pipe *pipe = cn->cn_pipe;
> +
> +	return pipe->nreaders || pipe->nwriters;
> +}
> +
>  static int
>  nfsd4_cld_tracking_init(struct net *net)
>  {
>  	int status;
>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> +	bool running;
> +	int retries = 10;
>  
>  	status = nfs4_cld_state_init(net);
>  	if (status)
>  		return status;
>  
> -	status = nfsd4_init_cld_pipe(net);
> +	status = __nfsd4_init_cld_pipe(net);
>  	if (status)
>  		goto err_shutdown;
>  
> +	/*
> +	 * rpc pipe upcalls take 30 seconds to time out, so we don't want to
> +	 * queue an upcall unless we know that nfsdcld is running (because we
> +	 * want this to fail fast so that nfsd4_client_tracking_init() can try
> +	 * the next client tracking method).  nfsdcld should already be running
> +	 * before nfsd is started, so the wait here is for nfsdcld to open the
> +	 * pipefs file we just created.
> +	 */
> +	while (!(running = cld_running(nn)) && retries--)
> +		msleep(100);
> +
> +	if (!running) {
> +		status = -ETIMEDOUT;
> +		goto err_remove;
> +	}
> +
>  	status = nfsd4_cld_grace_start(nn);
>  	if (status) {
>  		if (status == -EOPNOTSUPP)
>  			printk(KERN_WARNING "NFSD: Please upgrade nfsdcld.\n");
>  		nfs4_release_reclaim(nn);
>  		goto err_remove;
> -	}
> +	} else
> +		printk("NFSD: Using nfsdcld client tracking operations.\n");
>  	return 0;
>  
>  err_remove:
> @@ -1552,6 +1592,8 @@ nfsd4_umh_cltrack_init(struct net *net)
>  
>  	ret = nfsd4_umh_cltrack_upcall("init", NULL, grace_start, NULL);
>  	kfree(grace_start);
> +	if (!ret)
> +		printk("NFSD: Using UMH upcall client tracking operations.\n");
>  	return ret;
>  }
>  
> @@ -1701,9 +1743,20 @@ nfsd4_client_tracking_init(struct net *net)
>  	if (nn->client_tracking_ops)
>  		goto do_init;
>  
> +	/* First, try to use nfsdcld */
> +	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
> +	status = nn->client_tracking_ops->init(net);
> +	if (!status)
> +		return status;
> +	if (status != -ETIMEDOUT) {
> +		nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
> +		status = nn->client_tracking_ops->init(net);
> +		if (!status)
> +			return status;
> +	}
> +
>  	/*
> -	 * First, try a UMH upcall. It should succeed or fail quickly, so
> -	 * there's little harm in trying that first.
> +	 * Next, try the UMH upcall.
>  	 */
>  	nn->client_tracking_ops = &nfsd4_umh_tracking_ops;
>  	status = nn->client_tracking_ops->init(net);
> @@ -1711,26 +1764,23 @@ nfsd4_client_tracking_init(struct net *net)
>  		return status;
>  
>  	/*
> -	 * See if the recoverydir exists and is a directory. If it is,
> -	 * then use the legacy ops.
> +	 * Finally, See if the recoverydir exists and is a directory.
> +	 * If it is, then use the legacy ops.
>  	 */
>  	nn->client_tracking_ops = &nfsd4_legacy_tracking_ops;
>  	status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
>  	if (!status) {
>  		status = d_is_dir(path.dentry);
>  		path_put(&path);
> -		if (status)
> -			goto do_init;
> +		if (!status) {
> +			status = -EINVAL;
> +			goto out;
> +		}
>  	}
>  
> -	/* Finally, try to use nfsdcld */
> -	nn->client_tracking_ops = &nfsd4_cld_tracking_ops;
> -	status = nn->client_tracking_ops->init(net);
> -	if (!status)
> -		return status;
> -	nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0;
>  do_init:
>  	status = nn->client_tracking_ops->init(net);
> +out:
>  	if (status) {
>  		printk(KERN_WARNING "NFSD: Unable to initialize client "
>  				    "recovery tracking! (%d)\n", status);
> -- 
> 2.17.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v3 0/5] un-deprecate nfsdcld
  2019-03-29 12:16   ` Scott Mayhew
@ 2019-04-05 20:43     ` J. Bruce Fields
  0 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2019-04-05 20:43 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: jlayton, linux-nfs

On Fri, Mar 29, 2019 at 08:16:09AM -0400, Scott Mayhew wrote:
> On Thu, 28 Mar 2019, J. Bruce Fields wrote:
> > Thanks!  It looks good to me on a first skim, but I probably won't get a
> > proper look at it till next week.
> > 
> > You say "RFC"--are there still known issue that you're working on?
> 
> No, I just added RFC because of the new stuff I added for handling the
> legacy records.  Plus I wasn't sure if you would have a problem with the
> server not lifting the grace period early if there were legacy records.
> I could do it if I moved some code out of nfs4recover.c but I chose not
> to since it should really be a one-time thing and the original legacy
> tracking didn't lift the grace period early anyway.

Yeah, I don't think that case is worth spending time optimizing.

The kernel patches look OK to me.  I'm inclined to target them for the
next merge window (5.2).  If Jeff has the time to take a look, it'd also
be good to know if he has any more concerns.

--b.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-04-05 20:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-26 22:06 [PATCH RFC v3 0/5] un-deprecate nfsdcld Scott Mayhew
2019-03-26 22:06 ` [PATCH RFC v3 1/5] nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array Scott Mayhew
2019-03-26 22:06 ` [PATCH RFC v3 2/5] nfsd: un-deprecate nfsdcld Scott Mayhew
2019-04-05 18:54   ` J. Bruce Fields
2019-04-05 19:26     ` Scott Mayhew
2019-04-05 20:11       ` J. Bruce Fields
2019-03-26 22:06 ` [PATCH RFC v3 3/5] nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld Scott Mayhew
2019-03-26 22:06 ` [PATCH RFC v3 4/5] nfsd: re-order client tracking method selection Scott Mayhew
2019-04-05 20:40   ` J. Bruce Fields
2019-03-26 22:06 ` [PATCH RFC v3 5/5] nfsd: handle legacy client tracking records sent by nfsdcld Scott Mayhew
2019-03-28 21:17 ` [PATCH RFC v3 0/5] un-deprecate nfsdcld J. Bruce Fields
2019-03-29 12:16   ` Scott Mayhew
2019-04-05 20:43     ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).