All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] nfsd: overhaul the client name tracking code
@ 2012-02-01 15:44 Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Jeff Layton @ 2012-02-01 15:44 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

This is the fifth iteration of this patchset. At this point, I've
decided just to focus on what's needed to replace the upcall for the
single-server case. It's possible that we'll need to revise the upcall
format later to handle clustered configurations, but until we know
specifically what we'll need, it's probably simpler not to worry about
it.

An earlier version of this patchset can be viewed here. That set also
contains a more comprehensive description of the rationale for this:

    http://www.spinics.net/lists/linux-nfs/msg26324.html

The main changes since the last are:

- the "remove" upcall has been put back. As Bruce rightly pointed out
  in review of the last set, it is necessary.

- the "init" upcall has been removed. For now, we'll just keep using
  get_seconds() everywhere. Until we have a clear need for upcalling
  to get a boot generation value, I'll keep the changes minimal.

- the upcall format has lost some fields in the union. Specifically,
  the cm_index and cm_generation value since they aren't used.

I'd like to see this considered for inclusion in 3.4.

Jeff Layton (5):
  nfsd: add nfsd4_client_tracking_ops struct and a way to set it
  sunrpc: create nfsd dir in rpc_pipefs
  nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
  nfsd: add a header describing upcall to nfsdcld
  nfsd: add the infrastructure to handle the cld upcall

 fs/nfsd/nfs4callback.c   |   12 +-
 fs/nfsd/nfs4recover.c    |  485 +++++++++++++++++++++++++++++++++++++++++++++-
 fs/nfsd/nfs4state.c      |   46 ++---
 fs/nfsd/state.h          |   22 ++-
 include/linux/nfsd/cld.h |   56 ++++++
 net/sunrpc/rpc_pipe.c    |    5 +
 6 files changed, 572 insertions(+), 54 deletions(-)
 create mode 100644 include/linux/nfsd/cld.h

-- 
1.7.7.6


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it
  2012-02-01 15:44 [PATCH v5 0/5] nfsd: overhaul the client name tracking code Jeff Layton
@ 2012-02-01 15:44 ` Jeff Layton
  2012-02-02 22:45   ` J. Bruce Fields
  2012-02-01 15:44 ` [PATCH v5 2/5] sunrpc: create nfsd dir in rpc_pipefs Jeff Layton
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2012-02-01 15:44 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Abstract out the mechanism that we use to track clients into a set of
client name tracking functions.

This gives us a mechanism to plug in a new set of client tracking
functions without disturbing the callers. It also gives us a way to
decide on what tracking scheme to use at runtime.

For now, this just looks like pointless abstraction, but later we'll
add a new alternate scheme for tracking clients on stable storage.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfs4recover.c |  123 +++++++++++++++++++++++++++++++++++++++++++++----
 fs/nfsd/nfs4state.c   |   46 +++++++------------
 fs/nfsd/state.h       |   14 +++--
 3 files changed, 138 insertions(+), 45 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 0b3e875..3fbf1f4 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -43,9 +43,20 @@
 
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
 
+/* Declarations */
+struct nfsd4_client_tracking_ops {
+	int (*init)(void);
+	void (*exit)(void);
+	void (*create)(struct nfs4_client *);
+	void (*remove)(struct nfs4_client *);
+	int (*check)(struct nfs4_client *);
+	void (*grace_done)(time_t);
+};
+
 /* Globals */
 static struct file *rec_file;
 static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
+static struct nfsd4_client_tracking_ops *client_tracking_ops;
 
 static int
 nfs4_save_creds(const struct cred **original_creds)
@@ -117,7 +128,8 @@ out_no_tfm:
 	return status;
 }
 
-void nfsd4_create_clid_dir(struct nfs4_client *clp)
+static void
+nfsd4_create_clid_dir(struct nfs4_client *clp)
 {
 	const struct cred *original_cred;
 	char *dname = clp->cl_recdir;
@@ -265,7 +277,7 @@ out_unlock:
 	return status;
 }
 
-void
+static void
 nfsd4_remove_clid_dir(struct nfs4_client *clp)
 {
 	const struct cred *original_cred;
@@ -292,7 +304,6 @@ out:
 	if (status)
 		printk("NFSD: Failed to remove expired client state directory"
 				" %.*s\n", HEXDIR_LEN, clp->cl_recdir);
-	return;
 }
 
 static int
@@ -311,8 +322,8 @@ purge_old(struct dentry *parent, struct dentry *child)
 	return 0;
 }
 
-void
-nfsd4_recdir_purge_old(void) {
+static void
+nfsd4_recdir_purge_old(time_t boot_time __attribute__ ((unused))) {
 	int status;
 
 	if (!rec_file)
@@ -343,7 +354,7 @@ load_recdir(struct dentry *parent, struct dentry *child)
 	return 0;
 }
 
-int
+static int
 nfsd4_recdir_load(void) {
 	int status;
 
@@ -361,8 +372,8 @@ nfsd4_recdir_load(void) {
  * Hold reference to the recovery directory.
  */
 
-void
-nfsd4_init_recdir()
+static int
+nfsd4_init_recdir(void)
 {
 	const struct cred *original_cred;
 	int status;
@@ -377,20 +388,37 @@ nfsd4_init_recdir()
 		printk("NFSD: Unable to change credentials to find recovery"
 		       " directory: error %d\n",
 		       status);
-		return;
+		return status;
 	}
 
 	rec_file = filp_open(user_recovery_dirname, O_RDONLY | O_DIRECTORY, 0);
 	if (IS_ERR(rec_file)) {
 		printk("NFSD: unable to find recovery directory %s\n",
 				user_recovery_dirname);
+		status = PTR_ERR(rec_file);
 		rec_file = NULL;
 	}
 
 	nfs4_reset_creds(original_cred);
+	return status;
 }
 
-void
+static int
+nfsd4_load_reboot_recovery_data(void)
+{
+	int status;
+
+	nfs4_lock_state();
+	status = nfsd4_init_recdir();
+	if (!status)
+		status = nfsd4_recdir_load();
+	nfs4_unlock_state();
+	if (status)
+		printk(KERN_ERR "NFSD: Failure reading reboot recovery data\n");
+	return status;
+}
+
+static void
 nfsd4_shutdown_recdir(void)
 {
 	if (!rec_file)
@@ -425,3 +453,78 @@ nfs4_recoverydir(void)
 {
 	return user_recovery_dirname;
 }
+
+static int
+nfsd4_check_legacy_client(struct nfs4_client *clp)
+{
+	if (nfsd4_find_reclaim_client(clp) != NULL)
+		return 0;
+	else
+		return -ENOENT;
+}
+
+static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
+	.init		= nfsd4_load_reboot_recovery_data,
+	.exit		= nfsd4_shutdown_recdir,
+	.create		= nfsd4_create_clid_dir,
+	.remove		= nfsd4_remove_clid_dir,
+	.check		= nfsd4_check_legacy_client,
+	.grace_done	= nfsd4_recdir_purge_old,
+};
+
+int
+nfsd4_client_tracking_init(void)
+{
+	int status;
+
+	client_tracking_ops = &nfsd4_legacy_tracking_ops;
+
+	if (!client_tracking_ops->init)
+		return 0;
+
+	status = client_tracking_ops->init();
+	if (status) {
+		printk(KERN_WARNING "NFSD: Unable to initialize client "
+				    "recovery tracking! (%d)\n", status);
+		client_tracking_ops = NULL;
+	}
+	return status;
+}
+
+void
+nfsd4_client_tracking_exit(void)
+{
+	if (client_tracking_ops && client_tracking_ops->exit)
+		client_tracking_ops->exit();
+	client_tracking_ops = NULL;
+}
+
+void
+nfsd4_client_record_create(struct nfs4_client *clp)
+{
+	if (client_tracking_ops && client_tracking_ops->create)
+		client_tracking_ops->create(clp);
+}
+
+void
+nfsd4_client_record_remove(struct nfs4_client *clp)
+{
+	if (client_tracking_ops && client_tracking_ops->remove)
+		client_tracking_ops->remove(clp);
+}
+
+int
+nfsd4_client_record_check(struct nfs4_client *clp)
+{
+	if (client_tracking_ops && client_tracking_ops->check)
+		return client_tracking_ops->check(clp);
+
+	return -EOPNOTSUPP;
+}
+
+void
+nfsd4_grace_done(time_t boot_time)
+{
+	if (client_tracking_ops && client_tracking_ops->grace_done)
+		client_tracking_ops->grace_done(boot_time);
+}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e8c98f0..971a117 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2045,7 +2045,7 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 		goto out;
 
 	status = nfs_ok;
-	nfsd4_create_clid_dir(cstate->session->se_client);
+	nfsd4_client_record_create(cstate->session->se_client);
 out:
 	nfs4_unlock_state();
 	return status;
@@ -2240,7 +2240,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 			conf = find_confirmed_client_by_str(unconf->cl_recdir,
 							    hash);
 			if (conf) {
-				nfsd4_remove_clid_dir(conf);
+				nfsd4_client_record_remove(conf);
 				expire_client(conf);
 			}
 			move_to_confirmed(unconf);
@@ -3066,7 +3066,7 @@ static void
 nfsd4_end_grace(void)
 {
 	dprintk("NFSD: end of grace period\n");
-	nfsd4_recdir_purge_old();
+	nfsd4_grace_done(boot_time);
 	locks_end_grace(&nfsd4_manager);
 	/*
 	 * Now that every NFSv4 client has had the chance to recover and
@@ -3115,7 +3115,7 @@ nfs4_laundromat(void)
 		clp = list_entry(pos, struct nfs4_client, cl_lru);
 		dprintk("NFSD: purging unused client (clientid %08x)\n",
 			clp->cl_clientid.cl_id);
-		nfsd4_remove_clid_dir(clp);
+		nfsd4_client_record_remove(clp);
 		expire_client(clp);
 	}
 	spin_lock(&recall_lock);
@@ -3539,7 +3539,7 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	dprintk("NFSD: %s: success, seqid=%d stateid=" STATEID_FMT "\n",
 		__func__, oc->oc_seqid, STATEID_VAL(&stp->st_stid.sc_stateid));
 
-	nfsd4_create_clid_dir(oo->oo_owner.so_client);
+	nfsd4_client_record_create(oo->oo_owner.so_client);
 	status = nfs_ok;
 out:
 	if (!cstate->replay_owner)
@@ -4397,19 +4397,13 @@ nfs4_release_reclaim(void)
 
 /*
  * called from OPEN, CLAIM_PREVIOUS with a new clientid. */
-static struct nfs4_client_reclaim *
-nfs4_find_reclaim_client(clientid_t *clid)
+struct nfs4_client_reclaim *
+nfsd4_find_reclaim_client(struct nfs4_client *clp)
 {
 	unsigned int strhashval;
-	struct nfs4_client *clp;
 	struct nfs4_client_reclaim *crp = NULL;
 
 
-	/* find clientid in conf_id_hashtbl */
-	clp = find_confirmed_client(clid);
-	if (clp == NULL)
-		return NULL;
-
 	dprintk("NFSD: nfs4_find_reclaim_client for %.*s with recdir %s\n",
 		            clp->cl_name.len, clp->cl_name.data,
 			    clp->cl_recdir);
@@ -4430,7 +4424,14 @@ nfs4_find_reclaim_client(clientid_t *clid)
 __be32
 nfs4_check_open_reclaim(clientid_t *clid)
 {
-	return nfs4_find_reclaim_client(clid) ? nfs_ok : nfserr_reclaim_bad;
+	struct nfs4_client *clp;
+
+	/* find clientid in conf_id_hashtbl */
+	clp = find_confirmed_client(clid);
+	if (clp == NULL)
+		return nfserr_reclaim_bad;
+
+	return nfsd4_client_record_check(clp) ? nfserr_reclaim_bad : nfs_ok;
 }
 
 #ifdef CONFIG_NFSD_FAULT_INJECTION
@@ -4577,19 +4578,6 @@ nfs4_state_init(void)
 	reclaim_str_hashtbl_size = 0;
 }
 
-static void
-nfsd4_load_reboot_recovery_data(void)
-{
-	int status;
-
-	nfs4_lock_state();
-	nfsd4_init_recdir();
-	status = nfsd4_recdir_load();
-	nfs4_unlock_state();
-	if (status)
-		printk("NFSD: Failure reading reboot recovery data\n");
-}
-
 /*
  * Since the lifetime of a delegation isn't limited to that of an open, a
  * client may quite reasonably hang on to a delegation as long as it has
@@ -4642,7 +4630,7 @@ out_free_laundry:
 int
 nfs4_state_start(void)
 {
-	nfsd4_load_reboot_recovery_data();
+	nfsd4_client_tracking_init();
 	return __nfs4_state_start();
 }
 
@@ -4676,7 +4664,7 @@ __nfs4_state_shutdown(void)
 		unhash_delegation(dp);
 	}
 
-	nfsd4_shutdown_recdir();
+	nfsd4_client_tracking_exit();
 }
 
 void
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index ffb5df1..8f539ee 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -463,6 +463,7 @@ extern __be32 nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
 extern void nfs4_lock_state(void);
 extern void nfs4_unlock_state(void);
 extern int nfs4_in_grace(void);
+extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(struct nfs4_client *crp);
 extern __be32 nfs4_check_open_reclaim(clientid_t *clid);
 extern void nfs4_free_openowner(struct nfs4_openowner *);
 extern void nfs4_free_lockowner(struct nfs4_lockowner *);
@@ -477,16 +478,17 @@ extern void nfsd4_destroy_callback_queue(void);
 extern void nfsd4_shutdown_callback(struct nfs4_client *);
 extern void nfs4_put_delegation(struct nfs4_delegation *dp);
 extern __be32 nfs4_make_rec_clidname(char *clidname, struct xdr_netobj *clname);
-extern void nfsd4_init_recdir(void);
-extern int nfsd4_recdir_load(void);
-extern void nfsd4_shutdown_recdir(void);
 extern int nfs4_client_to_reclaim(const char *name);
 extern int nfs4_has_reclaimed_state(const char *name, bool use_exchange_id);
-extern void nfsd4_recdir_purge_old(void);
-extern void nfsd4_create_clid_dir(struct nfs4_client *clp);
-extern void nfsd4_remove_clid_dir(struct nfs4_client *clp);
 extern void release_session_client(struct nfsd4_session *);
 extern __be32 nfs4_validate_stateid(struct nfs4_client *, stateid_t *);
 extern void nfsd4_purge_closed_stateid(struct nfs4_stateowner *);
 
+/* nfs4recover operations */
+extern int nfsd4_client_tracking_init(void);
+extern void nfsd4_client_tracking_exit(void);
+extern void nfsd4_client_record_create(struct nfs4_client *clp);
+extern void nfsd4_client_record_remove(struct nfs4_client *clp);
+extern int nfsd4_client_record_check(struct nfs4_client *clp);
+extern void nfsd4_grace_done(time_t boot_time);
 #endif   /* NFSD4_STATE_H */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 2/5] sunrpc: create nfsd dir in rpc_pipefs
  2012-02-01 15:44 [PATCH v5 0/5] nfsd: overhaul the client name tracking code Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
@ 2012-02-01 15:44 ` Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field Jeff Layton
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2012-02-01 15:44 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
upcall.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 net/sunrpc/rpc_pipe.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 63a7a7a..3dc2c6e 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -986,6 +986,7 @@ enum {
 	RPCAUTH_statd,
 	RPCAUTH_nfsd4_cb,
 	RPCAUTH_cache,
+	RPCAUTH_nfsd,
 	RPCAUTH_RootEOF
 };
 
@@ -1018,6 +1019,10 @@ static const struct rpc_filelist files[] = {
 		.name = "cache",
 		.mode = S_IFDIR | S_IRUGO | S_IXUGO,
 	},
+	[RPCAUTH_nfsd] = {
+		.name = "nfsd",
+		.mode = S_IFDIR | S_IRUGO | S_IXUGO,
+	},
 };
 
 static int
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
  2012-02-01 15:44 [PATCH v5 0/5] nfsd: overhaul the client name tracking code Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 2/5] sunrpc: create nfsd dir in rpc_pipefs Jeff Layton
@ 2012-02-01 15:44 ` Jeff Layton
  2012-02-03 19:35   ` J. Bruce Fields
  2012-02-01 15:44 ` [PATCH v5 4/5] nfsd: add a header describing upcall to nfsdcld Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall Jeff Layton
  4 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2012-02-01 15:44 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

We'll need a way to flag the nfs4_client as already being recorded on
stable storage so that we don't continually upcall.

The cl_cb_flags field is only using 2 bits right now, so repurpose that
to a generic flags field. Rename NFSD4_CLIENT_KILL to
NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
flags. Add a mask that we can use for existing checks that look to see
whether any flags are set, so that the new STABLE flag doesn't
interfere.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfs4callback.c |   12 ++++++------
 fs/nfsd/state.h        |    8 +++++---
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 6f3ebb4..730e2cc 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -756,7 +756,7 @@ void nfsd4_probe_callback(struct nfs4_client *clp)
 {
 	/* XXX: atomicity?  Also, should we be using cl_cb_flags? */
 	clp->cl_cb_state = NFSD4_CB_UNKNOWN;
-	set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
+	set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
 	do_probe_callback(clp);
 }
 
@@ -915,7 +915,7 @@ void nfsd4_destroy_callback_queue(void)
 /* must be called under the state lock */
 void nfsd4_shutdown_callback(struct nfs4_client *clp)
 {
-	set_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags);
+	set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
 	/*
 	 * Note this won't actually result in a null callback;
 	 * instead, nfsd4_do_callback_rpc() will detect the killed
@@ -966,15 +966,15 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
 		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
 		clp->cl_cb_conn.cb_xprt = NULL;
 	}
-	if (test_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags))
+	if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
 		return;
 	spin_lock(&clp->cl_lock);
 	/*
 	 * Only serialized callback code is allowed to clear these
 	 * flags; main nfsd code can only set them:
 	 */
-	BUG_ON(!clp->cl_cb_flags);
-	clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
+	BUG_ON(!(clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK));
+	clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
 	memcpy(&conn, &cb->cb_clp->cl_cb_conn, sizeof(struct nfs4_cb_conn));
 	c = __nfsd4_find_backchannel(clp);
 	if (c) {
@@ -1000,7 +1000,7 @@ void nfsd4_do_callback_rpc(struct work_struct *w)
 	struct nfs4_client *clp = cb->cb_clp;
 	struct rpc_clnt *clnt;
 
-	if (clp->cl_cb_flags)
+	if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
 		nfsd4_process_cb_update(cb);
 
 	clnt = clp->cl_cb_client;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 8f539ee..0f9a0ac 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -250,9 +250,11 @@ struct nfs4_client {
 
 	/* for v4.0 and v4.1 callbacks: */
 	struct nfs4_cb_conn	cl_cb_conn;
-#define NFSD4_CLIENT_CB_UPDATE	1
-#define NFSD4_CLIENT_KILL	2
-	unsigned long		cl_cb_flags;
+#define NFSD4_CLIENT_STABLE		(0)	/* client on stable storage */
+#define NFSD4_CLIENT_CB_UPDATE		(1)
+#define NFSD4_CLIENT_CB_KILL		(2)
+#define NFSD4_CLIENT_CB_FLAG_MASK	(0x6)
+	unsigned long		cl_flags;
 	struct rpc_clnt		*cl_cb_client;
 	u32			cl_cb_ident;
 #define NFSD4_CB_UP		0
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 4/5] nfsd: add a header describing upcall to nfsdcld
  2012-02-01 15:44 [PATCH v5 0/5] nfsd: overhaul the client name tracking code Jeff Layton
                   ` (2 preceding siblings ...)
  2012-02-01 15:44 ` [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field Jeff Layton
@ 2012-02-01 15:44 ` Jeff Layton
  2012-02-01 15:44 ` [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall Jeff Layton
  4 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2012-02-01 15:44 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

The daemon takes a versioned binary struct. Hopefully this should allow
us to revise the struct later if it becomes necessary.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 include/linux/nfsd/cld.h |   56 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 56 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/nfsd/cld.h

diff --git a/include/linux/nfsd/cld.h b/include/linux/nfsd/cld.h
new file mode 100644
index 0000000..c478642
--- /dev/null
+++ b/include/linux/nfsd/cld.h
@@ -0,0 +1,56 @@
+/*
+ * Upcall description for nfsdcld communication
+ *
+ * Copyright (c) 2011 Red Hat, Inc.
+ * Author(s): Jeff Layton <jlayton@redhat.com>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#ifndef _NFSD_CLD_H
+#define _NFSD_CLD_H
+
+/* latest upcall version available */
+#define CLD_UPCALL_VERSION 1
+
+/* defined by RFC3530 */
+#define NFS4_OPAQUE_LIMIT 1024
+
+enum cld_command {
+	Cld_Create,		/* create a record for this cm_id */
+	Cld_Remove,		/* remove record of this cm_id */
+	Cld_Check,		/* is this cm_id allowed? */
+	Cld_GraceDone,		/* grace period is complete */
+};
+
+/* representation of long-form NFSv4 client ID */
+struct cld_name {
+	uint16_t	cn_len;				/* length of cm_id */
+	unsigned char	cn_id[NFS4_OPAQUE_LIMIT];	/* client-provided */
+} __attribute__((packed));
+
+/* message struct for communication with userspace */
+struct cld_msg {
+	uint8_t		cm_vers;		/* upcall version */
+	uint8_t		cm_cmd;			/* upcall command */
+	int16_t		cm_status;		/* return code */
+	uint32_t	cm_xid;			/* transaction id */
+	union {
+		int64_t		cm_gracetime;	/* grace period start time */
+		struct cld_name	cm_name;
+	} __attribute__((packed)) cm_u;
+} __attribute__((packed));
+
+#endif /* !_NFSD_CLD_H */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall
  2012-02-01 15:44 [PATCH v5 0/5] nfsd: overhaul the client name tracking code Jeff Layton
                   ` (3 preceding siblings ...)
  2012-02-01 15:44 ` [PATCH v5 4/5] nfsd: add a header describing upcall to nfsdcld Jeff Layton
@ 2012-02-01 15:44 ` Jeff Layton
  2012-02-03 22:57   ` J. Bruce Fields
  4 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2012-02-01 15:44 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...and add a mechanism for switching between the "legacy" tracker and
the new one. The decision is made by looking to see whether the
v4recoverydir exists. If it does, then the legacy client tracker is
used.

If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
That pipe is used to talk to a daemon for handling the upcall.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfs4recover.c |  364 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 363 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 3fbf1f4..592fe3d 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -1,5 +1,6 @@
 /*
 *  Copyright (c) 2004 The Regents of the University of Michigan.
+*  Copyright (c) 2011 Jeff Layton <jlayton@redhat.com>
 *  All rights reserved.
 *
 *  Andy Adamson <andros@citi.umich.edu>
@@ -36,6 +37,10 @@
 #include <linux/namei.h>
 #include <linux/crypto.h>
 #include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/sunrpc/rpc_pipe_fs.h>
+#include <linux/sunrpc/clnt.h>
+#include <linux/nfsd/cld.h>
 
 #include "nfsd.h"
 #include "state.h"
@@ -472,12 +477,369 @@ static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
 	.grace_done	= nfsd4_recdir_purge_old,
 };
 
+/* Globals */
+#define NFSD_PIPE_DIR		"/nfsd"
+
+static struct dentry *cld_pipe;
+
+/* list of cld_msg's that are currently in use */
+static DEFINE_SPINLOCK(cld_lock);
+static LIST_HEAD(cld_list);
+static unsigned int cld_xid;
+
+struct cld_upcall {
+	struct list_head	cu_list;
+	struct task_struct	*cu_task;
+	struct cld_msg	cu_msg;
+};
+
+static int
+__cld_pipe_upcall(struct cld_msg *cmsg)
+{
+	int ret;
+	struct rpc_pipe_msg msg;
+	struct inode *inode = cld_pipe->d_inode;
+
+	memset(&msg, 0, sizeof(msg));
+	msg.data = cmsg;
+	msg.len = sizeof(*cmsg);
+
+	/*
+	 * Set task state before we queue the upcall. That prevents
+	 * wake_up_process in the downcall from racing with schedule.
+	 */
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	ret = rpc_queue_upcall(inode, &msg);
+	if (ret < 0) {
+		set_current_state(TASK_RUNNING);
+		goto out;
+	}
+
+	schedule();
+	set_current_state(TASK_RUNNING);
+
+	if (msg.errno < 0)
+		ret = msg.errno;
+out:
+	return ret;
+}
+
+static int
+cld_pipe_upcall(struct cld_msg *cmsg)
+{
+	int ret;
+
+	/*
+	 * -EAGAIN occurs when pipe is closed an reopened while there are
+	 *  upcalls queued.
+	 */
+	do {
+		ret = __cld_pipe_upcall(cmsg);
+	} while (ret == -EAGAIN);
+
+	return ret;
+}
+
+static ssize_t
+cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
+{
+	struct cld_upcall *tmp, *cup;
+	struct cld_msg *cmsg = (struct cld_msg *)src;
+	uint32_t xid;
+
+	if (mlen != sizeof(*cmsg)) {
+		dprintk("%s: got %lu bytes, expected %lu\n", __func__, mlen,
+			sizeof(*cmsg));
+		return -EINVAL;
+	}
+
+	/* copy just the xid so we can try to find that */
+	if (copy_from_user(&xid, &cmsg->cm_xid, sizeof(xid)) != 0) {
+		dprintk("%s: error when copying xid from userspace", __func__);
+		return -EFAULT;
+	}
+
+	/* walk the list and find corresponding xid */
+	cup = NULL;
+	spin_lock(&cld_lock);
+	list_for_each_entry(tmp, &cld_list, cu_list) {
+		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
+			cup = tmp;
+			list_del_init(&cup->cu_list);
+			break;
+		}
+	}
+	spin_unlock(&cld_lock);
+
+	/* couldn't find upcall? */
+	if (!cup) {
+		dprintk("%s: couldn't find upcall -- xid=%u\n", __func__,
+			cup->cu_msg.cm_xid);
+		return -EINVAL;
+	}
+
+	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
+		return -EFAULT;
+
+	wake_up_process(cup->cu_task);
+	return mlen;
+}
+
+static void
+cld_pipe_destroy_msg(struct rpc_pipe_msg *msg)
+{
+	struct cld_msg *cmsg = msg->data;
+	struct cld_upcall *cup = container_of(cmsg, struct cld_upcall,
+						 cu_msg);
+
+	/* errno >= 0 means we got a downcall */
+	if (msg->errno >= 0)
+		return;
+
+	wake_up_process(cup->cu_task);
+}
+
+static const struct rpc_pipe_ops cld_upcall_ops = {
+	.upcall		= rpc_pipe_generic_upcall,
+	.downcall	= cld_pipe_downcall,
+	.destroy_msg	= cld_pipe_destroy_msg,
+};
+
+/* Initialize rpc_pipefs pipe for communication with client tracking daemon */
+static int
+nfsd4_init_cld_pipe(void)
+{
+	int ret;
+	struct path path;
+	struct vfsmount *mnt;
+
+	if (cld_pipe)
+		return 0;
+
+	mnt = rpc_get_mount();
+	if (IS_ERR(mnt))
+		return PTR_ERR(mnt);
+
+	ret = vfs_path_lookup(mnt->mnt_root, mnt, NFSD_PIPE_DIR, 0, &path);
+	if (ret)
+		goto err;
+
+	cld_pipe = rpc_mkpipe(path.dentry, "cld", NULL,
+				&cld_upcall_ops, RPC_PIPE_WAIT_FOR_OPEN);
+	path_put(&path);
+	if (!IS_ERR(cld_pipe))
+		return 0;
+
+	ret = PTR_ERR(cld_pipe);
+err:
+	rpc_put_mount();
+	printk(KERN_ERR "NFSD: unable to create nfsdcld upcall pipe (%d)\n",
+			ret);
+	return ret;
+}
+
+static void
+nfsd4_remove_cld_pipe(void)
+{
+	int ret;
+
+	ret = rpc_unlink(cld_pipe);
+	if (ret)
+		printk(KERN_ERR "NFSD: error removing cld pipe: %d\n", ret);
+	cld_pipe = NULL;
+	rpc_put_mount();
+}
+
+static struct cld_upcall *
+alloc_cld_upcall(void)
+{
+	struct cld_upcall *new, *tmp;
+
+	new = kzalloc(sizeof(*new), GFP_KERNEL);
+	if (!new)
+		return new;
+
+	/* FIXME: hard cap on number in flight? */
+restart_search:
+	spin_lock(&cld_lock);
+	list_for_each_entry(tmp, &cld_list, cu_list) {
+		if (tmp->cu_msg.cm_xid == cld_xid) {
+			cld_xid++;
+			spin_unlock(&cld_lock);
+			goto restart_search;
+		}
+	}
+	new->cu_task = current;
+	new->cu_msg.cm_vers = CLD_UPCALL_VERSION;
+	put_unaligned(cld_xid++, &new->cu_msg.cm_xid);
+	list_add(&new->cu_list, &cld_list);
+	spin_unlock(&cld_lock);
+
+	dprintk("%s: allocated xid %u\n", __func__, new->cu_msg.cm_xid);
+
+	return new;
+}
+
+static void
+free_cld_upcall(struct cld_upcall *victim)
+{
+	spin_lock(&cld_lock);
+	list_del(&victim->cu_list);
+	spin_unlock(&cld_lock);
+	kfree(victim);
+}
+
+/* Ask daemon to create a new record */
+static void
+nfsd4_cld_create(struct nfs4_client *clp)
+{
+	int ret;
+	struct cld_upcall *cup;
+
+	/* Don't upcall if it's already stored */
+	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+		return;
+
+	cup = alloc_cld_upcall();
+	if (!cup) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	cup->cu_msg.cm_cmd = Cld_Create;
+	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
+	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
+			clp->cl_name.len);
+
+	ret = cld_pipe_upcall(&cup->cu_msg);
+	if (!ret) {
+		ret = cup->cu_msg.cm_status;
+		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+	}
+
+	free_cld_upcall(cup);
+out_err:
+	if (ret)
+		printk(KERN_ERR "NFSD: Unable to create client "
+				"record on stable storage: %d\n", ret);
+}
+
+/* Ask daemon to create a new record */
+static void
+nfsd4_cld_remove(struct nfs4_client *clp)
+{
+	int ret;
+	struct cld_upcall *cup;
+
+	/* Don't upcall if it's already removed */
+	if (!test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+		return;
+
+	cup = alloc_cld_upcall();
+	if (!cup) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	cup->cu_msg.cm_cmd = Cld_Remove;
+	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
+	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
+			clp->cl_name.len);
+
+	ret = cld_pipe_upcall(&cup->cu_msg);
+	if (!ret) {
+		ret = cup->cu_msg.cm_status;
+		clear_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+	}
+
+	free_cld_upcall(cup);
+out_err:
+	if (ret)
+		printk(KERN_ERR "NFSD: Unable to remove client "
+				"record from stable storage: %d\n", ret);
+}
+/* Check for presence of a record, and update its timestamp */
+static int
+nfsd4_cld_check(struct nfs4_client *clp)
+{
+	int ret;
+	struct cld_upcall *cup;
+
+	/* Don't upcall if one was already stored during this grace pd */
+	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+		return 0;
+
+	cup = alloc_cld_upcall();
+	if (!cup) {
+		printk(KERN_ERR "NFSD: Unable to check client record on "
+				"stable storage: %d\n", -ENOMEM);
+		return -ENOMEM;
+	}
+
+	cup->cu_msg.cm_cmd = Cld_Check;
+	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
+	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
+			clp->cl_name.len);
+
+	ret = cld_pipe_upcall(&cup->cu_msg);
+	if (!ret) {
+		ret = cup->cu_msg.cm_status;
+		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+	}
+
+	free_cld_upcall(cup);
+	return ret;
+}
+
+static void
+nfsd4_cld_grace_done(time_t boot_time)
+{
+	int ret;
+	struct cld_upcall *cup;
+
+	cup = alloc_cld_upcall();
+	if (!cup) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	cup->cu_msg.cm_cmd = Cld_GraceDone;
+	cup->cu_msg.cm_u.cm_gracetime = (int64_t)boot_time;
+	ret = cld_pipe_upcall(&cup->cu_msg);
+	if (!ret)
+		ret = cup->cu_msg.cm_status;
+
+	free_cld_upcall(cup);
+out_err:
+	if (ret)
+		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
+}
+
+static struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
+	.init		= nfsd4_init_cld_pipe,
+	.exit		= nfsd4_remove_cld_pipe,
+	.create		= nfsd4_cld_create,
+	.remove		= nfsd4_cld_remove,
+	.check		= nfsd4_cld_check,
+	.grace_done	= nfsd4_cld_grace_done,
+};
+
 int
 nfsd4_client_tracking_init(void)
 {
 	int status;
+	struct path path;
 
-	client_tracking_ops = &nfsd4_legacy_tracking_ops;
+	if (!client_tracking_ops) {
+		client_tracking_ops = &nfsd4_cld_tracking_ops;
+		status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
+		if (!status) {
+			if (S_ISDIR(path.dentry->d_inode->i_mode))
+				client_tracking_ops =
+						&nfsd4_legacy_tracking_ops;
+			path_put(&path);
+		}
+	}
 
 	if (!client_tracking_ops->init)
 		return 0;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it
  2012-02-01 15:44 ` [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
@ 2012-02-02 22:45   ` J. Bruce Fields
  2012-02-03 19:22     ` Jeff Layton
  0 siblings, 1 reply; 16+ messages in thread
From: J. Bruce Fields @ 2012-02-02 22:45 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs

Just nits:

On Wed, Feb 01, 2012 at 10:44:08AM -0500, Jeff Layton wrote:
> +static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
> +	.init		= nfsd4_load_reboot_recovery_data,
> +	.exit		= nfsd4_shutdown_recdir,
> +	.create		= nfsd4_create_clid_dir,
> +	.remove		= nfsd4_remove_clid_dir,
> +	.check		= nfsd4_check_legacy_client,
> +	.grace_done	= nfsd4_recdir_purge_old,
> +};
> +
> +int
> +nfsd4_client_tracking_init(void)
> +{
> +	int status;
> +
> +	client_tracking_ops = &nfsd4_legacy_tracking_ops;
> +
> +	if (!client_tracking_ops->init)
> +		return 0;

Is that check necessary?

> +
> +	status = client_tracking_ops->init();
> +	if (status) {
> +		printk(KERN_WARNING "NFSD: Unable to initialize client "
> +				    "recovery tracking! (%d)\n", status);
> +		client_tracking_ops = NULL;
> +	}
> +	return status;
> +}
> +
> +void
> +nfsd4_client_tracking_exit(void)
> +{
> +	if (client_tracking_ops && client_tracking_ops->exit)

In general I don't see the point of handling the case where one of the
ops is NULL.  If there was some implementation that really wanted that
we could get the same effect by defining a default no-op implementation
they could use, but I don't think there actually is?

> +int
> +nfsd4_client_record_check(struct nfs4_client *clp)
> +{
> +	if (client_tracking_ops && client_tracking_ops->check)
> +		return client_tracking_ops->check(clp);
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +void
> +nfsd4_grace_done(time_t boot_time)

That name's a bit generic--but I don't have a better suggestion.
(nfsd4_record_grace_done()?)

Looks basically fine, though.

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it
  2012-02-02 22:45   ` J. Bruce Fields
@ 2012-02-03 19:22     ` Jeff Layton
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2012-02-03 19:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Thu, 2 Feb 2012 17:45:41 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> Just nits:
> 
> On Wed, Feb 01, 2012 at 10:44:08AM -0500, Jeff Layton wrote:
> > +static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
> > +	.init		= nfsd4_load_reboot_recovery_data,
> > +	.exit		= nfsd4_shutdown_recdir,
> > +	.create		= nfsd4_create_clid_dir,
> > +	.remove		= nfsd4_remove_clid_dir,
> > +	.check		= nfsd4_check_legacy_client,
> > +	.grace_done	= nfsd4_recdir_purge_old,
> > +};
> > +
> > +int
> > +nfsd4_client_tracking_init(void)
> > +{
> > +	int status;
> > +
> > +	client_tracking_ops = &nfsd4_legacy_tracking_ops;
> > +
> > +	if (!client_tracking_ops->init)
> > +		return 0;
> 
> Is that check necessary?
> 

No, see below...

> > +
> > +	status = client_tracking_ops->init();
> > +	if (status) {
> > +		printk(KERN_WARNING "NFSD: Unable to initialize client "
> > +				    "recovery tracking! (%d)\n", status);
> > +		client_tracking_ops = NULL;
> > +	}
> > +	return status;
> > +}
> > +
> > +void
> > +nfsd4_client_tracking_exit(void)
> > +{
> > +	if (client_tracking_ops && client_tracking_ops->exit)
> 
> In general I don't see the point of handling the case where one of the
> ops is NULL.  If there was some implementation that really wanted that
> we could get the same effect by defining a default no-op implementation
> they could use, but I don't think there actually is?
> 

In an earlier iteration of this patchset, some of these fields were
NULL pointers, but in the current version they're all populated. I
can remove those checks if you'd prefer.

> > +int
> > +nfsd4_client_record_check(struct nfs4_client *clp)
> > +{
> > +	if (client_tracking_ops && client_tracking_ops->check)
> > +		return client_tracking_ops->check(clp);
> > +
> > +	return -EOPNOTSUPP;
> > +}
> > +
> > +void
> > +nfsd4_grace_done(time_t boot_time)
> 
> That name's a bit generic--but I don't have a better suggestion.
> (nfsd4_record_grace_done()?)
> 

Ok, that sounds reasonable. I'll rename it when I respin this patch in
the next week or so.

Thanks,
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
  2012-02-01 15:44 ` [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field Jeff Layton
@ 2012-02-03 19:35   ` J. Bruce Fields
  2012-02-04 12:21     ` Jeff Layton
  2012-02-08 21:00     ` Jeff Layton
  0 siblings, 2 replies; 16+ messages in thread
From: J. Bruce Fields @ 2012-02-03 19:35 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs

On Wed, Feb 01, 2012 at 10:44:10AM -0500, Jeff Layton wrote:
> We'll need a way to flag the nfs4_client as already being recorded on
> stable storage so that we don't continually upcall.

Looks like the code currently uses cl_firststate for that purpose.

I'm perfectly open to the argument that using a u32 to store a boolean
is rather silly, and can't see any problem with doing it this way
instead, but then let's remove cl_firstate while we're at it.

--b.

> 
> The cl_cb_flags field is only using 2 bits right now, so repurpose that
> to a generic flags field. Rename NFSD4_CLIENT_KILL to
> NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
> flags. Add a mask that we can use for existing checks that look to see
> whether any flags are set, so that the new STABLE flag doesn't
> interfere.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/nfsd/nfs4callback.c |   12 ++++++------
>  fs/nfsd/state.h        |    8 +++++---
>  2 files changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 6f3ebb4..730e2cc 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -756,7 +756,7 @@ void nfsd4_probe_callback(struct nfs4_client *clp)
>  {
>  	/* XXX: atomicity?  Also, should we be using cl_cb_flags? */
>  	clp->cl_cb_state = NFSD4_CB_UNKNOWN;
> -	set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
> +	set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
>  	do_probe_callback(clp);
>  }
>  
> @@ -915,7 +915,7 @@ void nfsd4_destroy_callback_queue(void)
>  /* must be called under the state lock */
>  void nfsd4_shutdown_callback(struct nfs4_client *clp)
>  {
> -	set_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags);
> +	set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
>  	/*
>  	 * Note this won't actually result in a null callback;
>  	 * instead, nfsd4_do_callback_rpc() will detect the killed
> @@ -966,15 +966,15 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
>  		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
>  		clp->cl_cb_conn.cb_xprt = NULL;
>  	}
> -	if (test_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags))
> +	if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
>  		return;
>  	spin_lock(&clp->cl_lock);
>  	/*
>  	 * Only serialized callback code is allowed to clear these
>  	 * flags; main nfsd code can only set them:
>  	 */
> -	BUG_ON(!clp->cl_cb_flags);
> -	clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
> +	BUG_ON(!(clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK));
> +	clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
>  	memcpy(&conn, &cb->cb_clp->cl_cb_conn, sizeof(struct nfs4_cb_conn));
>  	c = __nfsd4_find_backchannel(clp);
>  	if (c) {
> @@ -1000,7 +1000,7 @@ void nfsd4_do_callback_rpc(struct work_struct *w)
>  	struct nfs4_client *clp = cb->cb_clp;
>  	struct rpc_clnt *clnt;
>  
> -	if (clp->cl_cb_flags)
> +	if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
>  		nfsd4_process_cb_update(cb);
>  
>  	clnt = clp->cl_cb_client;
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 8f539ee..0f9a0ac 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -250,9 +250,11 @@ struct nfs4_client {
>  
>  	/* for v4.0 and v4.1 callbacks: */
>  	struct nfs4_cb_conn	cl_cb_conn;
> -#define NFSD4_CLIENT_CB_UPDATE	1
> -#define NFSD4_CLIENT_KILL	2
> -	unsigned long		cl_cb_flags;
> +#define NFSD4_CLIENT_STABLE		(0)	/* client on stable storage */
> +#define NFSD4_CLIENT_CB_UPDATE		(1)
> +#define NFSD4_CLIENT_CB_KILL		(2)
> +#define NFSD4_CLIENT_CB_FLAG_MASK	(0x6)
> +	unsigned long		cl_flags;
>  	struct rpc_clnt		*cl_cb_client;
>  	u32			cl_cb_ident;
>  #define NFSD4_CB_UP		0
> -- 
> 1.7.7.6
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall
  2012-02-01 15:44 ` [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall Jeff Layton
@ 2012-02-03 22:57   ` J. Bruce Fields
  2012-02-04 11:49     ` Jeff Layton
  0 siblings, 1 reply; 16+ messages in thread
From: J. Bruce Fields @ 2012-02-03 22:57 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs

On Wed, Feb 01, 2012 at 10:44:12AM -0500, Jeff Layton wrote:
> ...and add a mechanism for switching between the "legacy" tracker and
> the new one. The decision is made by looking to see whether the
> v4recoverydir exists. If it does, then the legacy client tracker is
> used.
> 
> If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
> That pipe is used to talk to a daemon for handling the upcall.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/nfsd/nfs4recover.c |  364 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 363 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 3fbf1f4..592fe3d 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -1,5 +1,6 @@
>  /*
>  *  Copyright (c) 2004 The Regents of the University of Michigan.
> +*  Copyright (c) 2011 Jeff Layton <jlayton@redhat.com>
>  *  All rights reserved.
>  *
>  *  Andy Adamson <andros@citi.umich.edu>
> @@ -36,6 +37,10 @@
>  #include <linux/namei.h>
>  #include <linux/crypto.h>
>  #include <linux/sched.h>
> +#include <linux/fs.h>
> +#include <linux/sunrpc/rpc_pipe_fs.h>
> +#include <linux/sunrpc/clnt.h>
> +#include <linux/nfsd/cld.h>
>  
>  #include "nfsd.h"
>  #include "state.h"
> @@ -472,12 +477,369 @@ static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
>  	.grace_done	= nfsd4_recdir_purge_old,
>  };
>  
> +/* Globals */
> +#define NFSD_PIPE_DIR		"/nfsd"
> +
> +static struct dentry *cld_pipe;
> +
> +/* list of cld_msg's that are currently in use */
> +static DEFINE_SPINLOCK(cld_lock);
> +static LIST_HEAD(cld_list);
> +static unsigned int cld_xid;
> +
> +struct cld_upcall {
> +	struct list_head	cu_list;
> +	struct task_struct	*cu_task;
> +	struct cld_msg	cu_msg;
> +};
> +
> +static int
> +__cld_pipe_upcall(struct cld_msg *cmsg)
> +{
> +	int ret;
> +	struct rpc_pipe_msg msg;
> +	struct inode *inode = cld_pipe->d_inode;
> +
> +	memset(&msg, 0, sizeof(msg));
> +	msg.data = cmsg;
> +	msg.len = sizeof(*cmsg);
> +
> +	/*
> +	 * Set task state before we queue the upcall. That prevents
> +	 * wake_up_process in the downcall from racing with schedule.
> +	 */
> +	set_current_state(TASK_UNINTERRUPTIBLE);

Is there a risk of nfsd being left unkillable if the daemon dies or
becomes unresponsive?

> +	ret = rpc_queue_upcall(inode, &msg);
> +	if (ret < 0) {
> +		set_current_state(TASK_RUNNING);
> +		goto out;
> +	}
> +
> +	schedule();
> +	set_current_state(TASK_RUNNING);
> +
> +	if (msg.errno < 0)
> +		ret = msg.errno;
> +out:
> +	return ret;
> +}
> +
> +static int
> +cld_pipe_upcall(struct cld_msg *cmsg)
> +{
> +	int ret;
> +
> +	/*
> +	 * -EAGAIN occurs when pipe is closed an reopened while there are
> +	 *  upcalls queued.
> +	 */
> +	do {
> +		ret = __cld_pipe_upcall(cmsg);
> +	} while (ret == -EAGAIN);
> +
> +	return ret;
> +}
> +
> +static ssize_t
> +cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> +{
> +	struct cld_upcall *tmp, *cup;
> +	struct cld_msg *cmsg = (struct cld_msg *)src;
> +	uint32_t xid;
> +
> +	if (mlen != sizeof(*cmsg)) {
> +		dprintk("%s: got %lu bytes, expected %lu\n", __func__, mlen,
> +			sizeof(*cmsg));
> +		return -EINVAL;
> +	}
> +
> +	/* copy just the xid so we can try to find that */
> +	if (copy_from_user(&xid, &cmsg->cm_xid, sizeof(xid)) != 0) {
> +		dprintk("%s: error when copying xid from userspace", __func__);
> +		return -EFAULT;
> +	}
> +
> +	/* walk the list and find corresponding xid */
> +	cup = NULL;
> +	spin_lock(&cld_lock);
> +	list_for_each_entry(tmp, &cld_list, cu_list) {
> +		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
> +			cup = tmp;
> +			list_del_init(&cup->cu_list);
> +			break;
> +		}
> +	}
> +	spin_unlock(&cld_lock);
> +
> +	/* couldn't find upcall? */
> +	if (!cup) {
> +		dprintk("%s: couldn't find upcall -- xid=%u\n", __func__,
> +			cup->cu_msg.cm_xid);
> +		return -EINVAL;
> +	}
> +
> +	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
> +		return -EFAULT;
> +
> +	wake_up_process(cup->cu_task);
> +	return mlen;
> +}
> +
> +static void
> +cld_pipe_destroy_msg(struct rpc_pipe_msg *msg)
> +{
> +	struct cld_msg *cmsg = msg->data;
> +	struct cld_upcall *cup = container_of(cmsg, struct cld_upcall,
> +						 cu_msg);
> +
> +	/* errno >= 0 means we got a downcall */
> +	if (msg->errno >= 0)
> +		return;
> +
> +	wake_up_process(cup->cu_task);
> +}
> +
> +static const struct rpc_pipe_ops cld_upcall_ops = {
> +	.upcall		= rpc_pipe_generic_upcall,
> +	.downcall	= cld_pipe_downcall,
> +	.destroy_msg	= cld_pipe_destroy_msg,
> +};
> +
> +/* Initialize rpc_pipefs pipe for communication with client tracking daemon */
> +static int
> +nfsd4_init_cld_pipe(void)
> +{
> +	int ret;
> +	struct path path;
> +	struct vfsmount *mnt;
> +
> +	if (cld_pipe)
> +		return 0;
> +
> +	mnt = rpc_get_mount();
> +	if (IS_ERR(mnt))
> +		return PTR_ERR(mnt);
> +
> +	ret = vfs_path_lookup(mnt->mnt_root, mnt, NFSD_PIPE_DIR, 0, &path);
> +	if (ret)
> +		goto err;
> +
> +	cld_pipe = rpc_mkpipe(path.dentry, "cld", NULL,
> +				&cld_upcall_ops, RPC_PIPE_WAIT_FOR_OPEN);
> +	path_put(&path);
> +	if (!IS_ERR(cld_pipe))
> +		return 0;
> +
> +	ret = PTR_ERR(cld_pipe);
> +err:
> +	rpc_put_mount();
> +	printk(KERN_ERR "NFSD: unable to create nfsdcld upcall pipe (%d)\n",
> +			ret);
> +	return ret;
> +}
> +
> +static void
> +nfsd4_remove_cld_pipe(void)
> +{
> +	int ret;
> +
> +	ret = rpc_unlink(cld_pipe);
> +	if (ret)
> +		printk(KERN_ERR "NFSD: error removing cld pipe: %d\n", ret);
> +	cld_pipe = NULL;
> +	rpc_put_mount();
> +}
> +
> +static struct cld_upcall *
> +alloc_cld_upcall(void)
> +{
> +	struct cld_upcall *new, *tmp;
> +
> +	new = kzalloc(sizeof(*new), GFP_KERNEL);
> +	if (!new)
> +		return new;
> +
> +	/* FIXME: hard cap on number in flight? */
> +restart_search:
> +	spin_lock(&cld_lock);
> +	list_for_each_entry(tmp, &cld_list, cu_list) {
> +		if (tmp->cu_msg.cm_xid == cld_xid) {
> +			cld_xid++;
> +			spin_unlock(&cld_lock);
> +			goto restart_search;
> +		}
> +	}
> +	new->cu_task = current;
> +	new->cu_msg.cm_vers = CLD_UPCALL_VERSION;
> +	put_unaligned(cld_xid++, &new->cu_msg.cm_xid);
> +	list_add(&new->cu_list, &cld_list);
> +	spin_unlock(&cld_lock);
> +
> +	dprintk("%s: allocated xid %u\n", __func__, new->cu_msg.cm_xid);
> +
> +	return new;
> +}
> +
> +static void
> +free_cld_upcall(struct cld_upcall *victim)
> +{
> +	spin_lock(&cld_lock);
> +	list_del(&victim->cu_list);
> +	spin_unlock(&cld_lock);
> +	kfree(victim);
> +}
> +
> +/* Ask daemon to create a new record */
> +static void
> +nfsd4_cld_create(struct nfs4_client *clp)
> +{
> +	int ret;
> +	struct cld_upcall *cup;
> +
> +	/* Don't upcall if it's already stored */
> +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> +		return;
> +
> +	cup = alloc_cld_upcall();
> +	if (!cup) {
> +		ret = -ENOMEM;
> +		goto out_err;
> +	}
> +
> +	cup->cu_msg.cm_cmd = Cld_Create;
> +	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
> +	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
> +			clp->cl_name.len);
> +
> +	ret = cld_pipe_upcall(&cup->cu_msg);
> +	if (!ret) {
> +		ret = cup->cu_msg.cm_status;
> +		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
> +	}
> +
> +	free_cld_upcall(cup);
> +out_err:
> +	if (ret)
> +		printk(KERN_ERR "NFSD: Unable to create client "
> +				"record on stable storage: %d\n", ret);
> +}
> +
> +/* Ask daemon to create a new record */
> +static void
> +nfsd4_cld_remove(struct nfs4_client *clp)
> +{
> +	int ret;
> +	struct cld_upcall *cup;
> +
> +	/* Don't upcall if it's already removed */
> +	if (!test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> +		return;
> +
> +	cup = alloc_cld_upcall();
> +	if (!cup) {
> +		ret = -ENOMEM;
> +		goto out_err;
> +	}
> +
> +	cup->cu_msg.cm_cmd = Cld_Remove;
> +	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
> +	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
> +			clp->cl_name.len);
> +
> +	ret = cld_pipe_upcall(&cup->cu_msg);
> +	if (!ret) {
> +		ret = cup->cu_msg.cm_status;
> +		clear_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
> +	}
> +
> +	free_cld_upcall(cup);
> +out_err:
> +	if (ret)
> +		printk(KERN_ERR "NFSD: Unable to remove client "
> +				"record from stable storage: %d\n", ret);
> +}
> +/* Check for presence of a record, and update its timestamp */
> +static int
> +nfsd4_cld_check(struct nfs4_client *clp)
> +{
> +	int ret;
> +	struct cld_upcall *cup;
> +
> +	/* Don't upcall if one was already stored during this grace pd */
> +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> +		return 0;
> +
> +	cup = alloc_cld_upcall();
> +	if (!cup) {
> +		printk(KERN_ERR "NFSD: Unable to check client record on "
> +				"stable storage: %d\n", -ENOMEM);
> +		return -ENOMEM;
> +	}
> +
> +	cup->cu_msg.cm_cmd = Cld_Check;
> +	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
> +	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
> +			clp->cl_name.len);
> +
> +	ret = cld_pipe_upcall(&cup->cu_msg);
> +	if (!ret) {
> +		ret = cup->cu_msg.cm_status;
> +		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
> +	}

Most of this nfsd4_cld_*() code is *really* similar from one function to
the next--would it make sense to share some more code?

Anyway, this basically all looks reasonable to me....

--b.

> +
> +	free_cld_upcall(cup);
> +	return ret;
> +}
> +
> +static void
> +nfsd4_cld_grace_done(time_t boot_time)
> +{
> +	int ret;
> +	struct cld_upcall *cup;
> +
> +	cup = alloc_cld_upcall();
> +	if (!cup) {
> +		ret = -ENOMEM;
> +		goto out_err;
> +	}
> +
> +	cup->cu_msg.cm_cmd = Cld_GraceDone;
> +	cup->cu_msg.cm_u.cm_gracetime = (int64_t)boot_time;
> +	ret = cld_pipe_upcall(&cup->cu_msg);
> +	if (!ret)
> +		ret = cup->cu_msg.cm_status;
> +
> +	free_cld_upcall(cup);
> +out_err:
> +	if (ret)
> +		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> +}
> +
> +static struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> +	.init		= nfsd4_init_cld_pipe,
> +	.exit		= nfsd4_remove_cld_pipe,
> +	.create		= nfsd4_cld_create,
> +	.remove		= nfsd4_cld_remove,
> +	.check		= nfsd4_cld_check,
> +	.grace_done	= nfsd4_cld_grace_done,
> +};
> +
>  int
>  nfsd4_client_tracking_init(void)
>  {
>  	int status;
> +	struct path path;
>  
> -	client_tracking_ops = &nfsd4_legacy_tracking_ops;
> +	if (!client_tracking_ops) {
> +		client_tracking_ops = &nfsd4_cld_tracking_ops;
> +		status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
> +		if (!status) {
> +			if (S_ISDIR(path.dentry->d_inode->i_mode))
> +				client_tracking_ops =
> +						&nfsd4_legacy_tracking_ops;
> +			path_put(&path);
> +		}
> +	}
>  
>  	if (!client_tracking_ops->init)
>  		return 0;
> -- 
> 1.7.7.6
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall
  2012-02-03 22:57   ` J. Bruce Fields
@ 2012-02-04 11:49     ` Jeff Layton
  2012-02-07 15:00       ` Jeff Layton
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2012-02-04 11:49 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Fri, 3 Feb 2012 17:57:36 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Feb 01, 2012 at 10:44:12AM -0500, Jeff Layton wrote:
> > ...and add a mechanism for switching between the "legacy" tracker and
> > the new one. The decision is made by looking to see whether the
> > v4recoverydir exists. If it does, then the legacy client tracker is
> > used.
> > 
> > If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
> > That pipe is used to talk to a daemon for handling the upcall.
> > 
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  fs/nfsd/nfs4recover.c |  364 ++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 363 insertions(+), 1 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > index 3fbf1f4..592fe3d 100644
> > --- a/fs/nfsd/nfs4recover.c
> > +++ b/fs/nfsd/nfs4recover.c
> > @@ -1,5 +1,6 @@
> >  /*
> >  *  Copyright (c) 2004 The Regents of the University of Michigan.
> > +*  Copyright (c) 2011 Jeff Layton <jlayton@redhat.com>
> >  *  All rights reserved.
> >  *
> >  *  Andy Adamson <andros@citi.umich.edu>
> > @@ -36,6 +37,10 @@
> >  #include <linux/namei.h>
> >  #include <linux/crypto.h>
> >  #include <linux/sched.h>
> > +#include <linux/fs.h>
> > +#include <linux/sunrpc/rpc_pipe_fs.h>
> > +#include <linux/sunrpc/clnt.h>
> > +#include <linux/nfsd/cld.h>
> >  
> >  #include "nfsd.h"
> >  #include "state.h"
> > @@ -472,12 +477,369 @@ static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
> >  	.grace_done	= nfsd4_recdir_purge_old,
> >  };
> >  
> > +/* Globals */
> > +#define NFSD_PIPE_DIR		"/nfsd"
> > +
> > +static struct dentry *cld_pipe;
> > +
> > +/* list of cld_msg's that are currently in use */
> > +static DEFINE_SPINLOCK(cld_lock);
> > +static LIST_HEAD(cld_list);
> > +static unsigned int cld_xid;
> > +
> > +struct cld_upcall {
> > +	struct list_head	cu_list;
> > +	struct task_struct	*cu_task;
> > +	struct cld_msg	cu_msg;
> > +};
> > +
> > +static int
> > +__cld_pipe_upcall(struct cld_msg *cmsg)
> > +{
> > +	int ret;
> > +	struct rpc_pipe_msg msg;
> > +	struct inode *inode = cld_pipe->d_inode;
> > +
> > +	memset(&msg, 0, sizeof(msg));
> > +	msg.data = cmsg;
> > +	msg.len = sizeof(*cmsg);
> > +
> > +	/*
> > +	 * Set task state before we queue the upcall. That prevents
> > +	 * wake_up_process in the downcall from racing with schedule.
> > +	 */
> > +	set_current_state(TASK_UNINTERRUPTIBLE);
> 
> Is there a risk of nfsd being left unkillable if the daemon dies or
> becomes unresponsive?
> 

If the daemon dies, then the pipe will be closed and this will get
woken back up with msg.errno set to -EAGAIN. At that point, it'll
get requeued and then will eventually time out via the
RPC_PIPE_WAIT_FOR_OPEN workqueue job.

If the daemon is unresponsive but holds the pipe open then this will
sleep indefinitely. I suppose we could use a schedule_timeout() below
with a 30s timeout or so and then dequeue it if it does. I'll note
though that none of the other rpc_pipefs upcalls do that.

Hmm...now that I look though, I think I see a bug in the rpc_pipefs
code. When an upcall is read off the pipe, it moves to the in_upcall
list. If there's never a downcall for it, then I think it'll just
linger there forever until the pipe is removed. Even closing the pipe
won't unwedge it. It seems like we should return an error for those as
well. I'll see about spinning up a patch for that too...

> > +	ret = rpc_queue_upcall(inode, &msg);
> > +	if (ret < 0) {
> > +		set_current_state(TASK_RUNNING);
> > +		goto out;
> > +	}
> > +
> > +	schedule();
> > +	set_current_state(TASK_RUNNING);
> > +
> > +	if (msg.errno < 0)
> > +		ret = msg.errno;
> > +out:
> > +	return ret;
> > +}
> > +
> > +static int
> > +cld_pipe_upcall(struct cld_msg *cmsg)
> > +{
> > +	int ret;
> > +
> > +	/*
> > +	 * -EAGAIN occurs when pipe is closed an reopened while there are
> > +	 *  upcalls queued.
> > +	 */
> > +	do {
> > +		ret = __cld_pipe_upcall(cmsg);
> > +	} while (ret == -EAGAIN);
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t
> > +cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> > +{
> > +	struct cld_upcall *tmp, *cup;
> > +	struct cld_msg *cmsg = (struct cld_msg *)src;
> > +	uint32_t xid;
> > +
> > +	if (mlen != sizeof(*cmsg)) {
> > +		dprintk("%s: got %lu bytes, expected %lu\n", __func__, mlen,
> > +			sizeof(*cmsg));
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* copy just the xid so we can try to find that */
> > +	if (copy_from_user(&xid, &cmsg->cm_xid, sizeof(xid)) != 0) {
> > +		dprintk("%s: error when copying xid from userspace", __func__);
> > +		return -EFAULT;
> > +	}
> > +
> > +	/* walk the list and find corresponding xid */
> > +	cup = NULL;
> > +	spin_lock(&cld_lock);
> > +	list_for_each_entry(tmp, &cld_list, cu_list) {
> > +		if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
> > +			cup = tmp;
> > +			list_del_init(&cup->cu_list);
> > +			break;
> > +		}
> > +	}
> > +	spin_unlock(&cld_lock);
> > +
> > +	/* couldn't find upcall? */
> > +	if (!cup) {
> > +		dprintk("%s: couldn't find upcall -- xid=%u\n", __func__,
> > +			cup->cu_msg.cm_xid);
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
> > +		return -EFAULT;
> > +
> > +	wake_up_process(cup->cu_task);
> > +	return mlen;
> > +}
> > +
> > +static void
> > +cld_pipe_destroy_msg(struct rpc_pipe_msg *msg)
> > +{
> > +	struct cld_msg *cmsg = msg->data;
> > +	struct cld_upcall *cup = container_of(cmsg, struct cld_upcall,
> > +						 cu_msg);
> > +
> > +	/* errno >= 0 means we got a downcall */
> > +	if (msg->errno >= 0)
> > +		return;
> > +
> > +	wake_up_process(cup->cu_task);
> > +}
> > +
> > +static const struct rpc_pipe_ops cld_upcall_ops = {
> > +	.upcall		= rpc_pipe_generic_upcall,
> > +	.downcall	= cld_pipe_downcall,
> > +	.destroy_msg	= cld_pipe_destroy_msg,
> > +};
> > +
> > +/* Initialize rpc_pipefs pipe for communication with client tracking daemon */
> > +static int
> > +nfsd4_init_cld_pipe(void)
> > +{
> > +	int ret;
> > +	struct path path;
> > +	struct vfsmount *mnt;
> > +
> > +	if (cld_pipe)
> > +		return 0;
> > +
> > +	mnt = rpc_get_mount();
> > +	if (IS_ERR(mnt))
> > +		return PTR_ERR(mnt);
> > +
> > +	ret = vfs_path_lookup(mnt->mnt_root, mnt, NFSD_PIPE_DIR, 0, &path);
> > +	if (ret)
> > +		goto err;
> > +
> > +	cld_pipe = rpc_mkpipe(path.dentry, "cld", NULL,
> > +				&cld_upcall_ops, RPC_PIPE_WAIT_FOR_OPEN);
> > +	path_put(&path);
> > +	if (!IS_ERR(cld_pipe))
> > +		return 0;
> > +
> > +	ret = PTR_ERR(cld_pipe);
> > +err:
> > +	rpc_put_mount();
> > +	printk(KERN_ERR "NFSD: unable to create nfsdcld upcall pipe (%d)\n",
> > +			ret);
> > +	return ret;
> > +}
> > +
> > +static void
> > +nfsd4_remove_cld_pipe(void)
> > +{
> > +	int ret;
> > +
> > +	ret = rpc_unlink(cld_pipe);
> > +	if (ret)
> > +		printk(KERN_ERR "NFSD: error removing cld pipe: %d\n", ret);
> > +	cld_pipe = NULL;
> > +	rpc_put_mount();
> > +}
> > +
> > +static struct cld_upcall *
> > +alloc_cld_upcall(void)
> > +{
> > +	struct cld_upcall *new, *tmp;
> > +
> > +	new = kzalloc(sizeof(*new), GFP_KERNEL);
> > +	if (!new)
> > +		return new;
> > +
> > +	/* FIXME: hard cap on number in flight? */
> > +restart_search:
> > +	spin_lock(&cld_lock);
> > +	list_for_each_entry(tmp, &cld_list, cu_list) {
> > +		if (tmp->cu_msg.cm_xid == cld_xid) {
> > +			cld_xid++;
> > +			spin_unlock(&cld_lock);
> > +			goto restart_search;
> > +		}
> > +	}
> > +	new->cu_task = current;
> > +	new->cu_msg.cm_vers = CLD_UPCALL_VERSION;
> > +	put_unaligned(cld_xid++, &new->cu_msg.cm_xid);
> > +	list_add(&new->cu_list, &cld_list);
> > +	spin_unlock(&cld_lock);
> > +
> > +	dprintk("%s: allocated xid %u\n", __func__, new->cu_msg.cm_xid);
> > +
> > +	return new;
> > +}
> > +
> > +static void
> > +free_cld_upcall(struct cld_upcall *victim)
> > +{
> > +	spin_lock(&cld_lock);
> > +	list_del(&victim->cu_list);
> > +	spin_unlock(&cld_lock);
> > +	kfree(victim);
> > +}
> > +
> > +/* Ask daemon to create a new record */
> > +static void
> > +nfsd4_cld_create(struct nfs4_client *clp)
> > +{
> > +	int ret;
> > +	struct cld_upcall *cup;
> > +
> > +	/* Don't upcall if it's already stored */
> > +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> > +		return;
> > +
> > +	cup = alloc_cld_upcall();
> > +	if (!cup) {
> > +		ret = -ENOMEM;
> > +		goto out_err;
> > +	}
> > +
> > +	cup->cu_msg.cm_cmd = Cld_Create;
> > +	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
> > +	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
> > +			clp->cl_name.len);
> > +
> > +	ret = cld_pipe_upcall(&cup->cu_msg);
> > +	if (!ret) {
> > +		ret = cup->cu_msg.cm_status;
> > +		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
> > +	}
> > +
> > +	free_cld_upcall(cup);
> > +out_err:
> > +	if (ret)
> > +		printk(KERN_ERR "NFSD: Unable to create client "
> > +				"record on stable storage: %d\n", ret);
> > +}
> > +
> > +/* Ask daemon to create a new record */
> > +static void
> > +nfsd4_cld_remove(struct nfs4_client *clp)
> > +{
> > +	int ret;
> > +	struct cld_upcall *cup;
> > +
> > +	/* Don't upcall if it's already removed */
> > +	if (!test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> > +		return;
> > +
> > +	cup = alloc_cld_upcall();
> > +	if (!cup) {
> > +		ret = -ENOMEM;
> > +		goto out_err;
> > +	}
> > +
> > +	cup->cu_msg.cm_cmd = Cld_Remove;
> > +	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
> > +	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
> > +			clp->cl_name.len);
> > +
> > +	ret = cld_pipe_upcall(&cup->cu_msg);
> > +	if (!ret) {
> > +		ret = cup->cu_msg.cm_status;
> > +		clear_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
> > +	}
> > +
> > +	free_cld_upcall(cup);
> > +out_err:
> > +	if (ret)
> > +		printk(KERN_ERR "NFSD: Unable to remove client "
> > +				"record from stable storage: %d\n", ret);
> > +}
> > +/* Check for presence of a record, and update its timestamp */
> > +static int
> > +nfsd4_cld_check(struct nfs4_client *clp)
> > +{
> > +	int ret;
> > +	struct cld_upcall *cup;
> > +
> > +	/* Don't upcall if one was already stored during this grace pd */
> > +	if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
> > +		return 0;
> > +
> > +	cup = alloc_cld_upcall();
> > +	if (!cup) {
> > +		printk(KERN_ERR "NFSD: Unable to check client record on "
> > +				"stable storage: %d\n", -ENOMEM);
> > +		return -ENOMEM;
> > +	}
> > +
> > +	cup->cu_msg.cm_cmd = Cld_Check;
> > +	cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
> > +	memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
> > +			clp->cl_name.len);
> > +
> > +	ret = cld_pipe_upcall(&cup->cu_msg);
> > +	if (!ret) {
> > +		ret = cup->cu_msg.cm_status;
> > +		set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
> > +	}
> 
> Most of this nfsd4_cld_*() code is *really* similar from one function to
> the next--would it make sense to share some more code?
> 
> Anyway, this basically all looks reasonable to me....
> 
> --b.
> 

It is very similar, but I think I'm sharing about all I can here. I'll
see if I can consolidate it some more, but the upcalls are just
different enough to make that difficult...

> > +
> > +	free_cld_upcall(cup);
> > +	return ret;
> > +}
> > +
> > +static void
> > +nfsd4_cld_grace_done(time_t boot_time)
> > +{
> > +	int ret;
> > +	struct cld_upcall *cup;
> > +
> > +	cup = alloc_cld_upcall();
> > +	if (!cup) {
> > +		ret = -ENOMEM;
> > +		goto out_err;
> > +	}
> > +
> > +	cup->cu_msg.cm_cmd = Cld_GraceDone;
> > +	cup->cu_msg.cm_u.cm_gracetime = (int64_t)boot_time;
> > +	ret = cld_pipe_upcall(&cup->cu_msg);
> > +	if (!ret)
> > +		ret = cup->cu_msg.cm_status;
> > +
> > +	free_cld_upcall(cup);
> > +out_err:
> > +	if (ret)
> > +		printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
> > +}
> > +
> > +static struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
> > +	.init		= nfsd4_init_cld_pipe,
> > +	.exit		= nfsd4_remove_cld_pipe,
> > +	.create		= nfsd4_cld_create,
> > +	.remove		= nfsd4_cld_remove,
> > +	.check		= nfsd4_cld_check,
> > +	.grace_done	= nfsd4_cld_grace_done,
> > +};
> > +
> >  int
> >  nfsd4_client_tracking_init(void)
> >  {
> >  	int status;
> > +	struct path path;
> >  
> > -	client_tracking_ops = &nfsd4_legacy_tracking_ops;
> > +	if (!client_tracking_ops) {
> > +		client_tracking_ops = &nfsd4_cld_tracking_ops;
> > +		status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
> > +		if (!status) {
> > +			if (S_ISDIR(path.dentry->d_inode->i_mode))
> > +				client_tracking_ops =
> > +						&nfsd4_legacy_tracking_ops;
> > +			path_put(&path);
> > +		}
> > +	}
> >  
> >  	if (!client_tracking_ops->init)
> >  		return 0;
> > -- 
> > 1.7.7.6
> > 


-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
  2012-02-03 19:35   ` J. Bruce Fields
@ 2012-02-04 12:21     ` Jeff Layton
  2012-02-08 21:00     ` Jeff Layton
  1 sibling, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2012-02-04 12:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Fri, 3 Feb 2012 14:35:33 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Feb 01, 2012 at 10:44:10AM -0500, Jeff Layton wrote:
> > We'll need a way to flag the nfs4_client as already being recorded on
> > stable storage so that we don't continually upcall.
> 
> Looks like the code currently uses cl_firststate for that purpose.
> 
> I'm perfectly open to the argument that using a u32 to store a boolean
> is rather silly, and can't see any problem with doing it this way
> instead, but then let's remove cl_firstate while we're at it.
> 
> --b.
> 

Ahh ok, I missed that that was what that was for. Looks like I probably
have a couple of bugs here with that too. There are some places that
check cl_firststate and since the new code doesn't set it it would be
wrong.

I'll respin this patch to merge that in with this change. Thanks.

> > 
> > The cl_cb_flags field is only using 2 bits right now, so repurpose that
> > to a generic flags field. Rename NFSD4_CLIENT_KILL to
> > NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
> > flags. Add a mask that we can use for existing checks that look to see
> > whether any flags are set, so that the new STABLE flag doesn't
> > interfere.
> > 
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  fs/nfsd/nfs4callback.c |   12 ++++++------
> >  fs/nfsd/state.h        |    8 +++++---
> >  2 files changed, 11 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 6f3ebb4..730e2cc 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -756,7 +756,7 @@ void nfsd4_probe_callback(struct nfs4_client *clp)
> >  {
> >  	/* XXX: atomicity?  Also, should we be using cl_cb_flags? */
> >  	clp->cl_cb_state = NFSD4_CB_UNKNOWN;
> > -	set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
> > +	set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
> >  	do_probe_callback(clp);
> >  }
> >  
> > @@ -915,7 +915,7 @@ void nfsd4_destroy_callback_queue(void)
> >  /* must be called under the state lock */
> >  void nfsd4_shutdown_callback(struct nfs4_client *clp)
> >  {
> > -	set_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags);
> > +	set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
> >  	/*
> >  	 * Note this won't actually result in a null callback;
> >  	 * instead, nfsd4_do_callback_rpc() will detect the killed
> > @@ -966,15 +966,15 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
> >  		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
> >  		clp->cl_cb_conn.cb_xprt = NULL;
> >  	}
> > -	if (test_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags))
> > +	if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> >  		return;
> >  	spin_lock(&clp->cl_lock);
> >  	/*
> >  	 * Only serialized callback code is allowed to clear these
> >  	 * flags; main nfsd code can only set them:
> >  	 */
> > -	BUG_ON(!clp->cl_cb_flags);
> > -	clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
> > +	BUG_ON(!(clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK));
> > +	clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
> >  	memcpy(&conn, &cb->cb_clp->cl_cb_conn, sizeof(struct nfs4_cb_conn));
> >  	c = __nfsd4_find_backchannel(clp);
> >  	if (c) {
> > @@ -1000,7 +1000,7 @@ void nfsd4_do_callback_rpc(struct work_struct *w)
> >  	struct nfs4_client *clp = cb->cb_clp;
> >  	struct rpc_clnt *clnt;
> >  
> > -	if (clp->cl_cb_flags)
> > +	if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
> >  		nfsd4_process_cb_update(cb);
> >  
> >  	clnt = clp->cl_cb_client;
> > diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> > index 8f539ee..0f9a0ac 100644
> > --- a/fs/nfsd/state.h
> > +++ b/fs/nfsd/state.h
> > @@ -250,9 +250,11 @@ struct nfs4_client {
> >  
> >  	/* for v4.0 and v4.1 callbacks: */
> >  	struct nfs4_cb_conn	cl_cb_conn;
> > -#define NFSD4_CLIENT_CB_UPDATE	1
> > -#define NFSD4_CLIENT_KILL	2
> > -	unsigned long		cl_cb_flags;
> > +#define NFSD4_CLIENT_STABLE		(0)	/* client on stable storage */
> > +#define NFSD4_CLIENT_CB_UPDATE		(1)
> > +#define NFSD4_CLIENT_CB_KILL		(2)
> > +#define NFSD4_CLIENT_CB_FLAG_MASK	(0x6)
> > +	unsigned long		cl_flags;
> >  	struct rpc_clnt		*cl_cb_client;
> >  	u32			cl_cb_ident;
> >  #define NFSD4_CB_UP		0
> > -- 
> > 1.7.7.6
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall
  2012-02-04 11:49     ` Jeff Layton
@ 2012-02-07 15:00       ` Jeff Layton
  2012-02-07 15:19         ` J. Bruce Fields
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2012-02-07 15:00 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Sat, 4 Feb 2012 06:49:06 -0500
Jeff Layton <jlayton@redhat.com> wrote:

> On Fri, 3 Feb 2012 17:57:36 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Wed, Feb 01, 2012 at 10:44:12AM -0500, Jeff Layton wrote:
> > > ...and add a mechanism for switching between the "legacy" tracker and
> > > the new one. The decision is made by looking to see whether the
> > > v4recoverydir exists. If it does, then the legacy client tracker is
> > > used.
> > > 
> > > If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
> > > That pipe is used to talk to a daemon for handling the upcall.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > > ---
> > >  fs/nfsd/nfs4recover.c |  364 ++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 files changed, 363 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > index 3fbf1f4..592fe3d 100644
> > > --- a/fs/nfsd/nfs4recover.c
> > > +++ b/fs/nfsd/nfs4recover.c
> > > @@ -1,5 +1,6 @@
> > >  /*
> > >  *  Copyright (c) 2004 The Regents of the University of Michigan.
> > > +*  Copyright (c) 2011 Jeff Layton <jlayton@redhat.com>
> > >  *  All rights reserved.
> > >  *
> > >  *  Andy Adamson <andros@citi.umich.edu>
> > > @@ -36,6 +37,10 @@
> > >  #include <linux/namei.h>
> > >  #include <linux/crypto.h>
> > >  #include <linux/sched.h>
> > > +#include <linux/fs.h>
> > > +#include <linux/sunrpc/rpc_pipe_fs.h>
> > > +#include <linux/sunrpc/clnt.h>
> > > +#include <linux/nfsd/cld.h>
> > >  
> > >  #include "nfsd.h"
> > >  #include "state.h"
> > > @@ -472,12 +477,369 @@ static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
> > >  	.grace_done	= nfsd4_recdir_purge_old,
> > >  };
> > >  
> > > +/* Globals */
> > > +#define NFSD_PIPE_DIR		"/nfsd"
> > > +
> > > +static struct dentry *cld_pipe;
> > > +
> > > +/* list of cld_msg's that are currently in use */
> > > +static DEFINE_SPINLOCK(cld_lock);
> > > +static LIST_HEAD(cld_list);
> > > +static unsigned int cld_xid;
> > > +
> > > +struct cld_upcall {
> > > +	struct list_head	cu_list;
> > > +	struct task_struct	*cu_task;
> > > +	struct cld_msg	cu_msg;
> > > +};
> > > +
> > > +static int
> > > +__cld_pipe_upcall(struct cld_msg *cmsg)
> > > +{
> > > +	int ret;
> > > +	struct rpc_pipe_msg msg;
> > > +	struct inode *inode = cld_pipe->d_inode;
> > > +
> > > +	memset(&msg, 0, sizeof(msg));
> > > +	msg.data = cmsg;
> > > +	msg.len = sizeof(*cmsg);
> > > +
> > > +	/*
> > > +	 * Set task state before we queue the upcall. That prevents
> > > +	 * wake_up_process in the downcall from racing with schedule.
> > > +	 */
> > > +	set_current_state(TASK_UNINTERRUPTIBLE);
> > 
> > Is there a risk of nfsd being left unkillable if the daemon dies or
> > becomes unresponsive?
> > 
> 
> If the daemon dies, then the pipe will be closed and this will get
> woken back up with msg.errno set to -EAGAIN. At that point, it'll
> get requeued and then will eventually time out via the
> RPC_PIPE_WAIT_FOR_OPEN workqueue job.
> 
> If the daemon is unresponsive but holds the pipe open then this will
> sleep indefinitely. I suppose we could use a schedule_timeout() below
> with a 30s timeout or so and then dequeue it if it does. I'll note
> though that none of the other rpc_pipefs upcalls do that.
> 
> Hmm...now that I look though, I think I see a bug in the rpc_pipefs
> code. When an upcall is read off the pipe, it moves to the in_upcall
> list. If there's never a downcall for it, then I think it'll just
> linger there forever until the pipe is removed. Even closing the pipe
> won't unwedge it. It seems like we should return an error for those as
> well. I'll see about spinning up a patch for that too...
>

Ok, I had a bit more time to look over this... 

My mistake -- that last paragraph is wrong. I neglected to note that
upcalls that are partially read end up moving to filp->private_data as
well. If the pipe is closed after reading part of a call then the
upcall will end up getting an -EAGAIN back.

I think we can make these calls eventually time out if the daemon goes
unresponsive. It will take some fairly significant reengineering
however. We'll need to embed the rpc_pipe_msg inside another struct,
allocate them from the slab and refcount them, and then (carefully!)
deal with them if the schedule_timeout() times out.

The question is -- is that really worth it? None of the other rpc_pipe
upcalls do that (aside from the gssapi one). My preference would be not
to bother with it until we find that a hung daemon is a problem. That
said, if you think it's a necessary precondition to merging this then
I'll see about making those changes.

Thoughts?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall
  2012-02-07 15:00       ` Jeff Layton
@ 2012-02-07 15:19         ` J. Bruce Fields
  0 siblings, 0 replies; 16+ messages in thread
From: J. Bruce Fields @ 2012-02-07 15:19 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs

On Tue, Feb 07, 2012 at 10:00:09AM -0500, Jeff Layton wrote:
> On Sat, 4 Feb 2012 06:49:06 -0500
> Jeff Layton <jlayton@redhat.com> wrote:
> 
> > On Fri, 3 Feb 2012 17:57:36 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > On Wed, Feb 01, 2012 at 10:44:12AM -0500, Jeff Layton wrote:
> > > > ...and add a mechanism for switching between the "legacy" tracker and
> > > > the new one. The decision is made by looking to see whether the
> > > > v4recoverydir exists. If it does, then the legacy client tracker is
> > > > used.
> > > > 
> > > > If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
> > > > That pipe is used to talk to a daemon for handling the upcall.
> > > > 
> > > > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > > > ---
> > > >  fs/nfsd/nfs4recover.c |  364 ++++++++++++++++++++++++++++++++++++++++++++++++-
> > > >  1 files changed, 363 insertions(+), 1 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > > index 3fbf1f4..592fe3d 100644
> > > > --- a/fs/nfsd/nfs4recover.c
> > > > +++ b/fs/nfsd/nfs4recover.c
> > > > @@ -1,5 +1,6 @@
> > > >  /*
> > > >  *  Copyright (c) 2004 The Regents of the University of Michigan.
> > > > +*  Copyright (c) 2011 Jeff Layton <jlayton@redhat.com>
> > > >  *  All rights reserved.
> > > >  *
> > > >  *  Andy Adamson <andros@citi.umich.edu>
> > > > @@ -36,6 +37,10 @@
> > > >  #include <linux/namei.h>
> > > >  #include <linux/crypto.h>
> > > >  #include <linux/sched.h>
> > > > +#include <linux/fs.h>
> > > > +#include <linux/sunrpc/rpc_pipe_fs.h>
> > > > +#include <linux/sunrpc/clnt.h>
> > > > +#include <linux/nfsd/cld.h>
> > > >  
> > > >  #include "nfsd.h"
> > > >  #include "state.h"
> > > > @@ -472,12 +477,369 @@ static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
> > > >  	.grace_done	= nfsd4_recdir_purge_old,
> > > >  };
> > > >  
> > > > +/* Globals */
> > > > +#define NFSD_PIPE_DIR		"/nfsd"
> > > > +
> > > > +static struct dentry *cld_pipe;
> > > > +
> > > > +/* list of cld_msg's that are currently in use */
> > > > +static DEFINE_SPINLOCK(cld_lock);
> > > > +static LIST_HEAD(cld_list);
> > > > +static unsigned int cld_xid;
> > > > +
> > > > +struct cld_upcall {
> > > > +	struct list_head	cu_list;
> > > > +	struct task_struct	*cu_task;
> > > > +	struct cld_msg	cu_msg;
> > > > +};
> > > > +
> > > > +static int
> > > > +__cld_pipe_upcall(struct cld_msg *cmsg)
> > > > +{
> > > > +	int ret;
> > > > +	struct rpc_pipe_msg msg;
> > > > +	struct inode *inode = cld_pipe->d_inode;
> > > > +
> > > > +	memset(&msg, 0, sizeof(msg));
> > > > +	msg.data = cmsg;
> > > > +	msg.len = sizeof(*cmsg);
> > > > +
> > > > +	/*
> > > > +	 * Set task state before we queue the upcall. That prevents
> > > > +	 * wake_up_process in the downcall from racing with schedule.
> > > > +	 */
> > > > +	set_current_state(TASK_UNINTERRUPTIBLE);
> > > 
> > > Is there a risk of nfsd being left unkillable if the daemon dies or
> > > becomes unresponsive?
> > > 
> > 
> > If the daemon dies, then the pipe will be closed and this will get
> > woken back up with msg.errno set to -EAGAIN. At that point, it'll
> > get requeued and then will eventually time out via the
> > RPC_PIPE_WAIT_FOR_OPEN workqueue job.
> > 
> > If the daemon is unresponsive but holds the pipe open then this will
> > sleep indefinitely. I suppose we could use a schedule_timeout() below
> > with a 30s timeout or so and then dequeue it if it does. I'll note
> > though that none of the other rpc_pipefs upcalls do that.
> > 
> > Hmm...now that I look though, I think I see a bug in the rpc_pipefs
> > code. When an upcall is read off the pipe, it moves to the in_upcall
> > list. If there's never a downcall for it, then I think it'll just
> > linger there forever until the pipe is removed. Even closing the pipe
> > won't unwedge it. It seems like we should return an error for those as
> > well. I'll see about spinning up a patch for that too...
> >
> 
> Ok, I had a bit more time to look over this... 
> 
> My mistake -- that last paragraph is wrong. I neglected to note that
> upcalls that are partially read end up moving to filp->private_data as
> well. If the pipe is closed after reading part of a call then the
> upcall will end up getting an -EAGAIN back.
> 
> I think we can make these calls eventually time out if the daemon goes
> unresponsive. It will take some fairly significant reengineering
> however. We'll need to embed the rpc_pipe_msg inside another struct,
> allocate them from the slab and refcount them, and then (carefully!)
> deal with them if the schedule_timeout() times out.
> 
> The question is -- is that really worth it? None of the other rpc_pipe
> upcalls do that (aside from the gssapi one). My preference would be not
> to bother with it until we find that a hung daemon is a problem. That
> said, if you think it's a necessary precondition to merging this then
> I'll see about making those changes.

OK, leaving it as is sounds like a reasonable tradeoff to me.

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
  2012-02-03 19:35   ` J. Bruce Fields
  2012-02-04 12:21     ` Jeff Layton
@ 2012-02-08 21:00     ` Jeff Layton
  2012-02-10 16:06       ` Jeff Layton
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2012-02-08 21:00 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Fri, 3 Feb 2012 14:35:33 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Feb 01, 2012 at 10:44:10AM -0500, Jeff Layton wrote:
> > We'll need a way to flag the nfs4_client as already being recorded on
> > stable storage so that we don't continually upcall.
> 
> Looks like the code currently uses cl_firststate for that purpose.
> 
> I'm perfectly open to the argument that using a u32 to store a boolean
> is rather silly, and can't see any problem with doing it this way
> instead, but then let's remove cl_firstate while we're at it.
> 
> --b.
> 

I started looking at this. The semantics on cl_firststate are a little
different from my proposed flag:

The meaning of the flag I was proposing is fairly clear -- I set it
whenever the kernel believes that there is already a record for this
client on stable storage. So, when we first upcall to check if a client
is on stable storage, then we'll also set that bit if it was, as well
as when we first create a client record on stable storage.

The legacy tracker sets cl_firststate whenever the directory is
created, or in nfs4_set_claim_prev. When the clients are restored
after a reboot by reading in the contents of the v4recoverydir, then
they do not have the cl_firststate set initially. It's only after a
NFS4_OPEN_CLAIM_PREVIOUS open request comes in that they get it.

Now, I don't think it really matters much. We'll just have this flag
set a little earlier with the new client tracker, and it may get set
again when it's already set. I don't see any cases right offhand where
the legacy and new client trackers would cause different behavior.

Am I missing any subtlety in the semantics of cl_firststate here?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
  2012-02-08 21:00     ` Jeff Layton
@ 2012-02-10 16:06       ` Jeff Layton
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2012-02-10 16:06 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Wed, 8 Feb 2012 16:00:23 -0500
Jeff Layton <jlayton@redhat.com> wrote:

> On Fri, 3 Feb 2012 14:35:33 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Wed, Feb 01, 2012 at 10:44:10AM -0500, Jeff Layton wrote:
> > > We'll need a way to flag the nfs4_client as already being recorded on
> > > stable storage so that we don't continually upcall.
> > 
> > Looks like the code currently uses cl_firststate for that purpose.
> > 
> > I'm perfectly open to the argument that using a u32 to store a boolean
> > is rather silly, and can't see any problem with doing it this way
> > instead, but then let's remove cl_firstate while we're at it.
> > 
> > --b.
> > 
> 
> I started looking at this. The semantics on cl_firststate are a little
> different from my proposed flag:
> 
> The meaning of the flag I was proposing is fairly clear -- I set it
> whenever the kernel believes that there is already a record for this
> client on stable storage. So, when we first upcall to check if a client
> is on stable storage, then we'll also set that bit if it was, as well
> as when we first create a client record on stable storage.
> 
> The legacy tracker sets cl_firststate whenever the directory is
> created, or in nfs4_set_claim_prev. When the clients are restored
> after a reboot by reading in the contents of the v4recoverydir, then
> they do not have the cl_firststate set initially. It's only after a
> NFS4_OPEN_CLAIM_PREVIOUS open request comes in that they get it.
> 
> Now, I don't think it really matters much. We'll just have this flag
> set a little earlier with the new client tracker, and it may get set
> again when it's already set. I don't see any cases right offhand where
> the legacy and new client trackers would cause different behavior.
> 
> Am I missing any subtlety in the semantics of cl_firststate here?
> 

Apologies for the self-reply...

It does matter -- the check in nfsd4_reclaim_complete won't work
correctly with the new client ID tracker if we just replace
cl_firststate with NFSD4_CLIENT_STABLE. I think the right approach
is probably to add a new flag here that nfsd4_reclaim_complete can
use in place of its existing cl_firststate check.

I'll give it some more thought...

Thanks,
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-02-10 16:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-01 15:44 [PATCH v5 0/5] nfsd: overhaul the client name tracking code Jeff Layton
2012-02-01 15:44 ` [PATCH v5 1/5] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
2012-02-02 22:45   ` J. Bruce Fields
2012-02-03 19:22     ` Jeff Layton
2012-02-01 15:44 ` [PATCH v5 2/5] sunrpc: create nfsd dir in rpc_pipefs Jeff Layton
2012-02-01 15:44 ` [PATCH v5 3/5] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field Jeff Layton
2012-02-03 19:35   ` J. Bruce Fields
2012-02-04 12:21     ` Jeff Layton
2012-02-08 21:00     ` Jeff Layton
2012-02-10 16:06       ` Jeff Layton
2012-02-01 15:44 ` [PATCH v5 4/5] nfsd: add a header describing upcall to nfsdcld Jeff Layton
2012-02-01 15:44 ` [PATCH v5 5/5] nfsd: add the infrastructure to handle the cld upcall Jeff Layton
2012-02-03 22:57   ` J. Bruce Fields
2012-02-04 11:49     ` Jeff Layton
2012-02-07 15:00       ` Jeff Layton
2012-02-07 15:19         ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.