All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal
@ 2014-07-16 14:31 Jeff Layton
  2014-07-16 14:31 ` [PATCH v2 01/10] nfsd: eliminate nfsd4_init_callback Jeff Layton
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:31 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

Here's a respin of the delegation fixes that I sent yesterday. The main
change is to move the remove_stid calls into the caller in the second
patch instead of keeping it in nfs4_free_stid, per Christoph's
suggestion.

Unfortunately, that causes a pile of merge conflicts in the later
patches, so I need to resend the whole set.

I also include the patch that converts the num_delegations counter to an
atomic_long_t, as I missed that in the earlier series.

There is one more patch that fixes up handling of the cl_revoked list
that comes later, but it can't easily be moved here as it depends on
some locking changes that come later.

Jeff Layton (6):
  nfsd: eliminate nfsd4_init_callback
  nfsd: Avoid taking state_lock while holding inode lock in
    nfsd_break_one_deleg
  nfsd: nfs4_alloc_init_lease should take a nfs4_file arg
  locks: add file_has_lease to prevent delegation break races
  nfsd: Protect the nfs4_file delegation fields using the fi_lock
  nfsd: Fix delegation revocation

Trond Myklebust (4):
  nfsd: Ensure stateids remain unique until they are freed
  nfsd: Move the delegation reference counter into the struct nfs4_stid
  nfsd: Simplify stateid management
  nfsd: Convert delegation counter to an atomic_long_t type

 fs/locks.c             |  26 +++++++
 fs/nfsd/nfs4callback.c |  23 ++++--
 fs/nfsd/nfs4state.c    | 197 +++++++++++++++++++++++++++++++------------------
 fs/nfsd/state.h        |   7 +-
 include/linux/fs.h     |   6 ++
 5 files changed, 179 insertions(+), 80 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 01/10] nfsd: eliminate nfsd4_init_callback
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
@ 2014-07-16 14:31 ` Jeff Layton
  2014-07-16 14:31 ` [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:31 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

It's just an obfuscated INIT_WORK call. Just make the work_func_t a
non-static symbol and use a normal INIT_WORK call.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4callback.c | 8 ++------
 fs/nfsd/nfs4state.c    | 4 ++--
 fs/nfsd/state.h        | 2 +-
 3 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 2c73cae9899d..30a71cb46001 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1011,7 +1011,8 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
 		run_nfsd4_cb(cb);
 }
 
-static void nfsd4_do_callback_rpc(struct work_struct *w)
+void
+nfsd4_do_callback_rpc(struct work_struct *w)
 {
 	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
 	struct nfs4_client *clp = cb->cb_clp;
@@ -1031,11 +1032,6 @@ static void nfsd4_do_callback_rpc(struct work_struct *w)
 			cb->cb_ops, cb);
 }
 
-void nfsd4_init_callback(struct nfsd4_callback *cb)
-{
-	INIT_WORK(&cb->cb_work, nfsd4_do_callback_rpc);
-}
-
 void nfsd4_cb_recall(struct nfs4_delegation *dp)
 {
 	struct nfsd4_callback *cb = &dp->dl_recall;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 225f98c7d00d..56ea4f12803e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -592,7 +592,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
 	dp->dl_time = 0;
 	atomic_set(&dp->dl_count, 1);
-	nfsd4_init_callback(&dp->dl_recall);
+	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_do_callback_rpc);
 	return dp;
 }
 
@@ -1677,7 +1677,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 		spin_unlock(&nn->client_lock);
 		return NULL;
 	}
-	nfsd4_init_callback(&clp->cl_cb_null);
+	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_do_callback_rpc);
 	clp->cl_time = get_seconds();
 	clear_bit(0, &clp->cl_cb_slot_busy);
 	copy_verf(clp, verf);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 015b972da8ba..20857142773f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -436,7 +436,7 @@ extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(const char *recdir,
 extern __be32 nfs4_check_open_reclaim(clientid_t *clid,
 		struct nfsd4_compound_state *cstate, struct nfsd_net *nn);
 extern int set_callback_cred(void);
-extern void nfsd4_init_callback(struct nfsd4_callback *);
+void nfsd4_do_callback_rpc(struct work_struct *w);
 extern void nfsd4_probe_callback(struct nfs4_client *clp);
 extern void nfsd4_probe_callback_sync(struct nfs4_client *clp);
 extern void nfsd4_change_callback(struct nfs4_client *clp, struct nfs4_cb_conn *);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
  2014-07-16 14:31 ` [PATCH v2 01/10] nfsd: eliminate nfsd4_init_callback Jeff Layton
@ 2014-07-16 14:31 ` Jeff Layton
  2014-07-17  1:34   ` J. Bruce Fields
  2014-07-16 14:31 ` [PATCH v2 03/10] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg Jeff Layton
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:31 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

state_lock is a heavily contended global lock. We don't want to grab
that while simultaneously holding the inode->i_lock.

Add a new per-nfs4_file lock that we can use to protect the
per-nfs4_file delegation list. Hold that while walking the list in the
break_deleg callback and queue the workqueue job for each one.

The workqueue job can then take the state_lock and do the list
manipulations without the i_lock being held prior to starting the
rpc call.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4callback.c | 25 ++++++++++++++++++----
 fs/nfsd/nfs4state.c    | 58 +++++++++++++++++++++++++++++++++-----------------
 fs/nfsd/state.h        |  4 +++-
 3 files changed, 62 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 30a71cb46001..a88a93e09d69 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -933,7 +933,7 @@ void nfsd4_shutdown_callback(struct nfs4_client *clp)
 	set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
 	/*
 	 * Note this won't actually result in a null callback;
-	 * instead, nfsd4_do_callback_rpc() will detect the killed
+	 * instead, nfsd4_run_cb_null() will detect the killed
 	 * client, destroy the rpc client, and stop:
 	 */
 	do_probe_callback(clp);
@@ -1011,10 +1011,9 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
 		run_nfsd4_cb(cb);
 }
 
-void
-nfsd4_do_callback_rpc(struct work_struct *w)
+static void
+nfsd4_run_callback_rpc(struct nfsd4_callback *cb)
 {
-	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
 	struct nfs4_client *clp = cb->cb_clp;
 	struct rpc_clnt *clnt;
 
@@ -1032,6 +1031,24 @@ nfsd4_do_callback_rpc(struct work_struct *w)
 			cb->cb_ops, cb);
 }
 
+void
+nfsd4_run_cb_null(struct work_struct *w)
+{
+	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback,
+							cb_work);
+	nfsd4_run_callback_rpc(cb);
+}
+
+void
+nfsd4_run_cb_recall(struct work_struct *w)
+{
+	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback,
+							cb_work);
+
+	nfsd4_prepare_cb_recall(cb->cb_op);
+	nfsd4_run_callback_rpc(cb);
+}
+
 void nfsd4_cb_recall(struct nfs4_delegation *dp)
 {
 	struct nfsd4_callback *cb = &dp->dl_recall;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 56ea4f12803e..bdf8ac3393bd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -254,6 +254,8 @@ static void nfsd4_free_file(struct nfs4_file *f)
 static inline void
 put_nfs4_file(struct nfs4_file *fi)
 {
+	might_lock(&state_lock);
+
 	if (atomic_dec_and_lock(&fi->fi_ref, &state_lock)) {
 		hlist_del(&fi->fi_hash);
 		spin_unlock(&state_lock);
@@ -554,6 +556,8 @@ static void block_delegations(struct knfsd_fh *fh)
 	u32 hash;
 	struct bloom_pair *bd = &blocked_delegations;
 
+	lockdep_assert_held(&state_lock);
+
 	hash = arch_fast_hash(&fh->fh_base, fh->fh_size, 0);
 
 	__set_bit(hash&255, bd->set[bd->new]);
@@ -592,7 +596,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
 	dp->dl_time = 0;
 	atomic_set(&dp->dl_count, 1);
-	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_do_callback_rpc);
+	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_run_cb_recall);
 	return dp;
 }
 
@@ -640,7 +644,9 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 	lockdep_assert_held(&state_lock);
 
 	dp->dl_stid.sc_type = NFS4_DELEG_STID;
+	spin_lock(&fp->fi_lock);
 	list_add(&dp->dl_perfile, &fp->fi_delegations);
+	spin_unlock(&fp->fi_lock);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 }
 
@@ -648,14 +654,18 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 static void
 unhash_delegation(struct nfs4_delegation *dp)
 {
+	struct nfs4_file *fp = dp->dl_file;
+
 	spin_lock(&state_lock);
 	list_del_init(&dp->dl_perclnt);
-	list_del_init(&dp->dl_perfile);
 	list_del_init(&dp->dl_recall_lru);
+	spin_lock(&fp->fi_lock);
+	list_del_init(&dp->dl_perfile);
+	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
-	if (dp->dl_file) {
-		nfs4_put_deleg_lease(dp->dl_file);
-		put_nfs4_file(dp->dl_file);
+	if (fp) {
+		nfs4_put_deleg_lease(fp);
+		put_nfs4_file(fp);
 		dp->dl_file = NULL;
 	}
 }
@@ -1677,7 +1687,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 		spin_unlock(&nn->client_lock);
 		return NULL;
 	}
-	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_do_callback_rpc);
+	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_run_cb_null);
 	clp->cl_time = get_seconds();
 	clear_bit(0, &clp->cl_cb_slot_busy);
 	copy_verf(clp, verf);
@@ -3079,30 +3089,38 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
 	return ret;
 }
 
-static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
+void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
 {
 	struct nfs4_client *clp = dp->dl_stid.sc_client;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
-	lockdep_assert_held(&state_lock);
-	/* We're assuming the state code never drops its reference
-	 * without first removing the lease.  Since we're in this lease
-	 * callback (and since the lease code is serialized by the kernel
-	 * lock) we know the server hasn't removed the lease yet, we know
-	 * it's safe to take a reference: */
-	atomic_inc(&dp->dl_count);
-
+	/*
+	 * We can't do this in nfsd_break_deleg_cb because it is
+	 * already holding inode->i_lock
+	 */
+	spin_lock(&state_lock);
+	block_delegations(&dp->dl_fh);
 	/*
 	 * If the dl_time != 0, then we know that it has already been
 	 * queued for a lease break. Don't queue it again.
 	 */
 	if (dp->dl_time == 0) {
-		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
 		dp->dl_time = get_seconds();
+		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
 	}
+	spin_unlock(&state_lock);
+}
 
-	block_delegations(&dp->dl_fh);
-
+static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
+{
+	/*
+	 * We're assuming the state code never drops its reference
+	 * without first removing the lease.  Since we're in this lease
+	 * callback (and since the lease code is serialized by the kernel
+	 * lock) we know the server hasn't removed the lease yet, we know
+	 * it's safe to take a reference.
+	 */
+	atomic_inc(&dp->dl_count);
 	nfsd4_cb_recall(dp);
 }
 
@@ -3127,11 +3145,11 @@ static void nfsd_break_deleg_cb(struct file_lock *fl)
 	 */
 	fl->fl_break_time = 0;
 
-	spin_lock(&state_lock);
 	fp->fi_had_conflict = true;
+	spin_lock(&fp->fi_lock);
 	list_for_each_entry(dp, &fp->fi_delegations, dl_perfile)
 		nfsd_break_one_deleg(dp);
-	spin_unlock(&state_lock);
+	spin_unlock(&fp->fi_lock);
 }
 
 static
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 20857142773f..81b7522e3f67 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -436,7 +436,8 @@ extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(const char *recdir,
 extern __be32 nfs4_check_open_reclaim(clientid_t *clid,
 		struct nfsd4_compound_state *cstate, struct nfsd_net *nn);
 extern int set_callback_cred(void);
-void nfsd4_do_callback_rpc(struct work_struct *w);
+void nfsd4_run_cb_null(struct work_struct *w);
+void nfsd4_run_cb_recall(struct work_struct *w);
 extern void nfsd4_probe_callback(struct nfs4_client *clp);
 extern void nfsd4_probe_callback_sync(struct nfs4_client *clp);
 extern void nfsd4_change_callback(struct nfs4_client *clp, struct nfs4_cb_conn *);
@@ -444,6 +445,7 @@ extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
 extern int nfsd4_create_callback_queue(void);
 extern void nfsd4_destroy_callback_queue(void);
 extern void nfsd4_shutdown_callback(struct nfs4_client *);
+extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
 extern void nfs4_put_delegation(struct nfs4_delegation *dp);
 extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
 							struct nfsd_net *nn);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 03/10] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
  2014-07-16 14:31 ` [PATCH v2 01/10] nfsd: eliminate nfsd4_init_callback Jeff Layton
  2014-07-16 14:31 ` [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
@ 2014-07-16 14:31 ` Jeff Layton
  2014-07-16 14:31 ` [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:31 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

No need to pass the delegation pointer in here as it's only used to get
the nfs4_file pointer.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4state.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index bdf8ac3393bd..1b01a20827ab 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3474,7 +3474,7 @@ static bool nfsd4_cb_channel_good(struct nfs4_client *clp)
 	return clp->cl_minorversion && clp->cl_cb_state == NFSD4_CB_UNKNOWN;
 }
 
-static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int flag)
+static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 {
 	struct file_lock *fl;
 
@@ -3486,7 +3486,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)(dp->dl_file);
+	fl->fl_owner = (fl_owner_t)fp;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
@@ -3497,7 +3497,7 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	struct file_lock *fl;
 	int status;
 
-	fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ);
+	fl = nfs4_alloc_init_lease(fp, NFS4_OPEN_DELEGATE_READ);
 	if (!fl)
 		return -ENOMEM;
 	fl->fl_file = find_readable_file(fp);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (2 preceding siblings ...)
  2014-07-16 14:31 ` [PATCH v2 03/10] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg Jeff Layton
@ 2014-07-16 14:31 ` Jeff Layton
  2014-07-17 18:44   ` J. Bruce Fields
  2014-07-16 14:32 ` [PATCH v2 05/10] locks: add file_has_lease to prevent delegation break races Jeff Layton
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:31 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Add an extra delegation state to allow the stateid to remain in the idr
tree until the last reference has been released. This will be necessary
to ensure uniqueness once the client_mutex is removed.

[jlayton: reset the sc_type under the state_lock in unhash_delegation]

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4state.c | 8 ++++----
 fs/nfsd/state.h     | 1 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1b01a20827ab..fd4deb049ddf 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -616,6 +616,7 @@ void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
 	if (atomic_dec_and_test(&dp->dl_count)) {
+		remove_stid(&dp->dl_stid);
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
 		num_delegations--;
 	}
@@ -657,6 +658,7 @@ unhash_delegation(struct nfs4_delegation *dp)
 	struct nfs4_file *fp = dp->dl_file;
 
 	spin_lock(&state_lock);
+	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
 	list_del_init(&dp->dl_perclnt);
 	list_del_init(&dp->dl_recall_lru);
 	spin_lock(&fp->fi_lock);
@@ -670,19 +672,15 @@ unhash_delegation(struct nfs4_delegation *dp)
 	}
 }
 
-
-
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
 {
 	list_del_init(&dp->dl_recall_lru);
-	remove_stid(&dp->dl_stid);
 	nfs4_put_delegation(dp);
 }
 
 static void destroy_delegation(struct nfs4_delegation *dp)
 {
 	unhash_delegation(dp);
-	remove_stid(&dp->dl_stid);
 	nfs4_put_delegation(dp);
 }
 
@@ -4036,7 +4034,9 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 		return nfs_ok;
 	default:
 		printk("unknown stateid type %x\n", s->sc_type);
+		/* Fallthrough */
 	case NFS4_CLOSED_STID:
+	case NFS4_CLOSED_DELEG_STID:
 		return nfserr_bad_stateid;
 	}
 }
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 81b7522e3f67..996d61eeb357 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -80,6 +80,7 @@ struct nfs4_stid {
 #define NFS4_CLOSED_STID 8
 /* For a deleg stateid kept around only to process free_stateid's: */
 #define NFS4_REVOKED_DELEG_STID 16
+#define NFS4_CLOSED_DELEG_STID 32
 	unsigned char sc_type;
 	stateid_t sc_stateid;
 	struct nfs4_client *sc_client;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 05/10] locks: add file_has_lease to prevent delegation break races
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (3 preceding siblings ...)
  2014-07-16 14:31 ` [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
@ 2014-07-16 14:32 ` Jeff Layton
  2014-07-16 14:32 ` [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:32 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

Once we remove the client_mutex, we'll have a potential race between
setting a lease on a file and breaking the delegation. We may set the
lease, but by the time we go to hash it, it may have already been
broken. Currently, we know that this won't happen as the nfs4_laundromat
takes the client_mutex, but we want to remove that.

As part of that fix, add a function that can tell us whether a
particular file has a lease set on it. In a later nfsd patch, we'll use
that to close the potential race window.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/locks.c         | 26 ++++++++++++++++++++++++++
 include/linux/fs.h |  6 ++++++
 2 files changed, 32 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index 717fbc404e6b..005cc86927e3 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1308,6 +1308,32 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
 }
 
 /**
+ * file_has_lease - does the given file have a lease set on it?
+ * @file: struct file on which we want to check the lease
+ *
+ * Returns true if a lease was is set on the given file description,
+ * false otherwise.
+ */
+bool
+file_has_lease(struct file *file)
+{
+	bool ret = false;
+	struct inode *inode = file_inode(file);
+	struct file_lock *fl;
+
+	spin_lock(&inode->i_lock);
+	for (fl = inode->i_flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (fl->fl_file == file) {
+			ret = true;
+			break;
+		}
+	}
+	spin_unlock(&inode->i_lock);
+	return ret;
+}
+EXPORT_SYMBOL(file_has_lease);
+
+/**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
  *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e11d60cc867b..7ae6f4869669 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -953,6 +953,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl);
+extern bool file_has_lease(struct file *file);
 extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **);
@@ -1064,6 +1065,11 @@ static inline int flock_lock_file_wait(struct file *filp,
 	return -ENOLCK;
 }
 
+static inline bool file_has_lease(struct file *file)
+{
+	return false;
+}
+
 static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	return 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (4 preceding siblings ...)
  2014-07-16 14:32 ` [PATCH v2 05/10] locks: add file_has_lease to prevent delegation break races Jeff Layton
@ 2014-07-16 14:32 ` Jeff Layton
  2014-07-16 18:09   ` Christoph Hellwig
  2014-07-16 14:32 ` [PATCH v2 07/10] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:32 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

Move more of the delegation fields to be protected by the fi_lock. It's
more granular than the state_lock and in later patches we'll want to
be able to rely on it in addition to the state_lock.

Also, the current code in nfs4_setlease calls vfs_setlease and uses the
client_mutex to ensure that it doesn't disappear before we can hash the
delegation. With the client_mutex gone, we'll have a potential race
condition.

It's possible that the delegation could be recalled after we acquire the
lease but before we ever get around to hashing it. If that happens, then
we'd have a nfs4_file that *thinks* it has a delegation, when it
actually has none.

Attempt to acquire a delegation. If that succeeds, take the state_lock
and recheck to make sure the lease is still there. If it is, then take
the fi_lock and set up the rest of the delegation fields. This prevents
the race because the delegation break workqueue job must also take the
state_lock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 51 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fd4deb049ddf..91a33f8896dd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -624,6 +624,8 @@ nfs4_put_delegation(struct nfs4_delegation *dp)
 
 static void nfs4_put_deleg_lease(struct nfs4_file *fp)
 {
+	lockdep_assert_held(&state_lock);
+
 	if (!fp->fi_lease)
 		return;
 	if (atomic_dec_and_test(&fp->fi_delegees)) {
@@ -643,11 +645,10 @@ static void
 hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 {
 	lockdep_assert_held(&state_lock);
+	lockdep_assert_held(&fp->fi_lock);
 
 	dp->dl_stid.sc_type = NFS4_DELEG_STID;
-	spin_lock(&fp->fi_lock);
 	list_add(&dp->dl_perfile, &fp->fi_delegations);
-	spin_unlock(&fp->fi_lock);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 }
 
@@ -659,17 +660,17 @@ unhash_delegation(struct nfs4_delegation *dp)
 
 	spin_lock(&state_lock);
 	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
+	spin_lock(&fp->fi_lock);
 	list_del_init(&dp->dl_perclnt);
 	list_del_init(&dp->dl_recall_lru);
-	spin_lock(&fp->fi_lock);
 	list_del_init(&dp->dl_perfile);
 	spin_unlock(&fp->fi_lock);
-	spin_unlock(&state_lock);
 	if (fp) {
 		nfs4_put_deleg_lease(fp);
-		put_nfs4_file(fp);
 		dp->dl_file = NULL;
 	}
+	spin_unlock(&state_lock);
+	put_nfs4_file(fp);
 }
 
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
@@ -3493,7 +3494,7 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 {
 	struct nfs4_file *fp = dp->dl_file;
 	struct file_lock *fl;
-	int status;
+	int status = 0;
 
 	fl = nfs4_alloc_init_lease(fp, NFS4_OPEN_DELEGATE_READ);
 	if (!fl)
@@ -3501,15 +3502,31 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	fl->fl_file = find_readable_file(fp);
 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
 	if (status)
-		goto out_free;
+		goto out_fput;
+	spin_lock(&state_lock);
+	/* Did the lease get broken before we took the lock? */
+	status = -EAGAIN;
+	if (!file_has_lease(fl->fl_file))
+		goto out_unlock;
+	spin_lock(&fp->fi_lock);
+	/* Race breaker */
+	if (fp->fi_lease) {
+		status = 0;
+		atomic_inc(&fp->fi_delegees);
+		hash_delegation_locked(dp, fp);
+		spin_unlock(&fp->fi_lock);
+		goto out_unlock;
+	}
 	fp->fi_lease = fl;
 	fp->fi_deleg_file = fl->fl_file;
 	atomic_set(&fp->fi_delegees, 1);
-	spin_lock(&state_lock);
 	hash_delegation_locked(dp, fp);
+	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 	return 0;
-out_free:
+out_unlock:
+	spin_unlock(&state_lock);
+out_fput:
 	if (fl->fl_file)
 		fput(fl->fl_file);
 	locks_free_lock(fl);
@@ -3518,19 +3535,27 @@ out_free:
 
 static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfs4_file *fp)
 {
+	int status = 0;
+
 	if (fp->fi_had_conflict)
 		return -EAGAIN;
 	get_nfs4_file(fp);
+	spin_lock(&state_lock);
+	spin_lock(&fp->fi_lock);
 	dp->dl_file = fp;
-	if (!fp->fi_lease)
+	if (!fp->fi_lease) {
+		spin_unlock(&fp->fi_lock);
+		spin_unlock(&state_lock);
 		return nfs4_setlease(dp);
-	spin_lock(&state_lock);
+	}
 	atomic_inc(&fp->fi_delegees);
 	if (fp->fi_had_conflict) {
-		spin_unlock(&state_lock);
-		return -EAGAIN;
+		status = -EAGAIN;
+		goto out_unlock;
 	}
 	hash_delegation_locked(dp, fp);
+out_unlock:
+	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 	return 0;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 07/10] nfsd: Move the delegation reference counter into the struct nfs4_stid
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (5 preceding siblings ...)
  2014-07-16 14:32 ` [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
@ 2014-07-16 14:32 ` Jeff Layton
  2014-07-16 14:32 ` [PATCH v2 08/10] nfsd: Simplify stateid management Jeff Layton
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:32 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

We will want to add reference counting to the lock stateid and open
stateids too in later patches.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4state.c | 6 +++---
 fs/nfsd/state.h     | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 91a33f8896dd..c55609a582ad 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -472,6 +472,7 @@ kmem_cache *slab)
 	stid->sc_stateid.si_opaque.so_clid = cl->cl_clientid;
 	/* Will be incremented before return to client: */
 	stid->sc_stateid.si_generation = 0;
+	atomic_set(&stid->sc_count, 1);
 
 	/*
 	 * It shouldn't be a problem to reuse an opaque stateid value.
@@ -595,7 +596,6 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
 	dp->dl_time = 0;
-	atomic_set(&dp->dl_count, 1);
 	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_run_cb_recall);
 	return dp;
 }
@@ -615,7 +615,7 @@ static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
 void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
-	if (atomic_dec_and_test(&dp->dl_count)) {
+	if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
 		remove_stid(&dp->dl_stid);
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
 		num_delegations--;
@@ -3119,7 +3119,7 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
 	 * lock) we know the server hasn't removed the lease yet, we know
 	 * it's safe to take a reference.
 	 */
-	atomic_inc(&dp->dl_count);
+	atomic_inc(&dp->dl_stid.sc_count);
 	nfsd4_cb_recall(dp);
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 996d61eeb357..e68a9ae30fd7 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -73,6 +73,7 @@ struct nfsd4_callback {
 };
 
 struct nfs4_stid {
+	atomic_t sc_count;
 #define NFS4_OPEN_STID 1
 #define NFS4_LOCK_STID 2
 #define NFS4_DELEG_STID 4
@@ -91,7 +92,6 @@ struct nfs4_delegation {
 	struct list_head	dl_perfile;
 	struct list_head	dl_perclnt;
 	struct list_head	dl_recall_lru;  /* delegation recalled */
-	atomic_t		dl_count;       /* ref count */
 	struct nfs4_file	*dl_file;
 	u32			dl_type;
 	time_t			dl_time;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 08/10] nfsd: Simplify stateid management
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (6 preceding siblings ...)
  2014-07-16 14:32 ` [PATCH v2 07/10] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
@ 2014-07-16 14:32 ` Jeff Layton
  2014-07-16 18:10   ` Christoph Hellwig
  2014-07-16 14:32 ` [PATCH v2 09/10] nfsd: Fix delegation revocation Jeff Layton
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:32 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Don't allow stateids to clear the open file pointer until they are
being destroyed. Also, move to kzalloc and get rid of explicit
zeroing of fields.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c55609a582ad..cb8e49375d1d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -459,7 +459,7 @@ kmem_cache *slab)
 	struct nfs4_stid *stid;
 	int new_id;
 
-	stid = kmem_cache_alloc(slab, GFP_KERNEL);
+	stid = kmem_cache_zalloc(slab, GFP_KERNEL);
 	if (!stid)
 		return NULL;
 
@@ -467,11 +467,9 @@ kmem_cache *slab)
 	if (new_id < 0)
 		goto out_free;
 	stid->sc_client = cl;
-	stid->sc_type = 0;
 	stid->sc_stateid.si_opaque.so_id = new_id;
 	stid->sc_stateid.si_opaque.so_clid = cl->cl_clientid;
 	/* Will be incremented before return to client: */
-	stid->sc_stateid.si_generation = 0;
 	atomic_set(&stid->sc_count, 1);
 
 	/*
@@ -592,10 +590,8 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	INIT_LIST_HEAD(&dp->dl_perfile);
 	INIT_LIST_HEAD(&dp->dl_perclnt);
 	INIT_LIST_HEAD(&dp->dl_recall_lru);
-	dp->dl_file = NULL;
 	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
-	dp->dl_time = 0;
 	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_run_cb_recall);
 	return dp;
 }
@@ -616,6 +612,8 @@ void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
 	if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
+		if (dp->dl_file)
+			put_nfs4_file(dp->dl_file);
 		remove_stid(&dp->dl_stid);
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
 		num_delegations--;
@@ -665,12 +663,9 @@ unhash_delegation(struct nfs4_delegation *dp)
 	list_del_init(&dp->dl_recall_lru);
 	list_del_init(&dp->dl_perfile);
 	spin_unlock(&fp->fi_lock);
-	if (fp) {
+	if (fp)
 		nfs4_put_deleg_lease(fp);
-		dp->dl_file = NULL;
-	}
 	spin_unlock(&state_lock);
-	put_nfs4_file(fp);
 }
 
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
@@ -878,12 +873,12 @@ static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
 static void close_generic_stateid(struct nfs4_ol_stateid *stp)
 {
 	release_all_access(stp);
-	put_nfs4_file(stp->st_file);
-	stp->st_file = NULL;
 }
 
 static void free_generic_stateid(struct nfs4_ol_stateid *stp)
 {
+	if (stp->st_file)
+		put_nfs4_file(stp->st_file);
 	remove_stid(&stp->st_stid);
 	nfs4_free_stid(stateid_slab, &stp->st_stid);
 }
@@ -4446,6 +4441,10 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 		if (list_empty(&oo->oo_owner.so_stateids))
 			release_openowner(oo);
 	} else {
+		if (s->st_file) {
+			put_nfs4_file(s->st_file);
+			s->st_file = NULL;
+		}
 		oo->oo_last_closed_stid = s;
 		/*
 		 * In the 4.0 case we need to keep the owners around a
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 09/10] nfsd: Fix delegation revocation
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (7 preceding siblings ...)
  2014-07-16 14:32 ` [PATCH v2 08/10] nfsd: Simplify stateid management Jeff Layton
@ 2014-07-16 14:32 ` Jeff Layton
  2014-07-16 18:30   ` Christoph Hellwig
  2014-07-16 14:32 ` [PATCH v2 10/10] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
  2014-07-16 14:33 ` [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
  10 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:32 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

Ensure that the delegations cannot be found by the laundromat etc once
we add them to the various 'revoke' lists.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index cb8e49375d1d..0329e363ac73 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -650,13 +650,13 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 }
 
-/* Called under the state lock. */
 static void
-unhash_delegation(struct nfs4_delegation *dp)
+unhash_delegation_locked(struct nfs4_delegation *dp)
 {
 	struct nfs4_file *fp = dp->dl_file;
 
-	spin_lock(&state_lock);
+	lockdep_assert_held(&state_lock);
+
 	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
 	spin_lock(&fp->fi_lock);
 	list_del_init(&dp->dl_perclnt);
@@ -665,18 +665,19 @@ unhash_delegation(struct nfs4_delegation *dp)
 	spin_unlock(&fp->fi_lock);
 	if (fp)
 		nfs4_put_deleg_lease(fp);
-	spin_unlock(&state_lock);
 }
 
-static void destroy_revoked_delegation(struct nfs4_delegation *dp)
+static void unhash_and_destroy_delegation(struct nfs4_delegation *dp)
 {
-	list_del_init(&dp->dl_recall_lru);
+	spin_lock(&state_lock);
+	unhash_delegation_locked(dp);
+	spin_unlock(&state_lock);
 	nfs4_put_delegation(dp);
 }
 
-static void destroy_delegation(struct nfs4_delegation *dp)
+static void destroy_revoked_delegation(struct nfs4_delegation *dp)
 {
-	unhash_delegation(dp);
+	list_del_init(&dp->dl_recall_lru);
 	nfs4_put_delegation(dp);
 }
 
@@ -685,11 +686,10 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 	struct nfs4_client *clp = dp->dl_stid.sc_client;
 
 	if (clp->cl_minorversion == 0)
-		destroy_delegation(dp);
+		destroy_revoked_delegation(dp);
 	else {
-		unhash_delegation(dp);
 		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
-		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+		list_move(&dp->dl_recall_lru, &clp->cl_revoked);
 	}
 }
 
@@ -1447,15 +1447,16 @@ destroy_client(struct nfs4_client *clp)
 	spin_lock(&state_lock);
 	while (!list_empty(&clp->cl_delegations)) {
 		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
-		list_del_init(&dp->dl_perclnt);
+		unhash_delegation_locked(dp);
 		/* Ensure that deleg break won't try to requeue it */
 		++dp->dl_time;
-		list_move(&dp->dl_recall_lru, &reaplist);
+		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
 	while (!list_empty(&reaplist)) {
 		dp = list_entry(reaplist.next, struct nfs4_delegation, dl_recall_lru);
-		destroy_delegation(dp);
+		list_del_init(&dp->dl_recall_lru);
+		nfs4_put_delegation(dp);
 	}
 	list_splice_init(&clp->cl_revoked, &reaplist);
 	while (!list_empty(&reaplist)) {
@@ -3639,7 +3640,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
 	open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
 	return;
 out_free:
-	destroy_delegation(dp);
+	nfs4_put_delegation(dp);
 out_no_deleg:
 	open->op_delegate_type = NFS4_OPEN_DELEGATE_NONE;
 	if (open->op_claim_type == NFS4_OPEN_CLAIM_PREVIOUS &&
@@ -3878,7 +3879,8 @@ nfs4_laundromat(struct nfsd_net *nn)
 			new_timeo = min(new_timeo, t);
 			break;
 		}
-		list_move(&dp->dl_recall_lru, &reaplist);
+		unhash_delegation_locked(dp);
+		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
 	list_for_each_safe(pos, next, &reaplist) {
@@ -4510,7 +4512,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (status)
 		goto out;
 
-	destroy_delegation(dp);
+	unhash_and_destroy_delegation(dp);
 out:
 	nfs4_unlock_state();
 
@@ -5353,7 +5355,8 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 			 * don't monkey with it now that we are.
 			 */
 			++dp->dl_time;
-			list_move(&dp->dl_recall_lru, victims);
+			unhash_delegation_locked(dp);
+			list_add(&dp->dl_recall_lru, victims);
 		}
 		if (++count == max)
 			break;
@@ -5608,12 +5611,14 @@ nfs4_state_shutdown_net(struct net *net)
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
-		list_move(&dp->dl_recall_lru, &reaplist);
+		unhash_delegation_locked(dp);
+		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
 	list_for_each_safe(pos, next, &reaplist) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
-		destroy_delegation(dp);
+		list_del_init(&dp->dl_recall_lru);
+		nfs4_put_delegation(dp);
 	}
 
 	nfsd4_client_tracking_exit(net);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 10/10] nfsd: Convert delegation counter to an atomic_long_t type
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (8 preceding siblings ...)
  2014-07-16 14:32 ` [PATCH v2 09/10] nfsd: Fix delegation revocation Jeff Layton
@ 2014-07-16 14:32 ` Jeff Layton
  2014-07-16 18:11   ` Christoph Hellwig
  2014-07-16 14:33 ` [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
  10 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:32 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

We want to convert to an atomic type so that we don't need to lock
across the call to alloc_init_deleg(). Then convert to a long type so
that we match the size of 'max_delegations'.

None of this is a problem today, but it will be once we remove
client_mutex protection.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0329e363ac73..64eac2707714 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -343,7 +343,7 @@ find_any_file(struct nfs4_file *f)
 	return ret;
 }
 
-static int num_delegations;
+static atomic_long_t num_delegations;
 unsigned long max_delegations;
 
 /*
@@ -571,22 +571,23 @@ static struct nfs4_delegation *
 alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct svc_fh *current_fh)
 {
 	struct nfs4_delegation *dp;
+	long n;
 
 	dprintk("NFSD alloc_init_deleg\n");
-	if (num_delegations > max_delegations)
-		return NULL;
+	n = atomic_long_inc_return(&num_delegations);
+	if (n < 0 || n > max_delegations)
+		goto out_dec;
 	if (delegation_blocked(&current_fh->fh_handle))
-		return NULL;
+		goto out_dec;
 	dp = delegstateid(nfs4_alloc_stid(clp, deleg_slab));
 	if (dp == NULL)
-		return dp;
+		goto out_dec;
 	/*
 	 * delegation seqid's are never incremented.  The 4.1 special
 	 * meaning of seqid 0 isn't meaningful, really, but let's avoid
 	 * 0 anyway just for consistency and use 1:
 	 */
 	dp->dl_stid.sc_stateid.si_generation = 1;
-	num_delegations++;
 	INIT_LIST_HEAD(&dp->dl_perfile);
 	INIT_LIST_HEAD(&dp->dl_perclnt);
 	INIT_LIST_HEAD(&dp->dl_recall_lru);
@@ -594,6 +595,9 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
 	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_run_cb_recall);
 	return dp;
+out_dec:
+	atomic_long_dec(&num_delegations);
+	return NULL;
 }
 
 static void remove_stid(struct nfs4_stid *s)
@@ -616,7 +620,7 @@ nfs4_put_delegation(struct nfs4_delegation *dp)
 			put_nfs4_file(dp->dl_file);
 		remove_stid(&dp->dl_stid);
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
-		num_delegations--;
+		atomic_long_dec(&num_delegations);
 	}
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal
  2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
                   ` (9 preceding siblings ...)
  2014-07-16 14:32 ` [PATCH v2 10/10] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
@ 2014-07-16 14:33 ` Jeff Layton
  10 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 14:33 UTC (permalink / raw)
  To: bfields; +Cc: hch, linux-nfs

On Wed, 16 Jul 2014 10:31:55 -0400
Jeff Layton <jlayton@primarydata.com> wrote:

> Here's a respin of the delegation fixes that I sent yesterday. The main
> change is to move the remove_stid calls into the caller in the second
> patch instead of keeping it in nfs4_free_stid, per Christoph's
> suggestion.
> 
> Unfortunately, that causes a pile of merge conflicts in the later
> patches, so I need to resend the whole set.
> 
> I also include the patch that converts the num_delegations counter to an
> atomic_long_t, as I missed that in the earlier series.
> 
> There is one more patch that fixes up handling of the cl_revoked list
> that comes later, but it can't easily be moved here as it depends on
> some locking changes that come later.
> 
> Jeff Layton (6):
>   nfsd: eliminate nfsd4_init_callback
>   nfsd: Avoid taking state_lock while holding inode lock in
>     nfsd_break_one_deleg
>   nfsd: nfs4_alloc_init_lease should take a nfs4_file arg
>   locks: add file_has_lease to prevent delegation break races
>   nfsd: Protect the nfs4_file delegation fields using the fi_lock
>   nfsd: Fix delegation revocation
> 
> Trond Myklebust (4):
>   nfsd: Ensure stateids remain unique until they are freed
>   nfsd: Move the delegation reference counter into the struct nfs4_stid
>   nfsd: Simplify stateid management
>   nfsd: Convert delegation counter to an atomic_long_t type
> 
>  fs/locks.c             |  26 +++++++
>  fs/nfsd/nfs4callback.c |  23 ++++--
>  fs/nfsd/nfs4state.c    | 197 +++++++++++++++++++++++++++++++------------------
>  fs/nfsd/state.h        |   7 +-
>  include/linux/fs.h     |   6 ++
>  5 files changed, 179 insertions(+), 80 deletions(-)
> 

Sorry...obviously the subject here should have read "0/10" and not "0/8".

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock
  2014-07-16 14:32 ` [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
@ 2014-07-16 18:09   ` Christoph Hellwig
  2014-07-16 19:04     ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2014-07-16 18:09 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, hch, linux-nfs

>  	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> +	spin_lock(&fp->fi_lock);
>  	list_del_init(&dp->dl_perclnt);
>  	list_del_init(&dp->dl_recall_lru);
> -	spin_lock(&fp->fi_lock);
>  	list_del_init(&dp->dl_perfile);
>  	spin_unlock(&fp->fi_lock);
> -	spin_unlock(&state_lock);
>  	if (fp) {
>  		nfs4_put_deleg_lease(fp);
> -		put_nfs4_file(fp);
>  		dp->dl_file = NULL;
>  	}
> +	spin_unlock(&state_lock);
> +	put_nfs4_file(fp);
>  }

Unless I've missed something put_nfs4_file does not handle a NULL
pointer argument in the current tree, so this needs a NULL check
for fp.

> +	spin_lock(&state_lock);
> +	/* Did the lease get broken before we took the lock? */
> +	status = -EAGAIN;
> +	if (!file_has_lease(fl->fl_file))
> +		goto out_unlock;

So if the delegation we just tried to add got broken, but someone else
managed to add one this would also return true.  But could that happen
either in real life or in theory?  Shouldn't we instead have an
atomic flag on the delegation that it got broken which we could check
here?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 08/10] nfsd: Simplify stateid management
  2014-07-16 14:32 ` [PATCH v2 08/10] nfsd: Simplify stateid management Jeff Layton
@ 2014-07-16 18:10   ` Christoph Hellwig
  0 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2014-07-16 18:10 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, hch, linux-nfs

On Wed, Jul 16, 2014 at 10:32:03AM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> Don't allow stateids to clear the open file pointer until they are
> being destroyed. Also, move to kzalloc and get rid of explicit
> zeroing of fields.

This doesn't explain what's actually simplified as the subject line
claims, nor does it explain why this move is safe.  It generally
looks good to me, but a little more explanation wouldn't hurt.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 10/10] nfsd: Convert delegation counter to an atomic_long_t type
  2014-07-16 14:32 ` [PATCH v2 10/10] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
@ 2014-07-16 18:11   ` Christoph Hellwig
  0 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2014-07-16 18:11 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, hch, linux-nfs

On Wed, Jul 16, 2014 at 10:32:05AM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> We want to convert to an atomic type so that we don't need to lock
> across the call to alloc_init_deleg(). Then convert to a long type so
> that we match the size of 'max_delegations'.
> 
> None of this is a problem today, but it will be once we remove
> client_mutex protection.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 09/10] nfsd: Fix delegation revocation
  2014-07-16 14:32 ` [PATCH v2 09/10] nfsd: Fix delegation revocation Jeff Layton
@ 2014-07-16 18:30   ` Christoph Hellwig
  2014-07-16 19:16     ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2014-07-16 18:30 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, hch, linux-nfs

> -/* Called under the state lock. */

You're removing a comment hat only becomes true after your patch :)

But I agree that it's obsolete, the lockdep_assert_held is better than
the comment.

> @@ -685,11 +686,10 @@ static void revoke_delegation(struct nfs4_delegation *dp)
>  	struct nfs4_client *clp = dp->dl_stid.sc_client;
>  
>  	if (clp->cl_minorversion == 0)
> -		destroy_delegation(dp);
> +		destroy_revoked_delegation(dp);

Seems a little confusing as we didn't turn the code into a REVOKED_DELEG
stateid type yet, but given that destroy_revoked_delegation doesn't
care this is probably fine.

>  	else {
> -		unhash_delegation(dp);
>  		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
> -		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> +		list_move(&dp->dl_recall_lru, &clp->cl_revoked);

What protects access to cl_revoked, btw?

Otherwise this looks fine to me,

Reviewed-by: Christoph Hellwig <hch@lst.de>

(would have been a little nicer to read if you had kept
destroy_revoked_delegation, otherwise diff obsfucates the fact that it
didn't change at all..

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock
  2014-07-16 18:09   ` Christoph Hellwig
@ 2014-07-16 19:04     ` Jeff Layton
  2014-07-17 14:55       ` Christoph Hellwig
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 19:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Wed, 16 Jul 2014 11:09:43 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> >  	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> > +	spin_lock(&fp->fi_lock);
> >  	list_del_init(&dp->dl_perclnt);
> >  	list_del_init(&dp->dl_recall_lru);
> > -	spin_lock(&fp->fi_lock);
> >  	list_del_init(&dp->dl_perfile);
> >  	spin_unlock(&fp->fi_lock);
> > -	spin_unlock(&state_lock);
> >  	if (fp) {
> >  		nfs4_put_deleg_lease(fp);
> > -		put_nfs4_file(fp);
> >  		dp->dl_file = NULL;
> >  	}
> > +	spin_unlock(&state_lock);
> > +	put_nfs4_file(fp);
> >  }
> 
> Unless I've missed something put_nfs4_file does not handle a NULL
> pointer argument in the current tree, so this needs a NULL check
> for fp.
> 

Well spotted, I'll fix that...

> > +	spin_lock(&state_lock);
> > +	/* Did the lease get broken before we took the lock? */
> > +	status = -EAGAIN;
> > +	if (!file_has_lease(fl->fl_file))
> > +		goto out_unlock;
> 
> So if the delegation we just tried to add got broken, but someone else
> managed to add one this would also return true.  But could that happen
> either in real life or in theory?  Shouldn't we instead have an
> atomic flag on the delegation that it got broken which we could check
> here?
> 

Hmm...maybe. It's a very unlikely race though, and I think the same
sort of race exists today. In fact, it's worse today since we don't do
any checking of the validity of the lease after acquiring it now.

The flag sounds like a good idea, but the code is structured completely
wrong for it currently. The delegation is only hashed after we get a
lease, so the lease break wouldn't find anything to set a flag on.

Quite frankly, I _really_ do not want to have to rework the locking in
the delegation code yet again. I think this scheme is an improvement
over what we have now, even if it's not 100% perfect.

Once we get the scalability set done, I'd like to go back and overhaul
the delegation code. There are a lot of ugly warts here, but fixing
them is really a separate project in its own right.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 09/10] nfsd: Fix delegation revocation
  2014-07-16 18:30   ` Christoph Hellwig
@ 2014-07-16 19:16     ` Jeff Layton
  2014-07-17  9:22       ` Christoph Hellwig
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2014-07-16 19:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Wed, 16 Jul 2014 11:30:47 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> > -/* Called under the state lock. */
> 
> You're removing a comment hat only becomes true after your patch :)
> 
> But I agree that it's obsolete, the lockdep_assert_held is better than
> the comment.
> 

Yeah, the other problem is that that comment is ambiguous. Does it mean
the state_lock or the client_mutex (which is also sometimes termed
"state lock" in this code).

> > @@ -685,11 +686,10 @@ static void revoke_delegation(struct nfs4_delegation *dp)
> >  	struct nfs4_client *clp = dp->dl_stid.sc_client;
> >  
> >  	if (clp->cl_minorversion == 0)
> > -		destroy_delegation(dp);
> > +		destroy_revoked_delegation(dp);
> 
> Seems a little confusing as we didn't turn the code into a REVOKED_DELEG
> stateid type yet, but given that destroy_revoked_delegation doesn't
> care this is probably fine.
> 

True, but since we're destroying it anyway we don't need to revoke it.
Still it is less confusing and since I'm respinning for other reasons,
I can change that too...

> >  	else {
> > -		unhash_delegation(dp);
> >  		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
> > -		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> > +		list_move(&dp->dl_recall_lru, &clp->cl_revoked);
> 
> What protects access to cl_revoked, btw?
> 

At this point in the series, the client_mutex does. There's a patch
later that protects the cl_revoked list with the cl_lock, but it
depends on some other locking changes and so it's harder to move it to
the front of the series just yet.

> Otherwise this looks fine to me,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> (would have been a little nicer to read if you had kept
> destroy_revoked_delegation, otherwise diff obsfucates the fact that it
> didn't change at all..

Thanks. Do you mean destroy_delegation? destroy_revoked_delegation
still exists -- destroy_delegation is removed, but it's just a wrapper
around nfs4_put_delegation at this point...

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg
  2014-07-16 14:31 ` [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
@ 2014-07-17  1:34   ` J. Bruce Fields
  2014-07-17 10:57     ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2014-07-17  1:34 UTC (permalink / raw)
  To: Jeff Layton; +Cc: hch, linux-nfs

On Wed, Jul 16, 2014 at 10:31:57AM -0400, Jeff Layton wrote:
> state_lock is a heavily contended global lock. We don't want to grab
> that while simultaneously holding the inode->i_lock.
> 
> Add a new per-nfs4_file lock that we can use to protect the
> per-nfs4_file delegation list. Hold that while walking the list in the
> break_deleg callback and queue the workqueue job for each one.
> 
> The workqueue job can then take the state_lock and do the list
> manipulations without the i_lock being held prior to starting the
> rpc call.

Looks fine.

As a separate issue: does nfs4_state_shutdown net have the same problem
as destroy_client?  It seems like there should be a cp->dl_time++ there
too.

Though I'm not sure what the point of that del_recall_lru loop is at
all.  nfs4_state_destroy_net() destroys each client and its associated
delegations.  Maybe that function should just be

	nfs4_lock_state();
	nfs4_client_tracking_exit(net);
	nfs4_state_destroy_net(net);
	WARN_ON(!list_empty(&nn->del_recall_lru);
	nfs4_unlock_state();

--b.

> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/nfsd/nfs4callback.c | 25 ++++++++++++++++++----
>  fs/nfsd/nfs4state.c    | 58 +++++++++++++++++++++++++++++++++-----------------
>  fs/nfsd/state.h        |  4 +++-
>  3 files changed, 62 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 30a71cb46001..a88a93e09d69 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -933,7 +933,7 @@ void nfsd4_shutdown_callback(struct nfs4_client *clp)
>  	set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
>  	/*
>  	 * Note this won't actually result in a null callback;
> -	 * instead, nfsd4_do_callback_rpc() will detect the killed
> +	 * instead, nfsd4_run_cb_null() will detect the killed
>  	 * client, destroy the rpc client, and stop:
>  	 */
>  	do_probe_callback(clp);
> @@ -1011,10 +1011,9 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
>  		run_nfsd4_cb(cb);
>  }
>  
> -void
> -nfsd4_do_callback_rpc(struct work_struct *w)
> +static void
> +nfsd4_run_callback_rpc(struct nfsd4_callback *cb)
>  {
> -	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
>  	struct nfs4_client *clp = cb->cb_clp;
>  	struct rpc_clnt *clnt;
>  
> @@ -1032,6 +1031,24 @@ nfsd4_do_callback_rpc(struct work_struct *w)
>  			cb->cb_ops, cb);
>  }
>  
> +void
> +nfsd4_run_cb_null(struct work_struct *w)
> +{
> +	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback,
> +							cb_work);
> +	nfsd4_run_callback_rpc(cb);
> +}
> +
> +void
> +nfsd4_run_cb_recall(struct work_struct *w)
> +{
> +	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback,
> +							cb_work);
> +
> +	nfsd4_prepare_cb_recall(cb->cb_op);
> +	nfsd4_run_callback_rpc(cb);
> +}
> +
>  void nfsd4_cb_recall(struct nfs4_delegation *dp)
>  {
>  	struct nfsd4_callback *cb = &dp->dl_recall;
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 56ea4f12803e..bdf8ac3393bd 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -254,6 +254,8 @@ static void nfsd4_free_file(struct nfs4_file *f)
>  static inline void
>  put_nfs4_file(struct nfs4_file *fi)
>  {
> +	might_lock(&state_lock);
> +
>  	if (atomic_dec_and_lock(&fi->fi_ref, &state_lock)) {
>  		hlist_del(&fi->fi_hash);
>  		spin_unlock(&state_lock);
> @@ -554,6 +556,8 @@ static void block_delegations(struct knfsd_fh *fh)
>  	u32 hash;
>  	struct bloom_pair *bd = &blocked_delegations;
>  
> +	lockdep_assert_held(&state_lock);
> +
>  	hash = arch_fast_hash(&fh->fh_base, fh->fh_size, 0);
>  
>  	__set_bit(hash&255, bd->set[bd->new]);
> @@ -592,7 +596,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
>  	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
>  	dp->dl_time = 0;
>  	atomic_set(&dp->dl_count, 1);
> -	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_do_callback_rpc);
> +	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_run_cb_recall);
>  	return dp;
>  }
>  
> @@ -640,7 +644,9 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
>  	lockdep_assert_held(&state_lock);
>  
>  	dp->dl_stid.sc_type = NFS4_DELEG_STID;
> +	spin_lock(&fp->fi_lock);
>  	list_add(&dp->dl_perfile, &fp->fi_delegations);
> +	spin_unlock(&fp->fi_lock);
>  	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
>  }
>  
> @@ -648,14 +654,18 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
>  static void
>  unhash_delegation(struct nfs4_delegation *dp)
>  {
> +	struct nfs4_file *fp = dp->dl_file;
> +
>  	spin_lock(&state_lock);
>  	list_del_init(&dp->dl_perclnt);
> -	list_del_init(&dp->dl_perfile);
>  	list_del_init(&dp->dl_recall_lru);
> +	spin_lock(&fp->fi_lock);
> +	list_del_init(&dp->dl_perfile);
> +	spin_unlock(&fp->fi_lock);
>  	spin_unlock(&state_lock);
> -	if (dp->dl_file) {
> -		nfs4_put_deleg_lease(dp->dl_file);
> -		put_nfs4_file(dp->dl_file);
> +	if (fp) {
> +		nfs4_put_deleg_lease(fp);
> +		put_nfs4_file(fp);
>  		dp->dl_file = NULL;
>  	}
>  }
> @@ -1677,7 +1687,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
>  		spin_unlock(&nn->client_lock);
>  		return NULL;
>  	}
> -	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_do_callback_rpc);
> +	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_run_cb_null);
>  	clp->cl_time = get_seconds();
>  	clear_bit(0, &clp->cl_cb_slot_busy);
>  	copy_verf(clp, verf);
> @@ -3079,30 +3089,38 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
>  	return ret;
>  }
>  
> -static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> +void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
>  {
>  	struct nfs4_client *clp = dp->dl_stid.sc_client;
>  	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
>  
> -	lockdep_assert_held(&state_lock);
> -	/* We're assuming the state code never drops its reference
> -	 * without first removing the lease.  Since we're in this lease
> -	 * callback (and since the lease code is serialized by the kernel
> -	 * lock) we know the server hasn't removed the lease yet, we know
> -	 * it's safe to take a reference: */
> -	atomic_inc(&dp->dl_count);
> -
> +	/*
> +	 * We can't do this in nfsd_break_deleg_cb because it is
> +	 * already holding inode->i_lock
> +	 */
> +	spin_lock(&state_lock);
> +	block_delegations(&dp->dl_fh);
>  	/*
>  	 * If the dl_time != 0, then we know that it has already been
>  	 * queued for a lease break. Don't queue it again.
>  	 */
>  	if (dp->dl_time == 0) {
> -		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
>  		dp->dl_time = get_seconds();
> +		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
>  	}
> +	spin_unlock(&state_lock);
> +}
>  
> -	block_delegations(&dp->dl_fh);
> -
> +static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> +{
> +	/*
> +	 * We're assuming the state code never drops its reference
> +	 * without first removing the lease.  Since we're in this lease
> +	 * callback (and since the lease code is serialized by the kernel
> +	 * lock) we know the server hasn't removed the lease yet, we know
> +	 * it's safe to take a reference.
> +	 */
> +	atomic_inc(&dp->dl_count);
>  	nfsd4_cb_recall(dp);
>  }
>  
> @@ -3127,11 +3145,11 @@ static void nfsd_break_deleg_cb(struct file_lock *fl)
>  	 */
>  	fl->fl_break_time = 0;
>  
> -	spin_lock(&state_lock);
>  	fp->fi_had_conflict = true;
> +	spin_lock(&fp->fi_lock);
>  	list_for_each_entry(dp, &fp->fi_delegations, dl_perfile)
>  		nfsd_break_one_deleg(dp);
> -	spin_unlock(&state_lock);
> +	spin_unlock(&fp->fi_lock);
>  }
>  
>  static
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 20857142773f..81b7522e3f67 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -436,7 +436,8 @@ extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(const char *recdir,
>  extern __be32 nfs4_check_open_reclaim(clientid_t *clid,
>  		struct nfsd4_compound_state *cstate, struct nfsd_net *nn);
>  extern int set_callback_cred(void);
> -void nfsd4_do_callback_rpc(struct work_struct *w);
> +void nfsd4_run_cb_null(struct work_struct *w);
> +void nfsd4_run_cb_recall(struct work_struct *w);
>  extern void nfsd4_probe_callback(struct nfs4_client *clp);
>  extern void nfsd4_probe_callback_sync(struct nfs4_client *clp);
>  extern void nfsd4_change_callback(struct nfs4_client *clp, struct nfs4_cb_conn *);
> @@ -444,6 +445,7 @@ extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
>  extern int nfsd4_create_callback_queue(void);
>  extern void nfsd4_destroy_callback_queue(void);
>  extern void nfsd4_shutdown_callback(struct nfs4_client *);
> +extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
>  extern void nfs4_put_delegation(struct nfs4_delegation *dp);
>  extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
>  							struct nfsd_net *nn);
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 09/10] nfsd: Fix delegation revocation
  2014-07-16 19:16     ` Jeff Layton
@ 2014-07-17  9:22       ` Christoph Hellwig
  0 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2014-07-17  9:22 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Wed, Jul 16, 2014 at 03:16:40PM -0400, Jeff Layton wrote:
> > 
> > Seems a little confusing as we didn't turn the code into a REVOKED_DELEG
> > stateid type yet, but given that destroy_revoked_delegation doesn't
> > care this is probably fine.
> > 
> 
> True, but since we're destroying it anyway we don't need to revoke it.
> Still it is less confusing and since I'm respinning for other reasons,
> I can change that too...

It might make sense to just rename it to destroy_delegation and use it
for both types.  There's another opencoded instance later on as well.

> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > 
> > (would have been a little nicer to read if you had kept
> > destroy_revoked_delegation, otherwise diff obsfucates the fact that it
> > didn't change at all..
> 
> Thanks. Do you mean destroy_delegation? destroy_revoked_delegation
> still exists -- destroy_delegation is removed, but it's just a wrapper
> around nfs4_put_delegation at this point...

I meant keept in the same place.  You added a function above it, and
removed one below it, so diff makes a mess of it despite the function not
changing at all.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg
  2014-07-17  1:34   ` J. Bruce Fields
@ 2014-07-17 10:57     ` Jeff Layton
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-17 10:57 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: hch, linux-nfs

On Wed, 16 Jul 2014 21:34:41 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Jul 16, 2014 at 10:31:57AM -0400, Jeff Layton wrote:
> > state_lock is a heavily contended global lock. We don't want to grab
> > that while simultaneously holding the inode->i_lock.
> > 
> > Add a new per-nfs4_file lock that we can use to protect the
> > per-nfs4_file delegation list. Hold that while walking the list in the
> > break_deleg callback and queue the workqueue job for each one.
> > 
> > The workqueue job can then take the state_lock and do the list
> > manipulations without the i_lock being held prior to starting the
> > rpc call.
> 
> Looks fine.
> 
> As a separate issue: does nfs4_state_shutdown net have the same problem
> as destroy_client?  It seems like there should be a cp->dl_time++ there
> too.
> 

I had looked at that before and convinced myself that it wasn't needed,
but now I can't recall why. I think you're correct though -- we ought
to bump it there too. It certainly won't hurt to do so and it might
prevent the same race there.

> Though I'm not sure what the point of that del_recall_lru loop is at
> all.  nfs4_state_destroy_net() destroys each client and its associated
> delegations.  Maybe that function should just be
> 
> 	nfs4_lock_state();
> 	nfs4_client_tracking_exit(net);
> 	nfs4_state_destroy_net(net);
> 	WARN_ON(!list_empty(&nn->del_recall_lru);
> 	nfs4_unlock_state();
> 
> --b.
> 

Yeah, it is odd. I end up relying on that for the "recall_delegations"
fault injector in a later patch in the series, so I'd probably prefer
to leave it as-is for now. We should be able to get rid of it
eventually though.

> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/nfsd/nfs4callback.c | 25 ++++++++++++++++++----
> >  fs/nfsd/nfs4state.c    | 58
> > +++++++++++++++++++++++++++++++++-----------------
> > fs/nfsd/state.h        |  4 +++- 3 files changed, 62 insertions(+),
> > 25 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 30a71cb46001..a88a93e09d69 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -933,7 +933,7 @@ void nfsd4_shutdown_callback(struct nfs4_client
> > *clp) set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
> >  	/*
> >  	 * Note this won't actually result in a null callback;
> > -	 * instead, nfsd4_do_callback_rpc() will detect the killed
> > +	 * instead, nfsd4_run_cb_null() will detect the killed
> >  	 * client, destroy the rpc client, and stop:
> >  	 */
> >  	do_probe_callback(clp);
> > @@ -1011,10 +1011,9 @@ static void nfsd4_process_cb_update(struct
> > nfsd4_callback *cb) run_nfsd4_cb(cb);
> >  }
> >  
> > -void
> > -nfsd4_do_callback_rpc(struct work_struct *w)
> > +static void
> > +nfsd4_run_callback_rpc(struct nfsd4_callback *cb)
> >  {
> > -	struct nfsd4_callback *cb = container_of(w, struct
> > nfsd4_callback, cb_work); struct nfs4_client *clp = cb->cb_clp;
> >  	struct rpc_clnt *clnt;
> >  
> > @@ -1032,6 +1031,24 @@ nfsd4_do_callback_rpc(struct work_struct *w)
> >  			cb->cb_ops, cb);
> >  }
> >  
> > +void
> > +nfsd4_run_cb_null(struct work_struct *w)
> > +{
> > +	struct nfsd4_callback *cb = container_of(w, struct
> > nfsd4_callback,
> > +							cb_work);
> > +	nfsd4_run_callback_rpc(cb);
> > +}
> > +
> > +void
> > +nfsd4_run_cb_recall(struct work_struct *w)
> > +{
> > +	struct nfsd4_callback *cb = container_of(w, struct
> > nfsd4_callback,
> > +							cb_work);
> > +
> > +	nfsd4_prepare_cb_recall(cb->cb_op);
> > +	nfsd4_run_callback_rpc(cb);
> > +}
> > +
> >  void nfsd4_cb_recall(struct nfs4_delegation *dp)
> >  {
> >  	struct nfsd4_callback *cb = &dp->dl_recall;
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 56ea4f12803e..bdf8ac3393bd 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -254,6 +254,8 @@ static void nfsd4_free_file(struct nfs4_file *f)
> >  static inline void
> >  put_nfs4_file(struct nfs4_file *fi)
> >  {
> > +	might_lock(&state_lock);
> > +
> >  	if (atomic_dec_and_lock(&fi->fi_ref, &state_lock)) {
> >  		hlist_del(&fi->fi_hash);
> >  		spin_unlock(&state_lock);
> > @@ -554,6 +556,8 @@ static void block_delegations(struct knfsd_fh
> > *fh) u32 hash;
> >  	struct bloom_pair *bd = &blocked_delegations;
> >  
> > +	lockdep_assert_held(&state_lock);
> > +
> >  	hash = arch_fast_hash(&fh->fh_base, fh->fh_size, 0);
> >  
> >  	__set_bit(hash&255, bd->set[bd->new]);
> > @@ -592,7 +596,7 @@ alloc_init_deleg(struct nfs4_client *clp,
> > struct nfs4_ol_stateid *stp, struct sv fh_copy_shallow(&dp->dl_fh,
> > &current_fh->fh_handle); dp->dl_time = 0;
> >  	atomic_set(&dp->dl_count, 1);
> > -	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_do_callback_rpc);
> > +	INIT_WORK(&dp->dl_recall.cb_work, nfsd4_run_cb_recall);
> >  	return dp;
> >  }
> >  
> > @@ -640,7 +644,9 @@ hash_delegation_locked(struct nfs4_delegation
> > *dp, struct nfs4_file *fp) lockdep_assert_held(&state_lock);
> >  
> >  	dp->dl_stid.sc_type = NFS4_DELEG_STID;
> > +	spin_lock(&fp->fi_lock);
> >  	list_add(&dp->dl_perfile, &fp->fi_delegations);
> > +	spin_unlock(&fp->fi_lock);
> >  	list_add(&dp->dl_perclnt,
> > &dp->dl_stid.sc_client->cl_delegations); }
> >  
> > @@ -648,14 +654,18 @@ hash_delegation_locked(struct nfs4_delegation
> > *dp, struct nfs4_file *fp) static void
> >  unhash_delegation(struct nfs4_delegation *dp)
> >  {
> > +	struct nfs4_file *fp = dp->dl_file;
> > +
> >  	spin_lock(&state_lock);
> >  	list_del_init(&dp->dl_perclnt);
> > -	list_del_init(&dp->dl_perfile);
> >  	list_del_init(&dp->dl_recall_lru);
> > +	spin_lock(&fp->fi_lock);
> > +	list_del_init(&dp->dl_perfile);
> > +	spin_unlock(&fp->fi_lock);
> >  	spin_unlock(&state_lock);
> > -	if (dp->dl_file) {
> > -		nfs4_put_deleg_lease(dp->dl_file);
> > -		put_nfs4_file(dp->dl_file);
> > +	if (fp) {
> > +		nfs4_put_deleg_lease(fp);
> > +		put_nfs4_file(fp);
> >  		dp->dl_file = NULL;
> >  	}
> >  }
> > @@ -1677,7 +1687,7 @@ static struct nfs4_client
> > *create_client(struct xdr_netobj name,
> > spin_unlock(&nn->client_lock); return NULL;
> >  	}
> > -	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_do_callback_rpc);
> > +	INIT_WORK(&clp->cl_cb_null.cb_work, nfsd4_run_cb_null);
> >  	clp->cl_time = get_seconds();
> >  	clear_bit(0, &clp->cl_cb_slot_busy);
> >  	copy_verf(clp, verf);
> > @@ -3079,30 +3089,38 @@ nfs4_share_conflict(struct svc_fh
> > *current_fh, unsigned int deny_type) return ret;
> >  }
> >  
> > -static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> > +void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
> >  {
> >  	struct nfs4_client *clp = dp->dl_stid.sc_client;
> >  	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
> >  
> > -	lockdep_assert_held(&state_lock);
> > -	/* We're assuming the state code never drops its reference
> > -	 * without first removing the lease.  Since we're in this
> > lease
> > -	 * callback (and since the lease code is serialized by the
> > kernel
> > -	 * lock) we know the server hasn't removed the lease yet,
> > we know
> > -	 * it's safe to take a reference: */
> > -	atomic_inc(&dp->dl_count);
> > -
> > +	/*
> > +	 * We can't do this in nfsd_break_deleg_cb because it is
> > +	 * already holding inode->i_lock
> > +	 */
> > +	spin_lock(&state_lock);
> > +	block_delegations(&dp->dl_fh);
> >  	/*
> >  	 * If the dl_time != 0, then we know that it has already
> > been
> >  	 * queued for a lease break. Don't queue it again.
> >  	 */
> >  	if (dp->dl_time == 0) {
> > -		list_add_tail(&dp->dl_recall_lru,
> > &nn->del_recall_lru); dp->dl_time = get_seconds();
> > +		list_add_tail(&dp->dl_recall_lru,
> > &nn->del_recall_lru); }
> > +	spin_unlock(&state_lock);
> > +}
> >  
> > -	block_delegations(&dp->dl_fh);
> > -
> > +static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> > +{
> > +	/*
> > +	 * We're assuming the state code never drops its reference
> > +	 * without first removing the lease.  Since we're in this
> > lease
> > +	 * callback (and since the lease code is serialized by the
> > kernel
> > +	 * lock) we know the server hasn't removed the lease yet,
> > we know
> > +	 * it's safe to take a reference.
> > +	 */
> > +	atomic_inc(&dp->dl_count);
> >  	nfsd4_cb_recall(dp);
> >  }
> >  
> > @@ -3127,11 +3145,11 @@ static void nfsd_break_deleg_cb(struct
> > file_lock *fl) */
> >  	fl->fl_break_time = 0;
> >  
> > -	spin_lock(&state_lock);
> >  	fp->fi_had_conflict = true;
> > +	spin_lock(&fp->fi_lock);
> >  	list_for_each_entry(dp, &fp->fi_delegations, dl_perfile)
> >  		nfsd_break_one_deleg(dp);
> > -	spin_unlock(&state_lock);
> > +	spin_unlock(&fp->fi_lock);
> >  }
> >  
> >  static
> > diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> > index 20857142773f..81b7522e3f67 100644
> > --- a/fs/nfsd/state.h
> > +++ b/fs/nfsd/state.h
> > @@ -436,7 +436,8 @@ extern struct nfs4_client_reclaim
> > *nfsd4_find_reclaim_client(const char *recdir, extern __be32
> > nfs4_check_open_reclaim(clientid_t *clid, struct
> > nfsd4_compound_state *cstate, struct nfsd_net *nn); extern int
> > set_callback_cred(void); -void nfsd4_do_callback_rpc(struct
> > work_struct *w); +void nfsd4_run_cb_null(struct work_struct *w);
> > +void nfsd4_run_cb_recall(struct work_struct *w);
> >  extern void nfsd4_probe_callback(struct nfs4_client *clp);
> >  extern void nfsd4_probe_callback_sync(struct nfs4_client *clp);
> >  extern void nfsd4_change_callback(struct nfs4_client *clp, struct
> > nfs4_cb_conn *); @@ -444,6 +445,7 @@ extern void
> > nfsd4_cb_recall(struct nfs4_delegation *dp); extern int
> > nfsd4_create_callback_queue(void); extern void
> > nfsd4_destroy_callback_queue(void); extern void
> > nfsd4_shutdown_callback(struct nfs4_client *); +extern void
> > nfsd4_prepare_cb_recall(struct nfs4_delegation *dp); extern void
> > nfs4_put_delegation(struct nfs4_delegation *dp); extern struct
> > nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
> > struct nfsd_net *nn); -- 
> > 1.9.3
> > 


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock
  2014-07-16 19:04     ` Jeff Layton
@ 2014-07-17 14:55       ` Christoph Hellwig
  2014-07-17 15:31         ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2014-07-17 14:55 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Wed, Jul 16, 2014 at 03:04:01PM -0400, Jeff Layton wrote:
> Hmm...maybe. It's a very unlikely race though, and I think the same
> sort of race exists today. In fact, it's worse today since we don't do
> any checking of the validity of the lease after acquiring it now.
> 
> The flag sounds like a good idea, but the code is structured completely
> wrong for it currently. The delegation is only hashed after we get a
> lease, so the lease break wouldn't find anything to set a flag on.
> 
> Quite frankly, I _really_ do not want to have to rework the locking in
> the delegation code yet again. I think this scheme is an improvement
> over what we have now, even if it's not 100% perfect.
> 
> Once we get the scalability set done, I'd like to go back and overhaul
> the delegation code. There are a lot of ugly warts here, but fixing
> them is really a separate project in its own right.

But in the old code we had the client lock over all of this, right?

Anyway, if Bruce and you are fine with this I'm not going to block it,
although it seems a little incomplete to me.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock
  2014-07-17 14:55       ` Christoph Hellwig
@ 2014-07-17 15:31         ` Jeff Layton
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-17 15:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jeff Layton, bfields, linux-nfs

On Thu, 17 Jul 2014 07:55:34 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Wed, Jul 16, 2014 at 03:04:01PM -0400, Jeff Layton wrote:
> > Hmm...maybe. It's a very unlikely race though, and I think the same
> > sort of race exists today. In fact, it's worse today since we don't do
> > any checking of the validity of the lease after acquiring it now.
> > 
> > The flag sounds like a good idea, but the code is structured completely
> > wrong for it currently. The delegation is only hashed after we get a
> > lease, so the lease break wouldn't find anything to set a flag on.
> > 
> > Quite frankly, I _really_ do not want to have to rework the locking in
> > the delegation code yet again. I think this scheme is an improvement
> > over what we have now, even if it's not 100% perfect.
> > 
> > Once we get the scalability set done, I'd like to go back and overhaul
> > the delegation code. There are a lot of ugly warts here, but fixing
> > them is really a separate project in its own right.
> 
> But in the old code we had the client lock over all of this, right?
> 

Yes, but it's still possible for the lease to get broken after setting
it but before the delegation is hashed in the existing code. You're
correct though that another nfsd wouldn't be able to set a lease on the
same file though.

One possibility to ensure that can't happen is to check whether
fp->fi_had_conflict became true after vfs_setlease returns. That would
also obviate the need for file_has_lease. I'm still trying to figure
out whether there are potential races there with that approach though.

> Anyway, if Bruce and you are fine with this I'm not going to block it,
> although it seems a little incomplete to me.

Thanks, and I agree. I think we're going to need to revisit this area,
but I suspect we need to do some surgery to the generic lease handling
code too.

For one thing, it's a little disturbing to me that vfs_setlease returns
a pointer to the file_lock that's sitting on the i_flock list. I'm not
sure that it's a problem today, but it seems like a caller could end up
trying to use that pointer after time_out_leases has been called, or
after the lease has been otherwise deleted.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed
  2014-07-16 14:31 ` [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
@ 2014-07-17 18:44   ` J. Bruce Fields
  2014-07-17 18:46     ` Jeff Layton
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2014-07-17 18:44 UTC (permalink / raw)
  To: Jeff Layton; +Cc: hch, linux-nfs

On Wed, Jul 16, 2014 at 10:31:59AM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> Add an extra delegation state to allow the stateid to remain in the idr
> tree until the last reference has been released. This will be necessary
> to ensure uniqueness once the client_mutex is removed.

Cyclic idr allocation should make reuse extremely unlikely, shouldn't
it?

Probably needs to be fixed anyway, applying.

--b.

> 
> [jlayton: reset the sc_type under the state_lock in unhash_delegation]
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/nfsd/nfs4state.c | 8 ++++----
>  fs/nfsd/state.h     | 1 +
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 1b01a20827ab..fd4deb049ddf 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -616,6 +616,7 @@ void
>  nfs4_put_delegation(struct nfs4_delegation *dp)
>  {
>  	if (atomic_dec_and_test(&dp->dl_count)) {
> +		remove_stid(&dp->dl_stid);
>  		nfs4_free_stid(deleg_slab, &dp->dl_stid);
>  		num_delegations--;
>  	}
> @@ -657,6 +658,7 @@ unhash_delegation(struct nfs4_delegation *dp)
>  	struct nfs4_file *fp = dp->dl_file;
>  
>  	spin_lock(&state_lock);
> +	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
>  	list_del_init(&dp->dl_perclnt);
>  	list_del_init(&dp->dl_recall_lru);
>  	spin_lock(&fp->fi_lock);
> @@ -670,19 +672,15 @@ unhash_delegation(struct nfs4_delegation *dp)
>  	}
>  }
>  
> -
> -
>  static void destroy_revoked_delegation(struct nfs4_delegation *dp)
>  {
>  	list_del_init(&dp->dl_recall_lru);
> -	remove_stid(&dp->dl_stid);
>  	nfs4_put_delegation(dp);
>  }
>  
>  static void destroy_delegation(struct nfs4_delegation *dp)
>  {
>  	unhash_delegation(dp);
> -	remove_stid(&dp->dl_stid);
>  	nfs4_put_delegation(dp);
>  }
>  
> @@ -4036,7 +4034,9 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  		return nfs_ok;
>  	default:
>  		printk("unknown stateid type %x\n", s->sc_type);
> +		/* Fallthrough */
>  	case NFS4_CLOSED_STID:
> +	case NFS4_CLOSED_DELEG_STID:
>  		return nfserr_bad_stateid;
>  	}
>  }
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 81b7522e3f67..996d61eeb357 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -80,6 +80,7 @@ struct nfs4_stid {
>  #define NFS4_CLOSED_STID 8
>  /* For a deleg stateid kept around only to process free_stateid's: */
>  #define NFS4_REVOKED_DELEG_STID 16
> +#define NFS4_CLOSED_DELEG_STID 32
>  	unsigned char sc_type;
>  	stateid_t sc_stateid;
>  	struct nfs4_client *sc_client;
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed
  2014-07-17 18:44   ` J. Bruce Fields
@ 2014-07-17 18:46     ` Jeff Layton
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2014-07-17 18:46 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: hch, linux-nfs

On Thu, 17 Jul 2014 14:44:17 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Jul 16, 2014 at 10:31:59AM -0400, Jeff Layton wrote:
> > From: Trond Myklebust <trond.myklebust@primarydata.com>
> > 
> > Add an extra delegation state to allow the stateid to remain in the idr
> > tree until the last reference has been released. This will be necessary
> > to ensure uniqueness once the client_mutex is removed.
> 
> Cyclic idr allocation should make reuse extremely unlikely, shouldn't
> it?
> 
> Probably needs to be fixed anyway, applying.
> 
> --b.
> 

Yes, for the most part, but reuse is still possible once the counter
wraps if you had a long-lived stateid (for instance). Not a likely
problem, but we might as well fix it while we're in here.
 
> > 
> > [jlayton: reset the sc_type under the state_lock in
> > unhash_delegation]
> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/nfsd/nfs4state.c | 8 ++++----
> >  fs/nfsd/state.h     | 1 +
> >  2 files changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 1b01a20827ab..fd4deb049ddf 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -616,6 +616,7 @@ void
> >  nfs4_put_delegation(struct nfs4_delegation *dp)
> >  {
> >  	if (atomic_dec_and_test(&dp->dl_count)) {
> > +		remove_stid(&dp->dl_stid);
> >  		nfs4_free_stid(deleg_slab, &dp->dl_stid);
> >  		num_delegations--;
> >  	}
> > @@ -657,6 +658,7 @@ unhash_delegation(struct nfs4_delegation *dp)
> >  	struct nfs4_file *fp = dp->dl_file;
> >  
> >  	spin_lock(&state_lock);
> > +	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> >  	list_del_init(&dp->dl_perclnt);
> >  	list_del_init(&dp->dl_recall_lru);
> >  	spin_lock(&fp->fi_lock);
> > @@ -670,19 +672,15 @@ unhash_delegation(struct nfs4_delegation *dp)
> >  	}
> >  }
> >  
> > -
> > -
> >  static void destroy_revoked_delegation(struct nfs4_delegation *dp)
> >  {
> >  	list_del_init(&dp->dl_recall_lru);
> > -	remove_stid(&dp->dl_stid);
> >  	nfs4_put_delegation(dp);
> >  }
> >  
> >  static void destroy_delegation(struct nfs4_delegation *dp)
> >  {
> >  	unhash_delegation(dp);
> > -	remove_stid(&dp->dl_stid);
> >  	nfs4_put_delegation(dp);
> >  }
> >  
> > @@ -4036,7 +4034,9 @@ static __be32 nfsd4_validate_stateid(struct
> > nfs4_client *cl, stateid_t *stateid) return nfs_ok;
> >  	default:
> >  		printk("unknown stateid type %x\n", s->sc_type);
> > +		/* Fallthrough */
> >  	case NFS4_CLOSED_STID:
> > +	case NFS4_CLOSED_DELEG_STID:
> >  		return nfserr_bad_stateid;
> >  	}
> >  }
> > diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> > index 81b7522e3f67..996d61eeb357 100644
> > --- a/fs/nfsd/state.h
> > +++ b/fs/nfsd/state.h
> > @@ -80,6 +80,7 @@ struct nfs4_stid {
> >  #define NFS4_CLOSED_STID 8
> >  /* For a deleg stateid kept around only to process free_stateid's:
> > */ #define NFS4_REVOKED_DELEG_STID 16
> > +#define NFS4_CLOSED_DELEG_STID 32
> >  	unsigned char sc_type;
> >  	stateid_t sc_stateid;
> >  	struct nfs4_client *sc_client;
> > -- 
> > 1.9.3
> > 


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-07-17 18:46 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-16 14:31 [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton
2014-07-16 14:31 ` [PATCH v2 01/10] nfsd: eliminate nfsd4_init_callback Jeff Layton
2014-07-16 14:31 ` [PATCH v2 02/10] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
2014-07-17  1:34   ` J. Bruce Fields
2014-07-17 10:57     ` Jeff Layton
2014-07-16 14:31 ` [PATCH v2 03/10] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg Jeff Layton
2014-07-16 14:31 ` [PATCH v2 04/10] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
2014-07-17 18:44   ` J. Bruce Fields
2014-07-17 18:46     ` Jeff Layton
2014-07-16 14:32 ` [PATCH v2 05/10] locks: add file_has_lease to prevent delegation break races Jeff Layton
2014-07-16 14:32 ` [PATCH v2 06/10] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
2014-07-16 18:09   ` Christoph Hellwig
2014-07-16 19:04     ` Jeff Layton
2014-07-17 14:55       ` Christoph Hellwig
2014-07-17 15:31         ` Jeff Layton
2014-07-16 14:32 ` [PATCH v2 07/10] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
2014-07-16 14:32 ` [PATCH v2 08/10] nfsd: Simplify stateid management Jeff Layton
2014-07-16 18:10   ` Christoph Hellwig
2014-07-16 14:32 ` [PATCH v2 09/10] nfsd: Fix delegation revocation Jeff Layton
2014-07-16 18:30   ` Christoph Hellwig
2014-07-16 19:16     ` Jeff Layton
2014-07-17  9:22       ` Christoph Hellwig
2014-07-16 14:32 ` [PATCH v2 10/10] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
2014-07-16 18:11   ` Christoph Hellwig
2014-07-16 14:33 ` [PATCH v2 0/8] nfsd: more delegation fixes to prepare for client_mutex removal Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.