All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 000/101] nfsd: eliminate the client_mutex
@ 2014-07-08 18:02 Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat Jeff Layton
                   ` (99 more replies)
  0 siblings, 100 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

v4 significant changes:

- rebased again on top of Bruce's for-3.17 branch

- the patch to add lockdep_assert_not_held has been dropped. There
  was already the same functionality with might_lock(), so the code
  has been switched to use that instead.

- fix for potential races between delegation break callbacks and the
  laundromat has been added. We use the dl_time value as a way to mark
  whether a delegation has been queued to the nn->del_recall_lru list
  at least once. The fault injection code has some similar fixes in a
  token effort to make it less racy.

- put_client_renew now no longer needs to take the client_lock unless
  the refcount is going to zero. put_client also doesn't need the
  client_lock.

- fixed a bug in the error handling when nfs4_set_delegation returns
  error. The code tried to unhash a delegation that had never been
  hashed and could end up dereferencing a bogus sc_file pointer.

- a bug in the handling of the block_delegations call has been fixed.
  It needs to be called under the state_lock. I also added a lockdep
  assertion for that as well.

v3 changes:
- rebased on top of Bruce's for-3.17 branch

- addressed a number of Christoph's review comments. I've generally kept
  the Reviewed-by's intact when I thought I was changing things along the
  lines that he suggested, but please to glance over the results to be
  sure that I did.

- some more reordering of patches. Some more have been moved near the
  front when they don't depend on other changes. I've also tried to group
  them a little more logically so that patches that touch related areas
  are together.

- second pass at overhauling deny handling. This one should close all of
  the potential races with the fi_share_deny field. There are also a
  number of related cleanups to the deny handling.

- st_access_bmap and st_deny_bmap have been shrunk to a byte each, which
  should help reduce the stateid memory footprint.

- scrapped the Documentation/ file and moved most of its content into
  comments above the respective data structures.

v2 changes:
- rebased on top of v3.16-rc2

- fixed up checkpatch warnings (I'm really starting to hate that 80 column
  limit warning)

- fleshed out patch descriptions. Most of them should now say they are a
  necessary step toward client_mutex removal when it's not otherwise
  obvious. Also, when things touch outside of fs/nfsd, I added Cc lines
  for the appropriate maintainers.

- reordered patches to put more of the ones that don't affect locking
  near the front of the queue. This may make it easier to merge this
  piecemeal.

- I think I have addressed all of Christoph's review comments -- let me
  know if I missed any. For now, I left the Documentation/ patch intact,
  but we don't need to merge it at all if it's objectionable. I may end
  up transplanting it into comments but I ran short of time so I'll defer
  it for now.

- fix race that can occur between concurrent FREE_STATEID and CLOSE. As
  part of that fix, the cl_lock thrashing (and ensuing races) that could
  occur when a stateowner was released has also been eliminated.

- overhaul of access/deny mode handling. Christoph was correct to be
  suspicious. It didn't properly handle the case where a stateid with a
  deny mode was released or downgraded. As a bonus, the new code should be
  much more efficient when you have a long list of stateids as we no longer
  need to walk the entire list to check for deny mode conflicts. I also
  did some cleanup of the file access handling.

- ensure that dl_recall_lru list entries are dequeued before calling
  revoke_delegation (potential memory corruptor).

- Included Christophs fix for the file access leak when nfsd4_truncate
  fails. I took the liberty of adding a commit log message for it and a
  SoB line. Let me know if that's a problem and we can rework it.


This time, I'm just posting what hasn't already been merged into Bruce's
for-3.17 branch. I'll plan to keep the following branch updated with the
latest set:

    http://git.samba.org/?p=jlayton/linux.git;a=shortlog;h=refs/heads/nfsd-devel

Original cover letter text follows:

-----------------------[snip]--------------------------

Here it is. The long awaited removal of the client_mutex from knfsd.  As
many of us are aware, one of the major bottlenecks in NFSv4 serving is
the fact that all compounds are processed while holding a single, global
mutex.

This has an obvious detrimental effect on scalability. I've heard
anecdotal reports of 10x slowdowns with v4 serving vs. v3 on the same
machine, primarily due to it.

This patchset eliminates that mutex and (hopefully!) the bottleneck that
it imposes. The basic idea is to add refcounting to most of the objects
that compounds deal with to ensure that they are pinned while in use.
Spinlocks are used to protect things like the hashtables and trees that
track the objects.

Benny started this set quite some time ago, and Trond took up the torch
early this spring. He then handed it to me to clean up the remaining
bits about a month ago.

Benny Halevy (1):
  nfsd4: use cl_lock to synchronize all stateid idr calls

Jeff Layton (52):
  nfsd: close potential race between delegation break and laundromat
  nfsd: reduce some spinlocking in put_client_renew
  nfsd: Avoid taking state_lock while holding inode lock in
    nfsd_break_one_deleg
  nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  nfsd: remove nfs4_file_put_fd
  nfsd: shrink st_access_bmap and st_deny_bmap
  nfsd: set stateid access and deny bits in nfs4_get_vfs_file
  nfsd: clean up reset_union_bmap_deny
  nfsd: always hold the fi_lock when bumping fi_access refcounts
  nfsd: make deny mode enforcement more efficient and close races in it
  nfsd: cleanup and rename nfs4_check_open
  locks: add file_has_lease to prevent delegation break races
  nfsd: nfs4_alloc_init_lease should take a nfs4_file arg
  nfsd: Protect the nfs4_file delegation fields using the fi_lock
  nfsd: Fix delegation revocation
  nfsd: Ensure atomicity of stateid destruction and idr tree removal
  nfsd: Cleanup the freeing of stateids
  nfsd: do filp_close in sc_free callback for lock stateids
  nfsd: Add locking to protect the state owner lists
  nfsd: clean up races in lock stateid searching and creation
  nfsd: ensure atomicity in nfsd4_free_stateid and
    nfsd4_validate_stateid
  nfsd: clean up lockowner refcounting when finding them
  nfsd: add an operation for unhashing a stateowner
  nfsd: clean up refcounting for lockowners
  nfsd: make openstateids hold references to their openowners
  nfsd: don't allow CLOSE to proceed until refcount on stateid drops
  nfsd: clean up and reorganize release_lockowner
  nfsd: add locking to stateowner release
  nfsd: optimize destroy_lockowner cl_lock thrashing
  nfsd: close potential race in nfsd4_free_stateid
  nfsd: reduce cl_lock thrashing in release_openowner
  nfsd: don't thrash the cl_lock while freeing an open stateid
  nfsd: Protect session creation and client confirm using client_lock
  nfsd: protect the close_lru list and oo_last_closed_stid with
    client_lock
  nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock
  nfsd: move unhash_client_locked call into mark_client_expired_locked
  nfsd: don't destroy client if mark_client_expired_locked fails
  nfsd: don't destroy clients that are busy
  nfsd: protect clid and verifier generation with client_lock
  nfsd: abstract out the get and set routines into the fault injection
    ops
  nfsd: add a forget_clients "get" routine with proper locking
  nfsd: add a forget_client set_clnt routine
  nfsd: add nfsd_inject_forget_clients
  nfsd: add a list_head arg to nfsd_foreach_client_lock
  nfsd: add more granular locking to forget_locks fault injector
  nfsd: add more granular locking to forget_openowners fault injector
  nfsd: add more granular locking to *_delegations fault injectors
  nfsd: remove old fault injection infrastructure
  nfsd: remove nfs4_lock_state: nfs4_laundromat
  nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net
  nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers
  nfsd: add some comments to the nfsd4 object definitions

Trond Myklebust (47):
  nfsd: Ensure stateids remain unique until they are freed
  nfsd: Move the delegation reference counter into the struct nfs4_stid
  nfsd: Add fine grained protection for the nfs4_file->fi_stateids list
  nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache
  nfsd: Add locking to the nfs4_file->fi_fds[] array
  nfsd: clean up helper __release_lock_stateid
  nfsd: Simplify stateid management
  nfsd: Add reference counting to the lock and open stateids
  nfsd: Add a struct nfs4_file field to struct nfs4_stid
  nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file
  nfsd: Convert delegation counter to an atomic_long_t type
  nfsd: Slight cleanup of find_stateid()
  nfsd: Add reference counting to lock stateids
  nfsd: nfsd4_locku() must reference the lock stateid
  nfsd: Ensure that nfs4_open_delegation() references the delegation
    stateid
  nfsd: nfsd4_process_open2() must reference the delegation stateid
  nfsd: nfsd4_process_open2() must reference the open stateid
  nfsd: Prepare nfsd4_close() for open stateid referencing
  nfsd: nfsd4_open_confirm() must reference the open stateid
  nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op
  nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op
  nfsd: Migrate the stateid reference into nfs4_lookup_stateid()
  nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type()
  nfsd: Add reference counting to state owners
  nfsd: Keep a reference to the open stateid for the NFSv4.0 replay
    cache
  nfsd: Make lock stateid take a reference to the lockowner
  nfsd: Protect adding/removing open state owners using client_lock
  nfsd: Protect adding/removing lock owners using client_lock
  nfsd: Move the open owner hash table into struct nfs4_client
  nfsd: Ensure struct nfs4_client is unhashed before we try to destroy
    it
  nfsd: Ensure that the laundromat unhashes the client before releasing
    locks
  nfsd: Don't require client_lock in free_client
  nfsd: Move create_client() call outside the lock
  nfsd: Protect unconfirmed client creation using client_lock
  nfsd: Protect nfsd4_destroy_clientid using client_lock
  nfsd: Ensure lookup_clientid() takes client_lock
  nfsd: Add lockdep assertions to document the nfs4_client/session
    locking
  nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op()
  nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid
  nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner
  nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt()
  nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close
  nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn()
  nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm
  nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session()
  nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm,
    renew
  nfsd: Remove nfs4_lock_state(): reclaim_complete()

 fs/locks.c             |   26 +
 fs/nfsd/fault_inject.c |  130 +--
 fs/nfsd/netns.h        |   15 +-
 fs/nfsd/nfs4callback.c |   28 +-
 fs/nfsd/nfs4proc.c     |   13 +-
 fs/nfsd/nfs4state.c    | 2547 ++++++++++++++++++++++++++++++++++--------------
 fs/nfsd/nfs4xdr.c      |    2 -
 fs/nfsd/state.h        |  170 +++-
 fs/nfsd/xdr4.h         |    5 +-
 include/linux/fs.h     |    6 +
 10 files changed, 2040 insertions(+), 902 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 144+ messages in thread

* [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 15:59   ` J. Bruce Fields
  2014-07-08 18:02 ` [PATCH v4 002/100] nfsd: reduce some spinlocking in put_client_renew Jeff Layton
                   ` (98 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Bruce says:

    There's also a preexisting expire_client/laundromat vs break race:

    - expire_client/laundromat adds a delegation to its local
      reaplist using the same dl_recall_lru field that a delegation
      uses to track its position on the recall lru and drops the
      state lock.

    - a concurrent break_lease adds the delegation to the lru.

    - expire/client/laundromat then walks it reaplist and sees the
      lru head as just another delegation on the list....

Fix this race by checking the dl_time under the state_lock. If we find
that it's not 0, then we know that it has already been queued to the LRU
list and that we shouldn't queue it again.

In the case of destroy_client, we must also ensure that we don't hit
similar races by ensuring that we don't move any delegations to the
reaplist with a dl_time of 0. Just bump the dl_time by one before we
drop the state_lock. We're destroying the delegations anyway, so a 1s
difference there won't matter.

The fault injection code also requires a bit of surgery here:

First, in the case of nfsd_forget_client_delegations, we must prevent
the same sort of race vs. the delegation break callback. For that, we
just increment the dl_time to ensure that a delegation callback can't
race in while we're working on it.

We can't do that for nfsd_recall_client_delegations, as we need to have
it actually queue the delegation, and that won't happen if we increment
the dl_time. The state lock is held over that function, so we don't need
to worry about these sorts of races there.

There is one other potential bug nfsd_recall_client_delegations though.
Entries on the victims list are not dequeued before calling
nfsd_break_one_deleg. That's a potential list corruptor, so ensure that
we do that there.

Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++-------
 1 file changed, 33 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 763aeeb67ccf..633b34fd6c92 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1287,6 +1287,8 @@ destroy_client(struct nfs4_client *clp)
 	while (!list_empty(&clp->cl_delegations)) {
 		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
 		list_del_init(&dp->dl_perclnt);
+		/* Ensure that deleg break won't try to requeue it */
+		++dp->dl_time;
 		list_move(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
@@ -2933,10 +2935,14 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
 	 * it's safe to take a reference: */
 	atomic_inc(&dp->dl_count);
 
-	list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
-
-	/* Only place dl_time is set; protected by i_lock: */
-	dp->dl_time = get_seconds();
+	/*
+	 * If the dl_time != 0, then we know that it has already been
+	 * queued for a lease break. Don't queue it again.
+	 */
+	if (dp->dl_time == 0) {
+		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
+		dp->dl_time = get_seconds();
+	}
 
 	block_delegations(&dp->dl_fh);
 
@@ -5081,8 +5087,23 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 
 	lockdep_assert_held(&state_lock);
 	list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
-		if (victims)
+		if (victims) {
+			/*
+			 * It's not safe to mess with delegations that have a
+			 * non-zero dl_time. They might have already been broken
+			 * and could be processed by the laundromat outside of
+			 * the state_lock. Just leave them be.
+			 */
+			if (dp->dl_time != 0)
+				continue;
+
+			/*
+			 * Increment dl_time to ensure that delegation breaks
+			 * don't monkey with it now that we are.
+			 */
+			++dp->dl_time;
 			list_move(&dp->dl_recall_lru, victims);
+		}
 		if (++count == max)
 			break;
 	}
@@ -5107,14 +5128,19 @@ u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
 
 u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
 {
-	struct nfs4_delegation *dp, *next;
+	struct nfs4_delegation *dp;
 	LIST_HEAD(victims);
 	u64 count;
 
 	spin_lock(&state_lock);
 	count = nfsd_find_all_delegations(clp, max, &victims);
-	list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
+	while (!list_empty(&victims)) {
+		dp = list_first_entry(&victims, struct nfs4_delegation,
+					dl_recall_lru);
+		list_del_init(&dp->dl_recall_lru);
+		dp->dl_time = 0;
 		nfsd_break_one_deleg(dp);
+	}
 	spin_unlock(&state_lock);
 
 	return count;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 002/100] nfsd: reduce some spinlocking in put_client_renew
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 11:18   ` Christoph Hellwig
  2014-07-08 18:02 ` [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
                   ` (97 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

No need to take the lock unless the count goes to 0.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 633b34fd6c92..8aa57265fb08 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -193,8 +193,10 @@ static void put_client_renew(struct nfs4_client *clp)
 {
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
-	spin_lock(&nn->client_lock);
-	put_client_renew_locked(clp);
+	if (!atomic_dec_and_lock(&clp->cl_refcount, &nn->client_lock))
+		return;
+	if (!is_client_expired(clp))
+		renew_client_locked(clp);
 	spin_unlock(&nn->client_lock);
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 002/100] nfsd: reduce some spinlocking in put_client_renew Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 11:23   ` Christoph Hellwig
  2014-07-08 18:02 ` [PATCH v4 004/100] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
                   ` (96 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Add an extra delegation state to allow the stateid to remain in the idr
tree until the last reference has been released. This will be necessary
to ensure uniqueness once the client_mutex is removed.

[jlayton: reset the sc_type under the state_lock in unhash_delegation]

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 9 ++++-----
 fs/nfsd/state.h     | 1 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8aa57265fb08..15d59e063885 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -496,6 +496,7 @@ static void remove_stid(struct nfs4_stid *s)
 
 static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
 {
+	remove_stid(s);
 	kmem_cache_free(slab, s);
 }
 
@@ -540,6 +541,7 @@ static void
 unhash_delegation(struct nfs4_delegation *dp)
 {
 	spin_lock(&state_lock);
+	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
 	list_del_init(&dp->dl_perclnt);
 	list_del_init(&dp->dl_perfile);
 	list_del_init(&dp->dl_recall_lru);
@@ -551,19 +553,15 @@ unhash_delegation(struct nfs4_delegation *dp)
 	}
 }
 
-
-
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
 {
 	list_del_init(&dp->dl_recall_lru);
-	remove_stid(&dp->dl_stid);
 	nfs4_put_delegation(dp);
 }
 
 static void destroy_delegation(struct nfs4_delegation *dp)
 {
 	unhash_delegation(dp);
-	remove_stid(&dp->dl_stid);
 	nfs4_put_delegation(dp);
 }
 
@@ -720,7 +718,6 @@ static void close_generic_stateid(struct nfs4_ol_stateid *stp)
 
 static void free_generic_stateid(struct nfs4_ol_stateid *stp)
 {
-	remove_stid(&stp->st_stid);
 	nfs4_free_stid(stateid_slab, &stp->st_stid);
 }
 
@@ -3814,7 +3811,9 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 		return nfs_ok;
 	default:
 		printk("unknown stateid type %x\n", s->sc_type);
+		/* Fallthrough */
 	case NFS4_CLOSED_STID:
+	case NFS4_CLOSED_DELEG_STID:
 		return nfserr_bad_stateid;
 	}
 }
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 82d8f0aa072e..0776e7c4c10a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -80,6 +80,7 @@ struct nfs4_stid {
 #define NFS4_CLOSED_STID 8
 /* For a deleg stateid kept around only to process free_stateid's: */
 #define NFS4_REVOKED_DELEG_STID 16
+#define NFS4_CLOSED_DELEG_STID 32
 	unsigned char sc_type;
 	stateid_t sc_stateid;
 	struct nfs4_client *sc_client;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 004/100] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (2 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 005/100] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
                   ` (95 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

state_lock is a heavily contended global lock. We don't want to grab
that while simultaneously holding the inode->i_lock.

Add a new per-nfs4_file lock that we can use to protect the
per-nfs4_file delegation list. Hold that while walking the list in the
break_deleg callback and queue the workqueue job for each one.

The workqueue job can then take the state_lock and do the list
manipulations without the i_lock being held prior to starting the
rpc call.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4callback.c | 28 +++++++++++++++++++-------
 fs/nfsd/nfs4state.c    | 53 +++++++++++++++++++++++++++++++++-----------------
 fs/nfsd/state.h        |  2 ++
 3 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 00cb9b7a75f6..cba4ca375f5e 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -43,7 +43,7 @@
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
 
 static void nfsd4_mark_cb_fault(struct nfs4_client *, int reason);
-static void nfsd4_do_callback_rpc(struct work_struct *w);
+static void nfsd4_run_cb_null(struct work_struct *w);
 
 #define NFSPROC4_CB_NULL 0
 #define NFSPROC4_CB_COMPOUND 1
@@ -764,7 +764,7 @@ static void do_probe_callback(struct nfs4_client *clp)
 
 	cb->cb_ops = &nfsd4_cb_probe_ops;
 
-	INIT_WORK(&cb->cb_work, nfsd4_do_callback_rpc);
+	INIT_WORK(&cb->cb_work, nfsd4_run_cb_null);
 
 	run_nfsd4_cb(cb);
 }
@@ -936,7 +936,7 @@ void nfsd4_shutdown_callback(struct nfs4_client *clp)
 	set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
 	/*
 	 * Note this won't actually result in a null callback;
-	 * instead, nfsd4_do_callback_rpc() will detect the killed
+	 * instead, nfsd4_run_cb_null() will detect the killed
 	 * client, destroy the rpc client, and stop:
 	 */
 	do_probe_callback(clp);
@@ -1014,9 +1014,8 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
 		run_nfsd4_cb(cb);
 }
 
-static void nfsd4_do_callback_rpc(struct work_struct *w)
+static void nfsd4_run_callback_rpc(struct nfsd4_callback *cb)
 {
-	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback, cb_work);
 	struct nfs4_client *clp = cb->cb_clp;
 	struct rpc_clnt *clnt;
 
@@ -1034,6 +1033,22 @@ static void nfsd4_do_callback_rpc(struct work_struct *w)
 			cb->cb_ops, cb);
 }
 
+static void nfsd4_run_cb_null(struct work_struct *w)
+{
+	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback,
+							cb_work);
+	nfsd4_run_callback_rpc(cb);
+}
+
+static void nfsd4_run_cb_recall(struct work_struct *w)
+{
+	struct nfsd4_callback *cb = container_of(w, struct nfsd4_callback,
+							cb_work);
+
+	nfsd4_prepare_cb_recall(cb->cb_op);
+	nfsd4_run_callback_rpc(cb);
+}
+
 void nfsd4_cb_recall(struct nfs4_delegation *dp)
 {
 	struct nfsd4_callback *cb = &dp->dl_recall;
@@ -1050,8 +1065,7 @@ void nfsd4_cb_recall(struct nfs4_delegation *dp)
 
 	INIT_LIST_HEAD(&cb->cb_per_client);
 	cb->cb_done = true;
-
-	INIT_WORK(&cb->cb_work, nfsd4_do_callback_rpc);
+	INIT_WORK(&cb->cb_work, nfsd4_run_cb_recall);
 
 	run_nfsd4_cb(&dp->dl_recall);
 }
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 15d59e063885..b77e34aca8bf 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -254,6 +254,8 @@ static void nfsd4_free_file(struct nfs4_file *f)
 static inline void
 put_nfs4_file(struct nfs4_file *fi)
 {
+	might_lock(&state_lock);
+
 	if (atomic_dec_and_lock(&fi->fi_ref, &state_lock)) {
 		hlist_del(&fi->fi_hash);
 		spin_unlock(&state_lock);
@@ -446,6 +448,8 @@ static void block_delegations(struct knfsd_fh *fh)
 	u32 hash;
 	struct bloom_pair *bd = &blocked_delegations;
 
+	lockdep_assert_held(&state_lock);
+
 	hash = arch_fast_hash(&fh->fh_base, fh->fh_size, 0);
 
 	__set_bit(hash&255, bd->set[bd->new]);
@@ -532,7 +536,9 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 	lockdep_assert_held(&state_lock);
 
 	dp->dl_stid.sc_type = NFS4_DELEG_STID;
+	spin_lock(&fp->fi_lock);
 	list_add(&dp->dl_perfile, &fp->fi_delegations);
+	spin_unlock(&fp->fi_lock);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 }
 
@@ -540,15 +546,19 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 static void
 unhash_delegation(struct nfs4_delegation *dp)
 {
+	struct nfs4_file *fp = dp->dl_file;
+
 	spin_lock(&state_lock);
 	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
 	list_del_init(&dp->dl_perclnt);
-	list_del_init(&dp->dl_perfile);
 	list_del_init(&dp->dl_recall_lru);
+	spin_lock(&fp->fi_lock);
+	list_del_init(&dp->dl_perfile);
+	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
-	if (dp->dl_file) {
-		nfs4_put_deleg_lease(dp->dl_file);
-		put_nfs4_file(dp->dl_file);
+	if (fp) {
+		nfs4_put_deleg_lease(fp);
+		put_nfs4_file(fp);
 		dp->dl_file = NULL;
 	}
 }
@@ -2671,6 +2681,7 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
 	lockdep_assert_held(&state_lock);
 
 	atomic_set(&fp->fi_ref, 1);
+	spin_lock_init(&fp->fi_lock);
 	INIT_LIST_HEAD(&fp->fi_stateids);
 	INIT_LIST_HEAD(&fp->fi_delegations);
 	ihold(ino);
@@ -2921,30 +2932,36 @@ out:
 	return ret;
 }
 
-static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
+void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
 {
 	struct nfs4_client *clp = dp->dl_stid.sc_client;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
-	lockdep_assert_held(&state_lock);
-	/* We're assuming the state code never drops its reference
-	 * without first removing the lease.  Since we're in this lease
-	 * callback (and since the lease code is serialized by the kernel
-	 * lock) we know the server hasn't removed the lease yet, we know
-	 * it's safe to take a reference: */
-	atomic_inc(&dp->dl_count);
-
+	/*
+	 * We can't do this in nfsd_break_deleg_cb because it is
+	 * already holding inode->i_lock
+	 */
+	spin_lock(&state_lock);
+	block_delegations(&dp->dl_fh);
 	/*
 	 * If the dl_time != 0, then we know that it has already been
 	 * queued for a lease break. Don't queue it again.
 	 */
 	if (dp->dl_time == 0) {
-		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
 		dp->dl_time = get_seconds();
+		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
 	}
+	spin_unlock(&state_lock);
+}
 
-	block_delegations(&dp->dl_fh);
-
+static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
+{
+	/* We're assuming the state code never drops its reference
+	 * without first removing the lease.  Since we're in this lease
+	 * callback (and since the lease code is serialized by the kernel
+	 * lock) we know the server hasn't removed the lease yet, we know
+	 * it's safe to take a reference: */
+	atomic_inc(&dp->dl_count);
 	nfsd4_cb_recall(dp);
 }
 
@@ -2969,11 +2986,11 @@ static void nfsd_break_deleg_cb(struct file_lock *fl)
 	 */
 	fl->fl_break_time = 0;
 
-	spin_lock(&state_lock);
 	fp->fi_had_conflict = true;
+	spin_lock(&fp->fi_lock);
 	list_for_each_entry(dp, &fp->fi_delegations, dl_perfile)
 		nfsd_break_one_deleg(dp);
-	spin_unlock(&state_lock);
+	spin_unlock(&fp->fi_lock);
 }
 
 static
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 0776e7c4c10a..1ba175ce3d09 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -378,6 +378,7 @@ static inline struct nfs4_lockowner * lockowner(struct nfs4_stateowner *so)
 /* nfs4_file: a file opened by some number of (open) nfs4_stateowners. */
 struct nfs4_file {
 	atomic_t		fi_ref;
+	spinlock_t		fi_lock;
 	struct hlist_node       fi_hash;    /* hash by "struct inode *" */
 	struct list_head        fi_stateids;
 	struct list_head	fi_delegations;
@@ -468,6 +469,7 @@ extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
 extern int nfsd4_create_callback_queue(void);
 extern void nfsd4_destroy_callback_queue(void);
 extern void nfsd4_shutdown_callback(struct nfs4_client *);
+extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
 extern void nfs4_put_delegation(struct nfs4_delegation *dp);
 extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
 							struct nfsd_net *nn);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 005/100] nfsd: Move the delegation reference counter into the struct nfs4_stid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (3 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 004/100] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 11:28   ` Christoph Hellwig
  2014-07-08 18:02 ` [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls Jeff Layton
                   ` (94 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

We will want to add reference counting to the lock stateid and open
stateids too in later patches.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 13 +++++++------
 fs/nfsd/state.h     |  2 +-
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b77e34aca8bf..bcfb350661ee 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -364,6 +364,7 @@ kmem_cache *slab)
 	stid->sc_stateid.si_opaque.so_clid = cl->cl_clientid;
 	/* Will be incremented before return to client: */
 	stid->sc_stateid.si_generation = 0;
+	atomic_set(&stid->sc_count, 1);
 
 	/*
 	 * It shouldn't be a problem to reuse an opaque stateid value.
@@ -487,7 +488,6 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
 	dp->dl_time = 0;
-	atomic_set(&dp->dl_count, 1);
 	return dp;
 }
 
@@ -507,7 +507,7 @@ static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
 void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
-	if (atomic_dec_and_test(&dp->dl_count)) {
+	if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
 		num_delegations--;
 	}
@@ -2958,10 +2958,11 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
 {
 	/* We're assuming the state code never drops its reference
 	 * without first removing the lease.  Since we're in this lease
-	 * callback (and since the lease code is serialized by the kernel
-	 * lock) we know the server hasn't removed the lease yet, we know
-	 * it's safe to take a reference: */
-	atomic_inc(&dp->dl_count);
+	 * callback (and since the lease code is serialized by the i_lock
+	 * we know the server hasn't removed the lease yet, we know it's
+	 * safe to take a reference.
+	 */
+	atomic_inc(&dp->dl_stid.sc_count);
 	nfsd4_cb_recall(dp);
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 1ba175ce3d09..39ac3ac07219 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -73,6 +73,7 @@ struct nfsd4_callback {
 };
 
 struct nfs4_stid {
+	atomic_t sc_count;
 #define NFS4_OPEN_STID 1
 #define NFS4_LOCK_STID 2
 #define NFS4_DELEG_STID 4
@@ -91,7 +92,6 @@ struct nfs4_delegation {
 	struct list_head	dl_perfile;
 	struct list_head	dl_perclnt;
 	struct list_head	dl_recall_lru;  /* delegation recalled */
-	atomic_t		dl_count;       /* ref count */
 	struct nfs4_file	*dl_file;
 	u32			dl_type;
 	time_t			dl_time;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (4 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 005/100] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 11:32   ` Christoph Hellwig
  2014-07-08 18:02 ` [PATCH v4 007/100] nfsd: Add fine grained protection for the nfs4_file->fi_stateids list Jeff Layton
                   ` (93 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Benny Halevy <bhalevy@primarydata.com>

Currently, this is serialized by the client_mutex, which is slated for
removal. Add finer-grained locking here.

Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
---
 fs/nfsd/nfs4state.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index bcfb350661ee..4b91058169af 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -347,7 +347,6 @@ static void nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct
 kmem_cache *slab)
 {
-	struct idr *stateids = &cl->cl_stateids;
 	struct nfs4_stid *stid;
 	int new_id;
 
@@ -355,7 +354,11 @@ kmem_cache *slab)
 	if (!stid)
 		return NULL;
 
-	new_id = idr_alloc_cyclic(stateids, stid, 0, 0, GFP_KERNEL);
+	idr_preload(GFP_KERNEL);
+	spin_lock(&cl->cl_lock);
+	new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 0, 0, GFP_NOWAIT);
+	spin_unlock(&cl->cl_lock);
+	idr_preload_end();
 	if (new_id < 0)
 		goto out_free;
 	stid->sc_client = cl;
@@ -493,9 +496,11 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 
 static void remove_stid(struct nfs4_stid *s)
 {
-	struct idr *stateids = &s->sc_client->cl_stateids;
+	struct nfs4_client *clp = s->sc_client;
 
-	idr_remove(stateids, s->sc_stateid.si_opaque.so_id);
+	spin_lock(&clp->cl_lock);
+	idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
+	spin_unlock(&clp->cl_lock);
 }
 
 static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
@@ -1266,7 +1271,9 @@ free_client(struct nfs4_client *clp)
 	rpc_destroy_wait_queue(&clp->cl_cb_waitq);
 	free_svc_cred(&clp->cl_cred);
 	kfree(clp->cl_name.data);
+	spin_lock(&clp->cl_lock);
 	idr_destroy(&clp->cl_stateids);
+	spin_unlock(&clp->cl_lock);
 	kfree(clp);
 }
 
@@ -1491,7 +1498,9 @@ static struct nfs4_stid *find_stateid(struct nfs4_client *cl, stateid_t *t)
 {
 	struct nfs4_stid *ret;
 
+	spin_lock(&cl->cl_lock);
 	ret = idr_find(&cl->cl_stateids, t->si_opaque.so_id);
+	spin_unlock(&cl->cl_lock);
 	if (!ret || !ret->sc_type)
 		return NULL;
 	return ret;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 007/100] nfsd: Add fine grained protection for the nfs4_file->fi_stateids list
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (5 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 11:33   ` Christoph Hellwig
  2014-07-08 18:02 ` [PATCH v4 008/100] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache Jeff Layton
                   ` (92 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Access to this list is currently serialized by the client_mutex. Add
finer grained locking around this list in preparation for its removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4b91058169af..c5818e02bc6a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -720,7 +720,11 @@ release_all_access(struct nfs4_ol_stateid *stp)
 
 static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
 {
+	struct nfs4_file *fp = stp->st_file;
+
+	spin_lock(&fp->fi_lock);
 	list_del(&stp->st_perfile);
+	spin_unlock(&fp->fi_lock);
 	list_del(&stp->st_perstateowner);
 }
 
@@ -2814,7 +2818,6 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 	stp->st_stid.sc_type = NFS4_OPEN_STID;
 	INIT_LIST_HEAD(&stp->st_locks);
 	list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
-	list_add(&stp->st_perfile, &fp->fi_stateids);
 	stp->st_stateowner = &oo->oo_owner;
 	get_nfs4_file(fp);
 	stp->st_file = fp;
@@ -2823,6 +2826,9 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 	set_access(open->op_share_access, stp);
 	set_deny(open->op_share_deny, stp);
 	stp->st_openstp = NULL;
+	spin_lock(&fp->fi_lock);
+	list_add(&stp->st_perfile, &fp->fi_stateids);
+	spin_unlock(&fp->fi_lock);
 }
 
 static void
@@ -2930,6 +2936,7 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
 		return nfs_ok;
 	ret = nfserr_locked;
 	/* Search for conflicting share reservations */
+	spin_lock(&fp->fi_lock);
 	list_for_each_entry(stp, &fp->fi_stateids, st_perfile) {
 		if (test_deny(deny_type, stp) ||
 		    test_deny(NFS4_SHARE_DENY_BOTH, stp))
@@ -2937,6 +2944,7 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
 	}
 	ret = nfs_ok;
 out:
+	spin_unlock(&fp->fi_lock);
 	put_nfs4_file(fp);
 	return ret;
 }
@@ -3172,6 +3180,7 @@ nfs4_check_open(struct nfs4_file *fp, struct nfsd4_open *open, struct nfs4_ol_st
 	struct nfs4_ol_stateid *local;
 	struct nfs4_openowner *oo = open->op_openowner;
 
+	spin_lock(&fp->fi_lock);
 	list_for_each_entry(local, &fp->fi_stateids, st_perfile) {
 		/* ignore lock owners */
 		if (local->st_stateowner->so_is_open_owner == 0)
@@ -3180,9 +3189,12 @@ nfs4_check_open(struct nfs4_file *fp, struct nfsd4_open *open, struct nfs4_ol_st
 		if (local->st_stateowner == &oo->oo_owner)
 			*stpp = local;
 		/* check for conflicting share reservations */
-		if (!test_share(local, open))
+		if (!test_share(local, open)) {
+			spin_unlock(&fp->fi_lock);
 			return nfserr_share_denied;
+		}
 	}
+	spin_unlock(&fp->fi_lock);
 	return nfs_ok;
 }
 
@@ -4432,7 +4444,6 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	if (stp == NULL)
 		return NULL;
 	stp->st_stid.sc_type = NFS4_LOCK_STID;
-	list_add(&stp->st_perfile, &fp->fi_stateids);
 	list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
 	stp->st_stateowner = &lo->lo_owner;
 	get_nfs4_file(fp);
@@ -4441,6 +4452,9 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	stp->st_deny_bmap = open_stp->st_deny_bmap;
 	stp->st_openstp = open_stp;
 	list_add(&stp->st_locks, &open_stp->st_locks);
+	spin_lock(&fp->fi_lock);
+	list_add(&stp->st_perfile, &fp->fi_stateids);
+	spin_unlock(&fp->fi_lock);
 	return stp;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 008/100] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (6 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 007/100] nfsd: Add fine grained protection for the nfs4_file->fi_stateids list Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 009/100] nfsd: Add locking to the nfs4_file->fi_fds[] array Jeff Layton
                   ` (91 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

We don't want to rely on the client_mutex for protection in the case of
NFSv4 open owners. Instead, we add a mutex that will only be taken for
NFSv4.0 state mutating operations, and that will be released once the
entire compound is done.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4proc.c  | 13 +++++--------
 fs/nfsd/nfs4state.c | 21 ++++++++-------------
 fs/nfsd/nfs4xdr.c   |  2 --
 fs/nfsd/state.h     |  1 +
 fs/nfsd/xdr4.h      | 21 +++++++++++++++++++++
 5 files changed, 35 insertions(+), 23 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 29a617ebe38c..5004245e9958 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -468,11 +468,11 @@ out:
 		kfree(resfh);
 	}
 	nfsd4_cleanup_open_state(open, status);
-	if (open->op_openowner && !nfsd4_has_session(cstate))
-		cstate->replay_owner = &open->op_openowner->oo_owner;
+	if (open->op_openowner)
+		nfsd4_cstate_assign_replay(cstate,
+				&open->op_openowner->oo_owner);
 	nfsd4_bump_seqid(cstate, status);
-	if (!cstate->replay_owner)
-		nfs4_unlock_state();
+	nfs4_unlock_state();
 	return status;
 }
 
@@ -1393,10 +1393,7 @@ encode_op:
 			args->ops, args->opcnt, resp->opcnt, op->opnum,
 			be32_to_cpu(status));
 
-		if (cstate->replay_owner) {
-			nfs4_unlock_state();
-			cstate->replay_owner = NULL;
-		}
+		nfsd4_cstate_clear_replay(cstate);
 		/* XXX Ugh, we need to get rid of this kind of special case: */
 		if (op->opnum == OP_READ && op->u.read.rd_filp)
 			fput(op->u.read.rd_filp);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c5818e02bc6a..ff7d74e0e39e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -888,7 +888,7 @@ void nfsd4_bump_seqid(struct nfsd4_compound_state *cstate, __be32 nfserr)
 		return;
 
 	if (!seqid_mutating_err(ntohl(nfserr))) {
-		cstate->replay_owner = NULL;
+		nfsd4_cstate_clear_replay(cstate);
 		return;
 	}
 	if (!so)
@@ -2759,6 +2759,7 @@ static void init_nfs4_replay(struct nfs4_replay *rp)
 	rp->rp_status = nfserr_serverfault;
 	rp->rp_buflen = 0;
 	rp->rp_buf = rp->rp_ibuf;
+	mutex_init(&rp->rp_mutex);
 }
 
 static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj *owner, struct nfs4_client *clp)
@@ -4083,8 +4084,7 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
 	if (status)
 		return status;
 	stp = openlockstateid(s);
-	if (!nfsd4_has_session(cstate))
-		cstate->replay_owner = stp->st_stateowner;
+	nfsd4_cstate_assign_replay(cstate, stp->st_stateowner);
 
 	status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
 	if (!status)
@@ -4145,8 +4145,7 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	status = nfs_ok;
 out:
 	nfsd4_bump_seqid(cstate, status);
-	if (!cstate->replay_owner)
-		nfs4_unlock_state();
+	nfs4_unlock_state();
 	return status;
 }
 
@@ -4228,8 +4227,7 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
 	status = nfs_ok;
 out:
 	nfsd4_bump_seqid(cstate, status);
-	if (!cstate->replay_owner)
-		nfs4_unlock_state();
+	nfs4_unlock_state();
 	return status;
 }
 
@@ -4284,8 +4282,7 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	nfsd4_close_open_stateid(stp);
 out:
-	if (!cstate->replay_owner)
-		nfs4_unlock_state();
+	nfs4_unlock_state();
 	return status;
 }
 
@@ -4679,8 +4676,7 @@ out:
 	if (status && new_state)
 		release_lock_stateid(lock_stp);
 	nfsd4_bump_seqid(cstate, status);
-	if (!cstate->replay_owner)
-		nfs4_unlock_state();
+	nfs4_unlock_state();
 	if (file_lock)
 		locks_free_lock(file_lock);
 	if (conflock)
@@ -4841,8 +4837,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 out:
 	nfsd4_bump_seqid(cstate, status);
-	if (!cstate->replay_owner)
-		nfs4_unlock_state();
+	nfs4_unlock_state();
 	if (file_lock)
 		locks_free_lock(file_lock);
 	return status;
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 21ffb9b9b768..b1eabebf6e19 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3926,8 +3926,6 @@ status:
  * 
  * XDR note: do not encode rp->rp_buflen: the buffer contains the
  * previously sent already encoded operation.
- *
- * called with nfs4_lock_state() held
  */
 void
 nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 39ac3ac07219..e8c059d2826e 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -328,6 +328,7 @@ struct nfs4_replay {
 	unsigned int		rp_buflen;
 	char			*rp_buf;
 	struct knfsd_fh		rp_openfh;
+	struct mutex		rp_mutex;
 	char			rp_ibuf[NFSD4_REPLAY_ISIZE];
 };
 
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 5abf6c942ddf..7442dc7efd31 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -74,6 +74,27 @@ static inline bool nfsd4_has_session(struct nfsd4_compound_state *cs)
 	return cs->slot != NULL;
 }
 
+static inline void
+nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
+				struct nfs4_stateowner *so)
+{
+	if (!nfsd4_has_session(cstate)) {
+		mutex_lock(&so->so_replay.rp_mutex);
+		cstate->replay_owner = so;
+	}
+}
+
+static inline void
+nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate)
+{
+	struct nfs4_stateowner *so = cstate->replay_owner;
+
+	if (so != NULL) {
+		cstate->replay_owner = NULL;
+		mutex_unlock(&so->so_replay.rp_mutex);
+	}
+}
+
 struct nfsd4_change_info {
 	u32		atomic;
 	bool		change_supported;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 009/100] nfsd: Add locking to the nfs4_file->fi_fds[] array
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (7 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 008/100] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10 10:32   ` Christoph Hellwig
  2014-07-08 18:02 ` [PATCH v4 010/100] nfsd: clean up helper __release_lock_stateid Jeff Layton
                   ` (90 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Preparation for removal of the client_mutex, which currently protects
this array.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 107 ++++++++++++++++++++++++++++++++++++++++++++--------
 fs/nfsd/state.h     |  26 -------------
 2 files changed, 91 insertions(+), 42 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ff7d74e0e39e..0b42bd47a7d6 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -270,6 +270,52 @@ get_nfs4_file(struct nfs4_file *fi)
 	atomic_inc(&fi->fi_ref);
 }
 
+static struct file *__nfs4_get_fd(struct nfs4_file *f, int oflag)
+{
+	if (f->fi_fds[oflag])
+		return get_file(f->fi_fds[oflag]);
+	return NULL;
+}
+
+static struct file *find_writeable_file(struct nfs4_file *f)
+{
+	struct file *ret;
+
+	spin_lock(&f->fi_lock);
+	ret = __nfs4_get_fd(f, O_WRONLY);
+	if (!ret)
+		ret = __nfs4_get_fd(f, O_RDWR);
+	spin_unlock(&f->fi_lock);
+	return ret;
+}
+
+static struct file *find_readable_file(struct nfs4_file *f)
+{
+	struct file *ret;
+
+	spin_lock(&f->fi_lock);
+	ret = __nfs4_get_fd(f, O_RDONLY);
+	if (!ret)
+		ret = __nfs4_get_fd(f, O_RDWR);
+	spin_unlock(&f->fi_lock);
+	return ret;
+}
+
+static struct file *find_any_file(struct nfs4_file *f)
+{
+	struct file *ret;
+
+	spin_lock(&f->fi_lock);
+	ret = __nfs4_get_fd(f, O_RDWR);
+	if (!ret) {
+		ret = __nfs4_get_fd(f, O_WRONLY);
+		if (!ret)
+			ret = __nfs4_get_fd(f, O_RDONLY);
+	}
+	spin_unlock(&f->fi_lock);
+	return ret;
+}
+
 static int num_delegations;
 unsigned long max_delegations;
 
@@ -318,20 +364,31 @@ static void nfs4_file_get_access(struct nfs4_file *fp, int oflag)
 		__nfs4_file_get_access(fp, oflag);
 }
 
-static void nfs4_file_put_fd(struct nfs4_file *fp, int oflag)
+static struct file *nfs4_file_put_fd(struct nfs4_file *fp, int oflag)
 {
-	if (fp->fi_fds[oflag]) {
-		fput(fp->fi_fds[oflag]);
-		fp->fi_fds[oflag] = NULL;
-	}
+	struct file *filp;
+
+	filp = fp->fi_fds[oflag];
+	fp->fi_fds[oflag] = NULL;
+	return filp;
 }
 
 static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 {
-	if (atomic_dec_and_test(&fp->fi_access[oflag])) {
-		nfs4_file_put_fd(fp, oflag);
+	might_lock(&fp->fi_lock);
+
+	if (atomic_dec_and_lock(&fp->fi_access[oflag], &fp->fi_lock)) {
+		struct file *f1 = NULL;
+		struct file *f2 = NULL;
+
+		f1 = nfs4_file_put_fd(fp, oflag);
 		if (atomic_read(&fp->fi_access[1 - oflag]) == 0)
-			nfs4_file_put_fd(fp, O_RDWR);
+			f2 = nfs4_file_put_fd(fp, O_RDWR);
+		spin_unlock(&fp->fi_lock);
+		if (f1)
+			fput(f1);
+		if (f2)
+			fput(f2);
 	}
 }
 
@@ -748,8 +805,10 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 	unhash_generic_stateid(stp);
 	unhash_stid(&stp->st_stid);
 	file = find_any_file(stp->st_file);
-	if (file)
+	if (file) {
 		locks_remove_posix(file, (fl_owner_t)lockowner(stp->st_stateowner));
+		fput(file);
+	}
 	close_generic_stateid(stp);
 	free_generic_stateid(stp);
 }
@@ -3228,17 +3287,27 @@ nfsd4_truncate(struct svc_rqst *rqstp, struct svc_fh *fh,
 static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 		struct svc_fh *cur_fh, struct nfsd4_open *open)
 {
+	struct file *filp = NULL;
 	__be32 status;
 	int oflag = nfs4_access_to_omode(open->op_share_access);
 	int access = nfs4_access_to_access(open->op_share_access);
 
+	spin_lock(&fp->fi_lock);
 	if (!fp->fi_fds[oflag]) {
-		status = nfsd_open(rqstp, cur_fh, S_IFREG, access,
-			&fp->fi_fds[oflag]);
+		spin_unlock(&fp->fi_lock);
+		status = nfsd_open(rqstp, cur_fh, S_IFREG, access, &filp);
 		if (status)
 			goto out;
+		spin_lock(&fp->fi_lock);
+		if (!fp->fi_fds[oflag]) {
+			fp->fi_fds[oflag] = filp;
+			filp = NULL;
+		}
 	}
 	nfs4_file_get_access(fp, oflag);
+	spin_unlock(&fp->fi_lock);
+	if (filp)
+		fput(filp);
 
 	status = nfsd4_truncate(rqstp, cur_fh, open);
 	if (status)
@@ -3323,13 +3392,15 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	if (status)
 		goto out_free;
 	fp->fi_lease = fl;
-	fp->fi_deleg_file = get_file(fl->fl_file);
+	fp->fi_deleg_file = fl->fl_file;
 	atomic_set(&fp->fi_delegees, 1);
 	spin_lock(&state_lock);
 	hash_delegation_locked(dp, fp);
 	spin_unlock(&state_lock);
 	return 0;
 out_free:
+	if (fl->fl_file)
+		fput(fl->fl_file);
 	locks_free_lock(fl);
 	return status;
 }
@@ -3929,6 +4000,7 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 				status = nfserr_serverfault;
 				goto out;
 			}
+			get_file(file);
 		}
 		break;
 	case NFS4_OPEN_STID:
@@ -3956,7 +4028,7 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 	}
 	status = nfs_ok;
 	if (file)
-		*filpp = get_file(file);
+		*filpp = file;
 out:
 	nfs4_unlock_state();
 	return status;
@@ -4673,6 +4745,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		break;
 	}
 out:
+	if (filp)
+		fput(filp);
 	if (status && new_state)
 		release_lock_stateid(lock_stp);
 	nfsd4_bump_seqid(cstate, status);
@@ -4812,7 +4886,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (!file_lock) {
 		dprintk("NFSD: %s: unable to allocate lock!\n", __func__);
 		status = nfserr_jukebox;
-		goto out;
+		goto fput;
 	}
 	locks_init_lock(file_lock);
 	file_lock->fl_type = F_UNLCK;
@@ -4834,7 +4908,8 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&locku->lu_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
-
+fput:
+	fput(filp);
 out:
 	nfsd4_bump_seqid(cstate, status);
 	nfs4_unlock_state();
@@ -4844,7 +4919,7 @@ out:
 
 out_nfserr:
 	status = nfserrno(err);
-	goto out;
+	goto fput;
 }
 
 /*
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index e8c059d2826e..2ce1d8f583e1 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -400,32 +400,6 @@ struct nfs4_file {
 	bool			fi_had_conflict;
 };
 
-/* XXX: for first cut may fall back on returning file that doesn't work
- * at all? */
-static inline struct file *find_writeable_file(struct nfs4_file *f)
-{
-	if (f->fi_fds[O_WRONLY])
-		return f->fi_fds[O_WRONLY];
-	return f->fi_fds[O_RDWR];
-}
-
-static inline struct file *find_readable_file(struct nfs4_file *f)
-{
-	if (f->fi_fds[O_RDONLY])
-		return f->fi_fds[O_RDONLY];
-	return f->fi_fds[O_RDWR];
-}
-
-static inline struct file *find_any_file(struct nfs4_file *f)
-{
-	if (f->fi_fds[O_RDWR])
-		return f->fi_fds[O_RDWR];
-	else if (f->fi_fds[O_WRONLY])
-		return f->fi_fds[O_WRONLY];
-	else
-		return f->fi_fds[O_RDONLY];
-}
-
 /* "ol" stands for "Open or Lock".  Better suggestions welcome. */
 struct nfs4_ol_stateid {
 	struct nfs4_stid    st_stid; /* must be first field */
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 010/100] nfsd: clean up helper __release_lock_stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (8 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 009/100] nfsd: Add locking to the nfs4_file->fi_fds[] array Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-08 18:02 ` [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access Jeff Layton
                   ` (89 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Use filp_close instead of open coding. filp_close does a bit more than
just release the locks and put the filp. It also calls ->flush and
dnotify_flush, both of which should be done here anyway.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/nfs4state.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0b42bd47a7d6..775267ed186a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -805,10 +805,8 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 	unhash_generic_stateid(stp);
 	unhash_stid(&stp->st_stid);
 	file = find_any_file(stp->st_file);
-	if (file) {
-		locks_remove_posix(file, (fl_owner_t)lockowner(stp->st_stateowner));
-		fput(file);
-	}
+	if (file)
+		filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
 	close_generic_stateid(stp);
 	free_generic_stateid(stp);
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (9 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 010/100] nfsd: clean up helper __release_lock_stateid Jeff Layton
@ 2014-07-08 18:02 ` Jeff Layton
  2014-07-10  7:59   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd Jeff Layton
                   ` (88 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:02 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Have them take NFS4_SHARE_ACCESS_* flags instead of an open mode. This
spares the callers from having to convert it themselves.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 59 ++++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 23 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 775267ed186a..2c49cd909115 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -341,6 +341,20 @@ static unsigned int ownerstr_hashval(u32 clientid, struct xdr_netobj *ownername)
 #define FILE_HASH_BITS                   8
 #define FILE_HASH_SIZE                  (1 << FILE_HASH_BITS)
 
+static int nfs4_access_to_omode(u32 access)
+{
+	switch (access & NFS4_SHARE_ACCESS_BOTH) {
+	case NFS4_SHARE_ACCESS_READ:
+		return O_RDONLY;
+	case NFS4_SHARE_ACCESS_WRITE:
+		return O_WRONLY;
+	case NFS4_SHARE_ACCESS_BOTH:
+		return O_RDWR;
+	}
+	WARN_ON_ONCE(1);
+	return O_RDONLY;
+}
+
 static unsigned int file_hashval(struct inode *ino)
 {
 	/* XXX: why are we hashing on inode pointer, anyway? */
@@ -355,8 +369,15 @@ static void __nfs4_file_get_access(struct nfs4_file *fp, int oflag)
 	atomic_inc(&fp->fi_access[oflag]);
 }
 
-static void nfs4_file_get_access(struct nfs4_file *fp, int oflag)
+static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
 {
+	int oflag = nfs4_access_to_omode(access);
+
+	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
+	access &= NFS4_SHARE_ACCESS_BOTH;
+	if (access == 0)
+		return;
+
 	if (oflag == O_RDWR) {
 		__nfs4_file_get_access(fp, O_RDONLY);
 		__nfs4_file_get_access(fp, O_WRONLY);
@@ -392,8 +413,16 @@ static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 	}
 }
 
-static void nfs4_file_put_access(struct nfs4_file *fp, int oflag)
+static void nfs4_file_put_access(struct nfs4_file *fp, u32 access)
 {
+	int oflag;
+
+	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
+	access &= NFS4_SHARE_ACCESS_BOTH;
+	if (!access)
+		return;
+
+	oflag = nfs4_access_to_omode(access);
 	if (oflag == O_RDWR) {
 		__nfs4_file_put_access(fp, O_RDONLY);
 		__nfs4_file_put_access(fp, O_WRONLY);
@@ -747,20 +776,6 @@ test_deny(u32 access, struct nfs4_ol_stateid *stp)
 	return test_bit(access, &stp->st_deny_bmap);
 }
 
-static int nfs4_access_to_omode(u32 access)
-{
-	switch (access & NFS4_SHARE_ACCESS_BOTH) {
-	case NFS4_SHARE_ACCESS_READ:
-		return O_RDONLY;
-	case NFS4_SHARE_ACCESS_WRITE:
-		return O_WRONLY;
-	case NFS4_SHARE_ACCESS_BOTH:
-		return O_RDWR;
-	}
-	WARN_ON_ONCE(1);
-	return O_RDONLY;
-}
-
 /* release all access and file references for a given stateid */
 static void
 release_all_access(struct nfs4_ol_stateid *stp)
@@ -769,8 +784,7 @@ release_all_access(struct nfs4_ol_stateid *stp)
 
 	for (i = 1; i < 4; i++) {
 		if (test_access(i, stp))
-			nfs4_file_put_access(stp->st_file,
-					     nfs4_access_to_omode(i));
+			nfs4_file_put_access(stp->st_file, i);
 		clear_access(i, stp);
 	}
 }
@@ -3302,7 +3316,7 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 			filp = NULL;
 		}
 	}
-	nfs4_file_get_access(fp, oflag);
+	nfs4_file_get_access(fp, open->op_share_access);
 	spin_unlock(&fp->fi_lock);
 	if (filp)
 		fput(filp);
@@ -3314,7 +3328,7 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 	return nfs_ok;
 
 out_put_access:
-	nfs4_file_put_access(fp, oflag);
+	nfs4_file_put_access(fp, open->op_share_access);
 out:
 	return status;
 }
@@ -4223,7 +4237,7 @@ static inline void nfs4_stateid_downgrade_bit(struct nfs4_ol_stateid *stp, u32 a
 {
 	if (!test_access(access, stp))
 		return;
-	nfs4_file_put_access(stp->st_file, nfs4_access_to_omode(access));
+	nfs4_file_put_access(stp->st_file, access);
 	clear_access(access, stp);
 }
 
@@ -4548,11 +4562,10 @@ check_lock_length(u64 offset, u64 length)
 static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
 {
 	struct nfs4_file *fp = lock_stp->st_file;
-	int oflag = nfs4_access_to_omode(access);
 
 	if (test_access(access, lock_stp))
 		return;
-	nfs4_file_get_access(fp, oflag);
+	nfs4_file_get_access(fp, access);
 	set_access(access, lock_stp);
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (10 preceding siblings ...)
  2014-07-08 18:02 ` [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10  8:03   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap Jeff Layton
                   ` (87 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...and replace it with a simple swap call.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2c49cd909115..478f8f6d797e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -385,15 +385,6 @@ static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
 		__nfs4_file_get_access(fp, oflag);
 }
 
-static struct file *nfs4_file_put_fd(struct nfs4_file *fp, int oflag)
-{
-	struct file *filp;
-
-	filp = fp->fi_fds[oflag];
-	fp->fi_fds[oflag] = NULL;
-	return filp;
-}
-
 static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 {
 	might_lock(&fp->fi_lock);
@@ -402,9 +393,9 @@ static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 		struct file *f1 = NULL;
 		struct file *f2 = NULL;
 
-		f1 = nfs4_file_put_fd(fp, oflag);
+		swap(f1, fp->fi_fds[oflag]);
 		if (atomic_read(&fp->fi_access[1 - oflag]) == 0)
-			f2 = nfs4_file_put_fd(fp, O_RDWR);
+			swap(f2, fp->fi_fds[O_RDWR]);
 		spin_unlock(&fp->fi_lock);
 		if (f1)
 			fput(f1);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (11 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10  8:04   ` Christoph Hellwig
  2014-07-10 10:50   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file Jeff Layton
                   ` (86 subsequent siblings)
  99 siblings, 2 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

We never use anything above bit #3, so an unsigned long for each is
wasteful. Shrink them to a char each, and add some WARN_ON_ONCE calls if
we try to set or clear bits that would go outside those sizes.

Note too that because atomic bitops work on unsigned longs, we have to
abandon their use here. That shouldn't be a problem though since we
don't really care about the atomicity in this code anyway. Using them
was just a convenient way to flip bits.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 38 +++++++++++++++++++++++++++-----------
 fs/nfsd/state.h     |  4 ++--
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 478f8f6d797e..c6fac2de7c95 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -729,42 +729,58 @@ test_share(struct nfs4_ol_stateid *stp, struct nfsd4_open *open) {
 static inline void
 set_access(u32 access, struct nfs4_ol_stateid *stp)
 {
-	__set_bit(access, &stp->st_access_bmap);
+	unsigned char mask = 1 << access;
+
+	WARN_ON_ONCE(access > NFS4_SHARE_ACCESS_BOTH);
+	stp->st_access_bmap |= mask;
 }
 
 /* clear share access for a given stateid */
 static inline void
 clear_access(u32 access, struct nfs4_ol_stateid *stp)
 {
-	__clear_bit(access, &stp->st_access_bmap);
+	unsigned char mask = 1 << access;
+
+	WARN_ON_ONCE(access > NFS4_SHARE_ACCESS_BOTH);
+	stp->st_access_bmap &= ~mask;
 }
 
 /* test whether a given stateid has access */
 static inline bool
 test_access(u32 access, struct nfs4_ol_stateid *stp)
 {
-	return test_bit(access, &stp->st_access_bmap);
+	unsigned char mask = 1 << access;
+
+	return (bool)(stp->st_access_bmap & mask);
 }
 
 /* set share deny for a given stateid */
 static inline void
-set_deny(u32 access, struct nfs4_ol_stateid *stp)
+set_deny(u32 deny, struct nfs4_ol_stateid *stp)
 {
-	__set_bit(access, &stp->st_deny_bmap);
+	unsigned char mask = 1 << deny;
+
+	WARN_ON_ONCE(deny > NFS4_SHARE_DENY_BOTH);
+	stp->st_deny_bmap |= mask;
 }
 
 /* clear share deny for a given stateid */
 static inline void
-clear_deny(u32 access, struct nfs4_ol_stateid *stp)
+clear_deny(u32 deny, struct nfs4_ol_stateid *stp)
 {
-	__clear_bit(access, &stp->st_deny_bmap);
+	unsigned char mask = 1 << deny;
+
+	WARN_ON_ONCE(deny > NFS4_SHARE_DENY_BOTH);
+	stp->st_deny_bmap &= ~mask;
 }
 
 /* test whether a given stateid is denying specific access */
 static inline bool
-test_deny(u32 access, struct nfs4_ol_stateid *stp)
+test_deny(u32 deny, struct nfs4_ol_stateid *stp)
 {
-	return test_bit(access, &stp->st_deny_bmap);
+	unsigned char mask = 1 << deny;
+
+	return (bool)(stp->st_deny_bmap & mask);
 }
 
 /* release all access and file references for a given stateid */
@@ -4284,12 +4300,12 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
 		goto out; 
 	status = nfserr_inval;
 	if (!test_access(od->od_share_access, stp)) {
-		dprintk("NFSD: access not a subset current bitmap: 0x%lx, input access=%08x\n",
+		dprintk("NFSD: access not a subset of current bitmap: 0x%hhx, input access=%08x\n",
 			stp->st_access_bmap, od->od_share_access);
 		goto out;
 	}
 	if (!test_deny(od->od_share_deny, stp)) {
-		dprintk("NFSD:deny not a subset current bitmap: 0x%lx, input deny=%08x\n",
+		dprintk("NFSD: deny not a subset of current bitmap: 0x%hhx, input deny=%08x\n",
 			stp->st_deny_bmap, od->od_share_deny);
 		goto out;
 	}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 2ce1d8f583e1..fa6d28329c8a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -408,8 +408,8 @@ struct nfs4_ol_stateid {
 	struct list_head              st_locks;
 	struct nfs4_stateowner      * st_stateowner;
 	struct nfs4_file            * st_file;
-	unsigned long                 st_access_bmap;
-	unsigned long                 st_deny_bmap;
+	unsigned char                 st_access_bmap;
+	unsigned char                 st_deny_bmap;
 	struct nfs4_ol_stateid         * st_openstp;
 };
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (12 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10  8:34   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny Jeff Layton
                   ` (85 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Cleanup -- ensure that the stateid bits are set at the same time that
the file access refcounts are incremented. Keeping them coherent like
this makes it easier to ensure that we account for all of the
references.

Since the initialization of the st_*_bmap fields is done when it's
hashed, we go ahead and hash the stateid before getting access to the
file and unhash it if that function returns error. This will be
necessary anyway in a follow-on patch that will overhaul deny mode
handling.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c6fac2de7c95..8cfd38a4dcc0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3304,7 +3304,8 @@ nfsd4_truncate(struct svc_rqst *rqstp, struct svc_fh *fh,
 }
 
 static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
-		struct svc_fh *cur_fh, struct nfsd4_open *open)
+		struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp,
+		struct nfsd4_open *open)
 {
 	struct file *filp = NULL;
 	__be32 status;
@@ -3332,6 +3333,9 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 	if (status)
 		goto out_put_access;
 
+	/* Set access and deny bits in stateid */
+	set_access(open->op_share_access, stp);
+	set_deny(open->op_share_deny, stp);
 	return nfs_ok;
 
 out_put_access:
@@ -3343,20 +3347,15 @@ out:
 static __be32
 nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
 {
-	u32 op_share_access = open->op_share_access;
 	__be32 status;
 
-	if (!test_access(op_share_access, stp))
-		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, open);
+	if (!test_access(open->op_share_access, stp))
+		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
 	else
 		status = nfsd4_truncate(rqstp, cur_fh, open);
 
 	if (status)
 		return status;
-
-	/* remember the open */
-	set_access(op_share_access, stp);
-	set_deny(open->op_share_deny, stp);
 	return nfs_ok;
 }
 
@@ -3604,12 +3603,14 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 		if (status)
 			goto out;
 	} else {
-		status = nfs4_get_vfs_file(rqstp, fp, current_fh, open);
-		if (status)
-			goto out;
 		stp = open->op_stp;
 		open->op_stp = NULL;
 		init_open_stateid(stp, fp, open);
+		status = nfs4_get_vfs_file(rqstp, fp, current_fh, stp, open);
+		if (status) {
+			release_open_stateid(stp);
+			goto out;
+		}
 	}
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&open->op_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (13 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10 10:31   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts Jeff Layton
                   ` (84 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Fix the "deny" argument type, and start the loop at 1. The 0 iteration
is always a noop.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8cfd38a4dcc0..67d1cb75a667 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4268,10 +4268,11 @@ static inline void nfs4_stateid_downgrade(struct nfs4_ol_stateid *stp, u32 to_ac
 }
 
 static void
-reset_union_bmap_deny(unsigned long deny, struct nfs4_ol_stateid *stp)
+reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)
 {
 	int i;
-	for (i = 0; i < 4; i++) {
+
+	for (i = 1; i < 4; i++) {
 		if ((i & deny) != i)
 			clear_deny(i, stp);
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (14 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10  8:51   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it Jeff Layton
                   ` (83 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Once we remove the client_mutex, there's an unlikely but possible race
that could occur. It will be possible for nfs4_file_put_access to race
with nfs4_file_get_access. The refcount will go to zero (briefly) and
then bumped back to one. If that happens we set ourselves up for a
use-after-free and the potential for a lock to race onto the i_flock
list as a filp is being torn down.

Ensure that we can safely bump the refcount on the file by holding the
fi_lock whenever that's done. The only place it currently isn't is in
get_lock_access.

In order to ensure atomicity with finding the file, add some
find_*_file_locked calls that can be called while already holding the
fi_lock and then call get_lock_access to get new access references on
the nfs4_file.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 47 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 41 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 67d1cb75a667..bd24337a8763 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -277,27 +277,52 @@ static struct file *__nfs4_get_fd(struct nfs4_file *f, int oflag)
 	return NULL;
 }
 
-static struct file *find_writeable_file(struct nfs4_file *f)
+static struct file *
+find_writeable_file_locked(struct nfs4_file *f)
 {
 	struct file *ret;
 
-	spin_lock(&f->fi_lock);
+	lockdep_assert_held(&f->fi_lock);
+
 	ret = __nfs4_get_fd(f, O_WRONLY);
 	if (!ret)
 		ret = __nfs4_get_fd(f, O_RDWR);
-	spin_unlock(&f->fi_lock);
 	return ret;
 }
 
-static struct file *find_readable_file(struct nfs4_file *f)
+static struct file *
+find_writeable_file(struct nfs4_file *f)
 {
 	struct file *ret;
 
 	spin_lock(&f->fi_lock);
+	ret = find_writeable_file_locked(f);
+	spin_unlock(&f->fi_lock);
+
+	return ret;
+}
+
+static struct file *find_readable_file_locked(struct nfs4_file *f)
+{
+	struct file *ret;
+
+	lockdep_assert_held(&f->fi_lock);
+
 	ret = __nfs4_get_fd(f, O_RDONLY);
 	if (!ret)
 		ret = __nfs4_get_fd(f, O_RDWR);
+	return ret;
+}
+
+static struct file *
+find_readable_file(struct nfs4_file *f)
+{
+	struct file *ret;
+
+	spin_lock(&f->fi_lock);
+	ret = find_readable_file_locked(f);
 	spin_unlock(&f->fi_lock);
+
 	return ret;
 }
 
@@ -373,6 +398,8 @@ static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
 {
 	int oflag = nfs4_access_to_omode(access);
 
+	lockdep_assert_held(&fp->fi_lock);
+
 	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
 	access &= NFS4_SHARE_ACCESS_BOTH;
 	if (access == 0)
@@ -4572,6 +4599,8 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
 {
 	struct nfs4_file *fp = lock_stp->st_file;
 
+	lockdep_assert_held(&fp->fi_lock);
+
 	if (test_access(access, lock_stp))
 		return;
 	nfs4_file_get_access(fp, access);
@@ -4623,6 +4652,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfs4_openowner *open_sop = NULL;
 	struct nfs4_lockowner *lock_sop = NULL;
 	struct nfs4_ol_stateid *lock_stp;
+	struct nfs4_file *fp;
 	struct file *filp = NULL;
 	struct file_lock *file_lock = NULL;
 	struct file_lock *conflock = NULL;
@@ -4703,20 +4733,25 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		goto out;
 	}
 
+	fp = lock_stp->st_file;
 	locks_init_lock(file_lock);
 	switch (lock->lk_type) {
 		case NFS4_READ_LT:
 		case NFS4_READW_LT:
-			filp = find_readable_file(lock_stp->st_file);
+			spin_lock(&fp->fi_lock);
+			filp = find_readable_file_locked(fp);
 			if (filp)
 				get_lock_access(lock_stp, NFS4_SHARE_ACCESS_READ);
+			spin_unlock(&fp->fi_lock);
 			file_lock->fl_type = F_RDLCK;
 			break;
 		case NFS4_WRITE_LT:
 		case NFS4_WRITEW_LT:
-			filp = find_writeable_file(lock_stp->st_file);
+			spin_lock(&fp->fi_lock);
+			filp = find_writeable_file_locked(fp);
 			if (filp)
 				get_lock_access(lock_stp, NFS4_SHARE_ACCESS_WRITE);
+			spin_unlock(&fp->fi_lock);
 			file_lock->fl_type = F_WRLCK;
 			break;
 		default:
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (15 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10 10:49   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 018/100] nfsd: cleanup and rename nfs4_check_open Jeff Layton
                   ` (82 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

The current enforcement of deny modes is both inefficient and scattered
across several places, which makes it hard to guarantee atomicity. The
inefficiency is a problem now, and the lack of atomicity will mean races
once the client_mutex is removed.

First, we address the inefficiency. We have to track deny modes on a
per-stateid basis to ensure that open downgrades are sane, but when the
server goes to enforce them it has to walk the entire list of stateids
and check against each one.

Instead of doing that, maintain a per-nfs4_file deny mode. When a file
is opened, we simply set any deny bits in that mode that were specified
in the OPEN call. We can then use that unified deny mode to do a simple
check to see whether there are any conflicts without needing to walk the
entire stateid list.

The only time we'll need to walk the entire list of stateids is when a
stateid that has a deny mode on it is being released, or one is having
its deny mode downgraded. In that case, we must walk the entire list and
recalculate the fi_share_deny field. Since deny modes are pretty rare
today, this should be very rare under normal workloads.

To address the potential for races once the client_mutex is removed,
protect fi_share_deny with the fi_lock. In nfs4_get_vfs_file, check to
make sure that any deny mode we want to apply won't conflict with
existing access. If that's ok, then have nfs4_file_get_access check that
new access to the file won't conflict with existing deny modes.

If that also passes, then get file access references, set the correct
access and deny bits in the stateid, and update the fi_share_deny field.
If opening the file or truncating it fails, then unwind the whole mess
and return the appropriate error.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 208 +++++++++++++++++++++++++++++++++++-----------------
 fs/nfsd/state.h     |   1 +
 2 files changed, 140 insertions(+), 69 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index bd24337a8763..6afeb7d39f9d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -388,28 +388,53 @@ static unsigned int file_hashval(struct inode *ino)
 
 static struct hlist_head file_hashtbl[FILE_HASH_SIZE];
 
-static void __nfs4_file_get_access(struct nfs4_file *fp, int oflag)
+static void
+__nfs4_file_get_access(struct nfs4_file *fp, u32 access)
 {
-	WARN_ON_ONCE(!(fp->fi_fds[oflag] || fp->fi_fds[O_RDWR]));
-	atomic_inc(&fp->fi_access[oflag]);
+	int oflag = nfs4_access_to_omode(access);
+
+	if (oflag == O_RDWR) {
+		atomic_inc(&fp->fi_access[O_RDONLY]);
+		atomic_inc(&fp->fi_access[O_WRONLY]);
+	} else
+		atomic_inc(&fp->fi_access[oflag]);
 }
 
-static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
+static __be32
+nfs4_file_get_access(struct nfs4_file *fp, u32 access)
 {
-	int oflag = nfs4_access_to_omode(access);
-
 	lockdep_assert_held(&fp->fi_lock);
 
 	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
 	access &= NFS4_SHARE_ACCESS_BOTH;
+
+	/* Does this access mask make sense? */
 	if (access == 0)
-		return;
+		return nfserr_inval;
 
-	if (oflag == O_RDWR) {
-		__nfs4_file_get_access(fp, O_RDONLY);
-		__nfs4_file_get_access(fp, O_WRONLY);
-	} else
-		__nfs4_file_get_access(fp, oflag);
+	/* Does it conflict with a deny mode already set? */
+	if ((access & fp->fi_share_deny) != 0)
+		return nfserr_share_denied;
+
+	__nfs4_file_get_access(fp, access);
+	return nfs_ok;
+}
+
+static __be32 nfs4_file_check_deny(struct nfs4_file *fp, u32 deny)
+{
+	/* Common case is that there is no deny mode. */
+	deny &= NFS4_SHARE_DENY_BOTH;
+	if (deny) {
+		/* Note: relies on NFS4_SHARE_DENY_BOTH == READ|WRITE */
+		if ((deny & NFS4_SHARE_DENY_READ) &&
+		    atomic_read(&fp->fi_access[O_RDONLY]))
+			return nfserr_share_denied;
+
+		if ((deny & NFS4_SHARE_DENY_WRITE) &&
+		    atomic_read(&fp->fi_access[O_WRONLY]))
+			return nfserr_share_denied;
+	}
+	return nfs_ok;
 }
 
 static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
@@ -741,17 +766,6 @@ bmap_to_share_mode(unsigned long bmap) {
 	return access;
 }
 
-static bool
-test_share(struct nfs4_ol_stateid *stp, struct nfsd4_open *open) {
-	unsigned int access, deny;
-
-	access = bmap_to_share_mode(stp->st_access_bmap);
-	deny = bmap_to_share_mode(stp->st_deny_bmap);
-	if ((access & open->op_share_deny) || (deny & open->op_share_access))
-		return false;
-	return true;
-}
-
 /* set share access for a given stateid */
 static inline void
 set_access(u32 access, struct nfs4_ol_stateid *stp)
@@ -810,11 +824,49 @@ test_deny(u32 deny, struct nfs4_ol_stateid *stp)
 	return (bool)(stp->st_deny_bmap & mask);
 }
 
+/*
+ * A stateid that had a deny mode associated with it is being released
+ * or downgraded. Recalculate the deny mode on the file.
+ */
+static void
+recalculate_deny_mode(struct nfs4_file *fp)
+{
+	struct nfs4_ol_stateid *stp;
+
+	spin_lock(&fp->fi_lock);
+	fp->fi_share_deny = 0;
+	list_for_each_entry(stp, &fp->fi_stateids, st_perfile)
+		fp->fi_share_deny |= bmap_to_share_mode(stp->st_deny_bmap);
+	spin_unlock(&fp->fi_lock);
+}
+
+static void
+reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)
+{
+	int i;
+	bool change = false;
+
+	for (i = 1; i < 4; i++) {
+		if ((i & deny) != i) {
+			change = true;
+			clear_deny(i, stp);
+		}
+	}
+
+	/* Recalculate per-file deny mode if there was a change */
+	if (change)
+		recalculate_deny_mode(stp->st_file);
+}
+
 /* release all access and file references for a given stateid */
 static void
 release_all_access(struct nfs4_ol_stateid *stp)
 {
 	int i;
+	struct nfs4_file *fp = stp->st_file;
+
+	if (fp && stp->st_deny_bmap != 0)
+		recalculate_deny_mode(fp);
 
 	for (i = 1; i < 4; i++) {
 		if (test_access(i, stp))
@@ -2806,6 +2858,7 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino)
 	fp->fi_inode = ino;
 	fp->fi_had_conflict = false;
 	fp->fi_lease = NULL;
+	fp->fi_share_deny = 0;
 	memset(fp->fi_fds, 0, sizeof(fp->fi_fds));
 	memset(fp->fi_access, 0, sizeof(fp->fi_access));
 	hlist_add_head(&fp->fi_hash, &file_hashtbl[hashval]);
@@ -3034,22 +3087,15 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
 {
 	struct inode *ino = current_fh->fh_dentry->d_inode;
 	struct nfs4_file *fp;
-	struct nfs4_ol_stateid *stp;
-	__be32 ret;
+	__be32 ret = nfs_ok;
 
 	fp = find_file(ino);
 	if (!fp)
-		return nfs_ok;
-	ret = nfserr_locked;
-	/* Search for conflicting share reservations */
+		return ret;
+	/* Check for conflicting share reservations */
 	spin_lock(&fp->fi_lock);
-	list_for_each_entry(stp, &fp->fi_stateids, st_perfile) {
-		if (test_deny(deny_type, stp) ||
-		    test_deny(NFS4_SHARE_DENY_BOTH, stp))
-			goto out;
-	}
-	ret = nfs_ok;
-out:
+	if (fp->fi_share_deny & deny_type)
+		ret = nfserr_locked;
 	spin_unlock(&fp->fi_lock);
 	put_nfs4_file(fp);
 	return ret;
@@ -3292,12 +3338,9 @@ nfs4_check_open(struct nfs4_file *fp, struct nfsd4_open *open, struct nfs4_ol_st
 		if (local->st_stateowner->so_is_open_owner == 0)
 			continue;
 		/* remember if we have seen this open owner */
-		if (local->st_stateowner == &oo->oo_owner)
+		if (local->st_stateowner == &oo->oo_owner) {
 			*stpp = local;
-		/* check for conflicting share reservations */
-		if (!test_share(local, open)) {
-			spin_unlock(&fp->fi_lock);
-			return nfserr_share_denied;
+			break;
 		}
 	}
 	spin_unlock(&fp->fi_lock);
@@ -3338,20 +3381,47 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 	__be32 status;
 	int oflag = nfs4_access_to_omode(open->op_share_access);
 	int access = nfs4_access_to_access(open->op_share_access);
+	unsigned char old_access_bmap, old_deny_bmap;
 
 	spin_lock(&fp->fi_lock);
+
+	/*
+	 * Are we trying to set a deny mode that would conflict with
+	 * current access?
+	 */
+	status = nfs4_file_check_deny(fp, open->op_share_deny);
+	if (status != nfs_ok) {
+		spin_unlock(&fp->fi_lock);
+		goto out;
+	}
+
+	/* set access to the file */
+	status = nfs4_file_get_access(fp, open->op_share_access);
+	if (status != nfs_ok) {
+		spin_unlock(&fp->fi_lock);
+		goto out;
+	}
+
+	/* Set access bits in stateid */
+	old_access_bmap = stp->st_access_bmap;
+	set_access(open->op_share_access, stp);
+
+	/* Set new deny mask */
+	old_deny_bmap = stp->st_deny_bmap;
+	set_deny(open->op_share_deny, stp);
+	fp->fi_share_deny |= (open->op_share_deny & NFS4_SHARE_DENY_BOTH);
+
 	if (!fp->fi_fds[oflag]) {
 		spin_unlock(&fp->fi_lock);
 		status = nfsd_open(rqstp, cur_fh, S_IFREG, access, &filp);
 		if (status)
-			goto out;
+			goto out_put_access;
 		spin_lock(&fp->fi_lock);
 		if (!fp->fi_fds[oflag]) {
 			fp->fi_fds[oflag] = filp;
 			filp = NULL;
 		}
 	}
-	nfs4_file_get_access(fp, open->op_share_access);
 	spin_unlock(&fp->fi_lock);
 	if (filp)
 		fput(filp);
@@ -3359,33 +3429,43 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 	status = nfsd4_truncate(rqstp, cur_fh, open);
 	if (status)
 		goto out_put_access;
-
-	/* Set access and deny bits in stateid */
-	set_access(open->op_share_access, stp);
-	set_deny(open->op_share_deny, stp);
-	return nfs_ok;
-
-out_put_access:
-	nfs4_file_put_access(fp, open->op_share_access);
 out:
 	return status;
+out_put_access:
+	stp->st_access_bmap = old_access_bmap;
+	nfs4_file_put_access(fp, open->op_share_access);
+	reset_union_bmap_deny(bmap_to_share_mode(old_deny_bmap), stp);
+	goto out;
 }
 
 static __be32
 nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
 {
 	__be32 status;
+	unsigned char old_deny_bmap;
 
 	if (!test_access(open->op_share_access, stp))
-		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
-	else
-		status = nfsd4_truncate(rqstp, cur_fh, open);
+		return nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
 
-	if (status)
+	/* test and set deny mode */
+	spin_lock(&fp->fi_lock);
+	status = nfs4_file_check_deny(fp, open->op_share_deny);
+	if (status == nfs_ok) {
+		old_deny_bmap = stp->st_deny_bmap;
+		set_deny(open->op_share_deny, stp);
+		fp->fi_share_deny |=
+				(open->op_share_deny & NFS4_SHARE_DENY_BOTH);
+	}
+	spin_unlock(&fp->fi_lock);
+
+	if (status != nfs_ok)
 		return status;
-	return nfs_ok;
-}
 
+	status = nfsd4_truncate(rqstp, cur_fh, open);
+	if (status != nfs_ok)
+		reset_union_bmap_deny(old_deny_bmap, stp);
+	return status;
+}
 
 static void
 nfs4_set_claim_prev(struct nfsd4_open *open, bool has_session)
@@ -3607,7 +3687,8 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 	 */
 	fp = find_or_add_file(ino, open->op_file);
 	if (fp != open->op_file) {
-		if ((status = nfs4_check_open(fp, open, &stp)))
+		status = nfs4_check_open(fp, open, &stp);
+		if (status)
 			goto out;
 		status = nfs4_check_deleg(cl, open, &dp);
 		if (status)
@@ -4294,17 +4375,6 @@ static inline void nfs4_stateid_downgrade(struct nfs4_ol_stateid *stp, u32 to_ac
 	}
 }
 
-static void
-reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)
-{
-	int i;
-
-	for (i = 1; i < 4; i++) {
-		if ((i & deny) != i)
-			clear_deny(i, stp);
-	}
-}
-
 __be32
 nfsd4_open_downgrade(struct svc_rqst *rqstp,
 		     struct nfsd4_compound_state *cstate,
@@ -4603,7 +4673,7 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
 
 	if (test_access(access, lock_stp))
 		return;
-	nfs4_file_get_access(fp, access);
+	__nfs4_file_get_access(fp, access);
 	set_access(access, lock_stp);
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index fa6d28329c8a..27adbebfd168 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -393,6 +393,7 @@ struct nfs4_file {
 	 *   + 1 to both of the above if NFS4_SHARE_ACCESS_BOTH is set.
 	 */
 	atomic_t		fi_access[2];
+	u32			fi_share_deny;
 	struct file		*fi_deleg_file;
 	struct file_lock	*fi_lease;
 	atomic_t		fi_delegees;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 018/100] nfsd: cleanup and rename nfs4_check_open
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (16 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-10 10:51   ` Christoph Hellwig
  2014-07-08 18:03 ` [PATCH v4 019/100] locks: add file_has_lease to prevent delegation break races Jeff Layton
                   ` (81 subsequent siblings)
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Rename it to better describe what it does, and have it just return the
stateid instead of a __be32 (which is now always nfs_ok). Also, do the
search for an existing stateid after the delegation check, to reduce
cleanup if the delegation check returns error.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6afeb7d39f9d..16282f1bf2ea 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3326,10 +3326,10 @@ out:
 	return nfs_ok;
 }
 
-static __be32
-nfs4_check_open(struct nfs4_file *fp, struct nfsd4_open *open, struct nfs4_ol_stateid **stpp)
+static struct nfs4_ol_stateid *
+nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 {
-	struct nfs4_ol_stateid *local;
+	struct nfs4_ol_stateid *local, *ret = NULL;
 	struct nfs4_openowner *oo = open->op_openowner;
 
 	spin_lock(&fp->fi_lock);
@@ -3337,14 +3337,13 @@ nfs4_check_open(struct nfs4_file *fp, struct nfsd4_open *open, struct nfs4_ol_st
 		/* ignore lock owners */
 		if (local->st_stateowner->so_is_open_owner == 0)
 			continue;
-		/* remember if we have seen this open owner */
 		if (local->st_stateowner == &oo->oo_owner) {
-			*stpp = local;
+			ret = local;
 			break;
 		}
 	}
 	spin_unlock(&fp->fi_lock);
-	return nfs_ok;
+	return ret;
 }
 
 static inline int nfs4_access_to_access(u32 nfs4_access)
@@ -3687,12 +3686,10 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 	 */
 	fp = find_or_add_file(ino, open->op_file);
 	if (fp != open->op_file) {
-		status = nfs4_check_open(fp, open, &stp);
-		if (status)
-			goto out;
 		status = nfs4_check_deleg(cl, open, &dp);
 		if (status)
 			goto out;
+		stp = nfsd4_find_existing_open(fp, open);
 	} else {
 		open->op_file = NULL;
 		status = nfserr_bad_stateid;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 019/100] locks: add file_has_lease to prevent delegation break races
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (17 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 018/100] nfsd: cleanup and rename nfs4_check_open Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 020/100] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg Jeff Layton
                   ` (80 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Once we remove the client_mutex, we'll have a potential race between
setting a lease on a file and breaking the delegation. We may set the
lease, but by the time we go to hash it, it may have already been
broken. Currently, we know that this won't happen as the nfs4_laundromat
takes the client_mutex, but we want to remove that.

As part of that fix, add a function that can tell us whether a
particular file has a lease set on it. In a later nfsd patch, we'll use
that to close the potential race window.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/locks.c         | 26 ++++++++++++++++++++++++++
 include/linux/fs.h |  6 ++++++
 2 files changed, 32 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index 717fbc404e6b..005cc86927e3 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1308,6 +1308,32 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
 }
 
 /**
+ * file_has_lease - does the given file have a lease set on it?
+ * @file: struct file on which we want to check the lease
+ *
+ * Returns true if a lease was is set on the given file description,
+ * false otherwise.
+ */
+bool
+file_has_lease(struct file *file)
+{
+	bool ret = false;
+	struct inode *inode = file_inode(file);
+	struct file_lock *fl;
+
+	spin_lock(&inode->i_lock);
+	for (fl = inode->i_flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (fl->fl_file == file) {
+			ret = true;
+			break;
+		}
+	}
+	spin_unlock(&inode->i_lock);
+	return ret;
+}
+EXPORT_SYMBOL(file_has_lease);
+
+/**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
  *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e11d60cc867b..7ae6f4869669 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -953,6 +953,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl);
+extern bool file_has_lease(struct file *file);
 extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **);
@@ -1064,6 +1065,11 @@ static inline int flock_lock_file_wait(struct file *filp,
 	return -ENOLCK;
 }
 
+static inline bool file_has_lease(struct file *file)
+{
+	return false;
+}
+
 static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	return 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 020/100] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (18 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 019/100] locks: add file_has_lease to prevent delegation break races Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 021/100] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
                   ` (79 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

No need to pass the delegation pointer in here as it's only used to get
the nfs4_file pointer.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 16282f1bf2ea..9dd3571b0ec6 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3485,7 +3485,7 @@ static bool nfsd4_cb_channel_good(struct nfs4_client *clp)
 	return clp->cl_minorversion && clp->cl_cb_state == NFSD4_CB_UNKNOWN;
 }
 
-static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int flag)
+static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 {
 	struct file_lock *fl;
 
@@ -3497,7 +3497,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
-	fl->fl_owner = (fl_owner_t)(dp->dl_file);
+	fl->fl_owner = (fl_owner_t)fp;
 	fl->fl_pid = current->tgid;
 	return fl;
 }
@@ -3508,7 +3508,7 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	struct file_lock *fl;
 	int status;
 
-	fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ);
+	fl = nfs4_alloc_init_lease(fp, NFS4_OPEN_DELEGATE_READ);
 	if (!fl)
 		return -ENOMEM;
 	fl->fl_file = find_readable_file(fp);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 021/100] nfsd: Protect the nfs4_file delegation fields using the fi_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (19 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 020/100] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 022/100] nfsd: Simplify stateid management Jeff Layton
                   ` (78 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Move more of the delegation fields to be protected by the fi_lock. It's
more granular than the state_lock and in later patches we'll want to
be able to rely on it in addition to the state_lock.

Also, the current code in nfs4_setlease calls vfs_setlease and uses the
client_mutex to ensure that it doesn't disappear before we can hash the
delegation. With the client_mutex gone, we'll have a potential race
condition.

It's possible that the delegation could be recalled after we acquire the
lease but before we ever get around to hashing it. If that happens, then
we'd have a nfs4_file that *thinks* it has a delegation, when it
actually has none.

Attempt to acquire a delegation. If that succeeds, take the state_lock
and recheck to make sure the lease is still there. If it is, then take
the fi_lock and set up the rest of the delegation fields. This prevents
the race because the delegation break workqueue job must also take the
state_lock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 51 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9dd3571b0ec6..a1ddb1e6805c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -649,6 +649,8 @@ nfs4_put_delegation(struct nfs4_delegation *dp)
 
 static void nfs4_put_deleg_lease(struct nfs4_file *fp)
 {
+	lockdep_assert_held(&state_lock);
+
 	if (!fp->fi_lease)
 		return;
 	if (atomic_dec_and_test(&fp->fi_delegees)) {
@@ -668,11 +670,10 @@ static void
 hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 {
 	lockdep_assert_held(&state_lock);
+	lockdep_assert_held(&fp->fi_lock);
 
 	dp->dl_stid.sc_type = NFS4_DELEG_STID;
-	spin_lock(&fp->fi_lock);
 	list_add(&dp->dl_perfile, &fp->fi_delegations);
-	spin_unlock(&fp->fi_lock);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 }
 
@@ -684,17 +685,17 @@ unhash_delegation(struct nfs4_delegation *dp)
 
 	spin_lock(&state_lock);
 	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
+	spin_lock(&fp->fi_lock);
 	list_del_init(&dp->dl_perclnt);
 	list_del_init(&dp->dl_recall_lru);
-	spin_lock(&fp->fi_lock);
 	list_del_init(&dp->dl_perfile);
 	spin_unlock(&fp->fi_lock);
-	spin_unlock(&state_lock);
 	if (fp) {
 		nfs4_put_deleg_lease(fp);
-		put_nfs4_file(fp);
 		dp->dl_file = NULL;
 	}
+	spin_unlock(&state_lock);
+	put_nfs4_file(fp);
 }
 
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
@@ -3506,7 +3507,7 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 {
 	struct nfs4_file *fp = dp->dl_file;
 	struct file_lock *fl;
-	int status;
+	int status = 0;
 
 	fl = nfs4_alloc_init_lease(fp, NFS4_OPEN_DELEGATE_READ);
 	if (!fl)
@@ -3514,15 +3515,31 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 	fl->fl_file = find_readable_file(fp);
 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl);
 	if (status)
-		goto out_free;
+		goto out_fput;
+	spin_lock(&state_lock);
+	/* Did the lease get broken before we took the lock? */
+	status = -EAGAIN;
+	if (!file_has_lease(fl->fl_file))
+		goto out_unlock;
+	spin_lock(&fp->fi_lock);
+	/* Race breaker */
+	if (fp->fi_lease) {
+		status = 0;
+		atomic_inc(&fp->fi_delegees);
+		hash_delegation_locked(dp, fp);
+		spin_unlock(&fp->fi_lock);
+		goto out_unlock;
+	}
 	fp->fi_lease = fl;
 	fp->fi_deleg_file = fl->fl_file;
 	atomic_set(&fp->fi_delegees, 1);
-	spin_lock(&state_lock);
 	hash_delegation_locked(dp, fp);
+	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 	return 0;
-out_free:
+out_unlock:
+	spin_unlock(&state_lock);
+out_fput:
 	if (fl->fl_file)
 		fput(fl->fl_file);
 	locks_free_lock(fl);
@@ -3531,19 +3548,27 @@ out_free:
 
 static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfs4_file *fp)
 {
+	int status = 0;
+
 	if (fp->fi_had_conflict)
 		return -EAGAIN;
 	get_nfs4_file(fp);
+	spin_lock(&state_lock);
+	spin_lock(&fp->fi_lock);
 	dp->dl_file = fp;
-	if (!fp->fi_lease)
+	if (!fp->fi_lease) {
+		spin_unlock(&fp->fi_lock);
+		spin_unlock(&state_lock);
 		return nfs4_setlease(dp);
-	spin_lock(&state_lock);
+	}
 	atomic_inc(&fp->fi_delegees);
 	if (fp->fi_had_conflict) {
-		spin_unlock(&state_lock);
-		return -EAGAIN;
+		status = -EAGAIN;
+		goto out_unlock;
 	}
 	hash_delegation_locked(dp, fp);
+out_unlock:
+	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 	return 0;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 022/100] nfsd: Simplify stateid management
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (20 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 021/100] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 023/100] nfsd: Fix delegation revocation Jeff Layton
                   ` (77 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Don't allow stateids to clear the open file pointer until they are
being destroyed. Also, move to kzalloc and get rid of explicit
zeroing of fields.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a1ddb1e6805c..7948d8cd75e0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -479,7 +479,7 @@ kmem_cache *slab)
 	struct nfs4_stid *stid;
 	int new_id;
 
-	stid = kmem_cache_alloc(slab, GFP_KERNEL);
+	stid = kmem_cache_zalloc(slab, GFP_KERNEL);
 	if (!stid)
 		return NULL;
 
@@ -491,11 +491,9 @@ kmem_cache *slab)
 	if (new_id < 0)
 		goto out_free;
 	stid->sc_client = cl;
-	stid->sc_type = 0;
 	stid->sc_stateid.si_opaque.so_id = new_id;
 	stid->sc_stateid.si_opaque.so_clid = cl->cl_clientid;
 	/* Will be incremented before return to client: */
-	stid->sc_stateid.si_generation = 0;
 	atomic_set(&stid->sc_count, 1);
 
 	/*
@@ -616,10 +614,8 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	INIT_LIST_HEAD(&dp->dl_perfile);
 	INIT_LIST_HEAD(&dp->dl_perclnt);
 	INIT_LIST_HEAD(&dp->dl_recall_lru);
-	dp->dl_file = NULL;
 	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
-	dp->dl_time = 0;
 	return dp;
 }
 
@@ -642,6 +638,8 @@ void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
 	if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
+		if (dp->dl_file)
+			put_nfs4_file(dp->dl_file);
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
 		num_delegations--;
 	}
@@ -690,12 +688,9 @@ unhash_delegation(struct nfs4_delegation *dp)
 	list_del_init(&dp->dl_recall_lru);
 	list_del_init(&dp->dl_perfile);
 	spin_unlock(&fp->fi_lock);
-	if (fp) {
+	if (fp)
 		nfs4_put_deleg_lease(fp);
-		dp->dl_file = NULL;
-	}
 	spin_unlock(&state_lock);
-	put_nfs4_file(fp);
 }
 
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
@@ -889,12 +884,12 @@ static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
 static void close_generic_stateid(struct nfs4_ol_stateid *stp)
 {
 	release_all_access(stp);
-	put_nfs4_file(stp->st_file);
-	stp->st_file = NULL;
 }
 
 static void free_generic_stateid(struct nfs4_ol_stateid *stp)
 {
+	if (stp->st_file)
+		put_nfs4_file(stp->st_file);
 	nfs4_free_stid(stateid_slab, &stp->st_stid);
 }
 
@@ -4456,6 +4451,10 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 		if (list_empty(&oo->oo_owner.so_stateids))
 			release_openowner(oo);
 	} else {
+		if (s->st_file) {
+			put_nfs4_file(s->st_file);
+			s->st_file = NULL;
+		}
 		oo->oo_last_closed_stid = s;
 		/*
 		 * In the 4.0 case we need to keep the owners around a
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 023/100] nfsd: Fix delegation revocation
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (21 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 022/100] nfsd: Simplify stateid management Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 024/100] nfsd: Add reference counting to the lock and open stateids Jeff Layton
                   ` (76 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Ensure that the delegations cannot be found by the laundromat etc once
we add them to the various 'revoke' lists.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7948d8cd75e0..4669328003f3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -675,13 +675,13 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
 }
 
-/* Called under the state lock. */
 static void
-unhash_delegation(struct nfs4_delegation *dp)
+unhash_delegation_locked(struct nfs4_delegation *dp)
 {
 	struct nfs4_file *fp = dp->dl_file;
 
-	spin_lock(&state_lock);
+	lockdep_assert_held(&state_lock);
+
 	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
 	spin_lock(&fp->fi_lock);
 	list_del_init(&dp->dl_perclnt);
@@ -690,18 +690,19 @@ unhash_delegation(struct nfs4_delegation *dp)
 	spin_unlock(&fp->fi_lock);
 	if (fp)
 		nfs4_put_deleg_lease(fp);
-	spin_unlock(&state_lock);
 }
 
-static void destroy_revoked_delegation(struct nfs4_delegation *dp)
+static void unhash_and_destroy_delegation(struct nfs4_delegation *dp)
 {
-	list_del_init(&dp->dl_recall_lru);
+	spin_lock(&state_lock);
+	unhash_delegation_locked(dp);
+	spin_unlock(&state_lock);
 	nfs4_put_delegation(dp);
 }
 
-static void destroy_delegation(struct nfs4_delegation *dp)
+static void destroy_revoked_delegation(struct nfs4_delegation *dp)
 {
-	unhash_delegation(dp);
+	list_del_init(&dp->dl_recall_lru);
 	nfs4_put_delegation(dp);
 }
 
@@ -710,11 +711,10 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 	struct nfs4_client *clp = dp->dl_stid.sc_client;
 
 	if (clp->cl_minorversion == 0)
-		destroy_delegation(dp);
+		destroy_revoked_delegation(dp);
 	else {
-		unhash_delegation(dp);
 		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
-		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+		list_move(&dp->dl_recall_lru, &clp->cl_revoked);
 	}
 }
 
@@ -1459,15 +1459,16 @@ destroy_client(struct nfs4_client *clp)
 	spin_lock(&state_lock);
 	while (!list_empty(&clp->cl_delegations)) {
 		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
-		list_del_init(&dp->dl_perclnt);
+		unhash_delegation_locked(dp);
 		/* Ensure that deleg break won't try to requeue it */
 		++dp->dl_time;
-		list_move(&dp->dl_recall_lru, &reaplist);
+		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
 	while (!list_empty(&reaplist)) {
 		dp = list_entry(reaplist.next, struct nfs4_delegation, dl_recall_lru);
-		destroy_delegation(dp);
+		list_del_init(&dp->dl_recall_lru);
+		nfs4_put_delegation(dp);
 	}
 	list_splice_init(&clp->cl_revoked, &reaplist);
 	while (!list_empty(&reaplist)) {
@@ -3652,7 +3653,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
 	open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
 	return;
 out_free:
-	destroy_delegation(dp);
+	nfs4_put_delegation(dp);
 out_no_deleg:
 	open->op_delegate_type = NFS4_OPEN_DELEGATE_NONE;
 	if (open->op_claim_type == NFS4_OPEN_CLAIM_PREVIOUS &&
@@ -3891,7 +3892,8 @@ nfs4_laundromat(struct nfsd_net *nn)
 			new_timeo = min(new_timeo, t);
 			break;
 		}
-		list_move(&dp->dl_recall_lru, &reaplist);
+		unhash_delegation_locked(dp);
+		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
 	list_for_each_safe(pos, next, &reaplist) {
@@ -4519,7 +4521,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (status)
 		goto out;
 
-	destroy_delegation(dp);
+	unhash_and_destroy_delegation(dp);
 out:
 	nfs4_unlock_state();
 
@@ -5360,7 +5362,8 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 			 * don't monkey with it now that we are.
 			 */
 			++dp->dl_time;
-			list_move(&dp->dl_recall_lru, victims);
+			unhash_delegation_locked(dp);
+			list_add(&dp->dl_recall_lru, victims);
 		}
 		if (++count == max)
 			break;
@@ -5615,12 +5618,14 @@ nfs4_state_shutdown_net(struct net *net)
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
-		list_move(&dp->dl_recall_lru, &reaplist);
+		unhash_delegation_locked(dp);
+		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
 	list_for_each_safe(pos, next, &reaplist) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
-		destroy_delegation(dp);
+		list_del_init(&dp->dl_recall_lru);
+		nfs4_put_delegation(dp);
 	}
 
 	nfsd4_client_tracking_exit(net);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 024/100] nfsd: Add reference counting to the lock and open stateids
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (22 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 023/100] nfsd: Fix delegation revocation Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 025/100] nfsd: Add a struct nfs4_file field to struct nfs4_stid Jeff Layton
                   ` (75 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

When we remove the client_mutex, we'll need to be able to ensure that
these objects aren't destroyed while we're not holding locks.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4669328003f3..251c735eedc0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -886,8 +886,10 @@ static void close_generic_stateid(struct nfs4_ol_stateid *stp)
 	release_all_access(stp);
 }
 
-static void free_generic_stateid(struct nfs4_ol_stateid *stp)
+static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 {
+	if (!atomic_dec_and_test(&stp->st_stid.sc_count))
+		return;
 	if (stp->st_file)
 		put_nfs4_file(stp->st_file);
 	nfs4_free_stid(stateid_slab, &stp->st_stid);
@@ -904,7 +906,7 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 	if (file)
 		filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
 	close_generic_stateid(stp);
-	free_generic_stateid(stp);
+	put_generic_stateid(stp);
 }
 
 static void unhash_lockowner(struct nfs4_lockowner *lo)
@@ -967,7 +969,7 @@ static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
 static void release_open_stateid(struct nfs4_ol_stateid *stp)
 {
 	unhash_open_stateid(stp);
-	free_generic_stateid(stp);
+	put_generic_stateid(stp);
 }
 
 static void unhash_openowner(struct nfs4_openowner *oo)
@@ -988,7 +990,7 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 	struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;
 
 	if (s) {
-		free_generic_stateid(s);
+		put_generic_stateid(s);
 		oo->oo_last_closed_stid = NULL;
 	}
 }
@@ -3798,7 +3800,7 @@ void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
 	if (open->op_file)
 		nfsd4_free_file(open->op_file);
 	if (open->op_stp)
-		free_generic_stateid(open->op_stp);
+		put_generic_stateid(open->op_stp);
 }
 
 __be32
@@ -4449,9 +4451,9 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 	unhash_open_stateid(s);
 
 	if (clp->cl_minorversion) {
-		free_generic_stateid(s);
 		if (list_empty(&oo->oo_owner.so_stateids))
 			release_openowner(oo);
+		put_generic_stateid(s);
 	} else {
 		if (s->st_file) {
 			put_nfs4_file(s->st_file);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 025/100] nfsd: Add a struct nfs4_file field to struct nfs4_stid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (23 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 024/100] nfsd: Add reference counting to the lock and open stateids Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 026/100] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file Jeff Layton
                   ` (74 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

All stateids are associated with a nfs4_file. Let's consolidate.
Start by replacing delegation->dl_file with the dl_stid.sc_file

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 16 ++++++++--------
 fs/nfsd/state.h     |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 251c735eedc0..7ed76666d520 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -631,6 +631,8 @@ static void remove_stid(struct nfs4_stid *s)
 static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
 {
 	remove_stid(s);
+	if (s->sc_file)
+		put_nfs4_file(s->sc_file);
 	kmem_cache_free(slab, s);
 }
 
@@ -638,8 +640,6 @@ void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
 	if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
-		if (dp->dl_file)
-			put_nfs4_file(dp->dl_file);
 		nfs4_free_stid(deleg_slab, &dp->dl_stid);
 		num_delegations--;
 	}
@@ -678,7 +678,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 static void
 unhash_delegation_locked(struct nfs4_delegation *dp)
 {
-	struct nfs4_file *fp = dp->dl_file;
+	struct nfs4_file *fp = dp->dl_stid.sc_file;
 
 	lockdep_assert_held(&state_lock);
 
@@ -3102,8 +3102,8 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
 
 void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp)
 {
-	struct nfs4_client *clp = dp->dl_stid.sc_client;
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+	struct nfsd_net *nn = net_generic(dp->dl_stid.sc_client->net,
+					  nfsd_net_id);
 
 	/*
 	 * We can't do this in nfsd_break_deleg_cb because it is
@@ -3503,7 +3503,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_file *fp, int flag)
 
 static int nfs4_setlease(struct nfs4_delegation *dp)
 {
-	struct nfs4_file *fp = dp->dl_file;
+	struct nfs4_file *fp = dp->dl_stid.sc_file;
 	struct file_lock *fl;
 	int status = 0;
 
@@ -3553,7 +3553,7 @@ static int nfs4_set_delegation(struct nfs4_delegation *dp, struct nfs4_file *fp)
 	get_nfs4_file(fp);
 	spin_lock(&state_lock);
 	spin_lock(&fp->fi_lock);
-	dp->dl_file = fp;
+	dp->dl_stid.sc_file = fp;
 	if (!fp->fi_lease) {
 		spin_unlock(&fp->fi_lock);
 		spin_unlock(&state_lock);
@@ -4143,7 +4143,7 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 		if (status)
 			goto out;
 		if (filpp) {
-			file = dp->dl_file->fi_deleg_file;
+			file = dp->dl_stid.sc_file->fi_deleg_file;
 			if (!file) {
 				WARN_ON_ONCE(1);
 				status = nfserr_serverfault;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 27adbebfd168..fa266d2612b7 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -85,6 +85,7 @@ struct nfs4_stid {
 	unsigned char sc_type;
 	stateid_t sc_stateid;
 	struct nfs4_client *sc_client;
+	struct nfs4_file *sc_file;
 };
 
 struct nfs4_delegation {
@@ -92,7 +93,6 @@ struct nfs4_delegation {
 	struct list_head	dl_perfile;
 	struct list_head	dl_perclnt;
 	struct list_head	dl_recall_lru;  /* delegation recalled */
-	struct nfs4_file	*dl_file;
 	u32			dl_type;
 	time_t			dl_time;
 /* For recall: */
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 026/100] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (24 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 025/100] nfsd: Add a struct nfs4_file field to struct nfs4_stid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 027/100] nfsd: Ensure atomicity of stateid destruction and idr tree removal Jeff Layton
                   ` (73 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 48 ++++++++++++++++++++++++------------------------
 fs/nfsd/state.h     |  1 -
 2 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7ed76666d520..352cd151eff3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -851,7 +851,7 @@ reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)
 
 	/* Recalculate per-file deny mode if there was a change */
 	if (change)
-		recalculate_deny_mode(stp->st_file);
+		recalculate_deny_mode(stp->st_stid.sc_file);
 }
 
 /* release all access and file references for a given stateid */
@@ -859,21 +859,21 @@ static void
 release_all_access(struct nfs4_ol_stateid *stp)
 {
 	int i;
-	struct nfs4_file *fp = stp->st_file;
+	struct nfs4_file *fp = stp->st_stid.sc_file;
 
 	if (fp && stp->st_deny_bmap != 0)
 		recalculate_deny_mode(fp);
 
 	for (i = 1; i < 4; i++) {
 		if (test_access(i, stp))
-			nfs4_file_put_access(stp->st_file, i);
+			nfs4_file_put_access(stp->st_stid.sc_file, i);
 		clear_access(i, stp);
 	}
 }
 
 static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
 {
-	struct nfs4_file *fp = stp->st_file;
+	struct nfs4_file *fp = stp->st_stid.sc_file;
 
 	spin_lock(&fp->fi_lock);
 	list_del(&stp->st_perfile);
@@ -890,8 +890,6 @@ static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 {
 	if (!atomic_dec_and_test(&stp->st_stid.sc_count))
 		return;
-	if (stp->st_file)
-		put_nfs4_file(stp->st_file);
 	nfs4_free_stid(stateid_slab, &stp->st_stid);
 }
 
@@ -902,7 +900,7 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 	list_del(&stp->st_locks);
 	unhash_generic_stateid(stp);
 	unhash_stid(&stp->st_stid);
-	file = find_any_file(stp->st_file);
+	file = find_any_file(stp->st_stid.sc_file);
 	if (file)
 		filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
 	close_generic_stateid(stp);
@@ -2978,7 +2976,7 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 	list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
 	stp->st_stateowner = &oo->oo_owner;
 	get_nfs4_file(fp);
-	stp->st_file = fp;
+	stp->st_stid.sc_file = fp;
 	stp->st_access_bmap = 0;
 	stp->st_deny_bmap = 0;
 	set_access(open->op_share_access, stp);
@@ -3644,7 +3642,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
 	dp = alloc_init_deleg(oo->oo_owner.so_client, stp, fh);
 	if (dp == NULL)
 		goto out_no_deleg;
-	status = nfs4_set_delegation(dp, stp->st_file);
+	status = nfs4_set_delegation(dp, stp->st_stid.sc_file);
 	if (status)
 		goto out_free;
 
@@ -3935,7 +3933,7 @@ laundromat_main(struct work_struct *laundry)
 
 static inline __be32 nfs4_check_fh(struct svc_fh *fhp, struct nfs4_ol_stateid *stp)
 {
-	if (fhp->fh_dentry->d_inode != stp->st_file->fi_inode)
+	if (fhp->fh_dentry->d_inode != stp->st_stid.sc_file->fi_inode)
 		return nfserr_bad_stateid;
 	return nfs_ok;
 }
@@ -4165,10 +4163,12 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 		if (status)
 			goto out;
 		if (filpp) {
+			struct nfs4_file *fp = stp->st_stid.sc_file;
+
 			if (flags & RD_STATE)
-				file = find_readable_file(stp->st_file);
+				file = find_readable_file(fp);
 			else
-				file = find_writeable_file(stp->st_file);
+				file = find_writeable_file(fp);
 		}
 		break;
 	default:
@@ -4188,7 +4188,7 @@ nfsd4_free_lock_stateid(struct nfs4_ol_stateid *stp)
 {
 	struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);
 
-	if (check_for_locks(stp->st_file, lo))
+	if (check_for_locks(stp->st_stid.sc_file, lo))
 		return nfserr_locks_held;
 	release_lockowner_if_empty(lo);
 	return nfs_ok;
@@ -4374,7 +4374,7 @@ static inline void nfs4_stateid_downgrade_bit(struct nfs4_ol_stateid *stp, u32 a
 {
 	if (!test_access(access, stp))
 		return;
-	nfs4_file_put_access(stp->st_file, access);
+	nfs4_file_put_access(stp->st_stid.sc_file, access);
 	clear_access(access, stp);
 }
 
@@ -4455,9 +4455,9 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 			release_openowner(oo);
 		put_generic_stateid(s);
 	} else {
-		if (s->st_file) {
-			put_nfs4_file(s->st_file);
-			s->st_file = NULL;
+		if (s->st_stid.sc_file) {
+			put_nfs4_file(s->st_stid.sc_file);
+			s->st_stid.sc_file = NULL;
 		}
 		oo->oo_last_closed_stid = s;
 		/*
@@ -4659,7 +4659,7 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
 	stp->st_stateowner = &lo->lo_owner;
 	get_nfs4_file(fp);
-	stp->st_file = fp;
+	stp->st_stid.sc_file = fp;
 	stp->st_access_bmap = 0;
 	stp->st_deny_bmap = open_stp->st_deny_bmap;
 	stp->st_openstp = open_stp;
@@ -4676,7 +4676,7 @@ find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
 	struct nfs4_ol_stateid *lst;
 
 	list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
-		if (lst->st_file == fp)
+		if (lst->st_stid.sc_file == fp)
 			return lst;
 	}
 	return NULL;
@@ -4692,7 +4692,7 @@ check_lock_length(u64 offset, u64 length)
 
 static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
 {
-	struct nfs4_file *fp = lock_stp->st_file;
+	struct nfs4_file *fp = lock_stp->st_stid.sc_file;
 
 	lockdep_assert_held(&fp->fi_lock);
 
@@ -4704,7 +4704,7 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
 
 static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, struct nfs4_ol_stateid *ost, struct nfsd4_lock *lock, struct nfs4_ol_stateid **lst, bool *new)
 {
-	struct nfs4_file *fi = ost->st_file;
+	struct nfs4_file *fi = ost->st_stid.sc_file;
 	struct nfs4_openowner *oo = openowner(ost->st_stateowner);
 	struct nfs4_client *cl = oo->oo_owner.so_client;
 	struct nfs4_lockowner *lo;
@@ -4828,7 +4828,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		goto out;
 	}
 
-	fp = lock_stp->st_file;
+	fp = lock_stp->st_stid.sc_file;
 	locks_init_lock(file_lock);
 	switch (lock->lk_type) {
 		case NFS4_READ_LT:
@@ -5027,7 +5027,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 					&stp, nn);
 	if (status)
 		goto out;
-	filp = find_any_file(stp->st_file);
+	filp = find_any_file(stp->st_stid.sc_file);
 	if (!filp) {
 		status = nfserr_lock_range;
 		goto out;
@@ -5140,7 +5140,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	lo = lockowner(sop);
 	/* see if there are still any locks associated with it */
 	list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
-		if (check_for_locks(stp->st_file, lo))
+		if (check_for_locks(stp->st_stid.sc_file, lo))
 			goto out;
 	}
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index fa266d2612b7..95d04fae365a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -408,7 +408,6 @@ struct nfs4_ol_stateid {
 	struct list_head              st_perstateowner;
 	struct list_head              st_locks;
 	struct nfs4_stateowner      * st_stateowner;
-	struct nfs4_file            * st_file;
 	unsigned char                 st_access_bmap;
 	unsigned char                 st_deny_bmap;
 	struct nfs4_ol_stateid         * st_openstp;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 027/100] nfsd: Ensure atomicity of stateid destruction and idr tree removal
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (25 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 026/100] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 028/100] nfsd: Cleanup the freeing of stateids Jeff Layton
                   ` (72 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Preparation for removal of the client_mutex. Ensure that they are done
while holding the clp->cl_lock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 352cd151eff3..f5219f7da276 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -619,30 +619,39 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	return dp;
 }
 
-static void remove_stid(struct nfs4_stid *s)
+static void remove_stid_locked(struct nfs4_client *clp, struct nfs4_stid *s)
 {
-	struct nfs4_client *clp = s->sc_client;
+	lockdep_assert_held(&clp->cl_lock);
 
-	spin_lock(&clp->cl_lock);
 	idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
-	spin_unlock(&clp->cl_lock);
 }
 
 static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
 {
-	remove_stid(s);
 	if (s->sc_file)
 		put_nfs4_file(s->sc_file);
 	kmem_cache_free(slab, s);
 }
 
+static bool nfs4_put_stid(struct kmem_cache *slab, struct nfs4_stid *s)
+{
+	struct nfs4_client *clp = s->sc_client;
+
+	might_lock(&clp->cl_lock);
+
+	if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock))
+		return false;
+	remove_stid_locked(clp, s);
+	spin_unlock(&clp->cl_lock);
+	nfs4_free_stid(slab, s);
+	return true;
+}
+
 void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
-	if (atomic_dec_and_test(&dp->dl_stid.sc_count)) {
-		nfs4_free_stid(deleg_slab, &dp->dl_stid);
+	if (nfs4_put_stid(deleg_slab, &dp->dl_stid))
 		num_delegations--;
-	}
 }
 
 static void nfs4_put_deleg_lease(struct nfs4_file *fp)
@@ -888,9 +897,7 @@ static void close_generic_stateid(struct nfs4_ol_stateid *stp)
 
 static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 {
-	if (!atomic_dec_and_test(&stp->st_stid.sc_count))
-		return;
-	nfs4_free_stid(stateid_slab, &stp->st_stid);
+	nfs4_put_stid(stateid_slab, &stp->st_stid);
 }
 
 static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 028/100] nfsd: Cleanup the freeing of stateids
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (26 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 027/100] nfsd: Ensure atomicity of stateid destruction and idr tree removal Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 029/100] nfsd: do filp_close in sc_free callback for lock stateids Jeff Layton
                   ` (71 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Add a ->free() callback to the struct nfs4_stid, so that we can
release a reference to the stid without caring about the contents.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 62 +++++++++++++++++++++++++++++++++++------------------
 fs/nfsd/state.h     |  2 ++
 2 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f5219f7da276..947cd157ee82 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -71,6 +71,7 @@ static u64 current_sessionid = 1;
 
 /* forward declarations */
 static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
+static void nfs4_free_generic_stateid(struct nfs4_stid *stid);
 
 /* Locking: */
 
@@ -473,8 +474,15 @@ static void nfs4_file_put_access(struct nfs4_file *fp, u32 access)
 		__nfs4_file_put_access(fp, oflag);
 }
 
-static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct
-kmem_cache *slab)
+static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
+{
+	if (s->sc_file)
+		put_nfs4_file(s->sc_file);
+	kmem_cache_free(slab, s);
+}
+
+static struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl,
+					 struct kmem_cache *slab)
 {
 	struct nfs4_stid *stid;
 	int new_id;
@@ -513,7 +521,22 @@ out_free:
 
 static struct nfs4_ol_stateid * nfs4_alloc_stateid(struct nfs4_client *clp)
 {
-	return openlockstateid(nfs4_alloc_stid(clp, stateid_slab));
+	struct nfs4_stid *stid;
+	struct nfs4_ol_stateid *stp;
+
+	stid = nfs4_alloc_stid(clp, stateid_slab);
+	if (!stid)
+		return NULL;
+
+	stp = openlockstateid(stid);
+	stp->st_stid.sc_free = nfs4_free_generic_stateid;
+	return stp;
+}
+
+static void nfs4_free_deleg(struct nfs4_stid *stid)
+{
+	nfs4_free_stid(deleg_slab, stid);
+	num_delegations--;
 }
 
 /*
@@ -604,6 +627,8 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	dp = delegstateid(nfs4_alloc_stid(clp, deleg_slab));
 	if (dp == NULL)
 		return dp;
+
+	dp->dl_stid.sc_free = nfs4_free_deleg;
 	/*
 	 * delegation seqid's are never incremented.  The 4.1 special
 	 * meaning of seqid 0 isn't meaningful, really, but let's avoid
@@ -626,32 +651,23 @@ static void remove_stid_locked(struct nfs4_client *clp, struct nfs4_stid *s)
 	idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
 }
 
-static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
-{
-	if (s->sc_file)
-		put_nfs4_file(s->sc_file);
-	kmem_cache_free(slab, s);
-}
-
-static bool nfs4_put_stid(struct kmem_cache *slab, struct nfs4_stid *s)
+static void nfs4_put_stid(struct nfs4_stid *s)
 {
 	struct nfs4_client *clp = s->sc_client;
 
 	might_lock(&clp->cl_lock);
 
 	if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock))
-		return false;
+		return;
 	remove_stid_locked(clp, s);
 	spin_unlock(&clp->cl_lock);
-	nfs4_free_stid(slab, s);
-	return true;
+	s->sc_free(s);
 }
 
 void
 nfs4_put_delegation(struct nfs4_delegation *dp)
 {
-	if (nfs4_put_stid(deleg_slab, &dp->dl_stid))
-		num_delegations--;
+	nfs4_put_stid(&dp->dl_stid);
 }
 
 static void nfs4_put_deleg_lease(struct nfs4_file *fp)
@@ -890,14 +906,17 @@ static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
 	list_del(&stp->st_perstateowner);
 }
 
-static void close_generic_stateid(struct nfs4_ol_stateid *stp)
+static void nfs4_free_generic_stateid(struct nfs4_stid *stid)
 {
+	struct nfs4_ol_stateid *stp = openlockstateid(stid);
+
 	release_all_access(stp);
+	nfs4_free_stid(stateid_slab, stid);
 }
 
 static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 {
-	nfs4_put_stid(stateid_slab, &stp->st_stid);
+	nfs4_put_stid(&stp->st_stid);
 }
 
 static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
@@ -910,7 +929,6 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 	file = find_any_file(stp->st_stid.sc_file);
 	if (file)
 		filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
-	close_generic_stateid(stp);
 	put_generic_stateid(stp);
 }
 
@@ -968,7 +986,6 @@ static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
 {
 	unhash_generic_stateid(stp);
 	release_open_stateid_locks(stp);
-	close_generic_stateid(stp);
 }
 
 static void release_open_stateid(struct nfs4_ol_stateid *stp)
@@ -4469,8 +4486,11 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 		oo->oo_last_closed_stid = s;
 		/*
 		 * In the 4.0 case we need to keep the owners around a
-		 * little while to handle CLOSE replay.
+		 * little while to handle CLOSE replay. We still do need
+		 * to release any file access that is held by them
+		 * before returning however.
 		 */
+		release_all_access(s);
 		if (list_empty(&oo->oo_owner.so_stateids))
 			move_to_close_lru(oo, clp->net);
 	}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 95d04fae365a..7188dcd45ef7 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -86,6 +86,8 @@ struct nfs4_stid {
 	stateid_t sc_stateid;
 	struct nfs4_client *sc_client;
 	struct nfs4_file *sc_file;
+
+	void (*sc_free)(struct nfs4_stid *);
 };
 
 struct nfs4_delegation {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 029/100] nfsd: do filp_close in sc_free callback for lock stateids
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (27 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 028/100] nfsd: Cleanup the freeing of stateids Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 030/100] nfsd: Add locking to protect the state owner lists Jeff Layton
                   ` (70 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Releasing locks when we unhash the stateid instead of doing so only when
the stateid is actually released is problematic and will complicate some
later changes.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 947cd157ee82..e3ca5bcd1c3f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -914,6 +914,18 @@ static void nfs4_free_generic_stateid(struct nfs4_stid *stid)
 	nfs4_free_stid(stateid_slab, stid);
 }
 
+static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
+{
+	struct nfs4_ol_stateid *stp = openlockstateid(stid);
+	struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);
+	struct file *file;
+
+	file = find_any_file(stp->st_stid.sc_file);
+	if (file)
+		filp_close(file, (fl_owner_t)lo);
+	nfs4_free_generic_stateid(stid);
+}
+
 static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 {
 	nfs4_put_stid(&stp->st_stid);
@@ -921,14 +933,9 @@ static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 
 static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 {
-	struct file *file;
-
 	list_del(&stp->st_locks);
 	unhash_generic_stateid(stp);
 	unhash_stid(&stp->st_stid);
-	file = find_any_file(stp->st_stid.sc_file);
-	if (file)
-		filp_close(file, (fl_owner_t)lockowner(stp->st_stateowner));
 	put_generic_stateid(stp);
 }
 
@@ -4687,6 +4694,7 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	stp->st_stateowner = &lo->lo_owner;
 	get_nfs4_file(fp);
 	stp->st_stid.sc_file = fp;
+	stp->st_stid.sc_free = nfs4_free_lock_stateid;
 	stp->st_access_bmap = 0;
 	stp->st_deny_bmap = open_stp->st_deny_bmap;
 	stp->st_openstp = open_stp;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 030/100] nfsd: Add locking to protect the state owner lists
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (28 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 029/100] nfsd: do filp_close in sc_free callback for lock stateids Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 031/100] nfsd: clean up races in lock stateid searching and creation Jeff Layton
                   ` (69 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Change to using the clp->cl_lock for this. For now, there's a lot of
cl_lock thrashing, but in later patches we'll eliminate that and close
the potential races that can occur when releasing the cl_lock while
walking the lists. For now, the client_mutex prevents those races.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e3ca5bcd1c3f..32538c79f9cb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -900,6 +900,8 @@ static void unhash_generic_stateid(struct nfs4_ol_stateid *stp)
 {
 	struct nfs4_file *fp = stp->st_stid.sc_file;
 
+	lockdep_assert_held(&stp->st_stateowner->so_client->cl_lock);
+
 	spin_lock(&fp->fi_lock);
 	list_del(&stp->st_perfile);
 	spin_unlock(&fp->fi_lock);
@@ -933,9 +935,13 @@ static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 
 static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 {
+	struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
+
+	spin_lock(&oo->oo_owner.so_client->cl_lock);
 	list_del(&stp->st_locks);
 	unhash_generic_stateid(stp);
 	unhash_stid(&stp->st_stid);
+	spin_unlock(&oo->oo_owner.so_client->cl_lock);
 	put_generic_stateid(stp);
 }
 
@@ -979,20 +985,26 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
 }
 
 static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
+	__releases(&open_stp->st_stateowner->so_client->cl_lock)
+	__acquires(&open_stp->st_stateowner->so_client->cl_lock)
 {
 	struct nfs4_ol_stateid *stp;
 
 	while (!list_empty(&open_stp->st_locks)) {
 		stp = list_entry(open_stp->st_locks.next,
 				struct nfs4_ol_stateid, st_locks);
+		spin_unlock(&open_stp->st_stateowner->so_client->cl_lock);
 		release_lock_stateid(stp);
+		spin_lock(&open_stp->st_stateowner->so_client->cl_lock);
 	}
 }
 
 static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
 {
+	spin_lock(&stp->st_stateowner->so_client->cl_lock);
 	unhash_generic_stateid(stp);
 	release_open_stateid_locks(stp);
+	spin_unlock(&stp->st_stateowner->so_client->cl_lock);
 }
 
 static void release_open_stateid(struct nfs4_ol_stateid *stp)
@@ -3004,7 +3016,6 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 
 	stp->st_stid.sc_type = NFS4_OPEN_STID;
 	INIT_LIST_HEAD(&stp->st_locks);
-	list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
 	stp->st_stateowner = &oo->oo_owner;
 	get_nfs4_file(fp);
 	stp->st_stid.sc_file = fp;
@@ -3013,9 +3024,12 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 	set_access(open->op_share_access, stp);
 	set_deny(open->op_share_deny, stp);
 	stp->st_openstp = NULL;
+	spin_lock(&oo->oo_owner.so_client->cl_lock);
+	list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids);
 	spin_lock(&fp->fi_lock);
 	list_add(&stp->st_perfile, &fp->fi_stateids);
 	spin_unlock(&fp->fi_lock);
+	spin_unlock(&oo->oo_owner.so_client->cl_lock);
 }
 
 static void
@@ -4683,6 +4697,7 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
 static struct nfs4_ol_stateid *
 alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct nfs4_ol_stateid *open_stp)
 {
+	struct nfs4_openowner *oo = openowner(open_stp->st_stateowner);
 	struct nfs4_ol_stateid *stp;
 	struct nfs4_client *clp = lo->lo_owner.so_client;
 
@@ -4690,7 +4705,6 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	if (stp == NULL)
 		return NULL;
 	stp->st_stid.sc_type = NFS4_LOCK_STID;
-	list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
 	stp->st_stateowner = &lo->lo_owner;
 	get_nfs4_file(fp);
 	stp->st_stid.sc_file = fp;
@@ -4698,10 +4712,13 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	stp->st_access_bmap = 0;
 	stp->st_deny_bmap = open_stp->st_deny_bmap;
 	stp->st_openstp = open_stp;
+	spin_lock(&oo->oo_owner.so_client->cl_lock);
 	list_add(&stp->st_locks, &open_stp->st_locks);
+	list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
 	spin_lock(&fp->fi_lock);
 	list_add(&stp->st_perfile, &fp->fi_stateids);
 	spin_unlock(&fp->fi_lock);
+	spin_unlock(&oo->oo_owner.so_client->cl_lock);
 	return stp;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 031/100] nfsd: clean up races in lock stateid searching and creation
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (29 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 030/100] nfsd: Add locking to protect the state owner lists Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 032/100] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
                   ` (68 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Preparation for removal of the client_mutex.

Currently, no lock aside from the client_mutex is held when calling
find_lock_state. Ensure that the cl_lock is held by adding a lockdep
assertion.

Once we remove the client_mutex, it'll be possible for another thread to
race in and insert a lock state for the same file after we search but
before we insert a new one. Ensure that doesn't happen by redoing the
search after allocating a new stid that we plan to insert. If one is
found just put the one we just allocated.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 58 +++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 41 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 32538c79f9cb..2b8ca8354b95 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4694,16 +4694,14 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
 	return lo;
 }
 
-static struct nfs4_ol_stateid *
-alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct nfs4_ol_stateid *open_stp)
+static void
+init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
+		  struct nfs4_file *fp, struct nfs4_ol_stateid *open_stp)
 {
-	struct nfs4_openowner *oo = openowner(open_stp->st_stateowner);
-	struct nfs4_ol_stateid *stp;
 	struct nfs4_client *clp = lo->lo_owner.so_client;
 
-	stp = nfs4_alloc_stateid(clp);
-	if (stp == NULL)
-		return NULL;
+	lockdep_assert_held(&clp->cl_lock);
+
 	stp->st_stid.sc_type = NFS4_LOCK_STID;
 	stp->st_stateowner = &lo->lo_owner;
 	get_nfs4_file(fp);
@@ -4712,20 +4710,20 @@ alloc_init_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp, struct
 	stp->st_access_bmap = 0;
 	stp->st_deny_bmap = open_stp->st_deny_bmap;
 	stp->st_openstp = open_stp;
-	spin_lock(&oo->oo_owner.so_client->cl_lock);
 	list_add(&stp->st_locks, &open_stp->st_locks);
 	list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
 	spin_lock(&fp->fi_lock);
 	list_add(&stp->st_perfile, &fp->fi_stateids);
 	spin_unlock(&fp->fi_lock);
-	spin_unlock(&oo->oo_owner.so_client->cl_lock);
-	return stp;
 }
 
 static struct nfs4_ol_stateid *
 find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
 {
 	struct nfs4_ol_stateid *lst;
+	struct nfs4_client *clp = lo->lo_owner.so_client;
+
+	lockdep_assert_held(&clp->cl_lock);
 
 	list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
 		if (lst->st_stid.sc_file == fp)
@@ -4734,6 +4732,36 @@ find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
 	return NULL;
 }
 
+static struct nfs4_ol_stateid *
+find_or_create_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fi,
+			    struct nfs4_ol_stateid *ost, bool *new)
+{
+	struct nfs4_ol_stateid *lst, *nst = NULL;
+	struct nfs4_openowner *oo = openowner(ost->st_stateowner);
+	struct nfs4_client *clp = oo->oo_owner.so_client;
+
+	spin_lock(&clp->cl_lock);
+	lst = find_lock_stateid(lo, fi);
+	if (lst == NULL) {
+		spin_unlock(&clp->cl_lock);
+		nst = nfs4_alloc_stateid(clp);
+		if (nst == NULL)
+			return NULL;
+
+		spin_lock(&clp->cl_lock);
+		lst = find_lock_stateid(lo, fi);
+		if (likely(!lst)) {
+			init_lock_stateid(nst, lo, fi, ost);
+			lst = nst;
+			nst = NULL;
+			*new = true;
+		}
+	}
+	spin_unlock(&clp->cl_lock);
+	if (nst)
+		put_generic_stateid(nst);
+	return lst;
+}
 
 static int
 check_lock_length(u64 offset, u64 length)
@@ -4777,14 +4805,10 @@ static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, s
 			return nfserr_bad_seqid;
 	}
 
-	*lst = find_lock_stateid(lo, fi);
+	*lst = find_or_create_lock_stateid(lo, fi, ost, new);
 	if (*lst == NULL) {
-		*lst = alloc_init_lock_stateid(lo, fi, ost);
-		if (*lst == NULL) {
-			release_lockowner_if_empty(lo);
-			return nfserr_jukebox;
-		}
-		*new = true;
+		release_lockowner_if_empty(lo);
+		return nfserr_jukebox;
 	}
 	return nfs_ok;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 032/100] nfsd: Convert delegation counter to an atomic_long_t type
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (30 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 031/100] nfsd: clean up races in lock stateid searching and creation Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 033/100] nfsd: Slight cleanup of find_stateid() Jeff Layton
                   ` (67 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

We want to convert to an atomic type so that we don't need to lock
across the call to alloc_init_deleg(). Then convert to a long type so
that we match the size of 'max_delegations'.

None of this is a problem today, but it will be once we remove
client_mutex protection.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2b8ca8354b95..2c59395c63df 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -342,7 +342,7 @@ static struct file *find_any_file(struct nfs4_file *f)
 	return ret;
 }
 
-static int num_delegations;
+static atomic_long_t num_delegations;
 unsigned long max_delegations;
 
 /*
@@ -536,7 +536,7 @@ static struct nfs4_ol_stateid * nfs4_alloc_stateid(struct nfs4_client *clp)
 static void nfs4_free_deleg(struct nfs4_stid *stid)
 {
 	nfs4_free_stid(deleg_slab, stid);
-	num_delegations--;
+	atomic_long_dec(&num_delegations);
 }
 
 /*
@@ -618,15 +618,17 @@ static struct nfs4_delegation *
 alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct svc_fh *current_fh)
 {
 	struct nfs4_delegation *dp;
+	long n;
 
 	dprintk("NFSD alloc_init_deleg\n");
-	if (num_delegations > max_delegations)
-		return NULL;
+	n = atomic_long_inc_return(&num_delegations);
+	if (n < 0 || n > max_delegations)
+		goto out_dec;
 	if (delegation_blocked(&current_fh->fh_handle))
-		return NULL;
+		goto out_dec;
 	dp = delegstateid(nfs4_alloc_stid(clp, deleg_slab));
 	if (dp == NULL)
-		return dp;
+		goto out_dec;
 
 	dp->dl_stid.sc_free = nfs4_free_deleg;
 	/*
@@ -635,13 +637,15 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_ol_stateid *stp, struct sv
 	 * 0 anyway just for consistency and use 1:
 	 */
 	dp->dl_stid.sc_stateid.si_generation = 1;
-	num_delegations++;
 	INIT_LIST_HEAD(&dp->dl_perfile);
 	INIT_LIST_HEAD(&dp->dl_perclnt);
 	INIT_LIST_HEAD(&dp->dl_recall_lru);
 	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
 	fh_copy_shallow(&dp->dl_fh, &current_fh->fh_handle);
 	return dp;
+out_dec:
+	atomic_long_dec(&num_delegations);
+	return NULL;
 }
 
 static void remove_stid_locked(struct nfs4_client *clp, struct nfs4_stid *s)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 033/100] nfsd: Slight cleanup of find_stateid()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (31 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 032/100] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 034/100] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid Jeff Layton
                   ` (66 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

In preparation of reference counting.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2c59395c63df..522f0efe47a8 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1699,28 +1699,37 @@ static void gen_confirm(struct nfs4_client *clp)
 	memcpy(clp->cl_confirm.data, verf, sizeof(clp->cl_confirm.data));
 }
 
-static struct nfs4_stid *find_stateid(struct nfs4_client *cl, stateid_t *t)
+static struct nfs4_stid *
+find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
 {
 	struct nfs4_stid *ret;
 
-	spin_lock(&cl->cl_lock);
 	ret = idr_find(&cl->cl_stateids, t->si_opaque.so_id);
-	spin_unlock(&cl->cl_lock);
 	if (!ret || !ret->sc_type)
 		return NULL;
 	return ret;
 }
 
+static struct nfs4_stid *find_stateid(struct nfs4_client *cl, stateid_t *t)
+{
+	struct nfs4_stid *ret;
+
+	spin_lock(&cl->cl_lock);
+	ret = find_stateid_locked(cl, t);
+	spin_unlock(&cl->cl_lock);
+	return ret;
+}
+
 static struct nfs4_stid *find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
 {
 	struct nfs4_stid *s;
 
-	s = find_stateid(cl, t);
-	if (!s)
-		return NULL;
-	if (typemask & s->sc_type)
-		return s;
-	return NULL;
+	spin_lock(&cl->cl_lock);
+	s = find_stateid_locked(cl, t);
+	if (s != NULL && !(typemask & s->sc_type))
+		s = NULL;
+	spin_unlock(&cl->cl_lock);
+	return s;
 }
 
 static struct nfs4_client *create_client(struct xdr_netobj name,
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 034/100] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (32 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 033/100] nfsd: Slight cleanup of find_stateid() Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 035/100] nfsd: Add reference counting to lock stateids Jeff Layton
                   ` (65 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Hold the cl_lock over the bulk of these functions. In addition to
ensuring that they aren't freed prematurely, this will also help prevent
a potential race that could be introduced later. Once we remove the
client_mutex, it'll be possible for FREE_STATEID and CLOSE to race and
for both to try to put the "persistent" reference to the stateid.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 67 ++++++++++++++++++++++++++++-------------------------
 1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 522f0efe47a8..6c2735231ca1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1710,16 +1710,6 @@ find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
 	return ret;
 }
 
-static struct nfs4_stid *find_stateid(struct nfs4_client *cl, stateid_t *t)
-{
-	struct nfs4_stid *ret;
-
-	spin_lock(&cl->cl_lock);
-	ret = find_stateid_locked(cl, t);
-	spin_unlock(&cl->cl_lock);
-	return ret;
-}
-
 static struct nfs4_stid *find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
 {
 	struct nfs4_stid *s;
@@ -4093,10 +4083,10 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 {
 	struct nfs4_stid *s;
 	struct nfs4_ol_stateid *ols;
-	__be32 status;
+	__be32 status = nfserr_bad_stateid;
 
 	if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
-		return nfserr_bad_stateid;
+		return status;
 	/* Client debugging aid. */
 	if (!same_clid(&stateid->si_opaque.so_clid, &cl->cl_clientid)) {
 		char addr_str[INET6_ADDRSTRLEN];
@@ -4104,34 +4094,42 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 				 sizeof(addr_str));
 		pr_warn_ratelimited("NFSD: client %s testing state ID "
 					"with incorrect client ID\n", addr_str);
-		return nfserr_bad_stateid;
+		return status;
 	}
-	s = find_stateid(cl, stateid);
+	spin_lock(&cl->cl_lock);
+	s = find_stateid_locked(cl, stateid);
 	if (!s)
-		return nfserr_bad_stateid;
+		goto out_unlock;
 	status = check_stateid_generation(stateid, &s->sc_stateid, 1);
 	if (status)
-		return status;
+		goto out_unlock;
 	switch (s->sc_type) {
 	case NFS4_DELEG_STID:
-		return nfs_ok;
+		status = nfs_ok;
+		break;
 	case NFS4_REVOKED_DELEG_STID:
-		return nfserr_deleg_revoked;
+		status = nfserr_deleg_revoked;
+		break;
 	case NFS4_OPEN_STID:
 	case NFS4_LOCK_STID:
 		ols = openlockstateid(s);
 		if (ols->st_stateowner->so_is_open_owner
 	    			&& !(openowner(ols->st_stateowner)->oo_flags
 						& NFS4_OO_CONFIRMED))
-			return nfserr_bad_stateid;
-		return nfs_ok;
+			status = nfserr_bad_stateid;
+		else
+			status = nfs_ok;
+		break;
 	default:
 		printk("unknown stateid type %x\n", s->sc_type);
 		/* Fallthrough */
 	case NFS4_CLOSED_STID:
 	case NFS4_CLOSED_DELEG_STID:
-		return nfserr_bad_stateid;
+		status = nfserr_bad_stateid;
 	}
+out_unlock:
+	spin_unlock(&cl->cl_lock);
+	return status;
 }
 
 static __be32
@@ -4282,31 +4280,36 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	__be32 ret = nfserr_bad_stateid;
 
 	nfs4_lock_state();
-	s = find_stateid(cl, stateid);
+	spin_lock(&cl->cl_lock);
+	s = find_stateid_locked(cl, stateid);
 	if (!s)
-		goto out;
+		goto out_unlock;
 	switch (s->sc_type) {
 	case NFS4_DELEG_STID:
 		ret = nfserr_locks_held;
-		goto out;
+		break;
 	case NFS4_OPEN_STID:
 	case NFS4_LOCK_STID:
 		ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
 		if (ret)
-			goto out;
-		if (s->sc_type == NFS4_LOCK_STID)
-			ret = nfsd4_free_lock_stateid(openlockstateid(s));
-		else
+			break;
+		if (s->sc_type != NFS4_LOCK_STID) {
 			ret = nfserr_locks_held;
-		break;
+			break;
+		}
+		spin_unlock(&cl->cl_lock);
+		ret = nfsd4_free_lock_stateid(openlockstateid(s));
+		goto out;
 	case NFS4_REVOKED_DELEG_STID:
+		spin_unlock(&cl->cl_lock);
 		dp = delegstateid(s);
 		destroy_revoked_delegation(dp);
 		ret = nfs_ok;
-		break;
-	default:
-		ret = nfserr_bad_stateid;
+		goto out;
+	/* Default falls through and returns nfserr_bad_stateid */
 	}
+out_unlock:
+	spin_unlock(&cl->cl_lock);
 out:
 	nfs4_unlock_state();
 	return ret;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 035/100] nfsd: Add reference counting to lock stateids
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (33 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 034/100] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 036/100] nfsd: nfsd4_locku() must reference the lock stateid Jeff Layton
                   ` (64 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfsd4_lock() references the lock stateid while it is
manipulating it. Not currently necessary, but will be once the
client_mutex is removed.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6c2735231ca1..619dbec92b50 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4718,6 +4718,7 @@ init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
 
 	lockdep_assert_held(&clp->cl_lock);
 
+	atomic_inc(&stp->st_stid.sc_count);
 	stp->st_stid.sc_type = NFS4_LOCK_STID;
 	stp->st_stateowner = &lo->lo_owner;
 	get_nfs4_file(fp);
@@ -4742,8 +4743,10 @@ find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
 	lockdep_assert_held(&clp->cl_lock);
 
 	list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
-		if (lst->st_stid.sc_file == fp)
+		if (lst->st_stid.sc_file == fp) {
+			atomic_inc(&lst->st_stid.sc_count);
 			return lst;
+		}
 	}
 	return NULL;
 }
@@ -4838,7 +4841,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 {
 	struct nfs4_openowner *open_sop = NULL;
 	struct nfs4_lockowner *lock_sop = NULL;
-	struct nfs4_ol_stateid *lock_stp;
+	struct nfs4_ol_stateid *lock_stp = NULL;
 	struct nfs4_file *fp;
 	struct file *filp = NULL;
 	struct file_lock *file_lock = NULL;
@@ -4892,11 +4895,15 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 			goto out;
 		status = lookup_or_create_lock_state(cstate, open_stp, lock,
 							&lock_stp, &new_state);
-	} else
+	} else {
 		status = nfs4_preprocess_seqid_op(cstate,
 				       lock->lk_old_lock_seqid,
 				       &lock->lk_old_lock_stateid,
 				       NFS4_LOCK_STID, &lock_stp, nn);
+		/* FIXME: move into nfs4_preprocess_seqid_op */
+		if (!status)
+			atomic_inc(&lock_stp->st_stid.sc_count);
+	}
 	if (status)
 		goto out;
 	lock_sop = lockowner(lock_stp->st_stateowner);
@@ -4989,6 +4996,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 out:
 	if (filp)
 		fput(filp);
+	if (lock_stp)
+		put_generic_stateid(lock_stp);
 	if (status && new_state)
 		release_lock_stateid(lock_stp);
 	nfsd4_bump_seqid(cstate, status);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 036/100] nfsd: nfsd4_locku() must reference the lock stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (34 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 035/100] nfsd: Add reference counting to lock stateids Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 037/100] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid Jeff Layton
                   ` (63 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfsd4_locku() keeps a reference to the lock stateid
until it is done working with it. Necessary step toward client_mutex
removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 619dbec92b50..a3cff312d9c2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5128,10 +5128,12 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 					&stp, nn);
 	if (status)
 		goto out;
+	/* FIXME: move into nfs4_preprocess_seqid_op */
+	atomic_inc(&stp->st_stid.sc_count);
 	filp = find_any_file(stp->st_stid.sc_file);
 	if (!filp) {
 		status = nfserr_lock_range;
-		goto out;
+		goto put_stateid;
 	}
 	file_lock = locks_alloc_lock();
 	if (!file_lock) {
@@ -5161,6 +5163,8 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	memcpy(&locku->lu_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
 fput:
 	fput(filp);
+put_stateid:
+	put_generic_stateid(stp);
 out:
 	nfsd4_bump_seqid(cstate, status);
 	nfs4_unlock_state();
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 037/100] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (35 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 036/100] nfsd: nfsd4_locku() must reference the lock stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 038/100] nfsd: nfsd4_process_open2() must reference " Jeff Layton
                   ` (62 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfs4_open_delegation() keeps a reference to the delegation
stateid until it is done working with it. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a3cff312d9c2..053186bd7e6c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -699,6 +699,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 	lockdep_assert_held(&state_lock);
 	lockdep_assert_held(&fp->fi_lock);
 
+	atomic_inc(&dp->dl_stid.sc_count);
 	dp->dl_stid.sc_type = NFS4_DELEG_STID;
 	list_add(&dp->dl_perfile, &fp->fi_delegations);
 	list_add(&dp->dl_perclnt, &dp->dl_stid.sc_client->cl_delegations);
@@ -3699,6 +3700,7 @@ nfs4_open_delegation(struct net *net, struct svc_fh *fh,
 	dprintk("NFSD: delegation stateid=" STATEID_FMT "\n",
 		STATEID_VAL(&dp->dl_stid.sc_stateid));
 	open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
+	nfs4_put_delegation(dp);
 	return;
 out_free:
 	nfs4_put_delegation(dp);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 038/100] nfsd: nfsd4_process_open2() must reference the delegation stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (36 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 037/100] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 039/100] nfsd: nfsd4_process_open2() must reference the open stateid Jeff Layton
                   ` (61 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfsd4_process_open2() keeps a reference to the delegation
stateid until it is done working with it. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 053186bd7e6c..f825c055e62b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3340,6 +3340,8 @@ static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, statei
 	ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID);
 	if (!ret)
 		return NULL;
+	/* FIXME: move into find_stateid_by_type */
+	atomic_inc(&ret->sc_count);
 	return delegstateid(ret);
 }
 
@@ -3355,14 +3357,18 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
 {
 	int flags;
 	__be32 status = nfserr_bad_stateid;
+	struct nfs4_delegation *deleg;
 
-	*dp = find_deleg_stateid(cl, &open->op_delegate_stateid);
-	if (*dp == NULL)
+	deleg = find_deleg_stateid(cl, &open->op_delegate_stateid);
+	if (deleg == NULL)
 		goto out;
 	flags = share_access_to_flags(open->op_share_access);
-	status = nfs4_check_delegmode(*dp, flags);
-	if (status)
-		*dp = NULL;
+	status = nfs4_check_delegmode(deleg, flags);
+	if (status) {
+		nfs4_put_delegation(deleg);
+		goto out;
+	}
+	*dp = deleg;
 out:
 	if (!nfsd4_is_deleg_cur(open))
 		return nfs_ok;
@@ -3826,6 +3832,8 @@ out:
 	if (!(open->op_openowner->oo_flags & NFS4_OO_CONFIRMED) &&
 	    !nfsd4_has_session(&resp->cstate))
 		open->op_rflags |= NFS4_OPEN_RESULT_CONFIRM;
+	if (dp)
+		nfs4_put_delegation(dp);
 
 	return status;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 039/100] nfsd: nfsd4_process_open2() must reference the open stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (37 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 038/100] nfsd: nfsd4_process_open2() must reference " Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 040/100] nfsd: Prepare nfsd4_close() for open stateid referencing Jeff Layton
                   ` (60 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfsd4_process_open2() keeps a reference to the open
stateid until it is done working with it. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f825c055e62b..6bd5453f2f76 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3018,6 +3018,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp, struct nfsd4_open *open) {
 	struct nfs4_openowner *oo = open->op_openowner;
 
+	atomic_inc(&stp->st_stid.sc_count);
 	stp->st_stid.sc_type = NFS4_OPEN_STID;
 	INIT_LIST_HEAD(&stp->st_locks);
 	stp->st_stateowner = &oo->oo_owner;
@@ -3391,6 +3392,7 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 			continue;
 		if (local->st_stateowner == &oo->oo_owner) {
 			ret = local;
+			atomic_inc(&ret->st_stid.sc_count);
 			break;
 		}
 	}
@@ -3834,6 +3836,8 @@ out:
 		open->op_rflags |= NFS4_OPEN_RESULT_CONFIRM;
 	if (dp)
 		nfs4_put_delegation(dp);
+	if (stp)
+		put_generic_stateid(stp);
 
 	return status;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 040/100] nfsd: Prepare nfsd4_close() for open stateid referencing
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (38 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 039/100] nfsd: nfsd4_process_open2() must reference the open stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 041/100] nfsd: nfsd4_open_confirm() must reference the open stateid Jeff Layton
                   ` (59 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Prepare nfsd4_close for a future where nfs4_preprocess_seqid_op()
hands it a fully referenced open stateid. Necessary step toward
client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6bd5453f2f76..72e5be805e3e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4570,10 +4570,15 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	nfsd4_bump_seqid(cstate, status);
 	if (status)
 		goto out; 
+	/* FIXME: move into nfs4_preprocess_seqid_op */
+	atomic_inc(&stp->st_stid.sc_count);
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
 
 	nfsd4_close_open_stateid(stp);
+
+	/* put reference from nfs4_preprocess_seqid_op */
+	put_generic_stateid(stp);
 out:
 	nfs4_unlock_state();
 	return status;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 041/100] nfsd: nfsd4_open_confirm() must reference the open stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (39 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 040/100] nfsd: Prepare nfsd4_close() for open stateid referencing Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 042/100] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op Jeff Layton
                   ` (58 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfsd4_open_confirm() keeps a reference to the open
stateid until it is done working with it.

Necessary step toward client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 72e5be805e3e..ef1fd4ddc0f7 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4427,10 +4427,12 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 					NFS4_OPEN_STID, &stp, nn);
 	if (status)
 		goto out;
+	/* FIXME: move into nfs4_preprocess_seqid_op */
+	atomic_inc(&stp->st_stid.sc_count);
 	oo = openowner(stp->st_stateowner);
 	status = nfserr_bad_stateid;
 	if (oo->oo_flags & NFS4_OO_CONFIRMED)
-		goto out;
+		goto put_stateid;
 	oo->oo_flags |= NFS4_OO_CONFIRMED;
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&oc->oc_resp_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
@@ -4439,6 +4441,8 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	nfsd4_client_record_create(oo->oo_owner.so_client);
 	status = nfs_ok;
+put_stateid:
+	put_generic_stateid(stp);
 out:
 	nfsd4_bump_seqid(cstate, status);
 	nfs4_unlock_state();
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 042/100] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (40 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 041/100] nfsd: nfsd4_open_confirm() must reference the open stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 043/100] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op Jeff Layton
                   ` (57 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that all the callers put the open stateid after use.
Necessary step toward client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ef1fd4ddc0f7..fa6a060b0b14 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4398,6 +4398,8 @@ static __be32 nfs4_preprocess_confirmed_seqid_op(struct nfsd4_compound_state *cs
 						NFS4_OPEN_STID, stpp, nn);
 	if (status)
 		return status;
+	/* FIXME: move into nfs4_preprocess_seqid_op */
+	atomic_inc(&(*stpp)->st_stid.sc_count);
 	oo = openowner((*stpp)->st_stateowner);
 	if (!(oo->oo_flags & NFS4_OO_CONFIRMED))
 		return nfserr_bad_stateid;
@@ -4501,12 +4503,12 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
 	if (!test_access(od->od_share_access, stp)) {
 		dprintk("NFSD: access not a subset of current bitmap: 0x%hhx, input access=%08x\n",
 			stp->st_access_bmap, od->od_share_access);
-		goto out;
+		goto put_stateid;
 	}
 	if (!test_deny(od->od_share_deny, stp)) {
 		dprintk("NFSD: deny not a subset of current bitmap: 0x%hhx, input deny=%08x\n",
 			stp->st_deny_bmap, od->od_share_deny);
-		goto out;
+		goto put_stateid;
 	}
 	nfs4_stateid_downgrade(stp, od->od_share_access);
 
@@ -4515,6 +4517,8 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&od->od_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
 	status = nfs_ok;
+put_stateid:
+	put_generic_stateid(stp);
 out:
 	nfsd4_bump_seqid(cstate, status);
 	nfs4_unlock_state();
@@ -4865,6 +4869,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfs4_openowner *open_sop = NULL;
 	struct nfs4_lockowner *lock_sop = NULL;
 	struct nfs4_ol_stateid *lock_stp = NULL;
+	struct nfs4_ol_stateid *open_stp = NULL;
 	struct nfs4_file *fp;
 	struct file *filp = NULL;
 	struct file_lock *file_lock = NULL;
@@ -4892,8 +4897,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	nfs4_lock_state();
 
 	if (lock->lk_is_new) {
-		struct nfs4_ol_stateid *open_stp = NULL;
-
 		if (nfsd4_has_session(cstate))
 			/* See rfc 5661 18.10.3: given clientid is ignored: */
 			memcpy(&lock->v.new.clientid,
@@ -5021,6 +5024,8 @@ out:
 		fput(filp);
 	if (lock_stp)
 		put_generic_stateid(lock_stp);
+	if (open_stp)
+		put_generic_stateid(open_stp);
 	if (status && new_state)
 		release_lock_stateid(lock_stp);
 	nfsd4_bump_seqid(cstate, status);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 043/100] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (41 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 042/100] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 044/100] nfsd: Migrate the stateid reference into nfs4_lookup_stateid() Jeff Layton
                   ` (56 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Allow nfs4_preprocess_seqid_op to take the stateid reference, instead
of having all the callers do so.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fa6a060b0b14..437b52b81215 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4383,8 +4383,11 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
 	nfsd4_cstate_assign_replay(cstate, stp->st_stateowner);
 
 	status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
-	if (!status)
+	if (!status) {
+		/* FIXME: move into find_stateid_by_type */
+		atomic_inc(&stp->st_stid.sc_count);
 		*stpp = stp;
+	}
 	return status;
 }
 
@@ -4393,16 +4396,18 @@ static __be32 nfs4_preprocess_confirmed_seqid_op(struct nfsd4_compound_state *cs
 {
 	__be32 status;
 	struct nfs4_openowner *oo;
+	struct nfs4_ol_stateid *stp;
 
 	status = nfs4_preprocess_seqid_op(cstate, seqid, stateid,
-						NFS4_OPEN_STID, stpp, nn);
+						NFS4_OPEN_STID, &stp, nn);
 	if (status)
 		return status;
-	/* FIXME: move into nfs4_preprocess_seqid_op */
-	atomic_inc(&(*stpp)->st_stid.sc_count);
-	oo = openowner((*stpp)->st_stateowner);
-	if (!(oo->oo_flags & NFS4_OO_CONFIRMED))
+	oo = openowner(stp->st_stateowner);
+	if (!(oo->oo_flags & NFS4_OO_CONFIRMED)) {
+		put_generic_stateid(stp);
 		return nfserr_bad_stateid;
+	}
+	*stpp = stp;
 	return nfs_ok;
 }
 
@@ -4429,8 +4434,6 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 					NFS4_OPEN_STID, &stp, nn);
 	if (status)
 		goto out;
-	/* FIXME: move into nfs4_preprocess_seqid_op */
-	atomic_inc(&stp->st_stid.sc_count);
 	oo = openowner(stp->st_stateowner);
 	status = nfserr_bad_stateid;
 	if (oo->oo_flags & NFS4_OO_CONFIRMED)
@@ -4578,8 +4581,6 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	nfsd4_bump_seqid(cstate, status);
 	if (status)
 		goto out; 
-	/* FIXME: move into nfs4_preprocess_seqid_op */
-	atomic_inc(&stp->st_stid.sc_count);
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
 
@@ -4926,9 +4927,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 				       lock->lk_old_lock_seqid,
 				       &lock->lk_old_lock_stateid,
 				       NFS4_LOCK_STID, &lock_stp, nn);
-		/* FIXME: move into nfs4_preprocess_seqid_op */
-		if (!status)
-			atomic_inc(&lock_stp->st_stid.sc_count);
 	}
 	if (status)
 		goto out;
@@ -5156,8 +5154,6 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 					&stp, nn);
 	if (status)
 		goto out;
-	/* FIXME: move into nfs4_preprocess_seqid_op */
-	atomic_inc(&stp->st_stid.sc_count);
 	filp = find_any_file(stp->st_stid.sc_file);
 	if (!filp) {
 		status = nfserr_lock_range;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 044/100] nfsd: Migrate the stateid reference into nfs4_lookup_stateid()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (42 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 043/100] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 045/100] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type() Jeff Layton
                   ` (55 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Allow nfs4_lookup_stateid to take the stateid reference, instead
of having all the callers do so.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 437b52b81215..6c8633870e9f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4166,6 +4166,8 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 	*s = find_stateid_by_type(cstate->clp, stateid, typemask);
 	if (!*s)
 		return nfserr_bad_stateid;
+	/* FIXME: move into find_stateid_by_type */
+	atomic_inc(&(*s)->sc_count);
 	return nfs_ok;
 }
 
@@ -4200,7 +4202,7 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 				NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
 				&s, nn);
 	if (status)
-		goto out;
+		goto unlock_state;
 	status = check_stateid_generation(stateid, &s->sc_stateid, nfsd4_has_session(cstate));
 	if (status)
 		goto out;
@@ -4249,6 +4251,8 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 	if (file)
 		*filpp = file;
 out:
+	nfs4_put_stid(s);
+unlock_state:
 	nfs4_unlock_state();
 	return status;
 }
@@ -4383,11 +4387,10 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
 	nfsd4_cstate_assign_replay(cstate, stp->st_stateowner);
 
 	status = nfs4_seqid_op_checks(cstate, stateid, seqid, stp);
-	if (!status) {
-		/* FIXME: move into find_stateid_by_type */
-		atomic_inc(&stp->st_stid.sc_count);
+	if (!status)
 		*stpp = stp;
-	}
+	else
+		put_generic_stateid(stp);
 	return status;
 }
 
@@ -4613,9 +4616,11 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	dp = delegstateid(s);
 	status = check_stateid_generation(stateid, &dp->dl_stid.sc_stateid, nfsd4_has_session(cstate));
 	if (status)
-		goto out;
+		goto put_stateid;
 
 	unhash_and_destroy_delegation(dp);
+put_stateid:
+	nfs4_put_delegation(dp);
 out:
 	nfs4_unlock_state();
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 045/100] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (43 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 044/100] nfsd: Migrate the stateid reference into nfs4_lookup_stateid() Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 046/100] nfsd: Add reference counting to state owners Jeff Layton
                   ` (54 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Allow nfs4_find_stateid_by_type to take the stateid reference, while
still holding the &cl->cl_lock. Necessary step toward client_mutex
removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6c8633870e9f..9ac1cd37e233 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1717,8 +1717,12 @@ static struct nfs4_stid *find_stateid_by_type(struct nfs4_client *cl, stateid_t
 
 	spin_lock(&cl->cl_lock);
 	s = find_stateid_locked(cl, t);
-	if (s != NULL && !(typemask & s->sc_type))
-		s = NULL;
+	if (s != NULL) {
+		if (typemask & s->sc_type)
+			atomic_inc(&s->sc_count);
+		else
+			s = NULL;
+	}
 	spin_unlock(&cl->cl_lock);
 	return s;
 }
@@ -3341,8 +3345,6 @@ static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, statei
 	ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID);
 	if (!ret)
 		return NULL;
-	/* FIXME: move into find_stateid_by_type */
-	atomic_inc(&ret->sc_count);
 	return delegstateid(ret);
 }
 
@@ -4166,8 +4168,6 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 	*s = find_stateid_by_type(cstate->clp, stateid, typemask);
 	if (!*s)
 		return nfserr_bad_stateid;
-	/* FIXME: move into find_stateid_by_type */
-	atomic_inc(&(*s)->sc_count);
 	return nfs_ok;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 046/100] nfsd: Add reference counting to state owners
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (44 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 045/100] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type() Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 047/100] nfsd: Keep a reference to the open stateid for the NFSv4.0 replay cache Jeff Layton
                   ` (53 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

The way stateowners are managed today is somewhat awkward. They need to
be explicitly destroyed, even though the stateids reference them. This
will be particularly problematic when we remove the client_mutex.

We may create a new stateowner and attempt to open a file or set a lock,
and have that fail. In the meantime, another RPC may come in that uses
that same stateowner and succeed. We can't have the first task tearing
down the stateowner in that situation.

To fix this, we need to change how stateowners are tracked altogether.
Refcount them and only destroy them once all stateids that reference
them have been destroyed. This patch starts by adding the refcounting
necessary to do that.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 43 +++++++++++++++++++++++++++++--------------
 fs/nfsd/state.h     |  3 +++
 2 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9ac1cd37e233..3c998ff64fbb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -72,6 +72,7 @@ static u64 current_sessionid = 1;
 /* forward declarations */
 static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
 static void nfs4_free_generic_stateid(struct nfs4_stid *stid);
+static void nfs4_put_stateowner(struct nfs4_stateowner *sop);
 
 /* Locking: */
 
@@ -962,16 +963,10 @@ static void unhash_lockowner(struct nfs4_lockowner *lo)
 	}
 }
 
-static void nfs4_free_lockowner(struct nfs4_lockowner *lo)
-{
-	kfree(lo->lo_owner.so_owner.data);
-	kmem_cache_free(lockowner_slab, lo);
-}
-
 static void release_lockowner(struct nfs4_lockowner *lo)
 {
 	unhash_lockowner(lo);
-	nfs4_free_lockowner(lo);
+	nfs4_put_stateowner(&lo->lo_owner);
 }
 
 static void release_lockowner_if_empty(struct nfs4_lockowner *lo)
@@ -1041,18 +1036,12 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 	}
 }
 
-static void nfs4_free_openowner(struct nfs4_openowner *oo)
-{
-	kfree(oo->oo_owner.so_owner.data);
-	kmem_cache_free(openowner_slab, oo);
-}
-
 static void release_openowner(struct nfs4_openowner *oo)
 {
 	unhash_openowner(oo);
 	list_del(&oo->oo_close_lru);
 	release_last_closed_stateid(oo);
-	nfs4_free_openowner(oo);
+	nfs4_put_stateowner(&oo->oo_owner);
 }
 
 static inline int
@@ -2986,9 +2975,17 @@ static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj
 	INIT_LIST_HEAD(&sop->so_stateids);
 	sop->so_client = clp;
 	init_nfs4_replay(&sop->so_replay);
+	atomic_set(&sop->so_count, 1);
 	return sop;
 }
 
+static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
+{
+	if (!atomic_dec_and_test(&sop->so_count))
+		return;
+	sop->so_free(sop);
+}
+
 static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, unsigned int strhashval)
 {
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
@@ -2997,6 +2994,14 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
 	list_add(&oo->oo_perclient, &clp->cl_openowners);
 }
 
+static void nfs4_free_openowner(struct nfs4_stateowner *so)
+{
+	struct nfs4_openowner *oo = openowner(so);
+
+	kfree(oo->oo_owner.so_owner.data);
+	kmem_cache_free(openowner_slab, oo);
+}
+
 static struct nfs4_openowner *
 alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 			   struct nfsd4_compound_state *cstate)
@@ -3007,6 +3012,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 	oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
 	if (!oo)
 		return NULL;
+	oo->oo_owner.so_free = nfs4_free_openowner;
 	oo->oo_owner.so_is_open_owner = 1;
 	oo->oo_owner.so_seqid = open->op_seqid;
 	oo->oo_flags = NFS4_OO_NEW;
@@ -4719,6 +4725,14 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
 	return NULL;
 }
 
+static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
+{
+	struct nfs4_lockowner *lo = lockowner(sop);
+
+	kfree(lo->lo_owner.so_owner.data);
+	kmem_cache_free(lockowner_slab, lo);
+}
+
 /*
  * Alloc a lock owner structure.
  * Called in nfsd4_lock - therefore, OPEN and OPEN_CONFIRM (if needed) has 
@@ -4739,6 +4753,7 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
 	/* It is the openowner seqid that will be incremented in encode in the
 	 * case of new lockowners; so increment the lock seqid manually: */
 	lo->lo_owner.so_seqid = lock->lk_new_lock_seqid + 1;
+	lo->lo_owner.so_free = nfs4_free_lockowner;
 	list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
 	return lo;
 }
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 7188dcd45ef7..eba7283a2613 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -340,10 +340,13 @@ struct nfs4_stateowner {
 	struct nfs4_client *    so_client;
 	/* after increment in ENCODE_SEQID_OP_TAIL, represents the next
 	 * sequence id expected from the client: */
+	atomic_t		so_count;
 	u32                     so_seqid;
 	struct xdr_netobj       so_owner;     /* open owner name */
 	struct nfs4_replay	so_replay;
 	bool			so_is_open_owner;
+
+	void (*so_free)(struct nfs4_stateowner *);
 };
 
 struct nfs4_openowner {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 047/100] nfsd: Keep a reference to the open stateid for the NFSv4.0 replay cache
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (45 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 046/100] nfsd: Add reference counting to state owners Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 048/100] nfsd: clean up lockowner refcounting when finding them Jeff Layton
                   ` (52 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take
a reference to the stateowner when they are using it for NFSv4.0
open and lock replay caching.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4proc.c  |  5 +----
 fs/nfsd/nfs4state.c | 26 +++++++++++++++++++++++++-
 fs/nfsd/xdr4.h      | 26 ++++----------------------
 3 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 5004245e9958..c53757ec6580 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -467,10 +467,7 @@ out:
 		fh_put(resfh);
 		kfree(resfh);
 	}
-	nfsd4_cleanup_open_state(open, status);
-	if (open->op_openowner)
-		nfsd4_cstate_assign_replay(cstate,
-				&open->op_openowner->oo_owner);
+	nfsd4_cleanup_open_state(cstate, open, status);
 	nfsd4_bump_seqid(cstate, status);
 	nfs4_unlock_state();
 	return status;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3c998ff64fbb..654362f688dc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2957,6 +2957,27 @@ static void init_nfs4_replay(struct nfs4_replay *rp)
 	mutex_init(&rp->rp_mutex);
 }
 
+static void nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
+		struct nfs4_stateowner *so)
+{
+	if (!nfsd4_has_session(cstate)) {
+		mutex_lock(&so->so_replay.rp_mutex);
+		cstate->replay_owner = so;
+		atomic_inc(&so->so_count);
+	}
+}
+
+void nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate)
+{
+	struct nfs4_stateowner *so = cstate->replay_owner;
+
+	if (so != NULL) {
+		cstate->replay_owner = NULL;
+		mutex_unlock(&so->so_replay.rp_mutex);
+		nfs4_put_stateowner(so);
+	}
+}
+
 static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj *owner, struct nfs4_client *clp)
 {
 	struct nfs4_stateowner *sop;
@@ -3850,7 +3871,8 @@ out:
 	return status;
 }
 
-void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
+void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
+			      struct nfsd4_open *open, __be32 status)
 {
 	if (open->op_openowner) {
 		struct nfs4_openowner *oo = open->op_openowner;
@@ -3864,6 +3886,8 @@ void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status)
 			} else
 				oo->oo_flags &= ~NFS4_OO_NEW;
 		}
+		if (open->op_openowner)
+			nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
 	}
 	if (open->op_file)
 		nfsd4_free_file(open->op_file);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 7442dc7efd31..465e7799742a 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -74,27 +74,6 @@ static inline bool nfsd4_has_session(struct nfsd4_compound_state *cs)
 	return cs->slot != NULL;
 }
 
-static inline void
-nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
-				struct nfs4_stateowner *so)
-{
-	if (!nfsd4_has_session(cstate)) {
-		mutex_lock(&so->so_replay.rp_mutex);
-		cstate->replay_owner = so;
-	}
-}
-
-static inline void
-nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate)
-{
-	struct nfs4_stateowner *so = cstate->replay_owner;
-
-	if (so != NULL) {
-		cstate->replay_owner = NULL;
-		mutex_unlock(&so->so_replay.rp_mutex);
-	}
-}
-
 struct nfsd4_change_info {
 	u32		atomic;
 	bool		change_supported;
@@ -620,7 +599,9 @@ extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
 		struct nfsd4_open *open, struct nfsd_net *nn);
 extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
 		struct svc_fh *current_fh, struct nfsd4_open *open);
-extern void nfsd4_cleanup_open_state(struct nfsd4_open *open, __be32 status);
+extern void nfsd4_cstate_clear_replay(struct nfsd4_compound_state *cstate);
+extern void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
+		struct nfsd4_open *open, __be32 status);
 extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *, struct nfsd4_open_confirm *oc);
 extern __be32 nfsd4_close(struct svc_rqst *rqstp,
@@ -651,6 +632,7 @@ extern __be32 nfsd4_test_stateid(struct svc_rqst *rqstp,
 extern __be32 nfsd4_free_stateid(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *, struct nfsd4_free_stateid *free_stateid);
 extern void nfsd4_bump_seqid(struct nfsd4_compound_state *, __be32 nfserr);
+
 #endif
 
 /*
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 048/100] nfsd: clean up lockowner refcounting when finding them
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (46 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 047/100] nfsd: Keep a reference to the open stateid for the NFSv4.0 replay cache Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 049/100] nfsd: add an operation for unhashing a stateowner Jeff Layton
                   ` (51 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Ensure that when finding or creating a lockowner, that we get a
reference to it. For now, we also take an extra reference when a
lockowner is created that can be put when release_lockowner is called,
but we'll remove that in a later patch once we change how references are
held.

Since we no longer destroy lockowners in the event of an error in
nfsd4_lock, we must change how the seqid gets bumped in the lk_is_new
case. Instead of doing so on creation, do it manually in nfsd4_lock.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 51 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 654362f688dc..dec8a2551806 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4744,6 +4744,7 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
 			continue;
 		if (!same_owner_str(so, owner, clid))
 			continue;
+		atomic_inc(&so->so_count);
 		return lockowner(so);
 	}
 	return NULL;
@@ -4774,9 +4775,7 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
 		return NULL;
 	INIT_LIST_HEAD(&lo->lo_owner.so_stateids);
 	lo->lo_owner.so_is_open_owner = 0;
-	/* It is the openowner seqid that will be incremented in encode in the
-	 * case of new lockowners; so increment the lock seqid manually: */
-	lo->lo_owner.so_seqid = lock->lk_new_lock_seqid + 1;
+	lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
 	lo->lo_owner.so_free = nfs4_free_lockowner;
 	list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
 	return lo;
@@ -4873,8 +4872,13 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
 	set_access(access, lock_stp);
 }
 
-static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, struct nfs4_ol_stateid *ost, struct nfsd4_lock *lock, struct nfs4_ol_stateid **lst, bool *new)
+static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
+					  struct nfs4_ol_stateid *ost,
+					  struct nfsd4_lock *lock,
+					  struct nfs4_ol_stateid **lst,
+					  bool *new)
 {
+	__be32 status;
 	struct nfs4_file *fi = ost->st_stid.sc_file;
 	struct nfs4_openowner *oo = openowner(ost->st_stateowner);
 	struct nfs4_client *cl = oo->oo_owner.so_client;
@@ -4889,19 +4893,26 @@ static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate, s
 		lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
 		if (lo == NULL)
 			return nfserr_jukebox;
+		/* FIXME: extra reference for new lockowners for the client */
+		atomic_inc(&lo->lo_owner.so_count);
 	} else {
 		/* with an existing lockowner, seqids must be the same */
+		status = nfserr_bad_seqid;
 		if (!cstate->minorversion &&
 		    lock->lk_new_lock_seqid != lo->lo_owner.so_seqid)
-			return nfserr_bad_seqid;
+			goto out;
 	}
 
 	*lst = find_or_create_lock_stateid(lo, fi, ost, new);
 	if (*lst == NULL) {
 		release_lockowner_if_empty(lo);
-		return nfserr_jukebox;
+		status = nfserr_jukebox;
+		goto out;
 	}
-	return nfs_ok;
+	status = nfs_ok;
+out:
+	nfs4_put_stateowner(&lo->lo_owner);
+	return status;
 }
 
 /*
@@ -4920,9 +4931,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct file_lock *file_lock = NULL;
 	struct file_lock *conflock = NULL;
 	__be32 status = 0;
-	bool new_state = false;
 	int lkflg;
 	int err;
+	bool new = false;
 	struct net *net = SVC_NET(rqstp);
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
@@ -4965,7 +4976,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 						&lock->v.new.clientid))
 			goto out;
 		status = lookup_or_create_lock_state(cstate, open_stp, lock,
-							&lock_stp, &new_state);
+							&lock_stp, &new);
 	} else {
 		status = nfs4_preprocess_seqid_op(cstate,
 				       lock->lk_old_lock_seqid,
@@ -5064,12 +5075,24 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 out:
 	if (filp)
 		fput(filp);
-	if (lock_stp)
+	if (lock_stp) {
+		/* Bump seqid manually if the 4.0 replay owner is openowner */
+		if (cstate->replay_owner &&
+		    cstate->replay_owner != &lock_sop->lo_owner &&
+		    seqid_mutating_err(ntohl(status)))
+			lock_sop->lo_owner.so_seqid++;
+
+		/*
+		 * If this is a new, never-before-used stateid, and we are
+		 * returning an error, then just go ahead and release it.
+		 */
+		if (status && new)
+			release_lock_stateid(lock_stp);
+
 		put_generic_stateid(lock_stp);
+	}
 	if (open_stp)
 		put_generic_stateid(open_stp);
-	if (status && new_state)
-		release_lock_stateid(lock_stp);
 	nfsd4_bump_seqid(cstate, status);
 	nfs4_unlock_state();
 	if (file_lock)
@@ -5104,7 +5127,7 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	    struct nfsd4_lockt *lockt)
 {
 	struct file_lock *file_lock = NULL;
-	struct nfs4_lockowner *lo;
+	struct nfs4_lockowner *lo = NULL;
 	__be32 status;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
@@ -5167,6 +5190,8 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		nfs4_set_lock_denied(file_lock, &lockt->lt_denied);
 	}
 out:
+	if (lo)
+		nfs4_put_stateowner(&lo->lo_owner);
 	nfs4_unlock_state();
 	if (file_lock)
 		locks_free_lock(file_lock);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 049/100] nfsd: add an operation for unhashing a stateowner
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (47 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 048/100] nfsd: clean up lockowner refcounting when finding them Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 050/100] nfsd: Make lock stateid take a reference to the lockowner Jeff Layton
                   ` (50 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Allow stateowners to be unhashed and destroyed when the last reference
is put. The unhashing must be idempotent. In a future patch, we'll add
some locking around it, but for now it's only protected by the
client_mutex.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++----------
 fs/nfsd/state.h     |  1 +
 2 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index dec8a2551806..a008dd08ee77 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -953,9 +953,13 @@ static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
 
 static void unhash_lockowner(struct nfs4_lockowner *lo)
 {
+	list_del_init(&lo->lo_owner.so_strhash);
+}
+
+static void release_lockowner_stateids(struct nfs4_lockowner *lo)
+{
 	struct nfs4_ol_stateid *stp;
 
-	list_del(&lo->lo_owner.so_strhash);
 	while (!list_empty(&lo->lo_owner.so_stateids)) {
 		stp = list_first_entry(&lo->lo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
@@ -966,6 +970,7 @@ static void unhash_lockowner(struct nfs4_lockowner *lo)
 static void release_lockowner(struct nfs4_lockowner *lo)
 {
 	unhash_lockowner(lo);
+	release_lockowner_stateids(lo);
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
@@ -1015,15 +1020,8 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
 
 static void unhash_openowner(struct nfs4_openowner *oo)
 {
-	struct nfs4_ol_stateid *stp;
-
-	list_del(&oo->oo_owner.so_strhash);
-	list_del(&oo->oo_perclient);
-	while (!list_empty(&oo->oo_owner.so_stateids)) {
-		stp = list_first_entry(&oo->oo_owner.so_stateids,
-				struct nfs4_ol_stateid, st_perstateowner);
-		release_open_stateid(stp);
-	}
+	list_del_init(&oo->oo_owner.so_strhash);
+	list_del_init(&oo->oo_perclient);
 }
 
 static void release_last_closed_stateid(struct nfs4_openowner *oo)
@@ -1036,9 +1034,21 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 	}
 }
 
+static void release_openowner_stateids(struct nfs4_openowner *oo)
+{
+	struct nfs4_ol_stateid *stp;
+
+	while (!list_empty(&oo->oo_owner.so_stateids)) {
+		stp = list_first_entry(&oo->oo_owner.so_stateids,
+				struct nfs4_ol_stateid, st_perstateowner);
+		release_open_stateid(stp);
+	}
+}
+
 static void release_openowner(struct nfs4_openowner *oo)
 {
 	unhash_openowner(oo);
+	release_openowner_stateids(oo);
 	list_del(&oo->oo_close_lru);
 	release_last_closed_stateid(oo);
 	nfs4_put_stateowner(&oo->oo_owner);
@@ -3004,6 +3014,7 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
 {
 	if (!atomic_dec_and_test(&sop->so_count))
 		return;
+	sop->so_unhash(sop);
 	sop->so_free(sop);
 }
 
@@ -3015,6 +3026,13 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
 	list_add(&oo->oo_perclient, &clp->cl_openowners);
 }
 
+static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
+{
+	struct nfs4_openowner *oo = openowner(so);
+
+	unhash_openowner(oo);
+}
+
 static void nfs4_free_openowner(struct nfs4_stateowner *so)
 {
 	struct nfs4_openowner *oo = openowner(so);
@@ -3034,6 +3052,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 	if (!oo)
 		return NULL;
 	oo->oo_owner.so_free = nfs4_free_openowner;
+	oo->oo_owner.so_unhash = nfs4_unhash_openowner;
 	oo->oo_owner.so_is_open_owner = 1;
 	oo->oo_owner.so_seqid = open->op_seqid;
 	oo->oo_flags = NFS4_OO_NEW;
@@ -4750,6 +4769,11 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
 	return NULL;
 }
 
+static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
+{
+	unhash_lockowner(lockowner(sop));
+}
+
 static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
 {
 	struct nfs4_lockowner *lo = lockowner(sop);
@@ -4777,6 +4801,7 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
 	lo->lo_owner.so_is_open_owner = 0;
 	lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
 	lo->lo_owner.so_free = nfs4_free_lockowner;
+	lo->lo_owner.so_unhash = nfs4_unhash_lockowner;
 	list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
 	return lo;
 }
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index eba7283a2613..f6639fb5a56f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -347,6 +347,7 @@ struct nfs4_stateowner {
 	bool			so_is_open_owner;
 
 	void (*so_free)(struct nfs4_stateowner *);
+	void (*so_unhash)(struct nfs4_stateowner *);
 };
 
 struct nfs4_openowner {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 050/100] nfsd: Make lock stateid take a reference to the lockowner
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (48 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 049/100] nfsd: add an operation for unhashing a stateowner Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 051/100] nfsd: clean up refcounting for lockowners Jeff Layton
                   ` (49 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

A necessary step toward client_mutex removal.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a008dd08ee77..078ca6a6a132 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -919,6 +919,8 @@ static void nfs4_free_generic_stateid(struct nfs4_stid *stid)
 	struct nfs4_ol_stateid *stp = openlockstateid(stid);
 
 	release_all_access(stp);
+	if (stp->st_stateowner && stid->sc_type == NFS4_LOCK_STID)
+		nfs4_put_stateowner(stp->st_stateowner);
 	nfs4_free_stid(stateid_slab, stid);
 }
 
@@ -4817,6 +4819,7 @@ init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
 	atomic_inc(&stp->st_stid.sc_count);
 	stp->st_stid.sc_type = NFS4_LOCK_STID;
 	stp->st_stateowner = &lo->lo_owner;
+	atomic_inc(&lo->lo_owner.so_count);
 	get_nfs4_file(fp);
 	stp->st_stid.sc_file = fp;
 	stp->st_stid.sc_free = nfs4_free_lock_stateid;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 051/100] nfsd: clean up refcounting for lockowners
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (49 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 050/100] nfsd: Make lock stateid take a reference to the lockowner Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 052/100] nfsd: make openstateids hold references to their openowners Jeff Layton
                   ` (48 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Ensure that lockowner references are only held by lockstateids and
operations that are in-progress. With this, we can get rid of
release_lockowner_if_empty, which will be racy once we remove
client_mutex protection.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 078ca6a6a132..591f127939a8 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -941,7 +941,7 @@ static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 	nfs4_put_stid(&stp->st_stid);
 }
 
-static void __release_lock_stateid(struct nfs4_ol_stateid *stp)
+static void release_lock_stateid(struct nfs4_ol_stateid *stp)
 {
 	struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
 
@@ -965,7 +965,7 @@ static void release_lockowner_stateids(struct nfs4_lockowner *lo)
 	while (!list_empty(&lo->lo_owner.so_stateids)) {
 		stp = list_first_entry(&lo->lo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
-		__release_lock_stateid(stp);
+		release_lock_stateid(stp);
 	}
 }
 
@@ -976,21 +976,6 @@ static void release_lockowner(struct nfs4_lockowner *lo)
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
-static void release_lockowner_if_empty(struct nfs4_lockowner *lo)
-{
-	if (list_empty(&lo->lo_owner.so_stateids))
-		release_lockowner(lo);
-}
-
-static void release_lock_stateid(struct nfs4_ol_stateid *stp)
-{
-	struct nfs4_lockowner *lo;
-
-	lo = lockowner(stp->st_stateowner);
-	__release_lock_stateid(stp);
-	release_lockowner_if_empty(lo);
-}
-
 static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
 	__releases(&open_stp->st_stateowner->so_client->cl_lock)
 	__acquires(&open_stp->st_stateowner->so_client->cl_lock)
@@ -4315,7 +4300,7 @@ nfsd4_free_lock_stateid(struct nfs4_ol_stateid *stp)
 
 	if (check_for_locks(stp->st_stid.sc_file, lo))
 		return nfserr_locks_held;
-	release_lockowner_if_empty(lo);
+	release_lock_stateid(stp);
 	return nfs_ok;
 }
 
@@ -4921,8 +4906,6 @@ static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
 		lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
 		if (lo == NULL)
 			return nfserr_jukebox;
-		/* FIXME: extra reference for new lockowners for the client */
-		atomic_inc(&lo->lo_owner.so_count);
 	} else {
 		/* with an existing lockowner, seqids must be the same */
 		status = nfserr_bad_seqid;
@@ -4933,7 +4916,6 @@ static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
 
 	*lst = find_or_create_lock_stateid(lo, fi, ost, new);
 	if (*lst == NULL) {
-		release_lockowner_if_empty(lo);
 		status = nfserr_jukebox;
 		goto out;
 	}
@@ -5353,6 +5335,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 			continue;
 		if (same_owner_str(tmp, owner, clid)) {
 			sop = tmp;
+			atomic_inc(&sop->so_count);
 			break;
 		}
 	}
@@ -5366,8 +5349,10 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	lo = lockowner(sop);
 	/* see if there are still any locks associated with it */
 	list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
-		if (check_for_locks(stp->st_stid.sc_file, lo))
+		if (check_for_locks(stp->st_stid.sc_file, lo)) {
+			nfs4_put_stateowner(sop);
 			goto out;
+		}
 	}
 
 	status = nfs_ok;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 052/100] nfsd: make openstateids hold references to their openowners
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (50 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 051/100] nfsd: clean up refcounting for lockowners Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 053/100] nfsd: don't allow CLOSE to proceed until refcount on stateid drops Jeff Layton
                   ` (47 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Change it so that only openstateids hold persistent references to
openowners. References can still be held by compounds in progress.

With this, we can get rid of NFS4_OO_NEW. It's possible that we
will create a new openowner in the process of doing the open, but
something later fails. In the meantime, another task could find
that openowner and start using it on a successful open. If that
occurs we don't necessarily want to tear it down, just put the
reference that the failing compound holds.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 71 +++++++++++++++++++++++------------------------------
 fs/nfsd/state.h     |  1 -
 2 files changed, 31 insertions(+), 41 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 591f127939a8..2a03efc42803 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -919,7 +919,7 @@ static void nfs4_free_generic_stateid(struct nfs4_stid *stid)
 	struct nfs4_ol_stateid *stp = openlockstateid(stid);
 
 	release_all_access(stp);
-	if (stp->st_stateowner && stid->sc_type == NFS4_LOCK_STID)
+	if (stp->st_stateowner)
 		nfs4_put_stateowner(stp->st_stateowner);
 	nfs4_free_stid(stateid_slab, stid);
 }
@@ -1016,8 +1016,9 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 	struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;
 
 	if (s) {
-		put_generic_stateid(s);
+		list_del_init(&oo->oo_close_lru);
 		oo->oo_last_closed_stid = NULL;
+		put_generic_stateid(s);
 	}
 }
 
@@ -1036,7 +1037,6 @@ static void release_openowner(struct nfs4_openowner *oo)
 {
 	unhash_openowner(oo);
 	release_openowner_stateids(oo);
-	list_del(&oo->oo_close_lru);
 	release_last_closed_stateid(oo);
 	nfs4_put_stateowner(&oo->oo_owner);
 }
@@ -1511,6 +1511,7 @@ destroy_client(struct nfs4_client *clp)
 	}
 	while (!list_empty(&clp->cl_openowners)) {
 		oo = list_entry(clp->cl_openowners.next, struct nfs4_openowner, oo_perclient);
+		atomic_inc(&oo->oo_owner.so_count);
 		release_openowner(oo);
 	}
 	nfsd4_shutdown_callback(clp);
@@ -3042,7 +3043,7 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 	oo->oo_owner.so_unhash = nfs4_unhash_openowner;
 	oo->oo_owner.so_is_open_owner = 1;
 	oo->oo_owner.so_seqid = open->op_seqid;
-	oo->oo_flags = NFS4_OO_NEW;
+	oo->oo_flags = 0;
 	if (nfsd4_has_session(cstate))
 		oo->oo_flags |= NFS4_OO_CONFIRMED;
 	oo->oo_time = 0;
@@ -3059,6 +3060,7 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 	stp->st_stid.sc_type = NFS4_OPEN_STID;
 	INIT_LIST_HEAD(&stp->st_locks);
 	stp->st_stateowner = &oo->oo_owner;
+	atomic_inc(&stp->st_stateowner->so_count);
 	get_nfs4_file(fp);
 	stp->st_stid.sc_file = fp;
 	stp->st_access_bmap = 0;
@@ -3074,13 +3076,27 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 	spin_unlock(&oo->oo_owner.so_client->cl_lock);
 }
 
+/*
+ * In the 4.0 case we need to keep the owners around a little while to handle
+ * CLOSE replay. We still do need to release any file access that is held by
+ * them before returning however.
+ */
 static void
-move_to_close_lru(struct nfs4_openowner *oo, struct net *net)
+move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
 {
-	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+	struct nfs4_openowner *oo = openowner(s->st_stateowner);
+	struct nfsd_net *nn = net_generic(s->st_stid.sc_client->net,
+						nfsd_net_id);
 
 	dprintk("NFSD: move_to_close_lru nfs4_openowner %p\n", oo);
 
+	release_all_access(s);
+	if (s->st_stid.sc_file) {
+		put_nfs4_file(s->st_stid.sc_file);
+		s->st_stid.sc_file = NULL;
+	}
+	release_last_closed_stateid(oo);
+	oo->oo_last_closed_stid = s;
 	list_move_tail(&oo->oo_close_lru, &nn->close_lru);
 	oo->oo_time = get_seconds();
 }
@@ -3111,6 +3127,7 @@ find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
 			if ((bool)clp->cl_minorversion != sessions)
 				return NULL;
 			renew_client(oo->oo_owner.so_client);
+			atomic_inc(&oo->oo_owner.so_count);
 			return oo;
 		}
 	}
@@ -3881,19 +3898,10 @@ void nfsd4_cleanup_open_state(struct nfsd4_compound_state *cstate,
 			      struct nfsd4_open *open, __be32 status)
 {
 	if (open->op_openowner) {
-		struct nfs4_openowner *oo = open->op_openowner;
-
-		if (!list_empty(&oo->oo_owner.so_stateids))
-			list_del_init(&oo->oo_close_lru);
-		if (oo->oo_flags & NFS4_OO_NEW) {
-			if (status) {
-				release_openowner(oo);
-				open->op_openowner = NULL;
-			} else
-				oo->oo_flags &= ~NFS4_OO_NEW;
-		}
-		if (open->op_openowner)
-			nfsd4_cstate_assign_replay(cstate, &oo->oo_owner);
+		struct nfs4_stateowner *so = &open->op_openowner->oo_owner;
+
+		nfsd4_cstate_assign_replay(cstate, so);
+		nfs4_put_stateowner(so);
 	}
 	if (open->op_file)
 		nfsd4_free_file(open->op_file);
@@ -4007,7 +4015,7 @@ nfs4_laundromat(struct nfsd_net *nn)
 			new_timeo = min(new_timeo, t);
 			break;
 		}
-		release_openowner(oo);
+		release_last_closed_stateid(oo);
 	}
 	new_timeo = max_t(time_t, new_timeo, NFSD_LAUNDROMAT_MINTIMEOUT);
 	nfs4_unlock_state();
@@ -4570,31 +4578,14 @@ out:
 static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 {
 	struct nfs4_client *clp = s->st_stid.sc_client;
-	struct nfs4_openowner *oo = openowner(s->st_stateowner);
 
 	s->st_stid.sc_type = NFS4_CLOSED_STID;
 	unhash_open_stateid(s);
 
-	if (clp->cl_minorversion) {
-		if (list_empty(&oo->oo_owner.so_stateids))
-			release_openowner(oo);
+	if (clp->cl_minorversion)
 		put_generic_stateid(s);
-	} else {
-		if (s->st_stid.sc_file) {
-			put_nfs4_file(s->st_stid.sc_file);
-			s->st_stid.sc_file = NULL;
-		}
-		oo->oo_last_closed_stid = s;
-		/*
-		 * In the 4.0 case we need to keep the owners around a
-		 * little while to handle CLOSE replay. We still do need
-		 * to release any file access that is held by them
-		 * before returning however.
-		 */
-		release_all_access(s);
-		if (list_empty(&oo->oo_owner.so_stateids))
-			move_to_close_lru(oo, clp->net);
-	}
+	else
+		move_to_close_lru(s, clp->net);
 }
 
 /*
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index f6639fb5a56f..9e9e45278b40 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -364,7 +364,6 @@ struct nfs4_openowner {
 	struct nfs4_ol_stateid *oo_last_closed_stid;
 	time_t			oo_time; /* time of placement on so_close_lru */
 #define NFS4_OO_CONFIRMED   1
-#define NFS4_OO_NEW         4
 	unsigned char		oo_flags;
 };
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 053/100] nfsd: don't allow CLOSE to proceed until refcount on stateid drops
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (51 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 052/100] nfsd: make openstateids hold references to their openowners Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 054/100] nfsd: Protect adding/removing open state owners using client_lock Jeff Layton
                   ` (46 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Once we remove client_mutex protection, it'll be possible to have an
in-flight operation using an openstateid when a CLOSE call comes in.
If that happens, we can't just put the sc_file reference and clear its
pointer without risking an oops.

Fix this by ensuring that v4.0 CLOSE operations wait for the refcount
to drop before proceeding to do so.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2a03efc42803..296d75d93de4 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -86,6 +86,12 @@ static DEFINE_MUTEX(client_mutex);
  */
 static DEFINE_SPINLOCK(state_lock);
 
+/*
+ * A waitqueue for all in-progress 4.0 CLOSE operations that are waiting for
+ * the refcount on the open stateid to drop.
+ */
+static DECLARE_WAIT_QUEUE_HEAD(close_wq);
+
 static struct kmem_cache *openowner_slab;
 static struct kmem_cache *lockowner_slab;
 static struct kmem_cache *file_slab;
@@ -662,8 +668,10 @@ static void nfs4_put_stid(struct nfs4_stid *s)
 
 	might_lock(&clp->cl_lock);
 
-	if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock))
+	if (!atomic_dec_and_lock(&s->sc_count, &clp->cl_lock)) {
+		wake_up_all(&close_wq);
 		return;
+	}
 	remove_stid_locked(clp, s);
 	spin_unlock(&clp->cl_lock);
 	s->sc_free(s);
@@ -3090,11 +3098,23 @@ move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
 
 	dprintk("NFSD: move_to_close_lru nfs4_openowner %p\n", oo);
 
+	/*
+	 * We know that we hold one reference via nfsd4_close, and another
+	 * "persistent" reference for the client. If the refcount is higher
+	 * than 2, then there are still calls in progress that are using this
+	 * stateid. We can't put the sc_file reference until they are finished.
+	 * Wait for the refcount to drop to 2. Since it has been unhashed,
+	 * there should be no danger of the refcount going back up again at
+	 * this point.
+	 */
+	wait_event(close_wq, atomic_read(&s->st_stid.sc_count) == 2);
+
 	release_all_access(s);
 	if (s->st_stid.sc_file) {
 		put_nfs4_file(s->st_stid.sc_file);
 		s->st_stid.sc_file = NULL;
 	}
+
 	release_last_closed_stateid(oo);
 	oo->oo_last_closed_stid = s;
 	list_move_tail(&oo->oo_close_lru, &nn->close_lru);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 054/100] nfsd: Protect adding/removing open state owners using client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (52 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 053/100] nfsd: don't allow CLOSE to proceed until refcount on stateid drops Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 055/100] nfsd: Protect adding/removing lock " Jeff Layton
                   ` (45 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Once we remove client mutex protection, we'll need to ensure that
stateowner lookup and creation are atomic between concurrent compounds.
Ensure that alloc_init_open_stateowner checks the hashtable under the
client_lock before adding a new element.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 63 +++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 296d75d93de4..743a48b6e84a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -72,6 +72,9 @@ static u64 current_sessionid = 1;
 /* forward declarations */
 static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
 static void nfs4_free_generic_stateid(struct nfs4_stid *stid);
+static struct nfs4_openowner *find_openstateowner_str_locked(
+		unsigned int hashval, struct nfsd4_open *open,
+		bool sessions, struct nfsd_net *nn);
 static void nfs4_put_stateowner(struct nfs4_stateowner *sop);
 
 /* Locking: */
@@ -1013,8 +1016,13 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
 	put_generic_stateid(stp);
 }
 
-static void unhash_openowner(struct nfs4_openowner *oo)
+static void unhash_openowner_locked(struct nfs4_openowner *oo)
 {
+	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+						nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
+
 	list_del_init(&oo->oo_owner.so_strhash);
 	list_del_init(&oo->oo_perclient);
 }
@@ -1033,18 +1041,29 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 static void release_openowner_stateids(struct nfs4_openowner *oo)
 {
 	struct nfs4_ol_stateid *stp;
+	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+						nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
 
 	while (!list_empty(&oo->oo_owner.so_stateids)) {
 		stp = list_first_entry(&oo->oo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
+		spin_unlock(&nn->client_lock);
 		release_open_stateid(stp);
+		spin_lock(&nn->client_lock);
 	}
 }
 
 static void release_openowner(struct nfs4_openowner *oo)
 {
-	unhash_openowner(oo);
+	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+						nfsd_net_id);
+
+	spin_lock(&nn->client_lock);
+	unhash_openowner_locked(oo);
 	release_openowner_stateids(oo);
+	spin_unlock(&nn->client_lock);
 	release_last_closed_stateid(oo);
 	nfs4_put_stateowner(&oo->oo_owner);
 }
@@ -3025,8 +3044,11 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
 static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
 {
 	struct nfs4_openowner *oo = openowner(so);
+	struct nfsd_net *nn = net_generic(so->so_client->net, nfsd_net_id);
 
-	unhash_openowner(oo);
+	spin_lock(&nn->client_lock);
+	unhash_openowner_locked(oo);
+	spin_unlock(&nn->client_lock);
 }
 
 static void nfs4_free_openowner(struct nfs4_stateowner *so)
@@ -3042,7 +3064,8 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 			   struct nfsd4_compound_state *cstate)
 {
 	struct nfs4_client *clp = cstate->clp;
-	struct nfs4_openowner *oo;
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+	struct nfs4_openowner *oo, *ret;
 
 	oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
 	if (!oo)
@@ -3057,7 +3080,15 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 	oo->oo_time = 0;
 	oo->oo_last_closed_stid = NULL;
 	INIT_LIST_HEAD(&oo->oo_close_lru);
-	hash_openowner(oo, clp, strhashval);
+	spin_lock(&nn->client_lock);
+	ret = find_openstateowner_str_locked(strhashval,
+			open, clp->cl_minorversion, nn);
+	if (ret == NULL) {
+		hash_openowner(oo, clp, strhashval);
+		ret = oo;
+	} else
+		nfs4_free_openowner(&oo->oo_owner);
+	spin_unlock(&nn->client_lock);
 	return oo;
 }
 
@@ -3131,13 +3162,15 @@ same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner,
 }
 
 static struct nfs4_openowner *
-find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
+find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
 			bool sessions, struct nfsd_net *nn)
 {
 	struct nfs4_stateowner *so;
 	struct nfs4_openowner *oo;
 	struct nfs4_client *clp;
 
+	lockdep_assert_held(&nn->client_lock);
+
 	list_for_each_entry(so, &nn->ownerstr_hashtbl[hashval], so_strhash) {
 		if (!so->so_is_open_owner)
 			continue;
@@ -3145,15 +3178,27 @@ find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
 			oo = openowner(so);
 			clp = oo->oo_owner.so_client;
 			if ((bool)clp->cl_minorversion != sessions)
-				return NULL;
-			renew_client(oo->oo_owner.so_client);
-			atomic_inc(&oo->oo_owner.so_count);
+				break;
+			renew_client_locked(clp);
+			atomic_inc(&so->so_count);
 			return oo;
 		}
 	}
 	return NULL;
 }
 
+static struct nfs4_openowner *
+find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
+			bool sessions, struct nfsd_net *nn)
+{
+	struct nfs4_openowner *oo;
+
+	spin_lock(&nn->client_lock);
+	oo = find_openstateowner_str_locked(hashval, open, sessions, nn);
+	spin_unlock(&nn->client_lock);
+	return oo;
+}
+
 /* search file_hashtbl[] for file */
 static struct nfs4_file *
 find_file_locked(struct inode *ino)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 055/100] nfsd: Protect adding/removing lock owners using client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (53 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 054/100] nfsd: Protect adding/removing open state owners using client_lock Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 056/100] nfsd: Move the open owner hash table into struct nfs4_client Jeff Layton
                   ` (44 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Once we remove client mutex protection, we'll need to ensure that
stateowner lookup and creation are atomic between concurrent compounds.
Ensure that alloc_init_lock_stateowner checks the hashtable under the
client_lock before adding a new element.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 61 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 743a48b6e84a..18ab6550d65d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -964,26 +964,42 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
 	put_generic_stateid(stp);
 }
 
-static void unhash_lockowner(struct nfs4_lockowner *lo)
+static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
 {
+	struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
+						nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
+
 	list_del_init(&lo->lo_owner.so_strhash);
 }
 
 static void release_lockowner_stateids(struct nfs4_lockowner *lo)
 {
+	struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
+						nfsd_net_id);
 	struct nfs4_ol_stateid *stp;
 
+	lockdep_assert_held(&nn->client_lock);
+
 	while (!list_empty(&lo->lo_owner.so_stateids)) {
 		stp = list_first_entry(&lo->lo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
+		spin_unlock(&nn->client_lock);
 		release_lock_stateid(stp);
+		spin_lock(&nn->client_lock);
 	}
 }
 
 static void release_lockowner(struct nfs4_lockowner *lo)
 {
-	unhash_lockowner(lo);
+	struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
+						nfsd_net_id);
+
+	spin_lock(&nn->client_lock);
+	unhash_lockowner_locked(lo);
 	release_lockowner_stateids(lo);
+	spin_unlock(&nn->client_lock);
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
@@ -4795,7 +4811,7 @@ nevermind:
 }
 
 static struct nfs4_lockowner *
-find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
+find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
 		struct nfsd_net *nn)
 {
 	unsigned int strhashval = ownerstr_hashval(clid->cl_id, owner);
@@ -4812,9 +4828,25 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
 	return NULL;
 }
 
+static struct nfs4_lockowner *
+find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
+		struct nfsd_net *nn)
+{
+	struct nfs4_lockowner *lo;
+
+	spin_lock(&nn->client_lock);
+	lo = find_lockowner_str_locked(clid, owner, nn);
+	spin_unlock(&nn->client_lock);
+	return lo;
+}
+
 static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
 {
-	unhash_lockowner(lockowner(sop));
+	struct nfsd_net *nn = net_generic(sop->so_client->net, nfsd_net_id);
+
+	spin_lock(&nn->client_lock);
+	unhash_lockowner_locked(lockowner(sop));
+	spin_unlock(&nn->client_lock);
 }
 
 static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
@@ -4833,9 +4865,12 @@ static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
  * strhashval = ownerstr_hashval
  */
 static struct nfs4_lockowner *
-alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, struct nfs4_ol_stateid *open_stp, struct nfsd4_lock *lock) {
-	struct nfs4_lockowner *lo;
+alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
+			   struct nfs4_ol_stateid *open_stp,
+			   struct nfsd4_lock *lock)
+{
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+	struct nfs4_lockowner *lo, *ret;
 
 	lo = alloc_stateowner(lockowner_slab, &lock->lk_new_owner, clp);
 	if (!lo)
@@ -4845,7 +4880,16 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp, str
 	lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
 	lo->lo_owner.so_free = nfs4_free_lockowner;
 	lo->lo_owner.so_unhash = nfs4_unhash_lockowner;
-	list_add(&lo->lo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
+	spin_lock(&nn->client_lock);
+	ret = find_lockowner_str_locked(&clp->cl_clientid,
+			&lock->lk_new_owner, nn);
+	if (ret == NULL) {
+		list_add(&lo->lo_owner.so_strhash,
+			 &nn->ownerstr_hashtbl[strhashval]);
+		ret = lo;
+	} else
+		nfs4_free_lockowner(&lo->lo_owner);
+	spin_unlock(&nn->client_lock);
 	return lo;
 }
 
@@ -5373,6 +5417,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	unsigned int hashval = ownerstr_hashval(clid->cl_id, owner);
 	__be32 status;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+	struct nfs4_client *clp;
 
 	dprintk("nfsd4_release_lockowner clientid: (%08x/%08x):\n",
 		clid->cl_boot, clid->cl_id);
@@ -5386,6 +5431,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	status = nfserr_locks_held;
 
 	/* Find the matching lock stateowner */
+	spin_lock(&nn->client_lock);
 	list_for_each_entry(tmp, &nn->ownerstr_hashtbl[hashval], so_strhash) {
 		if (tmp->so_is_open_owner)
 			continue;
@@ -5395,6 +5441,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 			break;
 		}
 	}
+	spin_unlock(&nn->client_lock);
 
 	/* No matching owner found, maybe a replay? Just declare victory... */
 	if (!sop) {
@@ -5404,16 +5451,22 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 
 	lo = lockowner(sop);
 	/* see if there are still any locks associated with it */
+	clp = cstate->clp;
+	spin_lock(&clp->cl_lock);
 	list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
 		if (check_for_locks(stp->st_stid.sc_file, lo)) {
-			nfs4_put_stateowner(sop);
+			spin_unlock(&clp->cl_lock);
 			goto out;
 		}
 	}
+	spin_unlock(&clp->cl_lock);
 
 	status = nfs_ok;
+	sop = NULL;
 	release_lockowner(lo);
 out:
+	if (sop)
+		nfs4_put_stateowner(sop);
 	nfs4_unlock_state();
 	return status;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 056/100] nfsd: Move the open owner hash table into struct nfs4_client
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (54 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 055/100] nfsd: Protect adding/removing lock " Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 057/100] nfsd: clean up and reorganize release_lockowner Jeff Layton
                   ` (43 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Preparation for removing the client_mutex.

Convert the open owner hash table into a per-client table and protect it
using the nfs4_client->cl_lock spin lock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/netns.h     |   1 -
 fs/nfsd/nfs4state.c | 188 ++++++++++++++++++++++++----------------------------
 fs/nfsd/state.h     |   1 +
 3 files changed, 87 insertions(+), 103 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index a71d14413d39..e1f479c162b5 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -63,7 +63,6 @@ struct nfsd_net {
 	struct rb_root conf_name_tree;
 	struct list_head *unconf_id_hashtbl;
 	struct rb_root unconf_name_tree;
-	struct list_head *ownerstr_hashtbl;
 	struct list_head *sessionid_hashtbl;
 	/*
 	 * client_lru holds client queue ordered by nfs4_client.cl_time
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 18ab6550d65d..93d526bca290 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -74,7 +74,7 @@ static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner
 static void nfs4_free_generic_stateid(struct nfs4_stid *stid);
 static struct nfs4_openowner *find_openstateowner_str_locked(
 		unsigned int hashval, struct nfsd4_open *open,
-		bool sessions, struct nfsd_net *nn);
+		struct nfs4_client *clp);
 static void nfs4_put_stateowner(struct nfs4_stateowner *sop);
 
 /* Locking: */
@@ -364,12 +364,11 @@ unsigned long max_delegations;
 #define OWNER_HASH_SIZE             (1 << OWNER_HASH_BITS)
 #define OWNER_HASH_MASK             (OWNER_HASH_SIZE - 1)
 
-static unsigned int ownerstr_hashval(u32 clientid, struct xdr_netobj *ownername)
+static unsigned int ownerstr_hashval(struct xdr_netobj *ownername)
 {
 	unsigned int ret;
 
 	ret = opaque_hashval(ownername->data, ownername->len);
-	ret += clientid;
 	return ret & OWNER_HASH_MASK;
 }
 
@@ -966,40 +965,37 @@ static void release_lock_stateid(struct nfs4_ol_stateid *stp)
 
 static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
 {
-	struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
-						nfsd_net_id);
+	struct nfs4_client *clp = lo->lo_owner.so_client;
 
-	lockdep_assert_held(&nn->client_lock);
+	lockdep_assert_held(&clp->cl_lock);
 
 	list_del_init(&lo->lo_owner.so_strhash);
 }
 
 static void release_lockowner_stateids(struct nfs4_lockowner *lo)
 {
-	struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
-						nfsd_net_id);
+	struct nfs4_client *clp = lo->lo_owner.so_client;
 	struct nfs4_ol_stateid *stp;
 
-	lockdep_assert_held(&nn->client_lock);
+	lockdep_assert_held(&clp->cl_lock);
 
 	while (!list_empty(&lo->lo_owner.so_stateids)) {
 		stp = list_first_entry(&lo->lo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
-		spin_unlock(&nn->client_lock);
+		spin_unlock(&clp->cl_lock);
 		release_lock_stateid(stp);
-		spin_lock(&nn->client_lock);
+		spin_lock(&clp->cl_lock);
 	}
 }
 
 static void release_lockowner(struct nfs4_lockowner *lo)
 {
-	struct nfsd_net *nn = net_generic(lo->lo_owner.so_client->net,
-						nfsd_net_id);
+	struct nfs4_client *clp = lo->lo_owner.so_client;
 
-	spin_lock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
 	unhash_lockowner_locked(lo);
 	release_lockowner_stateids(lo);
-	spin_unlock(&nn->client_lock);
+	spin_unlock(&clp->cl_lock);
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
@@ -1034,10 +1030,9 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
 
 static void unhash_openowner_locked(struct nfs4_openowner *oo)
 {
-	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
-						nfsd_net_id);
+	struct nfs4_client *clp = oo->oo_owner.so_client;
 
-	lockdep_assert_held(&nn->client_lock);
+	lockdep_assert_held(&clp->cl_lock);
 
 	list_del_init(&oo->oo_owner.so_strhash);
 	list_del_init(&oo->oo_perclient);
@@ -1057,29 +1052,27 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 static void release_openowner_stateids(struct nfs4_openowner *oo)
 {
 	struct nfs4_ol_stateid *stp;
-	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
-						nfsd_net_id);
+	struct nfs4_client *clp = oo->oo_owner.so_client;
 
-	lockdep_assert_held(&nn->client_lock);
+	lockdep_assert_held(&clp->cl_lock);
 
 	while (!list_empty(&oo->oo_owner.so_stateids)) {
 		stp = list_first_entry(&oo->oo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
-		spin_unlock(&nn->client_lock);
+		spin_unlock(&clp->cl_lock);
 		release_open_stateid(stp);
-		spin_lock(&nn->client_lock);
+		spin_lock(&clp->cl_lock);
 	}
 }
 
 static void release_openowner(struct nfs4_openowner *oo)
 {
-	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
-						nfsd_net_id);
+	struct nfs4_client *clp = oo->oo_owner.so_client;
 
-	spin_lock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
 	unhash_openowner_locked(oo);
 	release_openowner_stateids(oo);
-	spin_unlock(&nn->client_lock);
+	spin_unlock(&clp->cl_lock);
 	release_last_closed_stateid(oo);
 	nfs4_put_stateowner(&oo->oo_owner);
 }
@@ -1463,15 +1456,20 @@ STALE_CLIENTID(clientid_t *clid, struct nfsd_net *nn)
 static struct nfs4_client *alloc_client(struct xdr_netobj name)
 {
 	struct nfs4_client *clp;
+	int i;
 
 	clp = kzalloc(sizeof(struct nfs4_client), GFP_KERNEL);
 	if (clp == NULL)
 		return NULL;
 	clp->cl_name.data = kmemdup(name.data, name.len, GFP_KERNEL);
-	if (clp->cl_name.data == NULL) {
-		kfree(clp);
-		return NULL;
-	}
+	if (clp->cl_name.data == NULL)
+		goto err_no_name;
+	clp->cl_ownerstr_hashtbl = kmalloc(sizeof(struct list_head) *
+			OWNER_HASH_SIZE, GFP_KERNEL);
+	if (!clp->cl_ownerstr_hashtbl)
+		goto err_no_hashtbl;
+	for (i = 0; i < OWNER_HASH_SIZE; i++)
+		INIT_LIST_HEAD(&clp->cl_ownerstr_hashtbl[i]);
 	clp->cl_name.len = name.len;
 	INIT_LIST_HEAD(&clp->cl_sessions);
 	idr_init(&clp->cl_stateids);
@@ -1486,6 +1484,11 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
 	spin_lock_init(&clp->cl_lock);
 	rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
 	return clp;
+err_no_hashtbl:
+	kfree(clp->cl_name.data);
+err_no_name:
+	kfree(clp);
+	return NULL;
 }
 
 static void
@@ -1504,6 +1507,7 @@ free_client(struct nfs4_client *clp)
 	}
 	rpc_destroy_wait_queue(&clp->cl_cb_waitq);
 	free_svc_cred(&clp->cl_cred);
+	kfree(clp->cl_ownerstr_hashtbl);
 	kfree(clp->cl_name.data);
 	spin_lock(&clp->cl_lock);
 	idr_destroy(&clp->cl_stateids);
@@ -3051,20 +3055,20 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
 
 static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, unsigned int strhashval)
 {
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+	lockdep_assert_held(&clp->cl_lock);
 
-	list_add(&oo->oo_owner.so_strhash, &nn->ownerstr_hashtbl[strhashval]);
+	list_add(&oo->oo_owner.so_strhash,
+		 &clp->cl_ownerstr_hashtbl[strhashval]);
 	list_add(&oo->oo_perclient, &clp->cl_openowners);
 }
 
 static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
 {
-	struct nfs4_openowner *oo = openowner(so);
-	struct nfsd_net *nn = net_generic(so->so_client->net, nfsd_net_id);
+	struct nfs4_client *clp = so->so_client;
 
-	spin_lock(&nn->client_lock);
-	unhash_openowner_locked(oo);
-	spin_unlock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
+	unhash_openowner_locked(openowner(so));
+	spin_unlock(&clp->cl_lock);
 }
 
 static void nfs4_free_openowner(struct nfs4_stateowner *so)
@@ -3080,7 +3084,6 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 			   struct nfsd4_compound_state *cstate)
 {
 	struct nfs4_client *clp = cstate->clp;
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 	struct nfs4_openowner *oo, *ret;
 
 	oo = alloc_stateowner(openowner_slab, &open->op_owner, clp);
@@ -3096,15 +3099,14 @@ alloc_init_open_stateowner(unsigned int strhashval, struct nfsd4_open *open,
 	oo->oo_time = 0;
 	oo->oo_last_closed_stid = NULL;
 	INIT_LIST_HEAD(&oo->oo_close_lru);
-	spin_lock(&nn->client_lock);
-	ret = find_openstateowner_str_locked(strhashval,
-			open, clp->cl_minorversion, nn);
+	spin_lock(&clp->cl_lock);
+	ret = find_openstateowner_str_locked(strhashval, open, clp);
 	if (ret == NULL) {
 		hash_openowner(oo, clp, strhashval);
 		ret = oo;
 	} else
 		nfs4_free_openowner(&oo->oo_owner);
-	spin_unlock(&nn->client_lock);
+	spin_unlock(&clp->cl_lock);
 	return oo;
 }
 
@@ -3169,35 +3171,27 @@ move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
 }
 
 static int
-same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner,
-							clientid_t *clid)
+same_owner_str(struct nfs4_stateowner *sop, struct xdr_netobj *owner)
 {
 	return (sop->so_owner.len == owner->len) &&
-		0 == memcmp(sop->so_owner.data, owner->data, owner->len) &&
-		(sop->so_client->cl_clientid.cl_id == clid->cl_id);
+		0 == memcmp(sop->so_owner.data, owner->data, owner->len);
 }
 
 static struct nfs4_openowner *
 find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
-			bool sessions, struct nfsd_net *nn)
+			struct nfs4_client *clp)
 {
 	struct nfs4_stateowner *so;
-	struct nfs4_openowner *oo;
-	struct nfs4_client *clp;
 
-	lockdep_assert_held(&nn->client_lock);
+	lockdep_assert_held(&clp->cl_lock);
 
-	list_for_each_entry(so, &nn->ownerstr_hashtbl[hashval], so_strhash) {
+	list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[hashval],
+			    so_strhash) {
 		if (!so->so_is_open_owner)
 			continue;
-		if (same_owner_str(so, &open->op_owner, &open->op_clientid)) {
-			oo = openowner(so);
-			clp = oo->oo_owner.so_client;
-			if ((bool)clp->cl_minorversion != sessions)
-				break;
-			renew_client_locked(clp);
+		if (same_owner_str(so, &open->op_owner)) {
 			atomic_inc(&so->so_count);
-			return oo;
+			return openowner(so);
 		}
 	}
 	return NULL;
@@ -3205,13 +3199,13 @@ find_openstateowner_str_locked(unsigned int hashval, struct nfsd4_open *open,
 
 static struct nfs4_openowner *
 find_openstateowner_str(unsigned int hashval, struct nfsd4_open *open,
-			bool sessions, struct nfsd_net *nn)
+			struct nfs4_client *clp)
 {
 	struct nfs4_openowner *oo;
 
-	spin_lock(&nn->client_lock);
-	oo = find_openstateowner_str_locked(hashval, open, sessions, nn);
-	spin_unlock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
+	oo = find_openstateowner_str_locked(hashval, open, clp);
+	spin_unlock(&clp->cl_lock);
 	return oo;
 }
 
@@ -3427,8 +3421,8 @@ nfsd4_process_open1(struct nfsd4_compound_state *cstate,
 		return status;
 	clp = cstate->clp;
 
-	strhashval = ownerstr_hashval(clientid->cl_id, &open->op_owner);
-	oo = find_openstateowner_str(strhashval, open, cstate->minorversion, nn);
+	strhashval = ownerstr_hashval(&open->op_owner);
+	oo = find_openstateowner_str(strhashval, open, clp);
 	open->op_openowner = oo;
 	if (!oo) {
 		goto new_owner;
@@ -4812,15 +4806,16 @@ nevermind:
 
 static struct nfs4_lockowner *
 find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
-		struct nfsd_net *nn)
+		struct nfs4_client *clp)
 {
-	unsigned int strhashval = ownerstr_hashval(clid->cl_id, owner);
+	unsigned int strhashval = ownerstr_hashval(owner);
 	struct nfs4_stateowner *so;
 
-	list_for_each_entry(so, &nn->ownerstr_hashtbl[strhashval], so_strhash) {
+	list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[strhashval],
+			    so_strhash) {
 		if (so->so_is_open_owner)
 			continue;
-		if (!same_owner_str(so, owner, clid))
+		if (!same_owner_str(so, owner))
 			continue;
 		atomic_inc(&so->so_count);
 		return lockowner(so);
@@ -4830,23 +4825,23 @@ find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
 
 static struct nfs4_lockowner *
 find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
-		struct nfsd_net *nn)
+		struct nfs4_client *clp)
 {
 	struct nfs4_lockowner *lo;
 
-	spin_lock(&nn->client_lock);
-	lo = find_lockowner_str_locked(clid, owner, nn);
-	spin_unlock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
+	lo = find_lockowner_str_locked(clid, owner, clp);
+	spin_unlock(&clp->cl_lock);
 	return lo;
 }
 
 static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
 {
-	struct nfsd_net *nn = net_generic(sop->so_client->net, nfsd_net_id);
+	struct nfs4_client *clp = sop->so_client;
 
-	spin_lock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
 	unhash_lockowner_locked(lockowner(sop));
-	spin_unlock(&nn->client_lock);
+	spin_unlock(&clp->cl_lock);
 }
 
 static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
@@ -4869,7 +4864,6 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
 			   struct nfs4_ol_stateid *open_stp,
 			   struct nfsd4_lock *lock)
 {
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 	struct nfs4_lockowner *lo, *ret;
 
 	lo = alloc_stateowner(lockowner_slab, &lock->lk_new_owner, clp);
@@ -4880,16 +4874,16 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
 	lo->lo_owner.so_seqid = lock->lk_new_lock_seqid;
 	lo->lo_owner.so_free = nfs4_free_lockowner;
 	lo->lo_owner.so_unhash = nfs4_unhash_lockowner;
-	spin_lock(&nn->client_lock);
+	spin_lock(&clp->cl_lock);
 	ret = find_lockowner_str_locked(&clp->cl_clientid,
-			&lock->lk_new_owner, nn);
+			&lock->lk_new_owner, clp);
 	if (ret == NULL) {
 		list_add(&lo->lo_owner.so_strhash,
-			 &nn->ownerstr_hashtbl[strhashval]);
+			 &clp->cl_ownerstr_hashtbl[strhashval]);
 		ret = lo;
 	} else
 		nfs4_free_lockowner(&lo->lo_owner);
-	spin_unlock(&nn->client_lock);
+	spin_unlock(&clp->cl_lock);
 	return lo;
 }
 
@@ -4997,12 +4991,10 @@ static __be32 lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
 	struct nfs4_client *cl = oo->oo_owner.so_client;
 	struct nfs4_lockowner *lo;
 	unsigned int strhashval;
-	struct nfsd_net *nn = net_generic(cl->net, nfsd_net_id);
 
-	lo = find_lockowner_str(&cl->cl_clientid, &lock->v.new.owner, nn);
+	lo = find_lockowner_str(&cl->cl_clientid, &lock->v.new.owner, cl);
 	if (!lo) {
-		strhashval = ownerstr_hashval(cl->cl_clientid.cl_id,
-				&lock->v.new.owner);
+		strhashval = ownerstr_hashval(&lock->v.new.owner);
 		lo = alloc_init_lock_stateowner(strhashval, cl, ost, lock);
 		if (lo == NULL)
 			return nfserr_jukebox;
@@ -5280,7 +5272,8 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		goto out;
 	}
 
-	lo = find_lockowner_str(&lockt->lt_clientid, &lockt->lt_owner, nn);
+	lo = find_lockowner_str(&lockt->lt_clientid, &lockt->lt_owner,
+				cstate->clp);
 	if (lo)
 		file_lock->fl_owner = (fl_owner_t)lo;
 	file_lock->fl_pid = current->tgid;
@@ -5414,7 +5407,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	struct nfs4_lockowner *lo;
 	struct nfs4_ol_stateid *stp;
 	struct xdr_netobj *owner = &rlockowner->rl_owner;
-	unsigned int hashval = ownerstr_hashval(clid->cl_id, owner);
+	unsigned int hashval = ownerstr_hashval(owner);
 	__be32 status;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 	struct nfs4_client *clp;
@@ -5430,29 +5423,29 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 
 	status = nfserr_locks_held;
 
+	clp = cstate->clp;
 	/* Find the matching lock stateowner */
-	spin_lock(&nn->client_lock);
-	list_for_each_entry(tmp, &nn->ownerstr_hashtbl[hashval], so_strhash) {
+	spin_lock(&clp->cl_lock);
+	list_for_each_entry(tmp, &clp->cl_ownerstr_hashtbl[hashval],
+			    so_strhash) {
 		if (tmp->so_is_open_owner)
 			continue;
-		if (same_owner_str(tmp, owner, clid)) {
+		if (same_owner_str(tmp, owner)) {
 			sop = tmp;
 			atomic_inc(&sop->so_count);
 			break;
 		}
 	}
-	spin_unlock(&nn->client_lock);
 
 	/* No matching owner found, maybe a replay? Just declare victory... */
 	if (!sop) {
+		spin_unlock(&clp->cl_lock);
 		status = nfs_ok;
 		goto out;
 	}
 
 	lo = lockowner(sop);
 	/* see if there are still any locks associated with it */
-	clp = cstate->clp;
-	spin_lock(&clp->cl_lock);
 	list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
 		if (check_for_locks(stp->st_stid.sc_file, lo)) {
 			spin_unlock(&clp->cl_lock);
@@ -5810,10 +5803,6 @@ static int nfs4_state_create_net(struct net *net)
 			CLIENT_HASH_SIZE, GFP_KERNEL);
 	if (!nn->unconf_id_hashtbl)
 		goto err_unconf_id;
-	nn->ownerstr_hashtbl = kmalloc(sizeof(struct list_head) *
-			OWNER_HASH_SIZE, GFP_KERNEL);
-	if (!nn->ownerstr_hashtbl)
-		goto err_ownerstr;
 	nn->sessionid_hashtbl = kmalloc(sizeof(struct list_head) *
 			SESSION_HASH_SIZE, GFP_KERNEL);
 	if (!nn->sessionid_hashtbl)
@@ -5823,8 +5812,6 @@ static int nfs4_state_create_net(struct net *net)
 		INIT_LIST_HEAD(&nn->conf_id_hashtbl[i]);
 		INIT_LIST_HEAD(&nn->unconf_id_hashtbl[i]);
 	}
-	for (i = 0; i < OWNER_HASH_SIZE; i++)
-		INIT_LIST_HEAD(&nn->ownerstr_hashtbl[i]);
 	for (i = 0; i < SESSION_HASH_SIZE; i++)
 		INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]);
 	nn->conf_name_tree = RB_ROOT;
@@ -5840,8 +5827,6 @@ static int nfs4_state_create_net(struct net *net)
 	return 0;
 
 err_sessionid:
-	kfree(nn->ownerstr_hashtbl);
-err_ownerstr:
 	kfree(nn->unconf_id_hashtbl);
 err_unconf_id:
 	kfree(nn->conf_id_hashtbl);
@@ -5871,7 +5856,6 @@ nfs4_state_destroy_net(struct net *net)
 	}
 
 	kfree(nn->sessionid_hashtbl);
-	kfree(nn->ownerstr_hashtbl);
 	kfree(nn->unconf_id_hashtbl);
 	kfree(nn->conf_id_hashtbl);
 	put_net(net);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 9e9e45278b40..7e395f665b0f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -237,6 +237,7 @@ struct nfsd4_sessionid {
 struct nfs4_client {
 	struct list_head	cl_idhash; 	/* hash by cl_clientid.id */
 	struct rb_node		cl_namenode;	/* link into by-name trees */
+	struct list_head	*cl_ownerstr_hashtbl;
 	struct list_head	cl_openowners;
 	struct idr		cl_stateids;	/* stateid lookup */
 	struct list_head	cl_delegations;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 057/100] nfsd: clean up and reorganize release_lockowner
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (55 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 056/100] nfsd: Move the open owner hash table into struct nfs4_client Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 058/100] nfsd: add locking to stateowner release Jeff Layton
                   ` (42 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Do more within the main loop, and simplify the function a bit. Also,
there's no need to take a stateowner reference unless we're going to call
release_lockowner.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 49 ++++++++++++++++++-------------------------------
 1 file changed, 18 insertions(+), 31 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 93d526bca290..fb2f3ec0708f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5403,8 +5403,8 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 			struct nfsd4_release_lockowner *rlockowner)
 {
 	clientid_t *clid = &rlockowner->rl_clientid;
-	struct nfs4_stateowner *sop = NULL, *tmp;
-	struct nfs4_lockowner *lo;
+	struct nfs4_stateowner *sop;
+	struct nfs4_lockowner *lo = NULL;
 	struct nfs4_ol_stateid *stp;
 	struct xdr_netobj *owner = &rlockowner->rl_owner;
 	unsigned int hashval = ownerstr_hashval(owner);
@@ -5421,45 +5421,32 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	if (status)
 		goto out;
 
-	status = nfserr_locks_held;
-
 	clp = cstate->clp;
 	/* Find the matching lock stateowner */
 	spin_lock(&clp->cl_lock);
-	list_for_each_entry(tmp, &clp->cl_ownerstr_hashtbl[hashval],
+	list_for_each_entry(sop, &clp->cl_ownerstr_hashtbl[hashval],
 			    so_strhash) {
-		if (tmp->so_is_open_owner)
-			continue;
-		if (same_owner_str(tmp, owner)) {
-			sop = tmp;
-			atomic_inc(&sop->so_count);
-			break;
-		}
-	}
 
-	/* No matching owner found, maybe a replay? Just declare victory... */
-	if (!sop) {
-		spin_unlock(&clp->cl_lock);
-		status = nfs_ok;
-		goto out;
-	}
+		if (sop->so_is_open_owner || !same_owner_str(sop, owner))
+			continue;
 
-	lo = lockowner(sop);
-	/* see if there are still any locks associated with it */
-	list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
-		if (check_for_locks(stp->st_stid.sc_file, lo)) {
-			spin_unlock(&clp->cl_lock);
-			goto out;
+		/* see if there are still any locks associated with it */
+		lo = lockowner(sop);
+		list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
+			if (check_for_locks(stp->st_stid.sc_file, lo)) {
+				status = nfserr_locks_held;
+				spin_unlock(&clp->cl_lock);
+				goto out;
+			}
 		}
+
+		atomic_inc(&sop->so_count);
+		break;
 	}
 	spin_unlock(&clp->cl_lock);
-
-	status = nfs_ok;
-	sop = NULL;
-	release_lockowner(lo);
+	if (lo)
+		release_lockowner(lo);
 out:
-	if (sop)
-		nfs4_put_stateowner(sop);
 	nfs4_unlock_state();
 	return status;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 058/100] nfsd: add locking to stateowner release
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (56 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 057/100] nfsd: clean up and reorganize release_lockowner Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 059/100] nfsd: optimize destroy_lockowner cl_lock thrashing Jeff Layton
                   ` (41 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Once we remove the client_mutex, we'll need to properly protect
the stateowner reference counts using the cl_lock.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fb2f3ec0708f..346bdf320a35 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3047,9 +3047,14 @@ static inline void *alloc_stateowner(struct kmem_cache *slab, struct xdr_netobj
 
 static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
 {
-	if (!atomic_dec_and_test(&sop->so_count))
+	struct nfs4_client *clp = sop->so_client;
+
+	might_lock(&clp->cl_lock);
+
+	if (!atomic_dec_and_lock(&sop->so_count, &clp->cl_lock))
 		return;
 	sop->so_unhash(sop);
+	spin_unlock(&clp->cl_lock);
 	sop->so_free(sop);
 }
 
@@ -3064,11 +3069,7 @@ static void hash_openowner(struct nfs4_openowner *oo, struct nfs4_client *clp, u
 
 static void nfs4_unhash_openowner(struct nfs4_stateowner *so)
 {
-	struct nfs4_client *clp = so->so_client;
-
-	spin_lock(&clp->cl_lock);
 	unhash_openowner_locked(openowner(so));
-	spin_unlock(&clp->cl_lock);
 }
 
 static void nfs4_free_openowner(struct nfs4_stateowner *so)
@@ -4837,11 +4838,7 @@ find_lockowner_str(clientid_t *clid, struct xdr_netobj *owner,
 
 static void nfs4_unhash_lockowner(struct nfs4_stateowner *sop)
 {
-	struct nfs4_client *clp = sop->so_client;
-
-	spin_lock(&clp->cl_lock);
 	unhash_lockowner_locked(lockowner(sop));
-	spin_unlock(&clp->cl_lock);
 }
 
 static void nfs4_free_lockowner(struct nfs4_stateowner *sop)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 059/100] nfsd: optimize destroy_lockowner cl_lock thrashing
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (57 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 058/100] nfsd: add locking to stateowner release Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 060/100] nfsd: close potential race in nfsd4_free_stateid Jeff Layton
                   ` (40 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Reduce the cl_lock trashing in destroy_lockowner. Unhash all of the
lockstateids on the lockowner's list. Put the reference under the lock
and see if it was the last one. If so, then add it to a private list
to be destroyed after we drop the lock.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 50 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 346bdf320a35..ab248ee470dd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -951,14 +951,23 @@ static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 	nfs4_put_stid(&stp->st_stid);
 }
 
-static void release_lock_stateid(struct nfs4_ol_stateid *stp)
+static void unhash_lock_stateid(struct nfs4_ol_stateid *stp)
 {
 	struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
 
-	spin_lock(&oo->oo_owner.so_client->cl_lock);
-	list_del(&stp->st_locks);
+	lockdep_assert_held(&oo->oo_owner.so_client->cl_lock);
+
+	list_del_init(&stp->st_locks);
 	unhash_generic_stateid(stp);
 	unhash_stid(&stp->st_stid);
+}
+
+static void release_lock_stateid(struct nfs4_ol_stateid *stp)
+{
+	struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
+
+	spin_lock(&oo->oo_owner.so_client->cl_lock);
+	unhash_lock_stateid(stp);
 	spin_unlock(&oo->oo_owner.so_client->cl_lock);
 	put_generic_stateid(stp);
 }
@@ -972,30 +981,37 @@ static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
 	list_del_init(&lo->lo_owner.so_strhash);
 }
 
-static void release_lockowner_stateids(struct nfs4_lockowner *lo)
+static void release_lockowner(struct nfs4_lockowner *lo)
 {
 	struct nfs4_client *clp = lo->lo_owner.so_client;
 	struct nfs4_ol_stateid *stp;
+	struct list_head reaplist;
 
-	lockdep_assert_held(&clp->cl_lock);
+	INIT_LIST_HEAD(&reaplist);
 
+	spin_lock(&clp->cl_lock);
+	unhash_lockowner_locked(lo);
 	while (!list_empty(&lo->lo_owner.so_stateids)) {
 		stp = list_first_entry(&lo->lo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
-		spin_unlock(&clp->cl_lock);
-		release_lock_stateid(stp);
-		spin_lock(&clp->cl_lock);
+		unhash_lock_stateid(stp);
+		/*
+		 * We now know that no new references can be added to the
+		 * stateid. If ours is the last one, finish the unhashing
+		 * and put it on the list to be reaped.
+		 */
+		if (atomic_dec_and_test(&stp->st_stid.sc_count)) {
+			remove_stid_locked(clp, &stp->st_stid);
+			list_add(&stp->st_locks, &reaplist);
+		}
 	}
-}
-
-static void release_lockowner(struct nfs4_lockowner *lo)
-{
-	struct nfs4_client *clp = lo->lo_owner.so_client;
-
-	spin_lock(&clp->cl_lock);
-	unhash_lockowner_locked(lo);
-	release_lockowner_stateids(lo);
 	spin_unlock(&clp->cl_lock);
+	while (!list_empty(&reaplist)) {
+		stp = list_first_entry(&reaplist, struct nfs4_ol_stateid,
+					st_locks);
+		list_del(&stp->st_locks);
+		stp->st_stid.sc_free(&stp->st_stid);
+	}
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 060/100] nfsd: close potential race in nfsd4_free_stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (58 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 059/100] nfsd: optimize destroy_lockowner cl_lock thrashing Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 061/100] nfsd: reduce cl_lock thrashing in release_openowner Jeff Layton
                   ` (39 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Once we remove the client_mutex, it'll be possible for the sc_type of a
lock stateid to change after it's found and checked, but before we can
go to destroy it. If that happens, we can end up putting the persistent
reference to the stateid more than once, and unhash it more than once.

Fix this by unhashing the lock stateid prior to dropping the cl_lock but
after finding it.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ab248ee470dd..cf90c0078503 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4393,17 +4393,6 @@ unlock_state:
 	return status;
 }
 
-static __be32
-nfsd4_free_lock_stateid(struct nfs4_ol_stateid *stp)
-{
-	struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);
-
-	if (check_for_locks(stp->st_stid.sc_file, lo))
-		return nfserr_locks_held;
-	release_lock_stateid(stp);
-	return nfs_ok;
-}
-
 /*
  * Test if the stateid is valid
  */
@@ -4430,6 +4419,7 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	stateid_t *stateid = &free_stateid->fr_stateid;
 	struct nfs4_stid *s;
 	struct nfs4_delegation *dp;
+	struct nfs4_ol_stateid *stp;
 	struct nfs4_client *cl = cstate->session->se_client;
 	__be32 ret = nfserr_bad_stateid;
 
@@ -4447,12 +4437,16 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
 		if (ret)
 			break;
-		if (s->sc_type != NFS4_LOCK_STID) {
-			ret = nfserr_locks_held;
+		stp = openlockstateid(s);
+		ret = nfserr_locks_held;
+		if (s->sc_type == NFS4_OPEN_STID ||
+		    check_for_locks(s->sc_file,
+				    lockowner(stp->st_stateowner)))
 			break;
-		}
+		unhash_lock_stateid(stp);
 		spin_unlock(&cl->cl_lock);
-		ret = nfsd4_free_lock_stateid(openlockstateid(s));
+		put_generic_stateid(stp);
+		ret = nfs_ok;
 		goto out;
 	case NFS4_REVOKED_DELEG_STID:
 		spin_unlock(&cl->cl_lock);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 061/100] nfsd: reduce cl_lock thrashing in release_openowner
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (59 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 060/100] nfsd: close potential race in nfsd4_free_stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 062/100] nfsd: don't thrash the cl_lock while freeing an open stateid Jeff Layton
                   ` (38 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Releasing an openowner is a bit inefficient as it can potentially thrash
the cl_lock if you have a lot of stateids attached to it. Once we remove
the client_mutex, it'll also potentially be dangerous to do this.

Add some functions to make it easier to defer the part of putting a
generic stateid reference that needs to be done outside the cl_lock while
doing the parts that must be done while holding it under a single lock.

First we unhash each open stateid. Then we call
put_generic_stateid_locked which will put the reference to an
nfs4_ol_stateid. If it turns out to be the last reference, it'll go
ahead and remove the stid from the IDR tree and put it onto the reaplist
using the st_locks list_head.

Then, after dropping the lock we'll call free_ol_stateid_reaplist to
walk the list of stateids that are fully unhashed and ready to be freed,
and free each of them. This function can sleep, so it must be done
outside any spinlocks.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 95 +++++++++++++++++++++++++++++++++++------------------
 1 file changed, 63 insertions(+), 32 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index cf90c0078503..da8e2ad8dd8e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -951,6 +951,30 @@ static void put_generic_stateid(struct nfs4_ol_stateid *stp)
 	nfs4_put_stid(&stp->st_stid);
 }
 
+/*
+ * Put the persistent reference to an already unhashed generic stateid, while
+ * holding the cl_lock. If it's the last reference, then put it onto the
+ * reaplist for later destruction.
+ */
+static void put_generic_stateid_locked(struct nfs4_ol_stateid *stp,
+				       struct list_head *reaplist)
+{
+	struct nfs4_stid *s = &stp->st_stid;
+	struct nfs4_client *clp = s->sc_client;
+
+	lockdep_assert_held(&clp->cl_lock);
+
+	WARN_ON_ONCE(!list_empty(&stp->st_locks));
+
+	if (!atomic_dec_and_test(&s->sc_count)) {
+		wake_up_all(&close_wq);
+		return;
+	}
+
+	remove_stid_locked(clp, s);
+	list_add(&stp->st_locks, reaplist);
+}
+
 static void unhash_lock_stateid(struct nfs4_ol_stateid *stp)
 {
 	struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
@@ -981,6 +1005,25 @@ static void unhash_lockowner_locked(struct nfs4_lockowner *lo)
 	list_del_init(&lo->lo_owner.so_strhash);
 }
 
+/*
+ * Free a list of generic stateids that were collected earlier after being
+ * fully unhashed.
+ */
+static void
+free_ol_stateid_reaplist(struct list_head *reaplist)
+{
+	struct nfs4_ol_stateid *stp;
+
+	might_sleep();
+
+	while (!list_empty(reaplist)) {
+		stp = list_first_entry(reaplist, struct nfs4_ol_stateid,
+				       st_locks);
+		list_del(&stp->st_locks);
+		stp->st_stid.sc_free(&stp->st_stid);
+	}
+}
+
 static void release_lockowner(struct nfs4_lockowner *lo)
 {
 	struct nfs4_client *clp = lo->lo_owner.so_client;
@@ -995,23 +1038,10 @@ static void release_lockowner(struct nfs4_lockowner *lo)
 		stp = list_first_entry(&lo->lo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
 		unhash_lock_stateid(stp);
-		/*
-		 * We now know that no new references can be added to the
-		 * stateid. If ours is the last one, finish the unhashing
-		 * and put it on the list to be reaped.
-		 */
-		if (atomic_dec_and_test(&stp->st_stid.sc_count)) {
-			remove_stid_locked(clp, &stp->st_stid);
-			list_add(&stp->st_locks, &reaplist);
-		}
+		put_generic_stateid_locked(stp, &reaplist);
 	}
 	spin_unlock(&clp->cl_lock);
-	while (!list_empty(&reaplist)) {
-		stp = list_first_entry(&reaplist, struct nfs4_ol_stateid,
-					st_locks);
-		list_del(&stp->st_locks);
-		stp->st_stid.sc_free(&stp->st_stid);
-	}
+	free_ol_stateid_reaplist(&reaplist);
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
@@ -1032,16 +1062,21 @@ static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
 
 static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
 {
-	spin_lock(&stp->st_stateowner->so_client->cl_lock);
+	lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);
+
 	unhash_generic_stateid(stp);
 	release_open_stateid_locks(stp);
-	spin_unlock(&stp->st_stateowner->so_client->cl_lock);
 }
 
 static void release_open_stateid(struct nfs4_ol_stateid *stp)
 {
+	LIST_HEAD(reaplist);
+
+	spin_lock(&stp->st_stid.sc_client->cl_lock);
 	unhash_open_stateid(stp);
-	put_generic_stateid(stp);
+	put_generic_stateid_locked(stp, &reaplist);
+	spin_unlock(&stp->st_stid.sc_client->cl_lock);
+	free_ol_stateid_reaplist(&reaplist);
 }
 
 static void unhash_openowner_locked(struct nfs4_openowner *oo)
@@ -1065,30 +1100,24 @@ static void release_last_closed_stateid(struct nfs4_openowner *oo)
 	}
 }
 
-static void release_openowner_stateids(struct nfs4_openowner *oo)
+static void release_openowner(struct nfs4_openowner *oo)
 {
 	struct nfs4_ol_stateid *stp;
 	struct nfs4_client *clp = oo->oo_owner.so_client;
+	struct list_head reaplist;
 
-	lockdep_assert_held(&clp->cl_lock);
+	INIT_LIST_HEAD(&reaplist);
 
+	spin_lock(&clp->cl_lock);
+	unhash_openowner_locked(oo);
 	while (!list_empty(&oo->oo_owner.so_stateids)) {
 		stp = list_first_entry(&oo->oo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
-		spin_unlock(&clp->cl_lock);
-		release_open_stateid(stp);
-		spin_lock(&clp->cl_lock);
+		unhash_open_stateid(stp);
+		put_generic_stateid_locked(stp, &reaplist);
 	}
-}
-
-static void release_openowner(struct nfs4_openowner *oo)
-{
-	struct nfs4_client *clp = oo->oo_owner.so_client;
-
-	spin_lock(&clp->cl_lock);
-	unhash_openowner_locked(oo);
-	release_openowner_stateids(oo);
 	spin_unlock(&clp->cl_lock);
+	free_ol_stateid_reaplist(&reaplist);
 	release_last_closed_stateid(oo);
 	nfs4_put_stateowner(&oo->oo_owner);
 }
@@ -4666,7 +4695,9 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 	struct nfs4_client *clp = s->st_stid.sc_client;
 
 	s->st_stid.sc_type = NFS4_CLOSED_STID;
+	spin_lock(&clp->cl_lock);
 	unhash_open_stateid(s);
+	spin_unlock(&clp->cl_lock);
 
 	if (clp->cl_minorversion)
 		put_generic_stateid(s);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 062/100] nfsd: don't thrash the cl_lock while freeing an open stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (60 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 061/100] nfsd: reduce cl_lock thrashing in release_openowner Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 063/100] nfsd: Ensure struct nfs4_client is unhashed before we try to destroy it Jeff Layton
                   ` (37 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

When we remove the client_mutex, we'll have a potential race between
FREE_STATEID and CLOSE.

The root of the problem is that we are walking the st_locks list,
dropping the spinlock and then trying to release the persistent
reference to the lockstateid. In between, a FREE_STATEID call can come
along and take the lock, find the stateid and then try to put the
reference. That leads to a double put.

Fix this by not releasing the cl_lock in order to release each lock
stateid. Use put_generic_stateid_locked to unhash them and gather them
onto a list, and free_ol_stateid_reaplist to free any that end up on the
list.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 34 +++++++++++++++++++---------------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index da8e2ad8dd8e..58448637dd44 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1045,27 +1045,26 @@ static void release_lockowner(struct nfs4_lockowner *lo)
 	nfs4_put_stateowner(&lo->lo_owner);
 }
 
-static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp)
-	__releases(&open_stp->st_stateowner->so_client->cl_lock)
-	__acquires(&open_stp->st_stateowner->so_client->cl_lock)
+static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp,
+				       struct list_head *reaplist)
 {
 	struct nfs4_ol_stateid *stp;
 
 	while (!list_empty(&open_stp->st_locks)) {
 		stp = list_entry(open_stp->st_locks.next,
 				struct nfs4_ol_stateid, st_locks);
-		spin_unlock(&open_stp->st_stateowner->so_client->cl_lock);
-		release_lock_stateid(stp);
-		spin_lock(&open_stp->st_stateowner->so_client->cl_lock);
+		unhash_lock_stateid(stp);
+		put_generic_stateid_locked(stp, reaplist);
 	}
 }
 
-static void unhash_open_stateid(struct nfs4_ol_stateid *stp)
+static void unhash_open_stateid(struct nfs4_ol_stateid *stp,
+				struct list_head *reaplist)
 {
 	lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);
 
 	unhash_generic_stateid(stp);
-	release_open_stateid_locks(stp);
+	release_open_stateid_locks(stp, reaplist);
 }
 
 static void release_open_stateid(struct nfs4_ol_stateid *stp)
@@ -1073,7 +1072,7 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
 	LIST_HEAD(reaplist);
 
 	spin_lock(&stp->st_stid.sc_client->cl_lock);
-	unhash_open_stateid(stp);
+	unhash_open_stateid(stp, &reaplist);
 	put_generic_stateid_locked(stp, &reaplist);
 	spin_unlock(&stp->st_stid.sc_client->cl_lock);
 	free_ol_stateid_reaplist(&reaplist);
@@ -1113,7 +1112,7 @@ static void release_openowner(struct nfs4_openowner *oo)
 	while (!list_empty(&oo->oo_owner.so_stateids)) {
 		stp = list_first_entry(&oo->oo_owner.so_stateids,
 				struct nfs4_ol_stateid, st_perstateowner);
-		unhash_open_stateid(stp);
+		unhash_open_stateid(stp, &reaplist);
 		put_generic_stateid_locked(stp, &reaplist);
 	}
 	spin_unlock(&clp->cl_lock);
@@ -4693,16 +4692,21 @@ out:
 static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s)
 {
 	struct nfs4_client *clp = s->st_stid.sc_client;
+	LIST_HEAD(reaplist);
 
 	s->st_stid.sc_type = NFS4_CLOSED_STID;
 	spin_lock(&clp->cl_lock);
-	unhash_open_stateid(s);
-	spin_unlock(&clp->cl_lock);
+	unhash_open_stateid(s, &reaplist);
 
-	if (clp->cl_minorversion)
-		put_generic_stateid(s);
-	else
+	if (clp->cl_minorversion) {
+		put_generic_stateid_locked(s, &reaplist);
+		spin_unlock(&clp->cl_lock);
+		free_ol_stateid_reaplist(&reaplist);
+	} else {
+		spin_unlock(&clp->cl_lock);
+		free_ol_stateid_reaplist(&reaplist);
 		move_to_close_lru(s, clp->net);
+	}
 }
 
 /*
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 063/100] nfsd: Ensure struct nfs4_client is unhashed before we try to destroy it
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (61 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 062/100] nfsd: don't thrash the cl_lock while freeing an open stateid Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 064/100] nfsd: Ensure that the laundromat unhashes the client before releasing locks Jeff Layton
                   ` (36 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

When we remove the client_mutex protection, we will need to ensure
that it can't be found by other threads while we're destroying it.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 58448637dd44..5770ac55dcea 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1560,12 +1560,23 @@ free_client(struct nfs4_client *clp)
 }
 
 /* must be called under the client_lock */
-static inline void
+static void
 unhash_client_locked(struct nfs4_client *clp)
 {
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 	struct nfsd4_session *ses;
 
-	list_del(&clp->cl_lru);
+	/* Mark the client as expired! */
+	clp->cl_time = 0;
+	/* Make it invisible */
+	if (!list_empty(&clp->cl_idhash)) {
+		list_del_init(&clp->cl_idhash);
+		if (test_bit(NFSD4_CLIENT_CONFIRMED, &clp->cl_flags))
+			rb_erase(&clp->cl_namenode, &nn->conf_name_tree);
+		else
+			rb_erase(&clp->cl_namenode, &nn->unconf_name_tree);
+	}
+	list_del_init(&clp->cl_lru);
 	spin_lock(&clp->cl_lock);
 	list_for_each_entry(ses, &clp->cl_sessions, se_perclnt)
 		list_del_init(&ses->se_hash);
@@ -1573,7 +1584,17 @@ unhash_client_locked(struct nfs4_client *clp)
 }
 
 static void
-destroy_client(struct nfs4_client *clp)
+unhash_client(struct nfs4_client *clp)
+{
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	spin_lock(&nn->client_lock);
+	unhash_client_locked(clp);
+	spin_unlock(&nn->client_lock);
+}
+
+static void
+__destroy_client(struct nfs4_client *clp)
 {
 	struct nfs4_openowner *oo;
 	struct nfs4_delegation *dp;
@@ -1608,22 +1629,24 @@ destroy_client(struct nfs4_client *clp)
 	nfsd4_shutdown_callback(clp);
 	if (clp->cl_cb_conn.cb_xprt)
 		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
-	list_del(&clp->cl_idhash);
-	if (test_bit(NFSD4_CLIENT_CONFIRMED, &clp->cl_flags))
-		rb_erase(&clp->cl_namenode, &nn->conf_name_tree);
-	else
-		rb_erase(&clp->cl_namenode, &nn->unconf_name_tree);
 	spin_lock(&nn->client_lock);
-	unhash_client_locked(clp);
 	WARN_ON_ONCE(atomic_read(&clp->cl_refcount));
 	free_client(clp);
 	spin_unlock(&nn->client_lock);
 }
 
+static void
+destroy_client(struct nfs4_client *clp)
+{
+	unhash_client(clp);
+	__destroy_client(clp);
+}
+
 static void expire_client(struct nfs4_client *clp)
 {
+	unhash_client(clp);
 	nfsd4_client_record_remove(clp);
-	destroy_client(clp);
+	__destroy_client(clp);
 }
 
 static void copy_verf(struct nfs4_client *target, nfs4_verifier *source)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 064/100] nfsd: Ensure that the laundromat unhashes the client before releasing locks
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (62 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 063/100] nfsd: Ensure struct nfs4_client is unhashed before we try to destroy it Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 065/100] nfsd: Don't require client_lock in free_client Jeff Layton
                   ` (35 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

If we leave the client on the confirmed/unconfirmed tables, and leave
the sessions visible on the sessionid_hashtbl, then someone might
find them before we've had a chance to destroy them.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5770ac55dcea..d290dbebafdf 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4124,13 +4124,15 @@ nfs4_laundromat(struct nfsd_net *nn)
 				clp->cl_clientid.cl_id);
 			continue;
 		}
-		list_move(&clp->cl_lru, &reaplist);
+		unhash_client_locked(clp);
+		list_add(&clp->cl_lru, &reaplist);
 	}
 	spin_unlock(&nn->client_lock);
 	list_for_each_safe(pos, next, &reaplist) {
 		clp = list_entry(pos, struct nfs4_client, cl_lru);
 		dprintk("NFSD: purging unused client (clientid %08x)\n",
 			clp->cl_clientid.cl_id);
+		list_del_init(&clp->cl_lru);
 		expire_client(clp);
 	}
 	spin_lock(&state_lock);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 065/100] nfsd: Don't require client_lock in free_client
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (63 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 064/100] nfsd: Ensure that the laundromat unhashes the client before releasing locks Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 066/100] nfsd: Move create_client() call outside the lock Jeff Layton
                   ` (34 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

The struct nfs_client is supposed to be invisible and unreferenced
before it gets here.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d290dbebafdf..cb319903d93b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1386,9 +1386,6 @@ static void __free_session(struct nfsd4_session *ses)
 
 static void free_session(struct nfsd4_session *ses)
 {
-	struct nfsd_net *nn = net_generic(ses->se_client->net, nfsd_net_id);
-
-	lockdep_assert_held(&nn->client_lock);
 	nfsd4_del_conns(ses);
 	nfsd4_put_drc_mem(&ses->se_fchannel);
 	__free_session(ses);
@@ -1538,9 +1535,6 @@ err_no_name:
 static void
 free_client(struct nfs4_client *clp)
 {
-	struct nfsd_net __maybe_unused *nn = net_generic(clp->net, nfsd_net_id);
-
-	lockdep_assert_held(&nn->client_lock);
 	while (!list_empty(&clp->cl_sessions)) {
 		struct nfsd4_session *ses;
 		ses = list_entry(clp->cl_sessions.next, struct nfsd4_session,
@@ -1599,7 +1593,6 @@ __destroy_client(struct nfs4_client *clp)
 	struct nfs4_openowner *oo;
 	struct nfs4_delegation *dp;
 	struct list_head reaplist;
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
 	INIT_LIST_HEAD(&reaplist);
 	spin_lock(&state_lock);
@@ -1629,10 +1622,7 @@ __destroy_client(struct nfs4_client *clp)
 	nfsd4_shutdown_callback(clp);
 	if (clp->cl_cb_conn.cb_xprt)
 		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
-	spin_lock(&nn->client_lock);
-	WARN_ON_ONCE(atomic_read(&clp->cl_refcount));
 	free_client(clp);
-	spin_unlock(&nn->client_lock);
 }
 
 static void
@@ -1835,7 +1825,6 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	struct sockaddr *sa = svc_addr(rqstp);
 	int ret;
 	struct net *net = SVC_NET(rqstp);
-	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	clp = alloc_client(name);
 	if (clp == NULL)
@@ -1843,9 +1832,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 
 	ret = copy_cred(&clp->cl_cred, &rqstp->rq_cred);
 	if (ret) {
-		spin_lock(&nn->client_lock);
 		free_client(clp);
-		spin_unlock(&nn->client_lock);
 		return NULL;
 	}
 	clp->cl_time = get_seconds();
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 066/100] nfsd: Move create_client() call outside the lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (64 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 065/100] nfsd: Don't require client_lock in free_client Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 067/100] nfsd: Protect unconfirmed client creation using client_lock Jeff Layton
                   ` (33 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

For efficiency reasons, and because we want to use spin locks instead
of relying on the client_mutex.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index cb319903d93b..9e18276877d7 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2153,6 +2153,10 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 		return nfserr_encr_alg_unsupp;
 	}
 
+	new = create_client(exid->clname, rqstp, &verf);
+	if (new == NULL)
+		return nfserr_jukebox;
+
 	/* Cases below refer to rfc 5661 section 18.35.4: */
 	nfs4_lock_state();
 	conf = find_confirmed_client_by_name(&exid->clname, nn);
@@ -2179,7 +2183,6 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 			}
 			/* case 6 */
 			exid->flags |= EXCHGID4_FLAG_CONFIRMED_R;
-			new = conf;
 			goto out_copy;
 		}
 		if (!creds_match) { /* case 3 */
@@ -2192,7 +2195,6 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 		}
 		if (verfs_match) { /* case 2 */
 			conf->cl_exchange_flags |= EXCHGID4_FLAG_CONFIRMED_R;
-			new = conf;
 			goto out_copy;
 		}
 		/* case 5, client reboot */
@@ -2210,29 +2212,28 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 
 	/* case 1 (normal case) */
 out_new:
-	new = create_client(exid->clname, rqstp, &verf);
-	if (new == NULL) {
-		status = nfserr_jukebox;
-		goto out;
-	}
 	new->cl_minorversion = cstate->minorversion;
 	new->cl_mach_cred = (exid->spa_how == SP4_MACH_CRED);
 
 	gen_clid(new, nn);
 	add_to_unconfirmed(new);
+	conf = new;
+	new = NULL;
 out_copy:
-	exid->clientid.cl_boot = new->cl_clientid.cl_boot;
-	exid->clientid.cl_id = new->cl_clientid.cl_id;
+	exid->clientid.cl_boot = conf->cl_clientid.cl_boot;
+	exid->clientid.cl_id = conf->cl_clientid.cl_id;
 
-	exid->seqid = new->cl_cs_slot.sl_seqid + 1;
-	nfsd4_set_ex_flags(new, exid);
+	exid->seqid = conf->cl_cs_slot.sl_seqid + 1;
+	nfsd4_set_ex_flags(conf, exid);
 
 	dprintk("nfsd4_exchange_id seqid %d flags %x\n",
-		new->cl_cs_slot.sl_seqid, new->cl_exchange_flags);
+		conf->cl_cs_slot.sl_seqid, conf->cl_exchange_flags);
 	status = nfs_ok;
 
 out:
 	nfs4_unlock_state();
+	if (new)
+		free_client(new);
 	return status;
 }
 
@@ -2875,6 +2876,9 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	__be32 			status;
 	struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
+	new = create_client(clname, rqstp, &clverifier);
+	if (new == NULL)
+		return nfserr_jukebox;
 	/* Cases below refer to rfc 3530 section 14.2.33: */
 	nfs4_lock_state();
 	conf = find_confirmed_client_by_name(&clname, nn);
@@ -2895,10 +2899,6 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	unconf = find_unconfirmed_client_by_name(&clname, nn);
 	if (unconf)
 		expire_client(unconf);
-	status = nfserr_jukebox;
-	new = create_client(clname, rqstp, &clverifier);
-	if (new == NULL)
-		goto out;
 	if (conf && same_verf(&conf->cl_verifier, &clverifier))
 		/* case 1: probable callback update */
 		copy_clid(new, conf);
@@ -2910,9 +2910,12 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	setclid->se_clientid.cl_boot = new->cl_clientid.cl_boot;
 	setclid->se_clientid.cl_id = new->cl_clientid.cl_id;
 	memcpy(setclid->se_confirm.data, new->cl_confirm.data, sizeof(setclid->se_confirm.data));
+	new = NULL;
 	status = nfs_ok;
 out:
 	nfs4_unlock_state();
+	if (new)
+		free_client(new);
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 067/100] nfsd: Protect unconfirmed client creation using client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (65 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 066/100] nfsd: Move create_client() call outside the lock Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 068/100] nfsd: Protect session creation and client confirm " Jeff Layton
                   ` (32 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

...instead of relying on the client_mutex.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9e18276877d7..f27e714bd6b6 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1895,7 +1895,7 @@ add_to_unconfirmed(struct nfs4_client *clp)
 	add_clp_to_name_tree(clp, &nn->unconf_name_tree);
 	idhashval = clientid_hashval(clp->cl_clientid.cl_id);
 	list_add(&clp->cl_idhash, &nn->unconf_id_hashtbl[idhashval]);
-	renew_client(clp);
+	renew_client_locked(clp);
 }
 
 static void
@@ -1909,7 +1909,7 @@ move_to_confirmed(struct nfs4_client *clp)
 	rb_erase(&clp->cl_namenode, &nn->unconf_name_tree);
 	add_clp_to_name_tree(clp, &nn->conf_name_tree);
 	set_bit(NFSD4_CLIENT_CONFIRMED, &clp->cl_flags);
-	renew_client(clp);
+	renew_client_locked(clp);
 }
 
 static struct nfs4_client *
@@ -1922,7 +1922,7 @@ find_client_in_id_table(struct list_head *tbl, clientid_t *clid, bool sessions)
 		if (same_clid(&clp->cl_clientid, clid)) {
 			if ((bool)clp->cl_minorversion != sessions)
 				return NULL;
-			renew_client(clp);
+			renew_client_locked(clp);
 			return clp;
 		}
 	}
@@ -2124,7 +2124,8 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 		  struct nfsd4_compound_state *cstate,
 		  struct nfsd4_exchange_id *exid)
 {
-	struct nfs4_client *unconf, *conf, *new;
+	struct nfs4_client *conf, *new;
+	struct nfs4_client *unconf = NULL;
 	__be32 status;
 	char			addr_str[INET6_ADDRSTRLEN];
 	nfs4_verifier		verf = exid->verifier;
@@ -2159,6 +2160,7 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 
 	/* Cases below refer to rfc 5661 section 18.35.4: */
 	nfs4_lock_state();
+	spin_lock(&nn->client_lock);
 	conf = find_confirmed_client_by_name(&exid->clname, nn);
 	if (conf) {
 		bool creds_match = same_creds(&conf->cl_cred, &rqstp->rq_cred);
@@ -2190,7 +2192,6 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 				status = nfserr_clid_inuse;
 				goto out;
 			}
-			expire_client(conf);
 			goto out_new;
 		}
 		if (verfs_match) { /* case 2 */
@@ -2198,6 +2199,7 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 			goto out_copy;
 		}
 		/* case 5, client reboot */
+		conf = NULL;
 		goto out_new;
 	}
 
@@ -2208,17 +2210,18 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 
 	unconf  = find_unconfirmed_client_by_name(&exid->clname, nn);
 	if (unconf) /* case 4, possible retry or client restart */
-		expire_client(unconf);
+		unhash_client_locked(unconf);
 
 	/* case 1 (normal case) */
 out_new:
+	if (conf)
+		unhash_client_locked(conf);
 	new->cl_minorversion = cstate->minorversion;
 	new->cl_mach_cred = (exid->spa_how == SP4_MACH_CRED);
 
 	gen_clid(new, nn);
 	add_to_unconfirmed(new);
-	conf = new;
-	new = NULL;
+	swap(new, conf);
 out_copy:
 	exid->clientid.cl_boot = conf->cl_clientid.cl_boot;
 	exid->clientid.cl_id = conf->cl_clientid.cl_id;
@@ -2231,9 +2234,12 @@ out_copy:
 	status = nfs_ok;
 
 out:
+	spin_unlock(&nn->client_lock);
 	nfs4_unlock_state();
 	if (new)
-		free_client(new);
+		expire_client(new);
+	if (unconf)
+		expire_client(unconf);
 	return status;
 }
 
@@ -2872,7 +2878,8 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 {
 	struct xdr_netobj 	clname = setclid->se_name;
 	nfs4_verifier		clverifier = setclid->se_verf;
-	struct nfs4_client	*conf, *unconf, *new;
+	struct nfs4_client	*conf, *new;
+	struct nfs4_client	*unconf = NULL;
 	__be32 			status;
 	struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
@@ -2881,6 +2888,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		return nfserr_jukebox;
 	/* Cases below refer to rfc 3530 section 14.2.33: */
 	nfs4_lock_state();
+	spin_lock(&nn->client_lock);
 	conf = find_confirmed_client_by_name(&clname, nn);
 	if (conf) {
 		/* case 0: */
@@ -2898,7 +2906,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 	unconf = find_unconfirmed_client_by_name(&clname, nn);
 	if (unconf)
-		expire_client(unconf);
+		unhash_client_locked(unconf);
 	if (conf && same_verf(&conf->cl_verifier, &clverifier))
 		/* case 1: probable callback update */
 		copy_clid(new, conf);
@@ -2913,9 +2921,12 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	new = NULL;
 	status = nfs_ok;
 out:
+	spin_unlock(&nn->client_lock);
 	nfs4_unlock_state();
 	if (new)
 		free_client(new);
+	if (unconf)
+		expire_client(unconf);
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 068/100] nfsd: Protect session creation and client confirm using client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (66 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 067/100] nfsd: Protect unconfirmed client creation using client_lock Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 069/100] nfsd: Protect nfsd4_destroy_clientid " Jeff Layton
                   ` (31 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

In particular, we want to ensure that the move_to_confirmed() is
protected by the nn->client_lock spin lock, so that we can use that when
looking up the clientid etc. instead of relying on the client_mutex.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 65 ++++++++++++++++++++++++++++++++---------------------
 1 file changed, 39 insertions(+), 26 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f27e714bd6b6..314594a05952 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -141,17 +141,6 @@ static __be32 mark_client_expired_locked(struct nfs4_client *clp)
 	return nfs_ok;
 }
 
-static __be32 mark_client_expired(struct nfs4_client *clp)
-{
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
-	__be32 ret;
-
-	spin_lock(&nn->client_lock);
-	ret = mark_client_expired_locked(clp);
-	spin_unlock(&nn->client_lock);
-	return ret;
-}
-
 static __be32 get_client_locked(struct nfs4_client *clp)
 {
 	if (is_client_expired(clp))
@@ -1407,12 +1396,10 @@ static void init_session(struct svc_rqst *rqstp, struct nfsd4_session *new, stru
 	new->se_cb_sec = cses->cb_sec;
 	atomic_set(&new->se_ref, 0);
 	idx = hash_sessionid(&new->se_sessionid);
-	spin_lock(&nn->client_lock);
 	list_add(&new->se_hash, &nn->sessionid_hashtbl[idx]);
 	spin_lock(&clp->cl_lock);
 	list_add(&new->se_perclnt, &clp->cl_sessions);
 	spin_unlock(&clp->cl_lock);
-	spin_unlock(&nn->client_lock);
 
 	if (cses->flags & SESSION4_BACK_CHAN) {
 		struct sockaddr *sa = svc_addr(rqstp);
@@ -2383,6 +2370,7 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 {
 	struct sockaddr *sa = svc_addr(rqstp);
 	struct nfs4_client *conf, *unconf;
+	struct nfs4_client *old = NULL;
 	struct nfsd4_session *new;
 	struct nfsd4_conn *conn;
 	struct nfsd4_clid_slot *cs_slot = NULL;
@@ -2409,6 +2397,7 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 		goto out_free_session;
 
 	nfs4_lock_state();
+	spin_lock(&nn->client_lock);
 	unconf = find_unconfirmed_client(&cr_ses->clientid, true, nn);
 	conf = find_confirmed_client(&cr_ses->clientid, true, nn);
 	WARN_ON_ONCE(conf && unconf);
@@ -2427,7 +2416,6 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 			goto out_free_conn;
 		}
 	} else if (unconf) {
-		struct nfs4_client *old;
 		if (!same_creds(&unconf->cl_cred, &rqstp->rq_cred) ||
 		    !rpc_cmp_addr(sa, (struct sockaddr *) &unconf->cl_addr)) {
 			status = nfserr_clid_inuse;
@@ -2445,10 +2433,10 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 		}
 		old = find_confirmed_client_by_name(&unconf->cl_name, nn);
 		if (old) {
-			status = mark_client_expired(old);
+			status = mark_client_expired_locked(old);
 			if (status)
 				goto out_free_conn;
-			expire_client(old);
+			unhash_client_locked(old);
 		}
 		move_to_confirmed(unconf);
 		conf = unconf;
@@ -2464,20 +2452,29 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 	cr_ses->flags &= ~SESSION4_RDMA;
 
 	init_session(rqstp, new, conf, cr_ses);
-	nfsd4_init_conn(rqstp, conn, new);
+	nfsd4_get_session_locked(new);
 
 	memcpy(cr_ses->sessionid.data, new->se_sessionid.data,
 	       NFS4_MAX_SESSIONID_LEN);
 	cs_slot->sl_seqid++;
 	cr_ses->seqid = cs_slot->sl_seqid;
 
-	/* cache solo and embedded create sessions under the state lock */
+	/* cache solo and embedded create sessions under the client_lock */
 	nfsd4_cache_create_session(cr_ses, cs_slot, status);
+	spin_unlock(&nn->client_lock);
+	/* init connection and backchannel */
+	nfsd4_init_conn(rqstp, conn, new);
+	nfsd4_put_session(new);
 	nfs4_unlock_state();
+	if (old)
+		expire_client(old);
 	return status;
 out_free_conn:
+	spin_unlock(&nn->client_lock);
 	nfs4_unlock_state();
 	free_conn(conn);
+	if (old)
+		expire_client(old);
 out_free_session:
 	__free_session(new);
 out_release_drc_mem:
@@ -2937,6 +2934,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 			 struct nfsd4_setclientid_confirm *setclientid_confirm)
 {
 	struct nfs4_client *conf, *unconf;
+	struct nfs4_client *old = NULL;
 	nfs4_verifier confirm = setclientid_confirm->sc_confirm; 
 	clientid_t * clid = &setclientid_confirm->sc_clientid;
 	__be32 status;
@@ -2946,6 +2944,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 		return nfserr_stale_clientid;
 	nfs4_lock_state();
 
+	spin_lock(&nn->client_lock);
 	conf = find_confirmed_client(clid, false, nn);
 	unconf = find_unconfirmed_client(clid, false, nn);
 	/*
@@ -2969,21 +2968,29 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 	}
 	status = nfs_ok;
 	if (conf) { /* case 1: callback update */
+		old = unconf;
+		unhash_client_locked(old);
 		nfsd4_change_callback(conf, &unconf->cl_cb_conn);
-		nfsd4_probe_callback(conf);
-		expire_client(unconf);
 	} else { /* case 3: normal case; new or rebooted client */
-		conf = find_confirmed_client_by_name(&unconf->cl_name, nn);
-		if (conf) {
-			status = mark_client_expired(conf);
+		old = find_confirmed_client_by_name(&unconf->cl_name, nn);
+		if (old) {
+			status = mark_client_expired_locked(old);
 			if (status)
 				goto out;
-			expire_client(conf);
+			unhash_client_locked(old);
 		}
 		move_to_confirmed(unconf);
-		nfsd4_probe_callback(unconf);
+		conf = unconf;
 	}
+	get_client_locked(conf);
+	spin_unlock(&nn->client_lock);
+	nfsd4_probe_callback(conf);
+	spin_lock(&nn->client_lock);
+	put_client_renew_locked(conf);
 out:
+	spin_unlock(&nn->client_lock);
+	if (old)
+		expire_client(old);
 	nfs4_unlock_state();
 	return status;
 }
@@ -5624,7 +5631,13 @@ nfs4_check_open_reclaim(clientid_t *clid,
 
 u64 nfsd_forget_client(struct nfs4_client *clp, u64 max)
 {
-	if (mark_client_expired(clp))
+	__be32 ret;
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	spin_lock(&nn->client_lock);
+	ret = mark_client_expired_locked(clp);
+	spin_unlock(&nn->client_lock);
+	if (ret != nfs_ok)
 		return 0;
 	expire_client(clp);
 	return 1;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 069/100] nfsd: Protect nfsd4_destroy_clientid using client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (67 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 068/100] nfsd: Protect session creation and client confirm " Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 070/100] nfsd: Ensure lookup_clientid() takes client_lock Jeff Layton
                   ` (30 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

...instead of relying on the client_mutex.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 314594a05952..8c3f63f9dd4e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2798,22 +2798,23 @@ nfsd4_sequence_done(struct nfsd4_compoundres *resp)
 __be32
 nfsd4_destroy_clientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_destroy_clientid *dc)
 {
-	struct nfs4_client *conf, *unconf, *clp;
+	struct nfs4_client *conf, *unconf;
+	struct nfs4_client *clp = NULL;
 	__be32 status = 0;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
 	nfs4_lock_state();
+	spin_lock(&nn->client_lock);
 	unconf = find_unconfirmed_client(&dc->clientid, true, nn);
 	conf = find_confirmed_client(&dc->clientid, true, nn);
 	WARN_ON_ONCE(conf && unconf);
 
 	if (conf) {
-		clp = conf;
-
 		if (client_has_state(conf)) {
 			status = nfserr_clientid_busy;
 			goto out;
 		}
+		clp = conf;
 	} else if (unconf)
 		clp = unconf;
 	else {
@@ -2821,12 +2822,16 @@ nfsd4_destroy_clientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 		goto out;
 	}
 	if (!mach_creds_match(clp, rqstp)) {
+		clp = NULL;
 		status = nfserr_wrong_cred;
 		goto out;
 	}
-	expire_client(clp);
+	unhash_client_locked(clp);
 out:
+	spin_unlock(&nn->client_lock);
 	nfs4_unlock_state();
+	if (clp)
+		expire_client(clp);
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 070/100] nfsd: Ensure lookup_clientid() takes client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (68 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 069/100] nfsd: Protect nfsd4_destroy_clientid " Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:03 ` [PATCH v4 071/100] nfsd: Add lockdep assertions to document the nfs4_client/session locking Jeff Layton
                   ` (29 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Ensure that the client lookup is done safely under the client_lock, so
we're not relying on the client_mutex.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8c3f63f9dd4e..eea6a844e2f2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3467,13 +3467,17 @@ static __be32 lookup_clientid(clientid_t *clid,
 	 * will be false.
 	 */
 	WARN_ON_ONCE(cstate->session);
+	spin_lock(&nn->client_lock);
 	found = find_confirmed_client(clid, false, nn);
-	if (!found)
+	if (!found) {
+		spin_unlock(&nn->client_lock);
 		return nfserr_expired;
+	}
+	atomic_inc(&found->cl_refcount);
+	spin_unlock(&nn->client_lock);
 
 	/* Cache the nfs4_client in cstate! */
 	cstate->clp = found;
-	atomic_inc(&found->cl_refcount);
 	return nfs_ok;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 071/100] nfsd: Add lockdep assertions to document the nfs4_client/session locking
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (69 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 070/100] nfsd: Ensure lookup_clientid() takes client_lock Jeff Layton
@ 2014-07-08 18:03 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 072/100] nfsd: protect the close_lru list and oo_last_closed_stid with client_lock Jeff Layton
                   ` (28 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:03 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index eea6a844e2f2..da2410f392b4 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -143,6 +143,10 @@ static __be32 mark_client_expired_locked(struct nfs4_client *clp)
 
 static __be32 get_client_locked(struct nfs4_client *clp)
 {
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
+
 	if (is_client_expired(clp))
 		return nfserr_expired;
 	atomic_inc(&clp->cl_refcount);
@@ -183,6 +187,10 @@ renew_client(struct nfs4_client *clp)
 
 static void put_client_renew_locked(struct nfs4_client *clp)
 {
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
+
 	if (!atomic_dec_and_test(&clp->cl_refcount))
 		return;
 	if (!is_client_expired(clp))
@@ -216,6 +224,9 @@ static __be32 nfsd4_get_session_locked(struct nfsd4_session *ses)
 static void nfsd4_put_session_locked(struct nfsd4_session *ses)
 {
 	struct nfs4_client *clp = ses->se_client;
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
 
 	if (atomic_dec_and_test(&ses->se_ref) && is_session_dead(ses))
 		free_session(ses);
@@ -1423,6 +1434,8 @@ __find_in_sessionid_hashtbl(struct nfs4_sessionid *sessionid, struct net *net)
 	int idx;
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
+	lockdep_assert_held(&nn->client_lock);
+
 	dump_sessionid(__func__, sessionid);
 	idx = hash_sessionid(sessionid);
 	/* Search in the appropriate list */
@@ -1459,6 +1472,11 @@ out:
 static void
 unhash_session(struct nfsd4_session *ses)
 {
+	struct nfs4_client *clp = ses->se_client;
+	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+
+	lockdep_assert_held(&nn->client_lock);
+
 	list_del(&ses->se_hash);
 	spin_lock(&ses->se_client->cl_lock);
 	list_del(&ses->se_perclnt);
@@ -1547,6 +1565,8 @@ unhash_client_locked(struct nfs4_client *clp)
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 	struct nfsd4_session *ses;
 
+	lockdep_assert_held(&nn->client_lock);
+
 	/* Mark the client as expired! */
 	clp->cl_time = 0;
 	/* Make it invisible */
@@ -1878,6 +1898,8 @@ add_to_unconfirmed(struct nfs4_client *clp)
 	unsigned int idhashval;
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
+	lockdep_assert_held(&nn->client_lock);
+
 	clear_bit(NFSD4_CLIENT_CONFIRMED, &clp->cl_flags);
 	add_clp_to_name_tree(clp, &nn->unconf_name_tree);
 	idhashval = clientid_hashval(clp->cl_clientid.cl_id);
@@ -1891,6 +1913,8 @@ move_to_confirmed(struct nfs4_client *clp)
 	unsigned int idhashval = clientid_hashval(clp->cl_clientid.cl_id);
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
+	lockdep_assert_held(&nn->client_lock);
+
 	dprintk("NFSD: move_to_confirm nfs4_client %p\n", clp);
 	list_move(&clp->cl_idhash, &nn->conf_id_hashtbl[idhashval]);
 	rb_erase(&clp->cl_namenode, &nn->unconf_name_tree);
@@ -1921,6 +1945,7 @@ find_confirmed_client(clientid_t *clid, bool sessions, struct nfsd_net *nn)
 {
 	struct list_head *tbl = nn->conf_id_hashtbl;
 
+	lockdep_assert_held(&nn->client_lock);
 	return find_client_in_id_table(tbl, clid, sessions);
 }
 
@@ -1929,6 +1954,7 @@ find_unconfirmed_client(clientid_t *clid, bool sessions, struct nfsd_net *nn)
 {
 	struct list_head *tbl = nn->unconf_id_hashtbl;
 
+	lockdep_assert_held(&nn->client_lock);
 	return find_client_in_id_table(tbl, clid, sessions);
 }
 
@@ -1940,12 +1966,14 @@ static bool clp_used_exchangeid(struct nfs4_client *clp)
 static struct nfs4_client *
 find_confirmed_client_by_name(struct xdr_netobj *name, struct nfsd_net *nn)
 {
+	lockdep_assert_held(&nn->client_lock);
 	return find_clp_in_name_tree(name, &nn->conf_name_tree);
 }
 
 static struct nfs4_client *
 find_unconfirmed_client_by_name(struct xdr_netobj *name, struct nfsd_net *nn)
 {
+	lockdep_assert_held(&nn->client_lock);
 	return find_clp_in_name_tree(name, &nn->unconf_name_tree);
 }
 
@@ -4899,6 +4927,8 @@ find_lockowner_str_locked(clientid_t *clid, struct xdr_netobj *owner,
 	unsigned int strhashval = ownerstr_hashval(owner);
 	struct nfs4_stateowner *so;
 
+	lockdep_assert_held(&clp->cl_lock);
+
 	list_for_each_entry(so, &clp->cl_ownerstr_hashtbl[strhashval],
 			    so_strhash) {
 		if (so->so_is_open_owner)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 072/100] nfsd: protect the close_lru list and oo_last_closed_stid with client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (70 preceding siblings ...)
  2014-07-08 18:03 ` [PATCH v4 071/100] nfsd: Add lockdep assertions to document the nfs4_client/session locking Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 073/100] nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock Jeff Layton
                   ` (27 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Currently, it's protected by the client_mutex. Move it so that the list
and the fields in the openowner are protected by the client_lock.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index da2410f392b4..3ef8d9a577ae 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1090,13 +1090,19 @@ static void unhash_openowner_locked(struct nfs4_openowner *oo)
 
 static void release_last_closed_stateid(struct nfs4_openowner *oo)
 {
-	struct nfs4_ol_stateid *s = oo->oo_last_closed_stid;
+	struct nfsd_net *nn = net_generic(oo->oo_owner.so_client->net,
+					  nfsd_net_id);
+	struct nfs4_ol_stateid *s;
 
+	spin_lock(&nn->client_lock);
+	s = oo->oo_last_closed_stid;
 	if (s) {
 		list_del_init(&oo->oo_close_lru);
 		oo->oo_last_closed_stid = NULL;
-		put_generic_stateid(s);
 	}
+	spin_unlock(&nn->client_lock);
+	if (s)
+		put_generic_stateid(s);
 }
 
 static void release_openowner(struct nfs4_openowner *oo)
@@ -3250,6 +3256,7 @@ static void init_open_stateid(struct nfs4_ol_stateid *stp, struct nfs4_file *fp,
 static void
 move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
 {
+	struct nfs4_ol_stateid *last;
 	struct nfs4_openowner *oo = openowner(s->st_stateowner);
 	struct nfsd_net *nn = net_generic(s->st_stid.sc_client->net,
 						nfsd_net_id);
@@ -3273,10 +3280,14 @@ move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net)
 		s->st_stid.sc_file = NULL;
 	}
 
-	release_last_closed_stateid(oo);
+	spin_lock(&nn->client_lock);
+	last = oo->oo_last_closed_stid;
 	oo->oo_last_closed_stid = s;
 	list_move_tail(&oo->oo_close_lru, &nn->close_lru);
 	oo->oo_time = get_seconds();
+	spin_unlock(&nn->client_lock);
+	if (last)
+		put_generic_stateid(last);
 }
 
 static int
@@ -4147,6 +4158,7 @@ nfs4_laundromat(struct nfsd_net *nn)
 	struct nfs4_client *clp;
 	struct nfs4_openowner *oo;
 	struct nfs4_delegation *dp;
+	struct nfs4_ol_stateid *stp;
 	struct list_head *pos, *next, reaplist;
 	time_t cutoff = get_seconds() - nn->nfsd4_lease;
 	time_t t, new_timeo = nn->nfsd4_lease;
@@ -4198,15 +4210,26 @@ nfs4_laundromat(struct nfsd_net *nn)
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
 		revoke_delegation(dp);
 	}
-	list_for_each_safe(pos, next, &nn->close_lru) {
-		oo = container_of(pos, struct nfs4_openowner, oo_close_lru);
-		if (time_after((unsigned long)oo->oo_time, (unsigned long)cutoff)) {
+
+	spin_lock(&nn->client_lock);
+	while (!list_empty(&nn->close_lru)) {
+		oo = list_first_entry(&nn->close_lru, struct nfs4_openowner,
+					oo_close_lru);
+		if (time_after((unsigned long)oo->oo_time,
+			       (unsigned long)cutoff)) {
 			t = oo->oo_time - cutoff;
 			new_timeo = min(new_timeo, t);
 			break;
 		}
-		release_last_closed_stateid(oo);
+		list_del_init(&oo->oo_close_lru);
+		stp = oo->oo_last_closed_stid;
+		oo->oo_last_closed_stid = NULL;
+		spin_unlock(&nn->client_lock);
+		put_generic_stateid(stp);
+		spin_lock(&nn->client_lock);
 	}
+	spin_unlock(&nn->client_lock);
+
 	new_timeo = max_t(time_t, new_timeo, NFSD_LAUNDROMAT_MINTIMEOUT);
 	nfs4_unlock_state();
 	return new_timeo;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 073/100] nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (71 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 072/100] nfsd: protect the close_lru list and oo_last_closed_stid with client_lock Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 074/100] nfsd: move unhash_client_locked call into mark_client_expired_locked Jeff Layton
                   ` (26 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Currently, both destroy_revoked_delegation and revoke_delegation
manipulate the cl_revoked list without any locking aside from the
client_mutex. Ensure that the clp->cl_lock is held when manipulating it,
except for the list walking in destroy_client. At that point, the client
should no longer be in use, and so it should be safe to walk the list
without any locking. That also means that we don't need to do the
list_splice_init there either.

Also, the fact that destroy_revoked_delegation and revoke_delegation
delete dl_recall_lru without any locking makes it difficult to know
whether they're doing so safely in all cases. Move the list_del_init
calls into the callers, and add WARN_ONs in the event that these calls
are passed a delegation that has a non-empty list_head.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3ef8d9a577ae..06b6e3387ec7 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -743,7 +743,7 @@ static void unhash_and_destroy_delegation(struct nfs4_delegation *dp)
 
 static void destroy_revoked_delegation(struct nfs4_delegation *dp)
 {
-	list_del_init(&dp->dl_recall_lru);
+	WARN_ON(!list_empty(&dp->dl_recall_lru));
 	nfs4_put_delegation(dp);
 }
 
@@ -751,11 +751,15 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 {
 	struct nfs4_client *clp = dp->dl_stid.sc_client;
 
+	WARN_ON(!list_empty(&dp->dl_recall_lru));
+
 	if (clp->cl_minorversion == 0)
 		destroy_revoked_delegation(dp);
 	else {
 		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
-		list_move(&dp->dl_recall_lru, &clp->cl_revoked);
+		spin_lock(&clp->cl_lock);
+		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+		spin_unlock(&clp->cl_lock);
 	}
 }
 
@@ -1622,9 +1626,9 @@ __destroy_client(struct nfs4_client *clp)
 		list_del_init(&dp->dl_recall_lru);
 		nfs4_put_delegation(dp);
 	}
-	list_splice_init(&clp->cl_revoked, &reaplist);
-	while (!list_empty(&reaplist)) {
+	while (!list_empty(&clp->cl_revoked)) {
 		dp = list_entry(reaplist.next, struct nfs4_delegation, dl_recall_lru);
+		list_del_init(&dp->dl_recall_lru);
 		destroy_revoked_delegation(dp);
 	}
 	while (!list_empty(&clp->cl_openowners)) {
@@ -4206,8 +4210,10 @@ nfs4_laundromat(struct nfsd_net *nn)
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
-	list_for_each_safe(pos, next, &reaplist) {
-		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
+	while (!list_empty(&reaplist)) {
+		dp = list_first_entry(&reaplist, struct nfs4_delegation,
+					dl_recall_lru);
+		list_del_init(&dp->dl_recall_lru);
 		revoke_delegation(dp);
 	}
 
@@ -4570,8 +4576,9 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		ret = nfs_ok;
 		goto out;
 	case NFS4_REVOKED_DELEG_STID:
-		spin_unlock(&cl->cl_lock);
 		dp = delegstateid(s);
+		list_del_init(&dp->dl_recall_lru);
+		spin_unlock(&cl->cl_lock);
 		destroy_revoked_delegation(dp);
 		ret = nfs_ok;
 		goto out;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 074/100] nfsd: move unhash_client_locked call into mark_client_expired_locked
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (72 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 073/100] nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 075/100] nfsd: don't destroy client if mark_client_expired_locked fails Jeff Layton
                   ` (25 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

All the callers except for the fault injection code call it directly
afterward, and in the fault injection case it won't hurt to do so
anyway.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 06b6e3387ec7..d47359b8a84e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -70,6 +70,7 @@ static u64 current_sessionid = 1;
 #define CURRENT_STATEID(stateid) (!memcmp((stateid), &currentstateid, sizeof(stateid_t)))
 
 /* forward declarations */
+static void unhash_client_locked(struct nfs4_client *clp);
 static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
 static void nfs4_free_generic_stateid(struct nfs4_stid *stid);
 static struct nfs4_openowner *find_openstateowner_str_locked(
@@ -137,7 +138,7 @@ static __be32 mark_client_expired_locked(struct nfs4_client *clp)
 {
 	if (atomic_read(&clp->cl_refcount))
 		return nfserr_jukebox;
-	clp->cl_time = 0;
+	unhash_client_locked(clp);
 	return nfs_ok;
 }
 
@@ -2474,7 +2475,6 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 			status = mark_client_expired_locked(old);
 			if (status)
 				goto out_free_conn;
-			unhash_client_locked(old);
 		}
 		move_to_confirmed(unconf);
 		conf = unconf;
@@ -3020,7 +3020,6 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 			status = mark_client_expired_locked(old);
 			if (status)
 				goto out;
-			unhash_client_locked(old);
 		}
 		move_to_confirmed(unconf);
 		conf = unconf;
@@ -4185,7 +4184,6 @@ nfs4_laundromat(struct nfsd_net *nn)
 				clp->cl_clientid.cl_id);
 			continue;
 		}
-		unhash_client_locked(clp);
 		list_add(&clp->cl_lru, &reaplist);
 	}
 	spin_unlock(&nn->client_lock);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 075/100] nfsd: don't destroy client if mark_client_expired_locked fails
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (73 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 074/100] nfsd: move unhash_client_locked call into mark_client_expired_locked Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 076/100] nfsd: don't destroy clients that are busy Jeff Layton
                   ` (24 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

If it fails, it means that the client is in use and so destroying it
would be bad. Currently, the client_mutex prevents this from happening
but once we remove it, we won't be able to do this.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d47359b8a84e..d11906845398 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2473,8 +2473,10 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 		old = find_confirmed_client_by_name(&unconf->cl_name, nn);
 		if (old) {
 			status = mark_client_expired_locked(old);
-			if (status)
+			if (status) {
+				old = NULL;
 				goto out_free_conn;
+			}
 		}
 		move_to_confirmed(unconf);
 		conf = unconf;
@@ -3018,8 +3020,10 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 		old = find_confirmed_client_by_name(&unconf->cl_name, nn);
 		if (old) {
 			status = mark_client_expired_locked(old);
-			if (status)
+			if (status) {
+				old = NULL;
 				goto out;
+			}
 		}
 		move_to_confirmed(unconf);
 		conf = unconf;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 076/100] nfsd: don't destroy clients that are busy
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (74 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 075/100] nfsd: don't destroy client if mark_client_expired_locked fails Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 077/100] nfsd: protect clid and verifier generation with client_lock Jeff Layton
                   ` (23 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

It's possible that we'll have an in-progress call on some of the clients
while a rogue EXCHANGE_ID or DESTROY_CLIENTID call comes in. Be sure to
try and mark the client expired first, so that the refcount is
respected.

This will only be a problem once the client_mutex is removed.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d11906845398..4aa8f51f7dda 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2240,8 +2240,11 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 
 	/* case 1 (normal case) */
 out_new:
-	if (conf)
-		unhash_client_locked(conf);
+	if (conf) {
+		status = mark_client_expired_locked(conf);
+		if (status)
+			goto out;
+	}
 	new->cl_minorversion = cstate->minorversion;
 	new->cl_mach_cred = (exid->spa_how == SP4_MACH_CRED);
 
@@ -2854,6 +2857,9 @@ nfsd4_destroy_clientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 			status = nfserr_clientid_busy;
 			goto out;
 		}
+		status = mark_client_expired_locked(conf);
+		if (status)
+			goto out;
 		clp = conf;
 	} else if (unconf)
 		clp = unconf;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 077/100] nfsd: protect clid and verifier generation with client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (75 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 076/100] nfsd: don't destroy clients that are busy Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 078/100] nfsd: abstract out the get and set routines into the fault injection ops Jeff Layton
                   ` (22 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

The clid counter is a global counter currently. Move it to be a per-net
property so that it can be properly protected by the nn->client_lock
instead of relying on the client_mutex.

The verifier generator is also potentially racy if there are two
simultaneous callers. Generate the verifier when we generate the clid
value, so it's also created under the client_lock. With this, there's
no need to keep two counters as they'd always be in sync anyway, so
just use the clientid_counter for both.

As Trond points out, what would be best is to eventually move this
code to use IDR instead of the hash tables. That would also help ensure
uniqueness, but that's probably best done as a separate project.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/netns.h     |  6 +++---
 fs/nfsd/nfs4state.c | 21 +++++++++------------
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index e1f479c162b5..3831ef6e5c75 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -92,9 +92,7 @@ struct nfsd_net {
 	bool nfsd_net_up;
 	bool lockd_up;
 
-	/*
-	 * Time of server startup
-	 */
+	/* Time of server startup */
 	struct timeval nfssvc_boot;
 
 	/*
@@ -103,6 +101,8 @@ struct nfsd_net {
 	 */
 	unsigned int max_connections;
 
+	u32 clientid_counter;
+
 	struct svc_serv *nfsd_serv;
 };
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4aa8f51f7dda..fc2abf9428af 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1787,28 +1787,26 @@ static bool mach_creds_match(struct nfs4_client *cl, struct svc_rqst *rqstp)
 	return 0 == strcmp(cl->cl_cred.cr_principal, cr->cr_principal);
 }
 
-static void gen_clid(struct nfs4_client *clp, struct nfsd_net *nn)
-{
-	static u32 current_clientid = 1;
-
-	clp->cl_clientid.cl_boot = nn->boot_time;
-	clp->cl_clientid.cl_id = current_clientid++; 
-}
-
-static void gen_confirm(struct nfs4_client *clp)
+static void gen_confirm(struct nfs4_client *clp, struct nfsd_net *nn)
 {
 	__be32 verf[2];
-	static u32 i;
 
 	/*
 	 * This is opaque to client, so no need to byte-swap. Use
 	 * __force to keep sparse happy
 	 */
 	verf[0] = (__force __be32)get_seconds();
-	verf[1] = (__force __be32)i++;
+	verf[1] = (__force __be32)nn->clientid_counter;
 	memcpy(clp->cl_confirm.data, verf, sizeof(clp->cl_confirm.data));
 }
 
+static void gen_clid(struct nfs4_client *clp, struct nfsd_net *nn)
+{
+	clp->cl_clientid.cl_boot = nn->boot_time;
+	clp->cl_clientid.cl_id = nn->clientid_counter++;
+	gen_confirm(clp, nn);
+}
+
 static struct nfs4_stid *
 find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
 {
@@ -1857,7 +1855,6 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	clear_bit(0, &clp->cl_cb_slot_busy);
 	copy_verf(clp, verf);
 	rpc_copy_addr((struct sockaddr *) &clp->cl_addr, sa);
-	gen_confirm(clp);
 	clp->cl_cb_session = NULL;
 	clp->net = net;
 	return clp;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 078/100] nfsd: abstract out the get and set routines into the fault injection ops
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (76 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 077/100] nfsd: protect clid and verifier generation with client_lock Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 079/100] nfsd: add a forget_clients "get" routine with proper locking Jeff Layton
                   ` (21 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Now that we've added more granular locking in other places, it's time
to address the fault injection code. This code is currently quite
reliant on the client_mutex for protection. Start to change this by
adding a new set of fault injection op vectors.

For now they all use the legacy ones. In later patches we'll add new
routines that can deal with more granular locking.

Also, move some of the printk routines into the callers to make the
results of the operations more uniform.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c | 129 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 78 insertions(+), 51 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index f1333fc35b33..b1159900d934 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -17,79 +17,50 @@
 
 struct nfsd_fault_inject_op {
 	char *file;
+	u64 (*get)(struct nfsd_fault_inject_op *);
+	u64 (*set_val)(struct nfsd_fault_inject_op *, u64);
+	u64 (*set_clnt)(struct nfsd_fault_inject_op *,
+			struct sockaddr_storage *, size_t);
 	u64 (*forget)(struct nfs4_client *, u64);
 	u64 (*print)(struct nfs4_client *, u64);
 };
 
-static struct nfsd_fault_inject_op inject_ops[] = {
-	{
-		.file   = "forget_clients",
-		.forget = nfsd_forget_client,
-		.print  = nfsd_print_client,
-	},
-	{
-		.file   = "forget_locks",
-		.forget = nfsd_forget_client_locks,
-		.print  = nfsd_print_client_locks,
-	},
-	{
-		.file   = "forget_openowners",
-		.forget = nfsd_forget_client_openowners,
-		.print  = nfsd_print_client_openowners,
-	},
-	{
-		.file   = "forget_delegations",
-		.forget = nfsd_forget_client_delegations,
-		.print  = nfsd_print_client_delegations,
-	},
-	{
-		.file   = "recall_delegations",
-		.forget = nfsd_recall_client_delegations,
-		.print  = nfsd_print_client_delegations,
-	},
-};
-
-static long int NUM_INJECT_OPS = sizeof(inject_ops) / sizeof(struct nfsd_fault_inject_op);
 static struct dentry *debug_dir;
 
-static void nfsd_inject_set(struct nfsd_fault_inject_op *op, u64 val)
+static u64 nfsd_inject_set(struct nfsd_fault_inject_op *op, u64 val)
 {
-	u64 count = 0;
-
-	if (val == 0)
-		printk(KERN_INFO "NFSD Fault Injection: %s (all)", op->file);
-	else
-		printk(KERN_INFO "NFSD Fault Injection: %s (n = %llu)", op->file, val);
+	u64 count;
 
 	nfs4_lock_state();
 	count = nfsd_for_n_state(val, op->forget);
 	nfs4_unlock_state();
-	printk(KERN_INFO "NFSD: %s: found %llu", op->file, count);
+	return count;
 }
 
-static void nfsd_inject_set_client(struct nfsd_fault_inject_op *op,
+static u64 nfsd_inject_set_client(struct nfsd_fault_inject_op *op,
 				   struct sockaddr_storage *addr,
 				   size_t addr_size)
 {
-	char buf[INET6_ADDRSTRLEN];
 	struct nfs4_client *clp;
-	u64 count;
+	u64 count = 0;
 
 	nfs4_lock_state();
 	clp = nfsd_find_client(addr, addr_size);
-	if (clp) {
+	if (clp)
 		count = op->forget(clp, 0);
-		rpc_ntop((struct sockaddr *)&clp->cl_addr, buf, sizeof(buf));
-		printk(KERN_INFO "NFSD [%s]: Client %s had %llu state object(s)\n", op->file, buf, count);
-	}
 	nfs4_unlock_state();
+	return count;
 }
 
-static void nfsd_inject_get(struct nfsd_fault_inject_op *op, u64 *val)
+static u64 nfsd_inject_get(struct nfsd_fault_inject_op *op)
 {
+	u64 count;
+
 	nfs4_lock_state();
-	*val = nfsd_for_n_state(0, op->print);
+	count = nfsd_for_n_state(0, op->print);
 	nfs4_unlock_state();
+
+	return count;
 }
 
 static ssize_t fault_inject_read(struct file *file, char __user *buf,
@@ -99,9 +70,10 @@ static ssize_t fault_inject_read(struct file *file, char __user *buf,
 	char read_buf[25];
 	size_t size;
 	loff_t pos = *ppos;
+	struct nfsd_fault_inject_op *op = file_inode(file)->i_private;
 
 	if (!pos)
-		nfsd_inject_get(file_inode(file)->i_private, &val);
+		val = op->get(op);
 	size = scnprintf(read_buf, sizeof(read_buf), "%llu\n", val);
 
 	return simple_read_from_buffer(buf, len, ppos, read_buf, size);
@@ -114,6 +86,7 @@ static ssize_t fault_inject_write(struct file *file, const char __user *buf,
 	size_t size = min(sizeof(write_buf) - 1, len);
 	struct net *net = current->nsproxy->net_ns;
 	struct sockaddr_storage sa;
+	struct nfsd_fault_inject_op *op = file_inode(file)->i_private;
 	u64 val;
 	char *nl;
 
@@ -129,11 +102,20 @@ static ssize_t fault_inject_write(struct file *file, const char __user *buf,
 	}
 
 	size = rpc_pton(net, write_buf, size, (struct sockaddr *)&sa, sizeof(sa));
-	if (size > 0)
-		nfsd_inject_set_client(file_inode(file)->i_private, &sa, size);
-	else {
+	if (size > 0) {
+		val = op->set_clnt(op, &sa, size);
+		if (val)
+			pr_info("NFSD [%s]: Client %s had %llu state object(s)\n",
+				op->file, write_buf, val);
+	} else {
 		val = simple_strtoll(write_buf, NULL, 0);
-		nfsd_inject_set(file_inode(file)->i_private, val);
+		if (val == 0)
+			pr_info("NFSD Fault Injection: %s (all)", op->file);
+		else
+			pr_info("NFSD Fault Injection: %s (n = %llu)",
+				op->file, val);
+		val = op->set_val(op, val);
+		pr_info("NFSD: %s: found %llu", op->file, val);
 	}
 	return len; /* on success, claim we got the whole input */
 }
@@ -149,6 +131,51 @@ void nfsd_fault_inject_cleanup(void)
 	debugfs_remove_recursive(debug_dir);
 }
 
+static struct nfsd_fault_inject_op inject_ops[] = {
+	{
+		.file     = "forget_clients",
+		.get	  = nfsd_inject_get,
+		.set_val  = nfsd_inject_set,
+		.set_clnt = nfsd_inject_set_client,
+		.forget   = nfsd_forget_client,
+		.print    = nfsd_print_client,
+	},
+	{
+		.file     = "forget_locks",
+		.get	  = nfsd_inject_get,
+		.set_val  = nfsd_inject_set,
+		.set_clnt = nfsd_inject_set_client,
+		.forget   = nfsd_forget_client_locks,
+		.print    = nfsd_print_client_locks,
+	},
+	{
+		.file     = "forget_openowners",
+		.get	  = nfsd_inject_get,
+		.set_val  = nfsd_inject_set,
+		.set_clnt = nfsd_inject_set_client,
+		.forget   = nfsd_forget_client_openowners,
+		.print    = nfsd_print_client_openowners,
+	},
+	{
+		.file     = "forget_delegations",
+		.get	  = nfsd_inject_get,
+		.set_val  = nfsd_inject_set,
+		.set_clnt = nfsd_inject_set_client,
+		.forget   = nfsd_forget_client_delegations,
+		.print    = nfsd_print_client_delegations,
+	},
+	{
+		.file     = "recall_delegations",
+		.get	  = nfsd_inject_get,
+		.set_val  = nfsd_inject_set,
+		.set_clnt = nfsd_inject_set_client,
+		.forget   = nfsd_recall_client_delegations,
+		.print    = nfsd_print_client_delegations,
+	},
+};
+
+#define NUM_INJECT_OPS (sizeof(inject_ops)/sizeof(struct nfsd_fault_inject_op))
+
 int nfsd_fault_inject_init(void)
 {
 	unsigned int i;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 079/100] nfsd: add a forget_clients "get" routine with proper locking
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (77 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 078/100] nfsd: abstract out the get and set routines into the fault injection ops Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 080/100] nfsd: add a forget_client set_clnt routine Jeff Layton
                   ` (20 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Add a new "get" routine for forget_clients that relies on the
client_lock instead of the client_mutex.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c |  3 +--
 fs/nfsd/nfs4state.c    | 30 ++++++++++++++++++++++--------
 fs/nfsd/state.h        |  4 +++-
 3 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index b1159900d934..a0387fd47e14 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -134,11 +134,10 @@ void nfsd_fault_inject_cleanup(void)
 static struct nfsd_fault_inject_op inject_ops[] = {
 	{
 		.file     = "forget_clients",
-		.get	  = nfsd_inject_get,
+		.get	  = nfsd_inject_print_clients,
 		.set_val  = nfsd_inject_set,
 		.set_clnt = nfsd_inject_set_client,
 		.forget   = nfsd_forget_client,
-		.print    = nfsd_print_client,
 	},
 	{
 		.file     = "forget_locks",
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fc2abf9428af..e929e6f51788 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5702,6 +5702,28 @@ nfs4_check_open_reclaim(clientid_t *clid,
 }
 
 #ifdef CONFIG_NFSD_FAULT_INJECTION
+u64
+nfsd_inject_print_clients(struct nfsd_fault_inject_op *op)
+{
+	struct nfs4_client *clp;
+	u64 count = 0;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+					  nfsd_net_id);
+	char buf[INET6_ADDRSTRLEN];
+
+	if (!nfsd_netns_ready(nn))
+		return 0;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		rpc_ntop((struct sockaddr *)&clp->cl_addr, buf, sizeof(buf));
+		pr_info("NFS Client: %s\n", buf);
+		++count;
+	}
+	spin_unlock(&nn->client_lock);
+
+	return count;
+}
 
 u64 nfsd_forget_client(struct nfs4_client *clp, u64 max)
 {
@@ -5717,14 +5739,6 @@ u64 nfsd_forget_client(struct nfs4_client *clp, u64 max)
 	return 1;
 }
 
-u64 nfsd_print_client(struct nfs4_client *clp, u64 num)
-{
-	char buf[INET6_ADDRSTRLEN];
-	rpc_ntop((struct sockaddr *)&clp->cl_addr, buf, sizeof(buf));
-	printk(KERN_INFO "NFS Client: %s\n", buf);
-	return 1;
-}
-
 static void nfsd_print_count(struct nfs4_client *clp, unsigned int count,
 			     const char *type)
 {
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 7e395f665b0f..e0bf42051c1f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -466,18 +466,20 @@ extern void nfsd4_record_grace_done(struct nfsd_net *nn, time_t boot_time);
 
 /* nfs fault injection functions */
 #ifdef CONFIG_NFSD_FAULT_INJECTION
+struct nfsd_fault_inject_op;
+
 int nfsd_fault_inject_init(void);
 void nfsd_fault_inject_cleanup(void);
 u64 nfsd_for_n_state(u64, u64 (*)(struct nfs4_client *, u64));
 struct nfs4_client *nfsd_find_client(struct sockaddr_storage *, size_t);
 
+u64 nfsd_inject_print_clients(struct nfsd_fault_inject_op *op);
 u64 nfsd_forget_client(struct nfs4_client *, u64);
 u64 nfsd_forget_client_locks(struct nfs4_client*, u64);
 u64 nfsd_forget_client_openowners(struct nfs4_client *, u64);
 u64 nfsd_forget_client_delegations(struct nfs4_client *, u64);
 u64 nfsd_recall_client_delegations(struct nfs4_client *, u64);
 
-u64 nfsd_print_client(struct nfs4_client *, u64);
 u64 nfsd_print_client_locks(struct nfs4_client *, u64);
 u64 nfsd_print_client_openowners(struct nfs4_client *, u64);
 u64 nfsd_print_client_delegations(struct nfs4_client *, u64);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 080/100] nfsd: add a forget_client set_clnt routine
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (78 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 079/100] nfsd: add a forget_clients "get" routine with proper locking Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 081/100] nfsd: add nfsd_inject_forget_clients Jeff Layton
                   ` (19 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...that relies on the client_lock instead of client_mutex.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c |  2 +-
 fs/nfsd/nfs4state.c    | 28 ++++++++++++++++++++++++++++
 fs/nfsd/state.h        |  3 +++
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index a0387fd47e14..5f3ead0c72fb 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -136,7 +136,7 @@ static struct nfsd_fault_inject_op inject_ops[] = {
 		.file     = "forget_clients",
 		.get	  = nfsd_inject_print_clients,
 		.set_val  = nfsd_inject_set,
-		.set_clnt = nfsd_inject_set_client,
+		.set_clnt = nfsd_inject_forget_client,
 		.forget   = nfsd_forget_client,
 	},
 	{
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e929e6f51788..62ac78b28efc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5739,6 +5739,34 @@ u64 nfsd_forget_client(struct nfs4_client *clp, u64 max)
 	return 1;
 }
 
+u64
+nfsd_inject_forget_client(struct nfsd_fault_inject_op *op,
+			  struct sockaddr_storage *addr, size_t addr_size)
+{
+	u64 count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+					  nfsd_net_id);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	clp = nfsd_find_client(addr, addr_size);
+	if (clp) {
+		if (mark_client_expired_locked(clp) == nfs_ok)
+			++count;
+		else
+			clp = NULL;
+	}
+	spin_unlock(&nn->client_lock);
+
+	if (clp)
+		expire_client(clp);
+
+	return count;
+}
+
 static void nfsd_print_count(struct nfs4_client *clp, unsigned int count,
 			     const char *type)
 {
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index e0bf42051c1f..ead0fe9027fb 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -475,6 +475,9 @@ struct nfs4_client *nfsd_find_client(struct sockaddr_storage *, size_t);
 
 u64 nfsd_inject_print_clients(struct nfsd_fault_inject_op *op);
 u64 nfsd_forget_client(struct nfs4_client *, u64);
+u64 nfsd_inject_forget_client(struct nfsd_fault_inject_op *,
+			      struct sockaddr_storage *, size_t);
+
 u64 nfsd_forget_client_locks(struct nfs4_client*, u64);
 u64 nfsd_forget_client_openowners(struct nfs4_client *, u64);
 u64 nfsd_forget_client_delegations(struct nfs4_client *, u64);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 081/100] nfsd: add nfsd_inject_forget_clients
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (79 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 080/100] nfsd: add a forget_client set_clnt routine Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 082/100] nfsd: add a list_head arg to nfsd_foreach_client_lock Jeff Layton
                   ` (18 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...which uses the client_lock for protection instead of client_mutex.
Also remove nfsd_forget_client as there are no more callers.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c |  3 +--
 fs/nfsd/nfs4state.c    | 42 ++++++++++++++++++++++++++++--------------
 fs/nfsd/state.h        |  2 +-
 3 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index 5f3ead0c72fb..76ecdff37ea2 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -135,9 +135,8 @@ static struct nfsd_fault_inject_op inject_ops[] = {
 	{
 		.file     = "forget_clients",
 		.get	  = nfsd_inject_print_clients,
-		.set_val  = nfsd_inject_set,
+		.set_val  = nfsd_inject_forget_clients,
 		.set_clnt = nfsd_inject_forget_client,
-		.forget   = nfsd_forget_client,
 	},
 	{
 		.file     = "forget_locks",
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 62ac78b28efc..92c12dbcd7df 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5725,20 +5725,6 @@ nfsd_inject_print_clients(struct nfsd_fault_inject_op *op)
 	return count;
 }
 
-u64 nfsd_forget_client(struct nfs4_client *clp, u64 max)
-{
-	__be32 ret;
-	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
-
-	spin_lock(&nn->client_lock);
-	ret = mark_client_expired_locked(clp);
-	spin_unlock(&nn->client_lock);
-	if (ret != nfs_ok)
-		return 0;
-	expire_client(clp);
-	return 1;
-}
-
 u64
 nfsd_inject_forget_client(struct nfsd_fault_inject_op *op,
 			  struct sockaddr_storage *addr, size_t addr_size)
@@ -5767,6 +5753,34 @@ nfsd_inject_forget_client(struct nfsd_fault_inject_op *op,
 	return count;
 }
 
+u64
+nfsd_inject_forget_clients(struct nfsd_fault_inject_op *op, u64 max)
+{
+	u64 count = 0;
+	struct nfs4_client *clp, *next;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry_safe(clp, next, &nn->client_lru, cl_lru) {
+		if (mark_client_expired_locked(clp) == nfs_ok) {
+			list_add(&clp->cl_lru, &reaplist);
+			if (max != 0 && ++count >= max)
+				break;
+		}
+	}
+	spin_unlock(&nn->client_lock);
+
+	list_for_each_entry_safe(clp, next, &reaplist, cl_lru)
+		expire_client(clp);
+
+	return count;
+}
+
 static void nfsd_print_count(struct nfs4_client *clp, unsigned int count,
 			     const char *type)
 {
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index ead0fe9027fb..25b4df7bc57a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -474,9 +474,9 @@ u64 nfsd_for_n_state(u64, u64 (*)(struct nfs4_client *, u64));
 struct nfs4_client *nfsd_find_client(struct sockaddr_storage *, size_t);
 
 u64 nfsd_inject_print_clients(struct nfsd_fault_inject_op *op);
-u64 nfsd_forget_client(struct nfs4_client *, u64);
 u64 nfsd_inject_forget_client(struct nfsd_fault_inject_op *,
 			      struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_clients(struct nfsd_fault_inject_op *, u64);
 
 u64 nfsd_forget_client_locks(struct nfs4_client*, u64);
 u64 nfsd_forget_client_openowners(struct nfs4_client *, u64);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 082/100] nfsd: add a list_head arg to nfsd_foreach_client_lock
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (80 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 081/100] nfsd: add nfsd_inject_forget_clients Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 083/100] nfsd: add more granular locking to forget_locks fault injector Jeff Layton
                   ` (17 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

In a later patch, we'll want to collect the locks onto a list for later
destruction. If "func" is defined and "collect" is defined, then we'll
add the lock stateid to the list.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 92c12dbcd7df..dbd1f422b3ae 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5790,6 +5790,7 @@ static void nfsd_print_count(struct nfs4_client *clp, unsigned int count,
 }
 
 static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
+				    struct list_head *collect,
 				    void (*func)(struct nfs4_ol_stateid *))
 {
 	struct nfs4_openowner *oop;
@@ -5802,8 +5803,12 @@ static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
 				&oop->oo_owner.so_stateids, st_perstateowner) {
 			list_for_each_entry_safe(lst, lst_next,
 					&stp->st_locks, st_locks) {
-				if (func)
+				if (func) {
 					func(lst);
+					if (collect)
+						list_add(&lst->st_locks,
+							 collect);
+				}
 				if (++count == max)
 					return count;
 			}
@@ -5815,12 +5820,12 @@ static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
 
 u64 nfsd_forget_client_locks(struct nfs4_client *clp, u64 max)
 {
-	return nfsd_foreach_client_lock(clp, max, release_lock_stateid);
+	return nfsd_foreach_client_lock(clp, max, NULL, release_lock_stateid);
 }
 
 u64 nfsd_print_client_locks(struct nfs4_client *clp, u64 max)
 {
-	u64 count = nfsd_foreach_client_lock(clp, max, NULL);
+	u64 count = nfsd_foreach_client_lock(clp, max, NULL, NULL);
 	nfsd_print_count(clp, count, "locked files");
 	return count;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 083/100] nfsd: add more granular locking to forget_locks fault injector
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (81 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 082/100] nfsd: add a list_head arg to nfsd_foreach_client_lock Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 084/100] nfsd: add more granular locking to forget_openowners " Jeff Layton
                   ` (16 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...instead of relying on the client_mutex.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c |   8 ++-
 fs/nfsd/nfs4state.c    | 132 +++++++++++++++++++++++++++++++++++++++++++++----
 fs/nfsd/state.h        |   7 ++-
 3 files changed, 131 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index 76ecdff37ea2..a444d821d2a5 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -140,11 +140,9 @@ static struct nfsd_fault_inject_op inject_ops[] = {
 	},
 	{
 		.file     = "forget_locks",
-		.get	  = nfsd_inject_get,
-		.set_val  = nfsd_inject_set,
-		.set_clnt = nfsd_inject_set_client,
-		.forget   = nfsd_forget_client_locks,
-		.print    = nfsd_print_client_locks,
+		.get	  = nfsd_inject_print_locks,
+		.set_val  = nfsd_inject_forget_locks,
+		.set_clnt = nfsd_inject_forget_client_locks,
 	},
 	{
 		.file     = "forget_openowners",
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index dbd1f422b3ae..5f781c843523 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5702,6 +5702,12 @@ nfs4_check_open_reclaim(clientid_t *clid,
 }
 
 #ifdef CONFIG_NFSD_FAULT_INJECTION
+static inline void
+put_client(struct nfs4_client *clp)
+{
+	atomic_dec(&clp->cl_refcount);
+}
+
 u64
 nfsd_inject_print_clients(struct nfsd_fault_inject_op *op)
 {
@@ -5789,6 +5795,22 @@ static void nfsd_print_count(struct nfs4_client *clp, unsigned int count,
 	printk(KERN_INFO "NFS Client: %s has %u %s\n", buf, count, type);
 }
 
+static void
+nfsd_inject_add_lock_to_list(struct nfs4_ol_stateid *lst,
+			     struct list_head *collect)
+{
+	struct nfs4_client *clp = lst->st_stid.sc_client;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+					  nfsd_net_id);
+
+	if (!collect)
+		return;
+
+	lockdep_assert_held(&nn->client_lock);
+	atomic_inc(&clp->cl_refcount);
+	list_add(&lst->st_locks, collect);
+}
+
 static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
 				    struct list_head *collect,
 				    void (*func)(struct nfs4_ol_stateid *))
@@ -5798,6 +5820,7 @@ static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
 	struct nfs4_ol_stateid *lst, *lst_next;
 	u64 count = 0;
 
+	spin_lock(&clp->cl_lock);
 	list_for_each_entry(oop, &clp->cl_openowners, oo_perclient) {
 		list_for_each_entry_safe(stp, st_next,
 				&oop->oo_owner.so_stateids, st_perstateowner) {
@@ -5805,31 +5828,122 @@ static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
 					&stp->st_locks, st_locks) {
 				if (func) {
 					func(lst);
-					if (collect)
-						list_add(&lst->st_locks,
-							 collect);
+					nfsd_inject_add_lock_to_list(lst,
+								collect);
 				}
-				if (++count == max)
-					return count;
+				++count;
+				/*
+				 * Despite the fact that these functions deal
+				 * with 64-bit integers for "count", we must
+				 * ensure that it doesn't blow up the
+				 * clp->cl_refcount. Throw a warning if we
+				 * start to approach INT_MAX here.
+				 */
+				WARN_ON_ONCE(count == (INT_MAX / 2));
+				if (count == max)
+					goto out;
 			}
 		}
 	}
+out:
+	spin_unlock(&clp->cl_lock);
 
 	return count;
 }
 
-u64 nfsd_forget_client_locks(struct nfs4_client *clp, u64 max)
+static u64
+nfsd_collect_client_locks(struct nfs4_client *clp, struct list_head *collect,
+			  u64 max)
 {
-	return nfsd_foreach_client_lock(clp, max, NULL, release_lock_stateid);
+	return nfsd_foreach_client_lock(clp, max, collect, unhash_lock_stateid);
 }
 
-u64 nfsd_print_client_locks(struct nfs4_client *clp, u64 max)
+static u64
+nfsd_print_client_locks(struct nfs4_client *clp)
 {
-	u64 count = nfsd_foreach_client_lock(clp, max, NULL, NULL);
+	u64 count = nfsd_foreach_client_lock(clp, 0, NULL, NULL);
 	nfsd_print_count(clp, count, "locked files");
 	return count;
 }
 
+u64
+nfsd_inject_print_locks(struct nfsd_fault_inject_op *op)
+{
+	struct nfs4_client *clp;
+	u64 count = 0;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+
+	if (!nfsd_netns_ready(nn))
+		return 0;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru)
+		count += nfsd_print_client_locks(clp);
+	spin_unlock(&nn->client_lock);
+
+	return count;
+}
+
+static void
+nfsd_reap_locks(struct list_head *reaplist)
+{
+	struct nfs4_client *clp;
+	struct nfs4_ol_stateid *stp, *next;
+
+	list_for_each_entry_safe(stp, next, reaplist, st_locks) {
+		list_del_init(&stp->st_locks);
+		clp = stp->st_stid.sc_client;
+		put_generic_stateid(stp);
+		put_client(clp);
+	}
+}
+
+u64
+nfsd_inject_forget_client_locks(struct nfsd_fault_inject_op *op,
+				struct sockaddr_storage *addr, size_t addr_size)
+{
+	unsigned int count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	clp = nfsd_find_client(addr, addr_size);
+	if (clp)
+		count = nfsd_collect_client_locks(clp, &reaplist, 0);
+	spin_unlock(&nn->client_lock);
+	nfsd_reap_locks(&reaplist);
+	return count;
+}
+
+u64
+nfsd_inject_forget_locks(struct nfsd_fault_inject_op *op, u64 max)
+{
+	u64 count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		count += nfsd_collect_client_locks(clp, &reaplist, max - count);
+		if (max != 0 && count >= max)
+			break;
+	}
+	spin_unlock(&nn->client_lock);
+	nfsd_reap_locks(&reaplist);
+	return count;
+}
+
 static u64 nfsd_foreach_client_open(struct nfs4_client *clp, u64 max, void (*func)(struct nfs4_openowner *))
 {
 	struct nfs4_openowner *oop, *next;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 25b4df7bc57a..5015ad3c9282 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -478,12 +478,15 @@ u64 nfsd_inject_forget_client(struct nfsd_fault_inject_op *,
 			      struct sockaddr_storage *, size_t);
 u64 nfsd_inject_forget_clients(struct nfsd_fault_inject_op *, u64);
 
-u64 nfsd_forget_client_locks(struct nfs4_client*, u64);
+u64 nfsd_inject_print_locks(struct nfsd_fault_inject_op *);
+u64 nfsd_inject_forget_client_locks(struct nfsd_fault_inject_op *,
+				    struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_locks(struct nfsd_fault_inject_op *, u64);
+
 u64 nfsd_forget_client_openowners(struct nfs4_client *, u64);
 u64 nfsd_forget_client_delegations(struct nfs4_client *, u64);
 u64 nfsd_recall_client_delegations(struct nfs4_client *, u64);
 
-u64 nfsd_print_client_locks(struct nfs4_client *, u64);
 u64 nfsd_print_client_openowners(struct nfs4_client *, u64);
 u64 nfsd_print_client_delegations(struct nfs4_client *, u64);
 #else /* CONFIG_NFSD_FAULT_INJECTION */
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 084/100] nfsd: add more granular locking to forget_openowners fault injector
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (82 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 083/100] nfsd: add more granular locking to forget_locks fault injector Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 085/100] nfsd: add more granular locking to *_delegations fault injectors Jeff Layton
                   ` (15 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...instead of relying on the client_mutex.

Also, fix up the printk output that is generated when the file is read.
It currently says that it's reporting the number of open files, but
it's actually reporting the number of openowners.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c |   8 ++--
 fs/nfsd/nfs4state.c    | 122 +++++++++++++++++++++++++++++++++++++++++++++----
 fs/nfsd/state.h        |   7 ++-
 3 files changed, 122 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index a444d821d2a5..d4472cd19807 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -146,11 +146,9 @@ static struct nfsd_fault_inject_op inject_ops[] = {
 	},
 	{
 		.file     = "forget_openowners",
-		.get	  = nfsd_inject_get,
-		.set_val  = nfsd_inject_set,
-		.set_clnt = nfsd_inject_set_client,
-		.forget   = nfsd_forget_client_openowners,
-		.print    = nfsd_print_client_openowners,
+		.get	  = nfsd_inject_print_openowners,
+		.set_val  = nfsd_inject_forget_openowners,
+		.set_clnt = nfsd_inject_forget_client_openowners,
 	},
 	{
 		.file     = "forget_delegations",
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5f781c843523..017201d02c46 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5944,30 +5944,136 @@ nfsd_inject_forget_locks(struct nfsd_fault_inject_op *op, u64 max)
 	return count;
 }
 
-static u64 nfsd_foreach_client_open(struct nfs4_client *clp, u64 max, void (*func)(struct nfs4_openowner *))
+static u64
+nfsd_foreach_client_openowner(struct nfs4_client *clp, u64 max,
+			      struct list_head *collect,
+			      void (*func)(struct nfs4_openowner *))
 {
 	struct nfs4_openowner *oop, *next;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
 	u64 count = 0;
 
+	lockdep_assert_held(&nn->client_lock);
+
+	spin_lock(&clp->cl_lock);
 	list_for_each_entry_safe(oop, next, &clp->cl_openowners, oo_perclient) {
-		if (func)
+		if (func) {
 			func(oop);
-		if (++count == max)
+			if (collect) {
+				atomic_inc(&clp->cl_refcount);
+				list_add(&oop->oo_perclient, collect);
+			}
+		}
+		++count;
+		/*
+		 * Despite the fact that these functions deal with
+		 * 64-bit integers for "count", we must ensure that
+		 * it doesn't blow up the clp->cl_refcount. Throw a
+		 * warning if we start to approach INT_MAX here.
+		 */
+		WARN_ON_ONCE(count == (INT_MAX / 2));
+		if (count == max)
 			break;
 	}
+	spin_unlock(&clp->cl_lock);
+
+	return count;
+}
 
+static u64
+nfsd_print_client_openowners(struct nfs4_client *clp)
+{
+	u64 count = nfsd_foreach_client_openowner(clp, 0, NULL, NULL);
+
+	nfsd_print_count(clp, count, "openowners");
 	return count;
 }
 
-u64 nfsd_forget_client_openowners(struct nfs4_client *clp, u64 max)
+static u64
+nfsd_collect_client_openowners(struct nfs4_client *clp,
+			       struct list_head *collect, u64 max)
 {
-	return nfsd_foreach_client_open(clp, max, release_openowner);
+	return nfsd_foreach_client_openowner(clp, max, collect,
+						unhash_openowner_locked);
 }
 
-u64 nfsd_print_client_openowners(struct nfs4_client *clp, u64 max)
+u64
+nfsd_inject_print_openowners(struct nfsd_fault_inject_op *op)
 {
-	u64 count = nfsd_foreach_client_open(clp, max, NULL);
-	nfsd_print_count(clp, count, "open files");
+	struct nfs4_client *clp;
+	u64 count = 0;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+
+	if (!nfsd_netns_ready(nn))
+		return 0;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru)
+		count += nfsd_print_client_openowners(clp);
+	spin_unlock(&nn->client_lock);
+
+	return count;
+}
+
+static void
+nfsd_reap_openowners(struct list_head *reaplist)
+{
+	struct nfs4_client *clp;
+	struct nfs4_openowner *oop, *next;
+
+	list_for_each_entry_safe(oop, next, reaplist, oo_perclient) {
+		list_del_init(&oop->oo_perclient);
+		clp = oop->oo_owner.so_client;
+		release_openowner(oop);
+		put_client(clp);
+	}
+}
+
+u64
+nfsd_inject_forget_client_openowners(struct nfsd_fault_inject_op *op,
+				struct sockaddr_storage *addr, size_t addr_size)
+{
+	unsigned int count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	clp = nfsd_find_client(addr, addr_size);
+	if (clp)
+		count = nfsd_collect_client_openowners(clp, &reaplist, 0);
+	spin_unlock(&nn->client_lock);
+	nfsd_reap_openowners(&reaplist);
+	return count;
+}
+
+u64
+nfsd_inject_forget_openowners(struct nfsd_fault_inject_op *op, u64 max)
+{
+	u64 count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		count += nfsd_collect_client_openowners(clp, &reaplist,
+							max - count);
+		if (max != 0 && count >= max)
+			break;
+	}
+	spin_unlock(&nn->client_lock);
+	nfsd_reap_openowners(&reaplist);
 	return count;
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 5015ad3c9282..36651017697b 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -483,11 +483,14 @@ u64 nfsd_inject_forget_client_locks(struct nfsd_fault_inject_op *,
 				    struct sockaddr_storage *, size_t);
 u64 nfsd_inject_forget_locks(struct nfsd_fault_inject_op *, u64);
 
-u64 nfsd_forget_client_openowners(struct nfs4_client *, u64);
+u64 nfsd_inject_print_openowners(struct nfsd_fault_inject_op *);
+u64 nfsd_inject_forget_client_openowners(struct nfsd_fault_inject_op *,
+					 struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_openowners(struct nfsd_fault_inject_op *, u64);
+
 u64 nfsd_forget_client_delegations(struct nfs4_client *, u64);
 u64 nfsd_recall_client_delegations(struct nfs4_client *, u64);
 
-u64 nfsd_print_client_openowners(struct nfs4_client *, u64);
 u64 nfsd_print_client_delegations(struct nfs4_client *, u64);
 #else /* CONFIG_NFSD_FAULT_INJECTION */
 static inline int nfsd_fault_inject_init(void) { return 0; }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 085/100] nfsd: add more granular locking to *_delegations fault injectors
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (83 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 084/100] nfsd: add more granular locking to forget_openowners " Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 086/100] nfsd: remove old fault injection infrastructure Jeff Layton
                   ` (14 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

...instead of relying on the client_mutex.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c |  16 ++---
 fs/nfsd/nfs4state.c    | 177 +++++++++++++++++++++++++++++++++++++++++--------
 fs/nfsd/state.h        |  11 +--
 3 files changed, 164 insertions(+), 40 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index d4472cd19807..2479dba71c3c 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -152,19 +152,15 @@ static struct nfsd_fault_inject_op inject_ops[] = {
 	},
 	{
 		.file     = "forget_delegations",
-		.get	  = nfsd_inject_get,
-		.set_val  = nfsd_inject_set,
-		.set_clnt = nfsd_inject_set_client,
-		.forget   = nfsd_forget_client_delegations,
-		.print    = nfsd_print_client_delegations,
+		.get	  = nfsd_inject_print_delegations,
+		.set_val  = nfsd_inject_forget_delegations,
+		.set_clnt = nfsd_inject_forget_client_delegations,
 	},
 	{
 		.file     = "recall_delegations",
-		.get	  = nfsd_inject_get,
-		.set_val  = nfsd_inject_set,
-		.set_clnt = nfsd_inject_set_client,
-		.forget   = nfsd_recall_client_delegations,
-		.print    = nfsd_print_client_delegations,
+		.get	  = nfsd_inject_print_delegations,
+		.set_val  = nfsd_inject_recall_delegations,
+		.set_clnt = nfsd_inject_recall_client_delegations,
 	},
 };
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 017201d02c46..f1bd137c3851 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6081,9 +6081,13 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 				     struct list_head *victims)
 {
 	struct nfs4_delegation *dp, *next;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
 	u64 count = 0;
 
-	lockdep_assert_held(&state_lock);
+	lockdep_assert_held(&nn->client_lock);
+
+	spin_lock(&state_lock);
 	list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
 		if (victims) {
 			/*
@@ -6100,60 +6104,180 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 			 * don't monkey with it now that we are.
 			 */
 			++dp->dl_time;
+			atomic_inc(&clp->cl_refcount);
 			unhash_delegation_locked(dp);
 			list_add(&dp->dl_recall_lru, victims);
 		}
-		if (++count == max)
+		++count;
+		/*
+		 * Despite the fact that these functions deal with
+		 * 64-bit integers for "count", we must ensure that
+		 * it doesn't blow up the clp->cl_refcount. Throw a
+		 * warning if we start to approach INT_MAX here.
+		 */
+		WARN_ON_ONCE(count == (INT_MAX / 2));
+		if (count == max)
 			break;
 	}
+	spin_unlock(&state_lock);
+	return count;
+}
+
+static u64
+nfsd_print_client_delegations(struct nfs4_client *clp)
+{
+	u64 count = nfsd_find_all_delegations(clp, 0, NULL);
+
+	nfsd_print_count(clp, count, "delegations");
 	return count;
 }
 
-u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
+u64
+nfsd_inject_print_delegations(struct nfsd_fault_inject_op *op)
 {
-	struct nfs4_delegation *dp, *next;
-	LIST_HEAD(victims);
-	u64 count;
+	struct nfs4_client *clp;
+	u64 count = 0;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
 
-	spin_lock(&state_lock);
-	count = nfsd_find_all_delegations(clp, max, &victims);
-	spin_unlock(&state_lock);
+	if (!nfsd_netns_ready(nn))
+		return 0;
 
-	list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru)
+		count += nfsd_print_client_delegations(clp);
+	spin_unlock(&nn->client_lock);
+
+	return count;
+}
+
+static void
+nfsd_forget_delegations(struct list_head *reaplist)
+{
+	struct nfs4_client *clp;
+	struct nfs4_delegation *dp, *next;
+
+	list_for_each_entry_safe(dp, next, reaplist, dl_recall_lru) {
+		list_del_init(&dp->dl_recall_lru);
+		clp = dp->dl_stid.sc_client;
 		revoke_delegation(dp);
+		put_client(clp);
+	}
+}
 
+u64
+nfsd_inject_forget_client_delegations(struct nfsd_fault_inject_op *op,
+				struct sockaddr_storage *addr, size_t addr_size)
+{
+	u64 count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
+
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	clp = nfsd_find_client(addr, addr_size);
+	if (clp)
+		count = nfsd_find_all_delegations(clp, 0, &reaplist);
+	spin_unlock(&nn->client_lock);
+
+	nfsd_forget_delegations(&reaplist);
 	return count;
 }
 
-u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
+u64
+nfsd_inject_forget_delegations(struct nfsd_fault_inject_op *op, u64 max)
 {
-	struct nfs4_delegation *dp;
-	LIST_HEAD(victims);
-	u64 count;
+	u64 count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
 
-	spin_lock(&state_lock);
-	count = nfsd_find_all_delegations(clp, max, &victims);
-	while (!list_empty(&victims)) {
-		dp = list_first_entry(&victims, struct nfs4_delegation,
-					dl_recall_lru);
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		count += nfsd_find_all_delegations(clp, max - count, &reaplist);
+		if (max != 0 && count >= max)
+			break;
+	}
+	spin_unlock(&nn->client_lock);
+	nfsd_forget_delegations(&reaplist);
+	return count;
+}
+
+static void
+nfsd_recall_delegations(struct list_head *reaplist)
+{
+	struct nfs4_client *clp;
+	struct nfs4_delegation *dp, *next;
+
+	list_for_each_entry_safe(dp, next, reaplist, dl_recall_lru) {
 		list_del_init(&dp->dl_recall_lru);
+		clp = dp->dl_stid.sc_client;
+		/*
+		 * We skipped all entries that had a zero dl_time before,
+		 * so we can now reset the dl_time back to 0. If a delegation
+		 * break comes in now, then it won't make any difference since
+		 * we're recalling it either way.
+		 */
+		spin_lock(&state_lock);
 		dp->dl_time = 0;
+		spin_unlock(&state_lock);
 		nfsd_break_one_deleg(dp);
+		put_client(clp);
 	}
-	spin_unlock(&state_lock);
+}
+
+u64
+nfsd_inject_recall_client_delegations(struct nfsd_fault_inject_op *op,
+				      struct sockaddr_storage *addr,
+				      size_t addr_size)
+{
+	u64 count = 0;
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
 
+	if (!nfsd_netns_ready(nn))
+		return count;
+
+	spin_lock(&nn->client_lock);
+	clp = nfsd_find_client(addr, addr_size);
+	if (clp)
+		count = nfsd_find_all_delegations(clp, 0, &reaplist);
+	spin_unlock(&nn->client_lock);
+
+	nfsd_recall_delegations(&reaplist);
 	return count;
 }
 
-u64 nfsd_print_client_delegations(struct nfs4_client *clp, u64 max)
+u64
+nfsd_inject_recall_delegations(struct nfsd_fault_inject_op *op, u64 max)
 {
 	u64 count = 0;
+	struct nfs4_client *clp, *next;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
+	LIST_HEAD(reaplist);
 
-	spin_lock(&state_lock);
-	count = nfsd_find_all_delegations(clp, max, NULL);
-	spin_unlock(&state_lock);
+	if (!nfsd_netns_ready(nn))
+		return count;
 
-	nfsd_print_count(clp, count, "delegations");
+	spin_lock(&nn->client_lock);
+	list_for_each_entry_safe(clp, next, &nn->client_lru, cl_lru) {
+		count += nfsd_find_all_delegations(clp, max - count, &reaplist);
+		if (max != 0 && ++count >= max)
+			break;
+	}
+	spin_unlock(&nn->client_lock);
+	nfsd_recall_delegations(&reaplist);
 	return count;
 }
 
@@ -6161,7 +6285,8 @@ u64 nfsd_for_n_state(u64 max, u64 (*func)(struct nfs4_client *, u64))
 {
 	struct nfs4_client *clp, *next;
 	u64 count = 0;
-	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns, nfsd_net_id);
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+						nfsd_net_id);
 
 	if (!nfsd_netns_ready(nn))
 		return 0;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 36651017697b..9bdf6807d063 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -488,10 +488,13 @@ u64 nfsd_inject_forget_client_openowners(struct nfsd_fault_inject_op *,
 					 struct sockaddr_storage *, size_t);
 u64 nfsd_inject_forget_openowners(struct nfsd_fault_inject_op *, u64);
 
-u64 nfsd_forget_client_delegations(struct nfs4_client *, u64);
-u64 nfsd_recall_client_delegations(struct nfs4_client *, u64);
-
-u64 nfsd_print_client_delegations(struct nfs4_client *, u64);
+u64 nfsd_inject_print_delegations(struct nfsd_fault_inject_op *);
+u64 nfsd_inject_forget_client_delegations(struct nfsd_fault_inject_op *,
+					  struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_delegations(struct nfsd_fault_inject_op *, u64);
+u64 nfsd_inject_recall_client_delegations(struct nfsd_fault_inject_op *,
+					  struct sockaddr_storage *, size_t);
+u64 nfsd_inject_recall_delegations(struct nfsd_fault_inject_op *, u64);
 #else /* CONFIG_NFSD_FAULT_INJECTION */
 static inline int nfsd_fault_inject_init(void) { return 0; }
 static inline void nfsd_fault_inject_cleanup(void) {}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 086/100] nfsd: remove old fault injection infrastructure
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (84 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 085/100] nfsd: add more granular locking to *_delegations fault injectors Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 087/100] nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() Jeff Layton
                   ` (13 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Remove the old nfsd_for_n_state function and move nfsd_find_client
higher up into the file to get rid of forward declaration. Remove
the struct nfsd_fault_inject_op arguments from the operations as
they are no longer needed by any of them.

Finally, remove the old "standard" get and set routines, which
also eliminates the client_mutex from this code.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/fault_inject.c | 51 ++++-------------------------
 fs/nfsd/nfs4state.c    | 87 +++++++++++++++++++-------------------------------
 fs/nfsd/state.h        | 45 +++++++++++---------------
 3 files changed, 57 insertions(+), 126 deletions(-)

diff --git a/fs/nfsd/fault_inject.c b/fs/nfsd/fault_inject.c
index 2479dba71c3c..c16bf5af6831 100644
--- a/fs/nfsd/fault_inject.c
+++ b/fs/nfsd/fault_inject.c
@@ -17,52 +17,13 @@
 
 struct nfsd_fault_inject_op {
 	char *file;
-	u64 (*get)(struct nfsd_fault_inject_op *);
-	u64 (*set_val)(struct nfsd_fault_inject_op *, u64);
-	u64 (*set_clnt)(struct nfsd_fault_inject_op *,
-			struct sockaddr_storage *, size_t);
-	u64 (*forget)(struct nfs4_client *, u64);
-	u64 (*print)(struct nfs4_client *, u64);
+	u64 (*get)(void);
+	u64 (*set_val)(u64);
+	u64 (*set_clnt)(struct sockaddr_storage *, size_t);
 };
 
 static struct dentry *debug_dir;
 
-static u64 nfsd_inject_set(struct nfsd_fault_inject_op *op, u64 val)
-{
-	u64 count;
-
-	nfs4_lock_state();
-	count = nfsd_for_n_state(val, op->forget);
-	nfs4_unlock_state();
-	return count;
-}
-
-static u64 nfsd_inject_set_client(struct nfsd_fault_inject_op *op,
-				   struct sockaddr_storage *addr,
-				   size_t addr_size)
-{
-	struct nfs4_client *clp;
-	u64 count = 0;
-
-	nfs4_lock_state();
-	clp = nfsd_find_client(addr, addr_size);
-	if (clp)
-		count = op->forget(clp, 0);
-	nfs4_unlock_state();
-	return count;
-}
-
-static u64 nfsd_inject_get(struct nfsd_fault_inject_op *op)
-{
-	u64 count;
-
-	nfs4_lock_state();
-	count = nfsd_for_n_state(0, op->print);
-	nfs4_unlock_state();
-
-	return count;
-}
-
 static ssize_t fault_inject_read(struct file *file, char __user *buf,
 				 size_t len, loff_t *ppos)
 {
@@ -73,7 +34,7 @@ static ssize_t fault_inject_read(struct file *file, char __user *buf,
 	struct nfsd_fault_inject_op *op = file_inode(file)->i_private;
 
 	if (!pos)
-		val = op->get(op);
+		val = op->get();
 	size = scnprintf(read_buf, sizeof(read_buf), "%llu\n", val);
 
 	return simple_read_from_buffer(buf, len, ppos, read_buf, size);
@@ -103,7 +64,7 @@ static ssize_t fault_inject_write(struct file *file, const char __user *buf,
 
 	size = rpc_pton(net, write_buf, size, (struct sockaddr *)&sa, sizeof(sa));
 	if (size > 0) {
-		val = op->set_clnt(op, &sa, size);
+		val = op->set_clnt(&sa, size);
 		if (val)
 			pr_info("NFSD [%s]: Client %s had %llu state object(s)\n",
 				op->file, write_buf, val);
@@ -114,7 +75,7 @@ static ssize_t fault_inject_write(struct file *file, const char __user *buf,
 		else
 			pr_info("NFSD Fault Injection: %s (n = %llu)",
 				op->file, val);
-		val = op->set_val(op, val);
+		val = op->set_val(val);
 		pr_info("NFSD: %s: found %llu", op->file, val);
 	}
 	return len; /* on success, claim we got the whole input */
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f1bd137c3851..6342fb55ced4 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5708,8 +5708,25 @@ put_client(struct nfs4_client *clp)
 	atomic_dec(&clp->cl_refcount);
 }
 
+static struct nfs4_client *
+nfsd_find_client(struct sockaddr_storage *addr, size_t addr_size)
+{
+	struct nfs4_client *clp;
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+					  nfsd_net_id);
+
+	if (!nfsd_netns_ready(nn))
+		return NULL;
+
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		if (memcmp(&clp->cl_addr, addr, addr_size) == 0)
+			return clp;
+	}
+	return NULL;
+}
+
 u64
-nfsd_inject_print_clients(struct nfsd_fault_inject_op *op)
+nfsd_inject_print_clients(void)
 {
 	struct nfs4_client *clp;
 	u64 count = 0;
@@ -5732,8 +5749,7 @@ nfsd_inject_print_clients(struct nfsd_fault_inject_op *op)
 }
 
 u64
-nfsd_inject_forget_client(struct nfsd_fault_inject_op *op,
-			  struct sockaddr_storage *addr, size_t addr_size)
+nfsd_inject_forget_client(struct sockaddr_storage *addr, size_t addr_size)
 {
 	u64 count = 0;
 	struct nfs4_client *clp;
@@ -5760,7 +5776,7 @@ nfsd_inject_forget_client(struct nfsd_fault_inject_op *op,
 }
 
 u64
-nfsd_inject_forget_clients(struct nfsd_fault_inject_op *op, u64 max)
+nfsd_inject_forget_clients(u64 max)
 {
 	u64 count = 0;
 	struct nfs4_client *clp, *next;
@@ -5867,7 +5883,7 @@ nfsd_print_client_locks(struct nfs4_client *clp)
 }
 
 u64
-nfsd_inject_print_locks(struct nfsd_fault_inject_op *op)
+nfsd_inject_print_locks(void)
 {
 	struct nfs4_client *clp;
 	u64 count = 0;
@@ -5900,8 +5916,7 @@ nfsd_reap_locks(struct list_head *reaplist)
 }
 
 u64
-nfsd_inject_forget_client_locks(struct nfsd_fault_inject_op *op,
-				struct sockaddr_storage *addr, size_t addr_size)
+nfsd_inject_forget_client_locks(struct sockaddr_storage *addr, size_t addr_size)
 {
 	unsigned int count = 0;
 	struct nfs4_client *clp;
@@ -5922,7 +5937,7 @@ nfsd_inject_forget_client_locks(struct nfsd_fault_inject_op *op,
 }
 
 u64
-nfsd_inject_forget_locks(struct nfsd_fault_inject_op *op, u64 max)
+nfsd_inject_forget_locks(u64 max)
 {
 	u64 count = 0;
 	struct nfs4_client *clp;
@@ -5999,7 +6014,7 @@ nfsd_collect_client_openowners(struct nfs4_client *clp,
 }
 
 u64
-nfsd_inject_print_openowners(struct nfsd_fault_inject_op *op)
+nfsd_inject_print_openowners(void)
 {
 	struct nfs4_client *clp;
 	u64 count = 0;
@@ -6032,8 +6047,8 @@ nfsd_reap_openowners(struct list_head *reaplist)
 }
 
 u64
-nfsd_inject_forget_client_openowners(struct nfsd_fault_inject_op *op,
-				struct sockaddr_storage *addr, size_t addr_size)
+nfsd_inject_forget_client_openowners(struct sockaddr_storage *addr,
+				     size_t addr_size)
 {
 	unsigned int count = 0;
 	struct nfs4_client *clp;
@@ -6054,7 +6069,7 @@ nfsd_inject_forget_client_openowners(struct nfsd_fault_inject_op *op,
 }
 
 u64
-nfsd_inject_forget_openowners(struct nfsd_fault_inject_op *op, u64 max)
+nfsd_inject_forget_openowners(u64 max)
 {
 	u64 count = 0;
 	struct nfs4_client *clp;
@@ -6133,7 +6148,7 @@ nfsd_print_client_delegations(struct nfs4_client *clp)
 }
 
 u64
-nfsd_inject_print_delegations(struct nfsd_fault_inject_op *op)
+nfsd_inject_print_delegations(void)
 {
 	struct nfs4_client *clp;
 	u64 count = 0;
@@ -6166,8 +6181,8 @@ nfsd_forget_delegations(struct list_head *reaplist)
 }
 
 u64
-nfsd_inject_forget_client_delegations(struct nfsd_fault_inject_op *op,
-				struct sockaddr_storage *addr, size_t addr_size)
+nfsd_inject_forget_client_delegations(struct sockaddr_storage *addr,
+				      size_t addr_size)
 {
 	u64 count = 0;
 	struct nfs4_client *clp;
@@ -6189,7 +6204,7 @@ nfsd_inject_forget_client_delegations(struct nfsd_fault_inject_op *op,
 }
 
 u64
-nfsd_inject_forget_delegations(struct nfsd_fault_inject_op *op, u64 max)
+nfsd_inject_forget_delegations(u64 max)
 {
 	u64 count = 0;
 	struct nfs4_client *clp;
@@ -6235,8 +6250,7 @@ nfsd_recall_delegations(struct list_head *reaplist)
 }
 
 u64
-nfsd_inject_recall_client_delegations(struct nfsd_fault_inject_op *op,
-				      struct sockaddr_storage *addr,
+nfsd_inject_recall_client_delegations(struct sockaddr_storage *addr,
 				      size_t addr_size)
 {
 	u64 count = 0;
@@ -6259,7 +6273,7 @@ nfsd_inject_recall_client_delegations(struct nfsd_fault_inject_op *op,
 }
 
 u64
-nfsd_inject_recall_delegations(struct nfsd_fault_inject_op *op, u64 max)
+nfsd_inject_recall_delegations(u64 max)
 {
 	u64 count = 0;
 	struct nfs4_client *clp, *next;
@@ -6280,41 +6294,6 @@ nfsd_inject_recall_delegations(struct nfsd_fault_inject_op *op, u64 max)
 	nfsd_recall_delegations(&reaplist);
 	return count;
 }
-
-u64 nfsd_for_n_state(u64 max, u64 (*func)(struct nfs4_client *, u64))
-{
-	struct nfs4_client *clp, *next;
-	u64 count = 0;
-	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
-						nfsd_net_id);
-
-	if (!nfsd_netns_ready(nn))
-		return 0;
-
-	list_for_each_entry_safe(clp, next, &nn->client_lru, cl_lru) {
-		count += func(clp, max - count);
-		if ((max != 0) && (count >= max))
-			break;
-	}
-
-	return count;
-}
-
-struct nfs4_client *nfsd_find_client(struct sockaddr_storage *addr, size_t addr_size)
-{
-	struct nfs4_client *clp;
-	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns, nfsd_net_id);
-
-	if (!nfsd_netns_ready(nn))
-		return NULL;
-
-	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
-		if (memcmp(&clp->cl_addr, addr, addr_size) == 0)
-			return clp;
-	}
-	return NULL;
-}
-
 #endif /* CONFIG_NFSD_FAULT_INJECTION */
 
 /*
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 9bdf6807d063..c39ff5e1509f 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -466,35 +466,26 @@ extern void nfsd4_record_grace_done(struct nfsd_net *nn, time_t boot_time);
 
 /* nfs fault injection functions */
 #ifdef CONFIG_NFSD_FAULT_INJECTION
-struct nfsd_fault_inject_op;
-
 int nfsd_fault_inject_init(void);
 void nfsd_fault_inject_cleanup(void);
-u64 nfsd_for_n_state(u64, u64 (*)(struct nfs4_client *, u64));
-struct nfs4_client *nfsd_find_client(struct sockaddr_storage *, size_t);
-
-u64 nfsd_inject_print_clients(struct nfsd_fault_inject_op *op);
-u64 nfsd_inject_forget_client(struct nfsd_fault_inject_op *,
-			      struct sockaddr_storage *, size_t);
-u64 nfsd_inject_forget_clients(struct nfsd_fault_inject_op *, u64);
-
-u64 nfsd_inject_print_locks(struct nfsd_fault_inject_op *);
-u64 nfsd_inject_forget_client_locks(struct nfsd_fault_inject_op *,
-				    struct sockaddr_storage *, size_t);
-u64 nfsd_inject_forget_locks(struct nfsd_fault_inject_op *, u64);
-
-u64 nfsd_inject_print_openowners(struct nfsd_fault_inject_op *);
-u64 nfsd_inject_forget_client_openowners(struct nfsd_fault_inject_op *,
-					 struct sockaddr_storage *, size_t);
-u64 nfsd_inject_forget_openowners(struct nfsd_fault_inject_op *, u64);
-
-u64 nfsd_inject_print_delegations(struct nfsd_fault_inject_op *);
-u64 nfsd_inject_forget_client_delegations(struct nfsd_fault_inject_op *,
-					  struct sockaddr_storage *, size_t);
-u64 nfsd_inject_forget_delegations(struct nfsd_fault_inject_op *, u64);
-u64 nfsd_inject_recall_client_delegations(struct nfsd_fault_inject_op *,
-					  struct sockaddr_storage *, size_t);
-u64 nfsd_inject_recall_delegations(struct nfsd_fault_inject_op *, u64);
+
+u64 nfsd_inject_print_clients(void);
+u64 nfsd_inject_forget_client(struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_clients(u64);
+
+u64 nfsd_inject_print_locks(void);
+u64 nfsd_inject_forget_client_locks(struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_locks(u64);
+
+u64 nfsd_inject_print_openowners(void);
+u64 nfsd_inject_forget_client_openowners(struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_openowners(u64);
+
+u64 nfsd_inject_print_delegations(void);
+u64 nfsd_inject_forget_client_delegations(struct sockaddr_storage *, size_t);
+u64 nfsd_inject_forget_delegations(u64);
+u64 nfsd_inject_recall_client_delegations(struct sockaddr_storage *, size_t);
+u64 nfsd_inject_recall_delegations(u64);
 #else /* CONFIG_NFSD_FAULT_INJECTION */
 static inline int nfsd_fault_inject_init(void) { return 0; }
 static inline void nfsd_fault_inject_cleanup(void) {}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 087/100] nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (85 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 086/100] nfsd: remove old fault injection infrastructure Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 088/100] nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid Jeff Layton
                   ` (12 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6342fb55ced4..5410b7ee2273 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4464,13 +4464,11 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 	if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
 		return check_special_stateids(net, current_fh, stateid, flags);
 
-	nfs4_lock_state();
-
 	status = nfsd4_lookup_stateid(cstate, stateid,
 				NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
 				&s, nn);
 	if (status)
-		goto unlock_state;
+		return status;
 	status = check_stateid_generation(stateid, &s->sc_stateid, nfsd4_has_session(cstate));
 	if (status)
 		goto out;
@@ -4520,8 +4518,6 @@ nfs4_preprocess_stateid_op(struct net *net, struct nfsd4_compound_state *cstate,
 		*filpp = file;
 out:
 	nfs4_put_stid(s);
-unlock_state:
-	nfs4_unlock_state();
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 088/100] nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (86 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 087/100] nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 089/100] nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner Jeff Layton
                   ` (11 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5410b7ee2273..2acae21e67d1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4531,11 +4531,9 @@ nfsd4_test_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfsd4_test_stateid_id *stateid;
 	struct nfs4_client *cl = cstate->session->se_client;
 
-	nfs4_lock_state();
 	list_for_each_entry(stateid, &test_stateid->ts_stateid_list, ts_id_list)
 		stateid->ts_id_status =
 			nfsd4_validate_stateid(cl, &stateid->ts_id_stateid);
-	nfs4_unlock_state();
 
 	return nfs_ok;
 }
@@ -4551,7 +4549,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfs4_client *cl = cstate->session->se_client;
 	__be32 ret = nfserr_bad_stateid;
 
-	nfs4_lock_state();
 	spin_lock(&cl->cl_lock);
 	s = find_stateid_locked(cl, stateid);
 	if (!s)
@@ -4588,7 +4585,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 out_unlock:
 	spin_unlock(&cl->cl_lock);
 out:
-	nfs4_unlock_state();
 	return ret;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 089/100] nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (87 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 088/100] nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 090/100] nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() Jeff Layton
                   ` (10 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2acae21e67d1..4445925c4c13 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5556,11 +5556,9 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	dprintk("nfsd4_release_lockowner clientid: (%08x/%08x):\n",
 		clid->cl_boot, clid->cl_id);
 
-	nfs4_lock_state();
-
 	status = lookup_clientid(clid, cstate, nn);
 	if (status)
-		goto out;
+		return status;
 
 	clp = cstate->clp;
 	/* Find the matching lock stateowner */
@@ -5577,7 +5575,7 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 			if (check_for_locks(stp->st_stid.sc_file, lo)) {
 				status = nfserr_locks_held;
 				spin_unlock(&clp->cl_lock);
-				goto out;
+				return status;
 			}
 		}
 
@@ -5587,8 +5585,6 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
 	spin_unlock(&clp->cl_lock);
 	if (lo)
 		release_lockowner(lo);
-out:
-	nfs4_unlock_state();
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 090/100] nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (88 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 089/100] nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 091/100] nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close Jeff Layton
                   ` (9 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4445925c4c13..7f184503b42e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5193,8 +5193,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		return status;
 	}
 
-	nfs4_lock_state();
-
 	if (lock->lk_is_new) {
 		if (nfsd4_has_session(cstate))
 			/* See rfc 5661 18.10.3: given clientid is ignored: */
@@ -5337,7 +5335,6 @@ out:
 	if (open_stp)
 		put_generic_stateid(open_stp);
 	nfsd4_bump_seqid(cstate, status);
-	nfs4_unlock_state();
 	if (file_lock)
 		locks_free_lock(file_lock);
 	if (conflock)
@@ -5380,8 +5377,6 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (check_lock_length(lockt->lt_offset, lockt->lt_length))
 		 return nfserr_inval;
 
-	nfs4_lock_state();
-
 	if (!nfsd4_has_session(cstate)) {
 		status = lookup_clientid(&lockt->lt_clientid, cstate, nn);
 		if (status)
@@ -5436,7 +5431,6 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 out:
 	if (lo)
 		nfs4_put_stateowner(&lo->lo_owner);
-	nfs4_unlock_state();
 	if (file_lock)
 		locks_free_lock(file_lock);
 	return status;
@@ -5460,8 +5454,6 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (check_lock_length(locku->lu_offset, locku->lu_length))
 		 return nfserr_inval;
 
-	nfs4_lock_state();
-									        
 	status = nfs4_preprocess_seqid_op(cstate, locku->lu_seqid,
 					&locku->lu_stateid, NFS4_LOCK_STID,
 					&stp, nn);
@@ -5504,7 +5496,6 @@ put_stateid:
 	put_generic_stateid(stp);
 out:
 	nfsd4_bump_seqid(cstate, status);
-	nfs4_unlock_state();
 	if (file_lock)
 		locks_free_lock(file_lock);
 	return status;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 091/100] nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (89 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 090/100] nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 092/100] nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() Jeff Layton
                   ` (8 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7f184503b42e..46d3c13f7d94 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4708,7 +4708,6 @@ put_stateid:
 	put_generic_stateid(stp);
 out:
 	nfsd4_bump_seqid(cstate, status);
-	nfs4_unlock_state();
 	return status;
 }
 
@@ -4755,7 +4754,6 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
 		dprintk("NFSD: %s: od_deleg_want=0x%x ignored\n", __func__,
 			od->od_deleg_want);
 
-	nfs4_lock_state();
 	status = nfs4_preprocess_confirmed_seqid_op(cstate, od->od_seqid,
 					&od->od_stateid, &stp, nn);
 	if (status)
@@ -4821,7 +4819,6 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	dprintk("NFSD: nfsd4_close on file %pd\n", 
 			cstate->current_fh.fh_dentry);
 
-	nfs4_lock_state();
 	status = nfs4_preprocess_seqid_op(cstate, close->cl_seqid,
 					&close->cl_stateid,
 					NFS4_OPEN_STID|NFS4_CLOSED_STID,
@@ -4837,7 +4834,6 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	/* put reference from nfs4_preprocess_seqid_op */
 	put_generic_stateid(stp);
 out:
-	nfs4_unlock_state();
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 092/100] nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (90 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 091/100] nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 093/100] nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm Jeff Layton
                   ` (7 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 46d3c13f7d94..f597209a1e57 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4850,7 +4850,6 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
 		return status;
 
-	nfs4_lock_state();
 	status = nfsd4_lookup_stateid(cstate, stateid, NFS4_DELEG_STID, &s, nn);
 	if (status)
 		goto out;
@@ -4863,8 +4862,6 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 put_stateid:
 	nfs4_put_delegation(dp);
 out:
-	nfs4_unlock_state();
-
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 093/100] nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (91 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 092/100] nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 094/100] nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() Jeff Layton
                   ` (6 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4proc.c  | 3 ---
 fs/nfsd/nfs4state.c | 6 ------
 2 files changed, 9 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index c53757ec6580..918d42ab5976 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -385,8 +385,6 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (nfsd4_has_session(cstate))
 		copy_clientid(&open->op_clientid, cstate->session);
 
-	nfs4_lock_state();
-
 	/* check seqid for replay. set nfs4_owner */
 	resp = rqstp->rq_resp;
 	status = nfsd4_process_open1(&resp->cstate, open, nn);
@@ -469,7 +467,6 @@ out:
 	}
 	nfsd4_cleanup_open_state(cstate, open, status);
 	nfsd4_bump_seqid(cstate, status);
-	nfs4_unlock_state();
 	return status;
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f597209a1e57..a97c3f4d392c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4005,9 +4005,6 @@ static void nfsd4_deleg_xgrade_none_ext(struct nfsd4_open *open,
 	 */
 }
 
-/*
- * called with nfs4_lock_state() held.
- */
 __be32
 nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_open *open)
 {
@@ -4685,8 +4682,6 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (status)
 		return status;
 
-	nfs4_lock_state();
-
 	status = nfs4_preprocess_seqid_op(cstate,
 					oc->oc_seqid, &oc->oc_req_stateid,
 					NFS4_OPEN_STID, &stp, nn);
@@ -4780,7 +4775,6 @@ put_stateid:
 	put_generic_stateid(stp);
 out:
 	nfsd4_bump_seqid(cstate, status);
-	nfs4_unlock_state();
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 094/100] nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (92 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 093/100] nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 095/100] nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew Jeff Layton
                   ` (5 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Also destroy_clientid and bind_conn_to_session.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a97c3f4d392c..0c721cb01d42 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2182,7 +2182,6 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,
 		return nfserr_jukebox;
 
 	/* Cases below refer to rfc 5661 section 18.35.4: */
-	nfs4_lock_state();
 	spin_lock(&nn->client_lock);
 	conf = find_confirmed_client_by_name(&exid->clname, nn);
 	if (conf) {
@@ -2261,7 +2260,6 @@ out_copy:
 
 out:
 	spin_unlock(&nn->client_lock);
-	nfs4_unlock_state();
 	if (new)
 		expire_client(new);
 	if (unconf)
@@ -2435,7 +2433,6 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 	if (!conn)
 		goto out_free_session;
 
-	nfs4_lock_state();
 	spin_lock(&nn->client_lock);
 	unconf = find_unconfirmed_client(&cr_ses->clientid, true, nn);
 	conf = find_confirmed_client(&cr_ses->clientid, true, nn);
@@ -2505,13 +2502,11 @@ nfsd4_create_session(struct svc_rqst *rqstp,
 	/* init connection and backchannel */
 	nfsd4_init_conn(rqstp, conn, new);
 	nfsd4_put_session(new);
-	nfs4_unlock_state();
 	if (old)
 		expire_client(old);
 	return status;
 out_free_conn:
 	spin_unlock(&nn->client_lock);
-	nfs4_unlock_state();
 	free_conn(conn);
 	if (old)
 		expire_client(old);
@@ -2567,7 +2562,6 @@ __be32 nfsd4_bind_conn_to_session(struct svc_rqst *rqstp,
 
 	if (!nfsd4_last_compound_op(rqstp))
 		return nfserr_not_only_op;
-	nfs4_lock_state();
 	spin_lock(&nn->client_lock);
 	session = find_in_sessionid_hashtbl(&bcts->sessionid, net, &status);
 	spin_unlock(&nn->client_lock);
@@ -2588,7 +2582,6 @@ __be32 nfsd4_bind_conn_to_session(struct svc_rqst *rqstp,
 out:
 	nfsd4_put_session(session);
 out_no_session:
-	nfs4_unlock_state();
 	return status;
 }
 
@@ -2610,7 +2603,6 @@ nfsd4_destroy_session(struct svc_rqst *r,
 	struct net *net = SVC_NET(r);
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
-	nfs4_lock_state();
 	status = nfserr_not_only_op;
 	if (nfsd4_compound_in_session(cstate->session, &sessionid->sessionid)) {
 		if (!nfsd4_last_compound_op(r))
@@ -2640,7 +2632,6 @@ out_put_session:
 out_client_lock:
 	spin_unlock(&nn->client_lock);
 out:
-	nfs4_unlock_state();
 	return status;
 }
 
@@ -2843,7 +2834,6 @@ nfsd4_destroy_clientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 	__be32 status = 0;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
-	nfs4_lock_state();
 	spin_lock(&nn->client_lock);
 	unconf = find_unconfirmed_client(&dc->clientid, true, nn);
 	conf = find_confirmed_client(&dc->clientid, true, nn);
@@ -2872,7 +2862,6 @@ nfsd4_destroy_clientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 	unhash_client_locked(clp);
 out:
 	spin_unlock(&nn->client_lock);
-	nfs4_unlock_state();
 	if (clp)
 		expire_client(clp);
 	return status;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 095/100] nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (93 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 094/100] nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 096/100] nfsd: Remove nfs4_lock_state(): reclaim_complete() Jeff Layton
                   ` (4 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0c721cb01d42..d91e5d40e411 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2921,7 +2921,6 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (new == NULL)
 		return nfserr_jukebox;
 	/* Cases below refer to rfc 3530 section 14.2.33: */
-	nfs4_lock_state();
 	spin_lock(&nn->client_lock);
 	conf = find_confirmed_client_by_name(&clname, nn);
 	if (conf) {
@@ -2956,7 +2955,6 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	status = nfs_ok;
 out:
 	spin_unlock(&nn->client_lock);
-	nfs4_unlock_state();
 	if (new)
 		free_client(new);
 	if (unconf)
@@ -2979,7 +2977,6 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
 
 	if (STALE_CLIENTID(clid, nn))
 		return nfserr_stale_clientid;
-	nfs4_lock_state();
 
 	spin_lock(&nn->client_lock);
 	conf = find_confirmed_client(clid, false, nn);
@@ -3029,7 +3026,6 @@ out:
 	spin_unlock(&nn->client_lock);
 	if (old)
 		expire_client(old);
-	nfs4_unlock_state();
 	return status;
 }
 
@@ -4112,7 +4108,6 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	__be32 status;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 
-	nfs4_lock_state();
 	dprintk("process_renew(%08x/%08x): starting\n", 
 			clid->cl_boot, clid->cl_id);
 	status = lookup_clientid(clid, cstate, nn);
@@ -4125,7 +4120,6 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		goto out;
 	status = nfs_ok;
 out:
-	nfs4_unlock_state();
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 096/100] nfsd: Remove nfs4_lock_state(): reclaim_complete()
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (94 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 095/100] nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 097/100] nfsd: remove nfs4_lock_state: nfs4_laundromat Jeff Layton
                   ` (3 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Trond Myklebust <trond.myklebust@primarydata.com>

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/nfs4state.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d91e5d40e411..63db4213a5ce 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2882,7 +2882,6 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 		 return nfs_ok;
 	}
 
-	nfs4_lock_state();
 	status = nfserr_complete_already;
 	if (test_and_set_bit(NFSD4_CLIENT_RECLAIM_COMPLETE,
 			     &cstate->session->se_client->cl_flags))
@@ -2902,7 +2901,6 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
 	status = nfs_ok;
 	nfsd4_client_record_create(cstate->session->se_client);
 out:
-	nfs4_unlock_state();
 	return status;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 097/100] nfsd: remove nfs4_lock_state: nfs4_laundromat
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (95 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 096/100] nfsd: Remove nfs4_lock_state(): reclaim_complete() Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 098/100] nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net Jeff Layton
                   ` (2 subsequent siblings)
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 63db4213a5ce..34496bb3badc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4151,8 +4151,6 @@ nfs4_laundromat(struct nfsd_net *nn)
 	time_t cutoff = get_seconds() - nn->nfsd4_lease;
 	time_t t, new_timeo = nn->nfsd4_lease;
 
-	nfs4_lock_state();
-
 	dprintk("NFSD: laundromat service - starting\n");
 	nfsd4_end_grace(nn);
 	INIT_LIST_HEAD(&reaplist);
@@ -4220,7 +4218,6 @@ nfs4_laundromat(struct nfsd_net *nn)
 	spin_unlock(&nn->client_lock);
 
 	new_timeo = max_t(time_t, new_timeo, NFSD_LAUNDROMAT_MINTIMEOUT);
-	nfs4_unlock_state();
 	return new_timeo;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 098/100] nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (96 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 097/100] nfsd: remove nfs4_lock_state: nfs4_laundromat Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 099/100] nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 100/100] nfsd: add some comments to the nfsd4 object definitions Jeff Layton
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 34496bb3badc..4a5bcf256a1d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6390,7 +6390,6 @@ nfs4_state_shutdown_net(struct net *net)
 	cancel_delayed_work_sync(&nn->laundromat_work);
 	locks_end_grace(&nn->nfsd4_manager);
 
-	nfs4_lock_state();
 	INIT_LIST_HEAD(&reaplist);
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
@@ -6407,7 +6406,6 @@ nfs4_state_shutdown_net(struct net *net)
 
 	nfsd4_client_tracking_exit(net);
 	nfs4_state_destroy_net(net);
-	nfs4_unlock_state();
 }
 
 void
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 099/100] nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (97 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 098/100] nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-08 18:04 ` [PATCH v4 100/100] nfsd: add some comments to the nfsd4 object definitions Jeff Layton
  99 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4a5bcf256a1d..3396d246a437 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -80,9 +80,6 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop);
 
 /* Locking: */
 
-/* Currently used for almost all code touching nfsv4 state: */
-static DEFINE_MUTEX(client_mutex);
-
 /*
  * Currently used for the del_recall_lru and file hash table.  In an
  * effort to decrease the scope of the client_mutex, this spinlock may
@@ -102,12 +99,6 @@ static struct kmem_cache *file_slab;
 static struct kmem_cache *stateid_slab;
 static struct kmem_cache *deleg_slab;
 
-void
-nfs4_lock_state(void)
-{
-	mutex_lock(&client_mutex);
-}
-
 static void free_session(struct nfsd4_session *);
 
 static bool is_session_dead(struct nfsd4_session *ses)
@@ -123,12 +114,6 @@ static __be32 mark_session_dead_locked(struct nfsd4_session *ses, int ref_held_b
 	return nfs_ok;
 }
 
-void
-nfs4_unlock_state(void)
-{
-	mutex_unlock(&client_mutex);
-}
-
 static bool is_client_expired(struct nfs4_client *clp)
 {
 	return clp->cl_time == 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH v4 100/100] nfsd: add some comments to the nfsd4 object definitions
  2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
                   ` (98 preceding siblings ...)
  2014-07-08 18:04 ` [PATCH v4 099/100] nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers Jeff Layton
@ 2014-07-08 18:04 ` Jeff Layton
  2014-07-10  7:41   ` Christoph Hellwig
  99 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-08 18:04 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Add some comments that describe what each of these objects is, and how
they related to one another.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfsd/netns.h |  8 +++++
 fs/nfsd/state.h | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 3831ef6e5c75..46680da55cd7 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -34,6 +34,14 @@
 struct cld_net;
 struct nfsd4_client_tracking_ops;
 
+/*
+ * Represents a nfsd "container". With respect to nfsv4 state tracking, the
+ * fields of interest are the *_id_hashtbls and the *_name_tree. These track
+ * the nfs4_client objects by either short or long form clientid.
+ *
+ * Each nfsd_net runs a nfs4_laundromat workqueue job every lease period to
+ * clean up expired clients and delegations within the container.
+ */
 struct nfsd_net {
 	struct cld_net *cld_net;
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index c39ff5e1509f..806225fcda11 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -72,6 +72,11 @@ struct nfsd4_callback {
 	bool cb_done;
 };
 
+/*
+ * A core object that represents a "common" stateid. These are generally
+ * embedded within the different (more specific) stateid objects and contain
+ * fields that are of general use to any stateid.
+ */
 struct nfs4_stid {
 	atomic_t sc_count;
 #define NFS4_OPEN_STID 1
@@ -90,6 +95,18 @@ struct nfs4_stid {
 	void (*sc_free)(struct nfs4_stid *);
 };
 
+/*
+ * Represents a delegation stateid. The nfs4_client holds references to these
+ * and they are put when it is being destroyed or when the delegation is
+ * returned by the client.
+ *
+ * If the server attempts to recall a delegation and the client doesn't do so
+ * before a timeout, the server may also revoke the delegation. In that case,
+ * the object will either be destroyed (v4.0) or moved to a per-client list of
+ * revoked delegations (v4.1+).
+ *
+ * This object is a superset of the nfs4_stid.
+ */
 struct nfs4_delegation {
 	struct nfs4_stid	dl_stid; /* must be first field */
 	struct list_head	dl_perfile;
@@ -197,6 +214,11 @@ struct nfsd4_conn {
 	unsigned char cn_flags;
 };
 
+/*
+ * Representation of a v4.1+ session. These are refcounted in a similar fashion
+ * to the nfs4_client. References are only taken when the server is actively
+ * working on the object (primarily during the processing of compounds).
+ */
 struct nfsd4_session {
 	atomic_t		se_ref;
 	struct list_head	se_hash;	/* hash by sessionid */
@@ -226,13 +248,30 @@ struct nfsd4_sessionid {
 
 /*
  * struct nfs4_client - one per client.  Clientids live here.
- * 	o Each nfs4_client is hashed by clientid.
  *
- * 	o Each nfs4_clients is also hashed by name 
- * 	  (the opaque quantity initially sent by the client to identify itself).
+ * The initial object created by an NFS client using SETCLIENTID (for NFSv4.0)
+ * or EXCHANGE_ID (for NFSv4.1+). These objects are refcounted and timestamped.
+ * Each nfsd_net_ns object contains a set of these and they are tracked via
+ * short and long form clientid. They are hashed and searched for under the
+ * per-nfsd_net client_lock spinlock.
+ *
+ * References to it are only held during the processing of compounds, and in
+ * certain other operations. In their "resting state" they have a refcount of
+ * 0. If they are not renewed within a lease period, they become eligible for
+ * destruction by the laundromat.
+ *
+ * These objects can also be destroyed prematurely by the fault injection code,
+ * or if the client sends certain forms of SETCLIENTID or EXCHANGE_ID updates.
+ * Care is taken *not* to do this however when the objects have an elevated
+ * refcount.
+ *
+ * o Each nfs4_client is hashed by clientid
+ *
+ * o Each nfs4_clients is also hashed by name (the opaque quantity initially
+ *   sent by the client to identify itself).
  * 	  
- *	o cl_perclient list is used to ensure no dangling stateowner references
- *	  when we expire the nfs4_client
+ * o cl_perclient list is used to ensure no dangling stateowner references
+ *   when we expire the nfs4_client
  */
 struct nfs4_client {
 	struct list_head	cl_idhash; 	/* hash by cl_clientid.id */
@@ -335,6 +374,12 @@ struct nfs4_replay {
 	char			rp_ibuf[NFSD4_REPLAY_ISIZE];
 };
 
+/*
+ * A core object that represents either an open or lock owner. The object and
+ * lock owner objects have one of these embedded within them. Refcounts and
+ * other fields common to both owner types are contained within these
+ * structures.
+ */
 struct nfs4_stateowner {
 	struct list_head        so_strhash;   /* hash by op_name */
 	struct list_head        so_stateids;
@@ -351,6 +396,12 @@ struct nfs4_stateowner {
 	void (*so_unhash)(struct nfs4_stateowner *);
 };
 
+/*
+ * When a file is opened, the client provides an open state owner opaque string
+ * that indicates the "owner" of that open. These objects are refcounted.
+ * References to it are held by each open state associated with it. This object
+ * is a superset of the nfs4_stateowner struct.
+ */
 struct nfs4_openowner {
 	struct nfs4_stateowner	oo_owner; /* must be first field */
 	struct list_head        oo_perclient;
@@ -368,6 +419,12 @@ struct nfs4_openowner {
 	unsigned char		oo_flags;
 };
 
+/*
+ * Represents a generic "lockowner". Similar to an openowner. References to it
+ * are held by the lock stateids that are created on its behalf. This object is
+ * a superset of the nfs4_stateowner struct (or would be if it needed any extra
+ * fields).
+ */
 struct nfs4_lockowner {
 	struct nfs4_stateowner	lo_owner; /* must be first element */
 };
@@ -382,7 +439,14 @@ static inline struct nfs4_lockowner * lockowner(struct nfs4_stateowner *so)
 	return container_of(so, struct nfs4_lockowner, lo_owner);
 }
 
-/* nfs4_file: a file opened by some number of (open) nfs4_stateowners. */
+/*
+ * nfs4_file: a file opened by some number of (open) nfs4_stateowners.
+ *
+ * These objects are global. nfsd only keeps one instance of a nfs4_file per
+ * inode (though it may keep multiple file descriptors open per inode). These
+ * are tracked in the file_hashtbl which is protected by the state_lock
+ * spinlock.
+ */
 struct nfs4_file {
 	atomic_t		fi_ref;
 	spinlock_t		fi_lock;
@@ -407,7 +471,20 @@ struct nfs4_file {
 	bool			fi_had_conflict;
 };
 
-/* "ol" stands for "Open or Lock".  Better suggestions welcome. */
+/*
+ * A generic struct representing either a open or lock stateid. The nfs4_client
+ * holds a reference to each of these objects, and they in turn hold a
+ * reference to their respective stateowners. The client's reference is
+ * released in response to a close or unlock (depending on whether it's an open
+ * or lock stateid) or when the client is being destroyed.
+ *
+ * In the case of v4.0 open stateids, these objects are preserved for a little
+ * while after close in order to handle CLOSE replays. Those are eventually
+ * reclaimed via a LRU scheme by the laundromat.
+ *
+ * This object is a superset of the nfs4_stid. "ol" stands for "Open or Lock".
+ * Better suggestions welcome.
+ */
 struct nfs4_ol_stateid {
 	struct nfs4_stid    st_stid; /* must be first field */
 	struct list_head              st_perfile;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 100/100] nfsd: add some comments to the nfsd4 object definitions
  2014-07-08 18:04 ` [PATCH v4 100/100] nfsd: add some comments to the nfsd4 object definitions Jeff Layton
@ 2014-07-10  7:41   ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10  7:41 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  2014-07-08 18:02 ` [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access Jeff Layton
@ 2014-07-10  7:59   ` Christoph Hellwig
  2014-07-10 11:32     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10  7:59 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

> +static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
>  {
> +	int oflag = nfs4_access_to_omode(access);
> +
> +	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
> +	access &= NFS4_SHARE_ACCESS_BOTH;
> +	if (access == 0)
> +		return;
> +
>  	if (oflag == O_RDWR) {

This fragment looks odd to me in several ways.

For one NFS4_SHARE_ACCESS_BOTH isn't == READ|WRITE, although reading it
again I suspect this supposed to mean
NFS4_SHARE_ACCESS_READ|NFS4_SHARE_ACCESS_WRITE.  Second why to the &=
on access if it's not used except for the test, or for that matter
why don't we do the check on the oflag?

I can see two sensible ways to do this:

a)

static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
{
	int oflag = nfs4_access_to_omode(access);

	if (oflag == O_RDWR) {
		__nfs4_file_get_access(fp, O_RDONLY);
		__nfs4_file_get_access(fp, O_WRONLY);
	} else if (oflag == O_RDONLY || oflag == O_RDONLY)
		__nfs4_file_get_access(fp, oflag);
	}
}

Or even better just avoid the nfs4_file_get_access call altogether:

static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
{
	WARN_ON_ONCE(access & ~NFS4_SHARE_ACCESS_BOTH);

	if (access & NFS4_SHARE_ACCESS_WRITE)
		__nfs4_file_get_access(fp, O_WRONLY);
	if (access & NFS4_SHARE_ACCESS_READ)
		__nfs4_file_get_access(fp, O_RDONLY);
}

Same for the put side.


Btw, what is the story about the third fd in fi_fds? Seems like
nfs4_get_vfs_file can put a file pointer in there, but
nfs4_file_get_access never grabs a reference to it.

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd
  2014-07-08 18:03 ` [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd Jeff Layton
@ 2014-07-10  8:03   ` Christoph Hellwig
  2014-07-10 11:49     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10  8:03 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:03:00PM -0400, Jeff Layton wrote:
> ...and replace it with a simple swap call.

While trying to understand this code I'm failing to grasp the point
of the fi_access counters.  Why can't we simply grab a reference to
file everytime we increment fi_access, and simply drop it evertime we
decrement it?

Note that the comment explaining fi_access also seems wrong, as it
suggest contributing 0-4 to the counts, but at best we increment each
one by 1 in a single operation.

Also as you touch this area big time:  I think fi_fds should be renamed
to fi_fils or similar as we point to file structures and not fds.

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap
  2014-07-08 18:03 ` [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap Jeff Layton
@ 2014-07-10  8:04   ` Christoph Hellwig
  2014-07-10 10:50   ` Christoph Hellwig
  1 sibling, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10  8:04 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:03:01PM -0400, Jeff Layton wrote:
> We never use anything above bit #3, so an unsigned long for each is
> wasteful. Shrink them to a char each, and add some WARN_ON_ONCE calls if
> we try to set or clear bits that would go outside those sizes.
> 
> Note too that because atomic bitops work on unsigned longs, we have to
> abandon their use here. That shouldn't be a problem though since we
> don't really care about the atomicity in this code anyway. Using them
> was just a convenient way to flip bits.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file
  2014-07-08 18:03 ` [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file Jeff Layton
@ 2014-07-10  8:34   ` Christoph Hellwig
  2014-07-10 15:05     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10  8:34 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

> @@ -3343,20 +3347,15 @@ out:
>  static __be32
>  nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
>  {
> -	u32 op_share_access = open->op_share_access;
>  	__be32 status;
>  
> -	if (!test_access(op_share_access, stp))
> -		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, open);
> +	if (!test_access(open->op_share_access, stp))
> +		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
>  	else
>  		status = nfsd4_truncate(rqstp, cur_fh, open);
>  
>  	if (status)
>  		return status;
> -
> -	/* remember the open */
> -	set_access(op_share_access, stp);
> -	set_deny(open->op_share_deny, stp);
>  	return nfs_ok;

This function is trivial enough now to be merged into the only caller,
especially as that open actually has another call to nfs4_get_vfs_file
right next to it in another branch.


>  	} else {
> -		status = nfs4_get_vfs_file(rqstp, fp, current_fh, open);
> -		if (status)
> -			goto out;
>  		stp = open->op_stp;
>  		open->op_stp = NULL;
>  		init_open_stateid(stp, fp, open);
> +		status = nfs4_get_vfs_file(rqstp, fp, current_fh, stp, open);
> +		if (status) {
> +			release_open_stateid(stp);
> +			goto out;
> +		}

I can't find a place where we set the access bits before here.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts
  2014-07-08 18:03 ` [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts Jeff Layton
@ 2014-07-10  8:51   ` Christoph Hellwig
  2014-07-10 12:20     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10  8:51 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

Any reason to keep the access counts atomic with this?


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-08 18:03 ` [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny Jeff Layton
@ 2014-07-10 10:31   ` Christoph Hellwig
  2014-07-10 12:19     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 10:31 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

>  static void
> -reset_union_bmap_deny(unsigned long deny, struct nfs4_ol_stateid *stp)
> +reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)
>  {
>  	int i;
> -	for (i = 0; i < 4; i++) {
> +
> +	for (i = 1; i < 4; i++) {
>  		if ((i & deny) != i)
>  			clear_deny(i, stp);
>  	}


Couldn't this simply be written as:

	stp->st_deny_bmap &= ~deny;

and inlined into the caller?

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 009/100] nfsd: Add locking to the nfs4_file->fi_fds[] array
  2014-07-08 18:02 ` [PATCH v4 009/100] nfsd: Add locking to the nfs4_file->fi_fds[] array Jeff Layton
@ 2014-07-10 10:32   ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 10:32 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:02:57PM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> Preparation for removal of the client_mutex, which currently protects
> this array.

Looks good, although it might be a bit cleaner if you fold the addition
of the _locked version in a later patch into this one.

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it
  2014-07-08 18:03 ` [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it Jeff Layton
@ 2014-07-10 10:49   ` Christoph Hellwig
  2014-07-10 12:36     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 10:49 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

> +__nfs4_file_get_access(struct nfs4_file *fp, u32 access)
>  {
> -	WARN_ON_ONCE(!(fp->fi_fds[oflag] || fp->fi_fds[O_RDWR]));
> -	atomic_inc(&fp->fi_access[oflag]);
> +	int oflag = nfs4_access_to_omode(access);
> +
> +	if (oflag == O_RDWR) {
> +		atomic_inc(&fp->fi_access[O_RDONLY]);
> +		atomic_inc(&fp->fi_access[O_WRONLY]);
> +	} else
> +		atomic_inc(&fp->fi_access[oflag]);
>  }

Miught be worth to simply kill the old, simple
__nfs4_file_get_access/__nfs4_file_put_access helper that just wrap
the atomic ops in your earlier patch instead of reshuffling everything
here again.

> +nfs4_file_get_access(struct nfs4_file *fp, u32 access)
>  {
>  	lockdep_assert_held(&fp->fi_lock);
>  
>  	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
>  	access &= NFS4_SHARE_ACCESS_BOTH;
> +
> +	/* Does this access mask make sense? */
>  	if (access == 0)
> +		return nfserr_inval;
>  
> +	/* Does it conflict with a deny mode already set? */
> +	if ((access & fp->fi_share_deny) != 0)
> +		return nfserr_share_denied;
> +
> +	__nfs4_file_get_access(fp, access);
> +	return nfs_ok;

Why bother clearing out the inval bits from access?  Just do a:

	if (access & ~NFS4_SHARE_ACCESS_BOTH)
		return nfserr_inval;

also that check doesn't really belong in here, might be better to add it
to the initial nfs4_file_get_access refactor.

> +static __be32 nfs4_file_check_deny(struct nfs4_file *fp, u32 deny)
> +{
> +	/* Common case is that there is no deny mode. */
> +	deny &= NFS4_SHARE_DENY_BOTH;
> +	if (deny) {

No need to reject invalid ones here unlike the access case?  Any reason
to handle them differently?

Otherwise again no need to clear them out, that just makes the code
harder to read.

> +		/* Note: relies on NFS4_SHARE_DENY_BOTH == READ|WRITE */

READ should be NFS4_SHARE_DENY_READ and same for write I guess?  I don't
really think you need these comments anyway, as this is part of the
protocol.

>  
> +/*
> + * A stateid that had a deny mode associated with it is being released
> + * or downgraded. Recalculate the deny mode on the file.
> + */
> +static void
> +recalculate_deny_mode(struct nfs4_file *fp)
> +{
> +	struct nfs4_ol_stateid *stp;
> +
> +	spin_lock(&fp->fi_lock);
> +	fp->fi_share_deny = 0;
> +	list_for_each_entry(stp, &fp->fi_stateids, st_perfile)
> +		fp->fi_share_deny |= bmap_to_share_mode(stp->st_deny_bmap);

Seems like bmap_to_share_mode is superflous with your change of
st_deny_bmap to a char, this could become:

		fp->fi_share_deny |= stp->st_deny_bmap;

>  /* release all access and file references for a given stateid */
>  static void
>  release_all_access(struct nfs4_ol_stateid *stp)
>  {
>  	int i;
> +	struct nfs4_file *fp = stp->st_file;
> +
> +	if (fp && stp->st_deny_bmap != 0)
> +		recalculate_deny_mode(fp);

Can fp be zero here?

> +out_put_access:
> +	stp->st_access_bmap = old_access_bmap;
> +	nfs4_file_put_access(fp, open->op_share_access);
> +	reset_union_bmap_deny(bmap_to_share_mode(old_deny_bmap), stp);

Instead of setting stp->st_access_bmap to the old bmap and then passing
it to reset_union_bmap_deny just pass the bitmap there directly?  Or
just kill reset_union_bmap_denyand inline it given how simple it should have
become with my previous comments addressed.


>  static __be32
>  nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
>  {
>  	__be32 status;
> +	unsigned char old_deny_bmap;
>  
>  	if (!test_access(open->op_share_access, stp))
> +		return nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
>  
> +	/* test and set deny mode */
> +	spin_lock(&fp->fi_lock);
> +	status = nfs4_file_check_deny(fp, open->op_share_deny);
> +	if (status == nfs_ok) {
> +		old_deny_bmap = stp->st_deny_bmap;
> +		set_deny(open->op_share_deny, stp);

Maybe set_deny should return the old bitmap given that quite a few
callers care?  Maybe the set_deny should even move into
nfs4_file_check_deny which could return the old one, making it an check
and set operation.

> @@ -4603,7 +4673,7 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
>  
>  	if (test_access(access, lock_stp))
>  		return;
> -	nfs4_file_get_access(fp, access);
> +	__nfs4_file_get_access(fp, access);
>  	set_access(access, lock_stp);

Why do we not bother with checking the deny mode here?


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap
  2014-07-08 18:03 ` [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap Jeff Layton
  2014-07-10  8:04   ` Christoph Hellwig
@ 2014-07-10 10:50   ` Christoph Hellwig
  2014-07-10 12:36     ` Jeff Layton
  1 sibling, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 10:50 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

>  static inline void
>  set_access(u32 access, struct nfs4_ol_stateid *stp)

I suspect it might be a good idea to kill all these wrappers, as doing
higher level bitops seems to be benefitial to a lot of the callers.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 018/100] nfsd: cleanup and rename nfs4_check_open
  2014-07-08 18:03 ` [PATCH v4 018/100] nfsd: cleanup and rename nfs4_check_open Jeff Layton
@ 2014-07-10 10:51   ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 10:51 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:03:06PM -0400, Jeff Layton wrote:
> Rename it to better describe what it does, and have it just return the
> stateid instead of a __be32 (which is now always nfs_ok). Also, do the
> search for an existing stateid after the delegation check, to reduce
> cleanup if the delegation check returns error.
> 
> Signed-off-by: Jeff Layton <jlayton@primarydata.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 002/100] nfsd: reduce some spinlocking in put_client_renew
  2014-07-08 18:02 ` [PATCH v4 002/100] nfsd: reduce some spinlocking in put_client_renew Jeff Layton
@ 2014-07-10 11:18   ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 11:18 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:02:50PM -0400, Jeff Layton wrote:
> No need to take the lock unless the count goes to 0.
> 
> Signed-off-by: Jeff Layton <jlayton@primarydata.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed
  2014-07-08 18:02 ` [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
@ 2014-07-10 11:23   ` Christoph Hellwig
  2014-07-10 12:43     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 11:23 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:02:51PM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> Add an extra delegation state to allow the stateid to remain in the idr
> tree until the last reference has been released. This will be necessary
> to ensure uniqueness once the client_mutex is removed.
> 
> [jlayton: reset the sc_type under the state_lock in unhash_delegation]

I'd be tempted to instead have a closed flag, there's plenty of space in
the hole after sc_type anyway.

The rationale for that is that a stateid really shouldn't change the
type during it's life time, and callers that specify the type to look
up shouldn't bother with looking up different types due to this either.

NFS4_REVOKED_DELEG_STID would also be replaced by a revoked flag, which
makes much more sene to start with as well.

Btw, do you also plan to keep open stateids as NFS4_CLOSED_STID for
4.1+?  In that case the comment there would need an update.  What about
lock stateids?


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 005/100] nfsd: Move the delegation reference counter into the struct nfs4_stid
  2014-07-08 18:02 ` [PATCH v4 005/100] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
@ 2014-07-10 11:28   ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 11:28 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:02:53PM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> We will want to add reference counting to the lock stateid and open
> stateids too in later patches.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  2014-07-10  7:59   ` Christoph Hellwig
@ 2014-07-10 11:32     ` Jeff Layton
  2014-07-10 11:35       ` Christoph Hellwig
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 11:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 00:59:20 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> > +static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
> >  {
> > +	int oflag = nfs4_access_to_omode(access);
> > +
> > +	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
> > +	access &= NFS4_SHARE_ACCESS_BOTH;
> > +	if (access == 0)
> > +		return;
> > +
> >  	if (oflag == O_RDWR) {
> 
> This fragment looks odd to me in several ways.
> 
> For one NFS4_SHARE_ACCESS_BOTH isn't == READ|WRITE, although reading it
> again I suspect this supposed to mean
> NFS4_SHARE_ACCESS_READ|NFS4_SHARE_ACCESS_WRITE. 

Yeah, that's correct. I probably shouldn't abbreviate those, sorry...

> Second why to the &=
> on access if it's not used except for the test, or for that matter
> why don't we do the check on the oflag?
> 

That &= made more sense in Trond's original patch, but it probably can
be removed now.

> I can see two sensible ways to do this:
> 
> a)
> 
> static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
> {
> 	int oflag = nfs4_access_to_omode(access);
> 
> 	if (oflag == O_RDWR) {
> 		__nfs4_file_get_access(fp, O_RDONLY);
> 		__nfs4_file_get_access(fp, O_WRONLY);
> 	} else if (oflag == O_RDONLY || oflag == O_RDONLY)
> 		__nfs4_file_get_access(fp, oflag);
> 	}
> }
> 
> Or even better just avoid the nfs4_file_get_access call altogether:
> 
> static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
> {
> 	WARN_ON_ONCE(access & ~NFS4_SHARE_ACCESS_BOTH);
> 
> 	if (access & NFS4_SHARE_ACCESS_WRITE)
> 		__nfs4_file_get_access(fp, O_WRONLY);
> 	if (access & NFS4_SHARE_ACCESS_READ)
> 		__nfs4_file_get_access(fp, O_RDONLY);
> }
> 
> Same for the put side.
> 


Yeah, that second one looks fine. I'll change the patch to do that
here, but I'm not sure if I'll need to morph that a bit in the later
patches.

> 
> Btw, what is the story about the third fd in fi_fds? Seems like
> nfs4_get_vfs_file can put a file pointer in there, but
> nfs4_file_get_access never grabs a reference to it.

This code is just plain odd altogether, but it does make a perverse
sort of sense. Here's the (patched) __nfs4_file_put_access:

        if (atomic_dec_and_lock(&fp->fi_access[oflag], &fp->fi_lock)) {
                struct file *f1 = NULL;
                struct file *f2 = NULL;

                swap(f1, fp->fi_fds[oflag]);
                if (atomic_read(&fp->fi_access[1 - oflag]) == 0)
                        swap(f2, fp->fi_fds[O_RDWR]);
                spin_unlock(&fp->fi_lock);

So, we decrement the fi_access counter on the O_*ONLY flag that we want
to put. If it goes to zero, then we check the other O_*ONLY counter and
if it's also zero, we fput the O_RDWR file.

So, you're correct that we never take an fi_access reference for O_RDWR,
which is why the array doesn't have a slot for it:

        atomic_t                fi_access[2];

...it's tracked by the union of the O_RDONLY and O_WRONLY counters.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls
  2014-07-08 18:02 ` [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls Jeff Layton
@ 2014-07-10 11:32   ` Christoph Hellwig
  2014-07-10 12:45     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 11:32 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

>  static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
> @@ -1266,7 +1271,9 @@ free_client(struct nfs4_client *clp)
>  	rpc_destroy_wait_queue(&clp->cl_cb_waitq);
>  	free_svc_cred(&clp->cl_cred);
>  	kfree(clp->cl_name.data);
> +	spin_lock(&clp->cl_lock);
>  	idr_destroy(&clp->cl_stateids);
> +	spin_unlock(&clp->cl_lock);
>  	kfree(clp);

Taking cl_lock in free_client looks wrong to me, the client shall better be
removed from any data structures that allows looking it up at this
point.

Except for that looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 007/100] nfsd: Add fine grained protection for the nfs4_file->fi_stateids list
  2014-07-08 18:02 ` [PATCH v4 007/100] nfsd: Add fine grained protection for the nfs4_file->fi_stateids list Jeff Layton
@ 2014-07-10 11:33   ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 11:33 UTC (permalink / raw)
  To: Jeff Layton; +Cc: bfields, linux-nfs

On Tue, Jul 08, 2014 at 02:02:55PM -0400, Jeff Layton wrote:
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> Access to this list is currently serialized by the client_mutex. Add
> finer grained locking around this list in preparation for its removal.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  2014-07-10 11:32     ` Jeff Layton
@ 2014-07-10 11:35       ` Christoph Hellwig
  2014-07-10 12:49         ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 11:35 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Thu, Jul 10, 2014 at 07:32:14AM -0400, Jeff Layton wrote:
> So, you're correct that we never take an fi_access reference for O_RDWR,
> which is why the array doesn't have a slot for it:
> 
>         atomic_t                fi_access[2];
> 
> ...it's tracked by the union of the O_RDONLY and O_WRONLY counters.

Which will be racy as long we try to use an atomic_t and not a proper
lock over access to all of fi_access.

I'm still trying to understand why we even need fi_access and can't just
use the file references directly, though.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd
  2014-07-10  8:03   ` Christoph Hellwig
@ 2014-07-10 11:49     ` Jeff Layton
  2014-07-10 12:05       ` Christoph Hellwig
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 11:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 01:03:36 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Jul 08, 2014 at 02:03:00PM -0400, Jeff Layton wrote:
> > ...and replace it with a simple swap call.
> 
> While trying to understand this code I'm failing to grasp the point
> of the fi_access counters.  Why can't we simply grab a reference to
> file everytime we increment fi_access, and simply drop it evertime we
> decrement it?
> 

I'm not sure I understand what you're suggesting here. Can you
elaborate?

AFAIU, the fi_access counters tell us when we're able to zero out the
fi_fds slot. fput returns void so we don't have a way to know whether
we're putting the last reference to a file or not. Without that,
find_*_file become quite problematic -- how would we know whether the
pointers they return are still good?

> Note that the comment explaining fi_access also seems wrong, as it
> suggest contributing 0-4 to the counts, but at best we increment each
> one by 1 in a single operation.
> 

Right, we only increment by one for each operation, but open stateids
can be upgraded and downgraded. If we do:

OPEN for NFS4_SHARE_ACCESS_READ
OPEN for NFS4_SHARE_ACCESS_WRITE
OPEN for NFS4_SHARE_ACCESS_BOTH

...then we will have incremented the fi_access counts by a total of 4.

The big pain in the ass with this code is explained in the comment over
bmap_to_share_mode. We have to keep track of what share bits have been
previously used in order to comply with the RFC, so you can easily end
up with "extra" references.

> Also as you touch this area big time:  I think fi_fds should be renamed
> to fi_fils or similar as we point to file structures and not fds.

No objection to the renaming, I'll see if I can work that in on the
next respin.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd
  2014-07-10 11:49     ` Jeff Layton
@ 2014-07-10 12:05       ` Christoph Hellwig
  2014-07-10 13:15         ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 12:05 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Thu, Jul 10, 2014 at 07:49:52AM -0400, Jeff Layton wrote:
> 
> AFAIU, the fi_access counters tell us when we're able to zero out the
> fi_fds slot. fput returns void so we don't have a way to know whether
> we're putting the last reference to a file or not. Without that,
> find_*_file become quite problematic -- how would we know whether the
> pointers they return are still good?

Oh right, that's defintively a good reason to keep the counters.

> > Note that the comment explaining fi_access also seems wrong, as it
> > suggest contributing 0-4 to the counts, but at best we increment each
> > one by 1 in a single operation.
> > 
> 
> Right, we only increment by one for each operation, but open stateids
> can be upgraded and downgraded. If we do:
> 
> OPEN for NFS4_SHARE_ACCESS_READ
> OPEN for NFS4_SHARE_ACCESS_WRITE
> OPEN for NFS4_SHARE_ACCESS_BOTH
> 
> ...then we will have incremented the fi_access counts by a total of 4.
> 
> The big pain in the ass with this code is explained in the comment over
> bmap_to_share_mode. We have to keep track of what share bits have been
> previously used in order to comply with the RFC, so you can easily end
> up with "extra" references.

I've read both the comment, and the section in the RFC, but I still
don't understand how they explain this convoluted code.

Assume we'd have three members in fi_access instead, indexed by
(oflags & (O_RDONLY|O_WRONLY|O_RDWR):

static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
{
	int oflags = nfs4_access_to_omode(access);

	WARN_ON_ONCE(fp->fi_fds[oflag] == NULL);
	atomic_inc(&fp->fi_access[oflag]);
}

static void nfs4_file_put_access(struct nfs4_file *fp, u32 access)
{
	int oflags = nfs4_access_to_omode(access);

	if (atomic_dec_and_test(&fp->fi_access[oflag]))
		nfs4_file_put_fd(fp, oflag);
}

nfsd4_open_downgrade would become a little more complicated because it
would have to check if an FD for the downgraded access is around before
dropping the reference to the r/w one it currently holds. And as far as
I can tell O_RDWR to O_RDONLY or O_WRONLY is the only possible case for
it, feel free to correct me.

Or we could simply not bother with downgrading the underlying file here
at all, just like it's done for the case where only a O_RDWR file is
open in the current code, which probably is a fairly common case.

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-10 10:31   ` Christoph Hellwig
@ 2014-07-10 12:19     ` Jeff Layton
  2014-07-10 12:43       ` Christoph Hellwig
  2014-07-10 13:16       ` Christoph Hellwig
  0 siblings, 2 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 03:31:00 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> >  static void
> > -reset_union_bmap_deny(unsigned long deny, struct nfs4_ol_stateid *stp)
> > +reset_union_bmap_deny(u32 deny, struct nfs4_ol_stateid *stp)
> >  {
> >  	int i;
> > -	for (i = 0; i < 4; i++) {
> > +
> > +	for (i = 1; i < 4; i++) {
> >  		if ((i & deny) != i)
> >  			clear_deny(i, stp);
> >  	}
> 
> 
> Couldn't this simply be written as:
> 
> 	stp->st_deny_bmap &= ~deny;
> 
> and inlined into the caller?

No, the st_deny_bmap does not hold the NFS_SHARE_DENY_* bits
themselves. It sets a bit for each NFS_SHARE_DENY_* mode that the
stateid has used. The difference is subtle, but significant.

The rationale for that is in the comments above bmap_to_share_mode.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts
  2014-07-10  8:51   ` Christoph Hellwig
@ 2014-07-10 12:20     ` Jeff Layton
  2014-07-10 12:43       ` Christoph Hellwig
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 01:51:07 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> Any reason to keep the access counts atomic with this?
> 

Yes. We don't have to hold the lock when putting refcounts if we do.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it
  2014-07-10 10:49   ` Christoph Hellwig
@ 2014-07-10 12:36     ` Jeff Layton
  2014-07-10 12:45       ` Christoph Hellwig
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:36 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 03:49:27 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> > +__nfs4_file_get_access(struct nfs4_file *fp, u32 access)
> >  {
> > -	WARN_ON_ONCE(!(fp->fi_fds[oflag] || fp->fi_fds[O_RDWR]));
> > -	atomic_inc(&fp->fi_access[oflag]);
> > +	int oflag = nfs4_access_to_omode(access);
> > +
> > +	if (oflag == O_RDWR) {
> > +		atomic_inc(&fp->fi_access[O_RDONLY]);
> > +		atomic_inc(&fp->fi_access[O_WRONLY]);
> > +	} else
> > +		atomic_inc(&fp->fi_access[oflag]);
> >  }
> 
> Miught be worth to simply kill the old, simple
> __nfs4_file_get_access/__nfs4_file_put_access helper that just wrap
> the atomic ops in your earlier patch instead of reshuffling everything
> here again.
> 

Maybe.

> > +nfs4_file_get_access(struct nfs4_file *fp, u32 access)
> >  {
> >  	lockdep_assert_held(&fp->fi_lock);
> >  
> >  	/* Note: relies on NFS4_SHARE_ACCESS_BOTH == READ|WRITE */
> >  	access &= NFS4_SHARE_ACCESS_BOTH;
> > +
> > +	/* Does this access mask make sense? */
> >  	if (access == 0)
> > +		return nfserr_inval;
> >  
> > +	/* Does it conflict with a deny mode already set? */
> > +	if ((access & fp->fi_share_deny) != 0)
> > +		return nfserr_share_denied;
> > +
> > +	__nfs4_file_get_access(fp, access);
> > +	return nfs_ok;
> 
> Why bother clearing out the inval bits from access?  Just do a:
> 
> 	if (access & ~NFS4_SHARE_ACCESS_BOTH)
> 		return nfserr_inval;
> 
> also that check doesn't really belong in here, might be better to add it
> to the initial nfs4_file_get_access refactor.
> 

Ok, good point.

> > +static __be32 nfs4_file_check_deny(struct nfs4_file *fp, u32 deny)
> > +{
> > +	/* Common case is that there is no deny mode. */
> > +	deny &= NFS4_SHARE_DENY_BOTH;
> > +	if (deny) {
> 
> No need to reject invalid ones here unlike the access case?  Any reason
> to handle them differently?
> 
> Otherwise again no need to clear them out, that just makes the code
> harder to read.
> 

Yeah, makes sense. I'll fix it to throw back an error if there are
extra bits set.

> > +		/* Note: relies on NFS4_SHARE_DENY_BOTH == READ|WRITE */
> 
> READ should be NFS4_SHARE_DENY_READ and same for write I guess?  I don't
> really think you need these comments anyway, as this is part of the
> protocol.
> 

Ok.

> >  
> > +/*
> > + * A stateid that had a deny mode associated with it is being released
> > + * or downgraded. Recalculate the deny mode on the file.
> > + */
> > +static void
> > +recalculate_deny_mode(struct nfs4_file *fp)
> > +{
> > +	struct nfs4_ol_stateid *stp;
> > +
> > +	spin_lock(&fp->fi_lock);
> > +	fp->fi_share_deny = 0;
> > +	list_for_each_entry(stp, &fp->fi_stateids, st_perfile)
> > +		fp->fi_share_deny |= bmap_to_share_mode(stp->st_deny_bmap);
> 
> Seems like bmap_to_share_mode is superflous with your change of
> st_deny_bmap to a char, this could become:
> 
> 		fp->fi_share_deny |= stp->st_deny_bmap;
> 

No. Again, take a look at the comments above bmap_to_share_mode.

fi_share_deny will hold NFS_SHARE_DENY_* bits that are the union of all
the stateids attached to it, but the st_deny_bmap has to track the
individual NFS_SHARE_DENY_* modes that have been used.

> >  /* release all access and file references for a given stateid */
> >  static void
> >  release_all_access(struct nfs4_ol_stateid *stp)
> >  {
> >  	int i;
> > +	struct nfs4_file *fp = stp->st_file;
> > +
> > +	if (fp && stp->st_deny_bmap != 0)
> > +		recalculate_deny_mode(fp);
> 
> Can fp be zero here?
> 

At this point in the series, no. After later patches go in, yes.

> > +out_put_access:
> > +	stp->st_access_bmap = old_access_bmap;
> > +	nfs4_file_put_access(fp, open->op_share_access);
> > +	reset_union_bmap_deny(bmap_to_share_mode(old_deny_bmap), stp);
> 
> Instead of setting stp->st_access_bmap to the old bmap and then passing
> it to reset_union_bmap_deny just pass the bitmap there directly?  Or
> just kill reset_union_bmap_denyand inline it given how simple it should have
> become with my previous comments addressed.
> 
> 

We can't get rid of reset_union_bmap_deny for the reasons I explained
above and we'd still need to do an extra conversion in
nfsd4_open_downgrade if we do that. I'd rather leave any extra work to
the error case (which is why I did it this way).

> >  static __be32
> >  nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
> >  {
> >  	__be32 status;
> > +	unsigned char old_deny_bmap;
> >  
> >  	if (!test_access(open->op_share_access, stp))
> > +		return nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
> >  
> > +	/* test and set deny mode */
> > +	spin_lock(&fp->fi_lock);
> > +	status = nfs4_file_check_deny(fp, open->op_share_deny);
> > +	if (status == nfs_ok) {
> > +		old_deny_bmap = stp->st_deny_bmap;
> > +		set_deny(open->op_share_deny, stp);
> 
> Maybe set_deny should return the old bitmap given that quite a few
> callers care?  Maybe the set_deny should even move into
> nfs4_file_check_deny which could return the old one, making it an check
> and set operation.
> 

Yeah, I considered that, I'll see if that will help things, but it
might be best to leave that extra cleanup to a different patchset. This
set is all about removing the client_mutex, not so much about cleaning
up deny mode handling (even fixing that atomicity is a necessary step).

> > @@ -4603,7 +4673,7 @@ static void get_lock_access(struct nfs4_ol_stateid *lock_stp, u32 access)
> >  
> >  	if (test_access(access, lock_stp))
> >  		return;
> > -	nfs4_file_get_access(fp, access);
> > +	__nfs4_file_get_access(fp, access);
> >  	set_access(access, lock_stp);
> 
> Why do we not bother with checking the deny mode here?
> 

Lock stateids inherit the access and deny modes from the open stateids.
The open stateid has already checked both the access and deny modes, so
there's no need to do it again.

Doing so also becomes problematic with the change to use fi_share_deny.
If an open sets an fi_share_deny bit, then we'd have to account for
that here since the lock stateid associated with that open should not
conflict with that bit. Best to just avoid trying to check deny bits in
the LOCK codepath.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap
  2014-07-10 10:50   ` Christoph Hellwig
@ 2014-07-10 12:36     ` Jeff Layton
  0 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:36 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 03:50:19 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> >  static inline void
> >  set_access(u32 access, struct nfs4_ol_stateid *stp)
> 
> I suspect it might be a good idea to kill all these wrappers, as doing
> higher level bitops seems to be benefitial to a lot of the callers.
> 

Maybe. I'd prefer to leave that sort of cleanup until after this set if
it can be helped.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed
  2014-07-10 11:23   ` Christoph Hellwig
@ 2014-07-10 12:43     ` Jeff Layton
  2014-07-14 16:45       ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 04:23:42 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Jul 08, 2014 at 02:02:51PM -0400, Jeff Layton wrote:
> > From: Trond Myklebust <trond.myklebust@primarydata.com>
> > 
> > Add an extra delegation state to allow the stateid to remain in the idr
> > tree until the last reference has been released. This will be necessary
> > to ensure uniqueness once the client_mutex is removed.
> > 
> > [jlayton: reset the sc_type under the state_lock in unhash_delegation]
> 
> I'd be tempted to instead have a closed flag, there's plenty of space in
> the hole after sc_type anyway.
> 
> The rationale for that is that a stateid really shouldn't change the
> type during it's life time, and callers that specify the type to look
> up shouldn't bother with looking up different types due to this either.
> 
> NFS4_REVOKED_DELEG_STID would also be replaced by a revoked flag, which
> makes much more sene to start with as well.
> 

That sort of change will ripple throughout the set. Your point about
not changing the sc_type is valid though. I'll see whether it's doable.

> Btw, do you also plan to keep open stateids as NFS4_CLOSED_STID for
> 4.1+?  In that case the comment there would need an update.  What about
> lock stateids?
> 

No. For v4.1+ we have no need to keep stateids around that are closed.
They are released as soon as the close occurs. The only reason to keep
them around is for v4.0 replays.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-10 12:19     ` Jeff Layton
@ 2014-07-10 12:43       ` Christoph Hellwig
  2014-07-10 13:16       ` Christoph Hellwig
  1 sibling, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 12:43 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Thu, Jul 10, 2014 at 08:19:12AM -0400, Jeff Layton wrote:
> 
> No, the st_deny_bmap does not hold the NFS_SHARE_DENY_* bits
> themselves. It sets a bit for each NFS_SHARE_DENY_* mode that the
> stateid has used. The difference is subtle, but significant.
> 
> The rationale for that is in the comments above bmap_to_share_mode.

Oh, the dangers of the bitshift index access helper confusing the
reader.

But I think this is only really nessecary because of the odd fi_access
semantics, and if we sort that out as suggested by me earlier we can
get rid of this confusion as the access/deny checks don't really care
about the additional both bit, and once the the O_RDWR file has it's own
proper refconting there shouldn't be any need to treat it special.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts
  2014-07-10 12:20     ` Jeff Layton
@ 2014-07-10 12:43       ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 12:43 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Thu, Jul 10, 2014 at 08:20:43AM -0400, Jeff Layton wrote:
> Yes. We don't have to hold the lock when putting refcounts if we do.

So we can optimze the case where a reference is dropped but the file
stays around.  That sound reasonable.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls
  2014-07-10 11:32   ` Christoph Hellwig
@ 2014-07-10 12:45     ` Jeff Layton
  0 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 04:32:49 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> >  static void nfs4_free_stid(struct kmem_cache *slab, struct nfs4_stid *s)
> > @@ -1266,7 +1271,9 @@ free_client(struct nfs4_client *clp)
> >  	rpc_destroy_wait_queue(&clp->cl_cb_waitq);
> >  	free_svc_cred(&clp->cl_cred);
> >  	kfree(clp->cl_name.data);
> > +	spin_lock(&clp->cl_lock);
> >  	idr_destroy(&clp->cl_stateids);
> > +	spin_unlock(&clp->cl_lock);
> >  	kfree(clp);
> 
> Taking cl_lock in free_client looks wrong to me, the client shall better be
> removed from any data structures that allows looking it up at this
> point.
> 

Yeah, ok -- good point. I'll remove that.

> Except for that looks good,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it
  2014-07-10 12:36     ` Jeff Layton
@ 2014-07-10 12:45       ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 12:45 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Thu, Jul 10, 2014 at 08:36:28AM -0400, Jeff Layton wrote:
> Yeah, I considered that, I'll see if that will help things, but it
> might be best to leave that extra cleanup to a different patchset. This
> set is all about removing the client_mutex, not so much about cleaning
> up deny mode handling (even fixing that atomicity is a necessary step).

With the extensive changes to that area I would much prefer to get the
various bits rights before doing the locking instead of going back to
this later.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  2014-07-10 11:35       ` Christoph Hellwig
@ 2014-07-10 12:49         ` Jeff Layton
  2014-07-10 13:01           ` Christoph Hellwig
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 12:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jeff Layton, bfields, linux-nfs

On Thu, 10 Jul 2014 04:35:23 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Jul 10, 2014 at 07:32:14AM -0400, Jeff Layton wrote:
> > So, you're correct that we never take an fi_access reference for O_RDWR,
> > which is why the array doesn't have a slot for it:
> > 
> >         atomic_t                fi_access[2];
> > 
> > ...it's tracked by the union of the O_RDONLY and O_WRONLY counters.
> 
> Which will be racy as long we try to use an atomic_t and not a proper
> lock over access to all of fi_access.
> 

gets are always done with the lock held. puts do an atomic_dec_and_lock
or are done with the lock held. I don't see how that's racy...

> I'm still trying to understand why we even need fi_access and can't just
> use the file references directly, though.
> 


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access
  2014-07-10 12:49         ` Jeff Layton
@ 2014-07-10 13:01           ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 13:01 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Thu, Jul 10, 2014 at 08:49:22AM -0400, Jeff Layton wrote:
> gets are always done with the lock held. puts do an atomic_dec_and_lock
> or are done with the lock held. I don't see how that's racy...

Yeah, as long as the atomic_read for the other fd is done under the lock
it should be fine in the end.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd
  2014-07-10 12:05       ` Christoph Hellwig
@ 2014-07-10 13:15         ` Jeff Layton
  0 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 13:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jeff Layton, bfields, linux-nfs

On Thu, 10 Jul 2014 05:05:22 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Jul 10, 2014 at 07:49:52AM -0400, Jeff Layton wrote:
> > 
> > AFAIU, the fi_access counters tell us when we're able to zero out the
> > fi_fds slot. fput returns void so we don't have a way to know whether
> > we're putting the last reference to a file or not. Without that,
> > find_*_file become quite problematic -- how would we know whether the
> > pointers they return are still good?
> 
> Oh right, that's defintively a good reason to keep the counters.
> 
> > > Note that the comment explaining fi_access also seems wrong, as it
> > > suggest contributing 0-4 to the counts, but at best we increment each
> > > one by 1 in a single operation.
> > > 
> > 
> > Right, we only increment by one for each operation, but open stateids
> > can be upgraded and downgraded. If we do:
> > 
> > OPEN for NFS4_SHARE_ACCESS_READ
> > OPEN for NFS4_SHARE_ACCESS_WRITE
> > OPEN for NFS4_SHARE_ACCESS_BOTH
> > 
> > ...then we will have incremented the fi_access counts by a total of 4.
> > 
> > The big pain in the ass with this code is explained in the comment over
> > bmap_to_share_mode. We have to keep track of what share bits have been
> > previously used in order to comply with the RFC, so you can easily end
> > up with "extra" references.
> 
> I've read both the comment, and the section in the RFC, but I still
> don't understand how they explain this convoluted code.
> 
> Assume we'd have three members in fi_access instead, indexed by
> (oflags & (O_RDONLY|O_WRONLY|O_RDWR):
> 
> static void nfs4_file_get_access(struct nfs4_file *fp, u32 access)
> {
> 	int oflags = nfs4_access_to_omode(access);
> 
> 	WARN_ON_ONCE(fp->fi_fds[oflag] == NULL);
> 	atomic_inc(&fp->fi_access[oflag]);
> }
> 
> static void nfs4_file_put_access(struct nfs4_file *fp, u32 access)
> {
> 	int oflags = nfs4_access_to_omode(access);
> 
> 	if (atomic_dec_and_test(&fp->fi_access[oflag]))
> 		nfs4_file_put_fd(fp, oflag);
> }
> 
> nfsd4_open_downgrade would become a little more complicated because it
> would have to check if an FD for the downgraded access is around before
> dropping the reference to the r/w one it currently holds. And as far as
> I can tell O_RDWR to O_RDONLY or O_WRONLY is the only possible case for
> it, feel free to correct me.
> 
> Or we could simply not bother with downgrading the underlying file here
> at all, just like it's done for the case where only a O_RDWR file is
> open in the current code, which probably is a fairly common case.

That sounds like it would work, but will it really improve this code?
I'm not convinced.

Having to check whether you actually have the fd before dropping its
refcount sounds "fiddly". You'd need to do the check and put under the
fi_lock in order to not open up new races.

So, we'd either lose the benefit of using atomics for fi_access, or
have to have a whole separate nfs4_file_put_access_locked codepath that
can be done while already holding the lock.

That doesn't sound to me like it's really an improvement on this code.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-10 12:19     ` Jeff Layton
  2014-07-10 12:43       ` Christoph Hellwig
@ 2014-07-10 13:16       ` Christoph Hellwig
  2014-07-10 13:21         ` Jeff Layton
  1 sibling, 1 reply; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-10 13:16 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

Ok, feel free do drop my comments re the access/deny bitmap.  I don't
really think this is worth it to avoid the small false positive to
allow downgrading if a different open owner had a r/o or w/o open, but
it's probably indeed way to much churn for this series to do anything
about it.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-10 13:16       ` Christoph Hellwig
@ 2014-07-10 13:21         ` Jeff Layton
  2014-07-10 16:20           ` J. Bruce Fields
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 13:21 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jeff Layton, bfields, linux-nfs

On Thu, 10 Jul 2014 06:16:50 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> Ok, feel free do drop my comments re the access/deny bitmap.  I don't
> really think this is worth it to avoid the small false positive to
> allow downgrading if a different open owner had a r/o or w/o open, but
> it's probably indeed way to much churn for this series to do anything
> about it.
> 

Ok, thanks.

I agree that having to track this is a little ridiculous. No real client
really cares about that, but there are some pynfs tests that will fail
if we remove it altogether. I broke it a couple of years ago and Bruce
dinged me on it, so I'm inclined not to change it here.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file
  2014-07-10  8:34   ` Christoph Hellwig
@ 2014-07-10 15:05     ` Jeff Layton
  0 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 15:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 01:34:06 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> > @@ -3343,20 +3347,15 @@ out:
> >  static __be32
> >  nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
> >  {
> > -	u32 op_share_access = open->op_share_access;
> >  	__be32 status;
> >  
> > -	if (!test_access(op_share_access, stp))
> > -		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, open);
> > +	if (!test_access(open->op_share_access, stp))
> > +		status = nfs4_get_vfs_file(rqstp, fp, cur_fh, stp, open);
> >  	else
> >  		status = nfsd4_truncate(rqstp, cur_fh, open);
> >  
> >  	if (status)
> >  		return status;
> > -
> > -	/* remember the open */
> > -	set_access(op_share_access, stp);
> > -	set_deny(open->op_share_deny, stp);
> >  	return nfs_ok;
> 
> This function is trivial enough now to be merged into the only caller,
> especially as that open actually has another call to nfs4_get_vfs_file
> right next to it in another branch.
> 

Yes, but in two patches (nfsd: make deny mode enforcement more
efficient and close races in it) we add some complexity back in here
and at that point I think we'll want this in a separate function. So I
think we shouldn't fold that into the caller since it'll just increase
the churn.

> 
> >  	} else {
> > -		status = nfs4_get_vfs_file(rqstp, fp, current_fh, open);
> > -		if (status)
> > -			goto out;
> >  		stp = open->op_stp;
> >  		open->op_stp = NULL;
> >  		init_open_stateid(stp, fp, open);
> > +		status = nfs4_get_vfs_file(rqstp, fp, current_fh, stp, open);
> > +		if (status) {
> > +			release_open_stateid(stp);
> > +			goto out;
> > +		}
> 
> I can't find a place where we set the access bits before here.
> 

Correct, we don't. That's not a problem though, AFAICT.

The idea here is to go ahead and hash the stateid and then try to get
the access to the file. If that fails, we unhash it and free the
stateid.

In a later patch we'll need to do that anyway to ensure that the
fi_deny_mode is properly handled if it needs to be recalculated while
we're trying to get a new filp for the file.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat
  2014-07-08 18:02 ` [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat Jeff Layton
@ 2014-07-10 15:59   ` J. Bruce Fields
  2014-07-10 16:16     ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: J. Bruce Fields @ 2014-07-10 15:59 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs

On Tue, Jul 08, 2014 at 02:02:49PM -0400, Jeff Layton wrote:
> Bruce says:
> 
>     There's also a preexisting expire_client/laundromat vs break race:
> 
>     - expire_client/laundromat adds a delegation to its local
>       reaplist using the same dl_recall_lru field that a delegation
>       uses to track its position on the recall lru and drops the
>       state lock.
> 
>     - a concurrent break_lease adds the delegation to the lru.
> 
>     - expire/client/laundromat then walks it reaplist and sees the
>       lru head as just another delegation on the list....
> 
> Fix this race by checking the dl_time under the state_lock. If we find
> that it's not 0, then we know that it has already been queued to the LRU
> list and that we shouldn't queue it again.
> 
> In the case of destroy_client, we must also ensure that we don't hit
> similar races by ensuring that we don't move any delegations to the
> reaplist with a dl_time of 0. Just bump the dl_time by one before we
> drop the state_lock. We're destroying the delegations anyway, so a 1s
> difference there won't matter.
> 
> The fault injection code also requires a bit of surgery here:
> 
> First, in the case of nfsd_forget_client_delegations, we must prevent
> the same sort of race vs. the delegation break callback. For that, we
> just increment the dl_time to ensure that a delegation callback can't
> race in while we're working on it.
> 
> We can't do that for nfsd_recall_client_delegations, as we need to have
> it actually queue the delegation, and that won't happen if we increment
> the dl_time. The state lock is held over that function, so we don't need
> to worry about these sorts of races there.
> 
> There is one other potential bug nfsd_recall_client_delegations though.
> Entries on the victims list are not dequeued before calling
> nfsd_break_one_deleg. That's a potential list corruptor, so ensure that
> we do that there.
> 
> Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> ---
>  fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++-------
>  1 file changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 763aeeb67ccf..633b34fd6c92 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1287,6 +1287,8 @@ destroy_client(struct nfs4_client *clp)
>  	while (!list_empty(&clp->cl_delegations)) {
>  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
>  		list_del_init(&dp->dl_perclnt);
> +		/* Ensure that deleg break won't try to requeue it */
> +		++dp->dl_time;
>  		list_move(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);
> @@ -2933,10 +2935,14 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
>  	 * it's safe to take a reference: */
>  	atomic_inc(&dp->dl_count);
>  
> -	list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> -
> -	/* Only place dl_time is set; protected by i_lock: */
> -	dp->dl_time = get_seconds();
> +	/*
> +	 * If the dl_time != 0, then we know that it has already been
> +	 * queued for a lease break. Don't queue it again.
> +	 */
> +	if (dp->dl_time == 0) {

Any reason not to just make that

	if (dp->dl_time)
		return;

?

I don't know if it matters right now, but it might also be useful to
know that no more work will get queued once dl_time is set.

--b.

> +		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> +		dp->dl_time = get_seconds();
> +	}
>  
>  	block_delegations(&dp->dl_fh);
>  
> @@ -5081,8 +5087,23 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
>  
>  	lockdep_assert_held(&state_lock);
>  	list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
> -		if (victims)
> +		if (victims) {
> +			/*
> +			 * It's not safe to mess with delegations that have a
> +			 * non-zero dl_time. They might have already been broken
> +			 * and could be processed by the laundromat outside of
> +			 * the state_lock. Just leave them be.
> +			 */
> +			if (dp->dl_time != 0)
> +				continue;
> +
> +			/*
> +			 * Increment dl_time to ensure that delegation breaks
> +			 * don't monkey with it now that we are.
> +			 */
> +			++dp->dl_time;
>  			list_move(&dp->dl_recall_lru, victims);
> +		}
>  		if (++count == max)
>  			break;
>  	}
> @@ -5107,14 +5128,19 @@ u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
>  
>  u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
>  {
> -	struct nfs4_delegation *dp, *next;
> +	struct nfs4_delegation *dp;
>  	LIST_HEAD(victims);
>  	u64 count;
>  
>  	spin_lock(&state_lock);
>  	count = nfsd_find_all_delegations(clp, max, &victims);
> -	list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
> +	while (!list_empty(&victims)) {
> +		dp = list_first_entry(&victims, struct nfs4_delegation,
> +					dl_recall_lru);
> +		list_del_init(&dp->dl_recall_lru);
> +		dp->dl_time = 0;
>  		nfsd_break_one_deleg(dp);
> +	}
>  	spin_unlock(&state_lock);
>  
>  	return count;
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat
  2014-07-10 15:59   ` J. Bruce Fields
@ 2014-07-10 16:16     ` Jeff Layton
  2014-07-10 16:34       ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 16:16 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Thu, 10 Jul 2014 11:59:05 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Jul 08, 2014 at 02:02:49PM -0400, Jeff Layton wrote:
> > Bruce says:
> > 
> >     There's also a preexisting expire_client/laundromat vs break race:
> > 
> >     - expire_client/laundromat adds a delegation to its local
> >       reaplist using the same dl_recall_lru field that a delegation
> >       uses to track its position on the recall lru and drops the
> >       state lock.
> > 
> >     - a concurrent break_lease adds the delegation to the lru.
> > 
> >     - expire/client/laundromat then walks it reaplist and sees the
> >       lru head as just another delegation on the list....
> > 
> > Fix this race by checking the dl_time under the state_lock. If we find
> > that it's not 0, then we know that it has already been queued to the LRU
> > list and that we shouldn't queue it again.
> > 
> > In the case of destroy_client, we must also ensure that we don't hit
> > similar races by ensuring that we don't move any delegations to the
> > reaplist with a dl_time of 0. Just bump the dl_time by one before we
> > drop the state_lock. We're destroying the delegations anyway, so a 1s
> > difference there won't matter.
> > 
> > The fault injection code also requires a bit of surgery here:
> > 
> > First, in the case of nfsd_forget_client_delegations, we must prevent
> > the same sort of race vs. the delegation break callback. For that, we
> > just increment the dl_time to ensure that a delegation callback can't
> > race in while we're working on it.
> > 
> > We can't do that for nfsd_recall_client_delegations, as we need to have
> > it actually queue the delegation, and that won't happen if we increment
> > the dl_time. The state lock is held over that function, so we don't need
> > to worry about these sorts of races there.
> > 
> > There is one other potential bug nfsd_recall_client_delegations though.
> > Entries on the victims list are not dequeued before calling
> > nfsd_break_one_deleg. That's a potential list corruptor, so ensure that
> > we do that there.
> > 
> > Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
> > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > ---
> >  fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++-------
> >  1 file changed, 33 insertions(+), 7 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 763aeeb67ccf..633b34fd6c92 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1287,6 +1287,8 @@ destroy_client(struct nfs4_client *clp)
> >  	while (!list_empty(&clp->cl_delegations)) {
> >  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> >  		list_del_init(&dp->dl_perclnt);
> > +		/* Ensure that deleg break won't try to requeue it */
> > +		++dp->dl_time;
> >  		list_move(&dp->dl_recall_lru, &reaplist);
> >  	}
> >  	spin_unlock(&state_lock);
> > @@ -2933,10 +2935,14 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> >  	 * it's safe to take a reference: */
> >  	atomic_inc(&dp->dl_count);
> >  
> > -	list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> > -
> > -	/* Only place dl_time is set; protected by i_lock: */
> > -	dp->dl_time = get_seconds();
> > +	/*
> > +	 * If the dl_time != 0, then we know that it has already been
> > +	 * queued for a lease break. Don't queue it again.
> > +	 */
> > +	if (dp->dl_time == 0) {
> 
> Any reason not to just make that
> 
> 	if (dp->dl_time)
> 		return;
> 
> ?
> 
> I don't know if it matters right now, but it might also be useful to
> know that no more work will get queued once dl_time is set.
> 
> --b.
> 

No, that makes a lot of sense. I'll respin and do that instead...


> > +		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> > +		dp->dl_time = get_seconds();
> > +	}
> >  
> >  	block_delegations(&dp->dl_fh);
> >  
> > @@ -5081,8 +5087,23 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
> >  
> >  	lockdep_assert_held(&state_lock);
> >  	list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
> > -		if (victims)
> > +		if (victims) {
> > +			/*
> > +			 * It's not safe to mess with delegations that have a
> > +			 * non-zero dl_time. They might have already been broken
> > +			 * and could be processed by the laundromat outside of
> > +			 * the state_lock. Just leave them be.
> > +			 */
> > +			if (dp->dl_time != 0)
> > +				continue;
> > +
> > +			/*
> > +			 * Increment dl_time to ensure that delegation breaks
> > +			 * don't monkey with it now that we are.
> > +			 */
> > +			++dp->dl_time;
> >  			list_move(&dp->dl_recall_lru, victims);
> > +		}
> >  		if (++count == max)
> >  			break;
> >  	}
> > @@ -5107,14 +5128,19 @@ u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
> >  
> >  u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
> >  {
> > -	struct nfs4_delegation *dp, *next;
> > +	struct nfs4_delegation *dp;
> >  	LIST_HEAD(victims);
> >  	u64 count;
> >  
> >  	spin_lock(&state_lock);
> >  	count = nfsd_find_all_delegations(clp, max, &victims);
> > -	list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
> > +	while (!list_empty(&victims)) {
> > +		dp = list_first_entry(&victims, struct nfs4_delegation,
> > +					dl_recall_lru);
> > +		list_del_init(&dp->dl_recall_lru);
> > +		dp->dl_time = 0;
> >  		nfsd_break_one_deleg(dp);
> > +	}
> >  	spin_unlock(&state_lock);
> >  
> >  	return count;
> > -- 
> > 1.9.3
> > 


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-10 13:21         ` Jeff Layton
@ 2014-07-10 16:20           ` J. Bruce Fields
  2014-07-10 16:37             ` Jeff Layton
  0 siblings, 1 reply; 144+ messages in thread
From: J. Bruce Fields @ 2014-07-10 16:20 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs

On Thu, Jul 10, 2014 at 09:21:01AM -0400, Jeff Layton wrote:
> On Thu, 10 Jul 2014 06:16:50 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > Ok, feel free do drop my comments re the access/deny bitmap.  I don't
> > really think this is worth it to avoid the small false positive to
> > allow downgrading if a different open owner had a r/o or w/o open, but
> > it's probably indeed way to much churn for this series to do anything
> > about it.
> > 
> 
> Ok, thanks.
> 
> I agree that having to track this is a little ridiculous. No real client
> really cares about that, but there are some pynfs tests that will fail
> if we remove it altogether. I broke it a couple of years ago and Bruce
> dinged me on it, so I'm inclined not to change it here.

Looking back at the spec again, that server behavior is a SHOULD, but
I'm not sure why.

I suppose it's just an attempt to keep clients to the stricter behavior
in case some other server implementation requires it.  It seems like a
low priority, so if it makes your life easier, we can ditch it.

--b.

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat
  2014-07-10 16:16     ` Jeff Layton
@ 2014-07-10 16:34       ` Jeff Layton
  2014-07-10 17:41         ` J. Bruce Fields
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 16:34 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Thu, 10 Jul 2014 12:16:25 -0400
Jeff Layton <jlayton@primarydata.com> wrote:

> On Thu, 10 Jul 2014 11:59:05 -0400
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Tue, Jul 08, 2014 at 02:02:49PM -0400, Jeff Layton wrote:
> > > Bruce says:
> > > 
> > >     There's also a preexisting expire_client/laundromat vs break race:
> > > 
> > >     - expire_client/laundromat adds a delegation to its local
> > >       reaplist using the same dl_recall_lru field that a delegation
> > >       uses to track its position on the recall lru and drops the
> > >       state lock.
> > > 
> > >     - a concurrent break_lease adds the delegation to the lru.
> > > 
> > >     - expire/client/laundromat then walks it reaplist and sees the
> > >       lru head as just another delegation on the list....
> > > 
> > > Fix this race by checking the dl_time under the state_lock. If we find
> > > that it's not 0, then we know that it has already been queued to the LRU
> > > list and that we shouldn't queue it again.
> > > 
> > > In the case of destroy_client, we must also ensure that we don't hit
> > > similar races by ensuring that we don't move any delegations to the
> > > reaplist with a dl_time of 0. Just bump the dl_time by one before we
> > > drop the state_lock. We're destroying the delegations anyway, so a 1s
> > > difference there won't matter.
> > > 
> > > The fault injection code also requires a bit of surgery here:
> > > 
> > > First, in the case of nfsd_forget_client_delegations, we must prevent
> > > the same sort of race vs. the delegation break callback. For that, we
> > > just increment the dl_time to ensure that a delegation callback can't
> > > race in while we're working on it.
> > > 
> > > We can't do that for nfsd_recall_client_delegations, as we need to have
> > > it actually queue the delegation, and that won't happen if we increment
> > > the dl_time. The state lock is held over that function, so we don't need
> > > to worry about these sorts of races there.
> > > 
> > > There is one other potential bug nfsd_recall_client_delegations though.
> > > Entries on the victims list are not dequeued before calling
> > > nfsd_break_one_deleg. That's a potential list corruptor, so ensure that
> > > we do that there.
> > > 
> > > Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
> > > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > > ---
> > >  fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 33 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 763aeeb67ccf..633b34fd6c92 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -1287,6 +1287,8 @@ destroy_client(struct nfs4_client *clp)
> > >  	while (!list_empty(&clp->cl_delegations)) {
> > >  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> > >  		list_del_init(&dp->dl_perclnt);
> > > +		/* Ensure that deleg break won't try to requeue it */
> > > +		++dp->dl_time;
> > >  		list_move(&dp->dl_recall_lru, &reaplist);
> > >  	}
> > >  	spin_unlock(&state_lock);
> > > @@ -2933,10 +2935,14 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> > >  	 * it's safe to take a reference: */
> > >  	atomic_inc(&dp->dl_count);
> > >  
> > > -	list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> > > -
> > > -	/* Only place dl_time is set; protected by i_lock: */
> > > -	dp->dl_time = get_seconds();
> > > +	/*
> > > +	 * If the dl_time != 0, then we know that it has already been
> > > +	 * queued for a lease break. Don't queue it again.
> > > +	 */
> > > +	if (dp->dl_time == 0) {
> > 
> > Any reason not to just make that
> > 
> > 	if (dp->dl_time)
> > 		return;
> > 
> > ?
> > 
> > I don't know if it matters right now, but it might also be useful to
> > know that no more work will get queued once dl_time is set.
> > 
> > --b.
> > 
> 
> No, that makes a lot of sense. I'll respin and do that instead...
> 
> 

Actually, now that I look, there are a few reasons not to do that...

First, we want to ensure that block_delegations gets called even if the
dl_time is non-zero. So fine, we can move that up above the dl_time
check...

Second, it'll complicate patch #4 in this series. We have to take a
reference to the delegation in the delegation break. The dl_time though
must be checked while holding the state_lock. So, if the dl_time ends
up being non-zero at that point, we'll have to put the client reference
and take care not to do the actual recall job. It's doable, but it'll
make the code more complex. I'm not sure that's worth the extra
complexity...

> > > +		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> > > +		dp->dl_time = get_seconds();
> > > +	}
> > >  
> > >  	block_delegations(&dp->dl_fh);
> > >  
> > > @@ -5081,8 +5087,23 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
> > >  
> > >  	lockdep_assert_held(&state_lock);
> > >  	list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
> > > -		if (victims)
> > > +		if (victims) {
> > > +			/*
> > > +			 * It's not safe to mess with delegations that have a
> > > +			 * non-zero dl_time. They might have already been broken
> > > +			 * and could be processed by the laundromat outside of
> > > +			 * the state_lock. Just leave them be.
> > > +			 */
> > > +			if (dp->dl_time != 0)
> > > +				continue;
> > > +
> > > +			/*
> > > +			 * Increment dl_time to ensure that delegation breaks
> > > +			 * don't monkey with it now that we are.
> > > +			 */
> > > +			++dp->dl_time;
> > >  			list_move(&dp->dl_recall_lru, victims);
> > > +		}
> > >  		if (++count == max)
> > >  			break;
> > >  	}
> > > @@ -5107,14 +5128,19 @@ u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
> > >  
> > >  u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
> > >  {
> > > -	struct nfs4_delegation *dp, *next;
> > > +	struct nfs4_delegation *dp;
> > >  	LIST_HEAD(victims);
> > >  	u64 count;
> > >  
> > >  	spin_lock(&state_lock);
> > >  	count = nfsd_find_all_delegations(clp, max, &victims);
> > > -	list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
> > > +	while (!list_empty(&victims)) {
> > > +		dp = list_first_entry(&victims, struct nfs4_delegation,
> > > +					dl_recall_lru);
> > > +		list_del_init(&dp->dl_recall_lru);
> > > +		dp->dl_time = 0;
> > >  		nfsd_break_one_deleg(dp);
> > > +	}
> > >  	spin_unlock(&state_lock);
> > >  
> > >  	return count;
> > > -- 
> > > 1.9.3
> > > 
> 
> 


-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny
  2014-07-10 16:20           ` J. Bruce Fields
@ 2014-07-10 16:37             ` Jeff Layton
  0 siblings, 0 replies; 144+ messages in thread
From: Jeff Layton @ 2014-07-10 16:37 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, Christoph Hellwig, linux-nfs

On Thu, 10 Jul 2014 12:20:19 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Thu, Jul 10, 2014 at 09:21:01AM -0400, Jeff Layton wrote:
> > On Thu, 10 Jul 2014 06:16:50 -0700
> > Christoph Hellwig <hch@infradead.org> wrote:
> > 
> > > Ok, feel free do drop my comments re the access/deny bitmap.  I don't
> > > really think this is worth it to avoid the small false positive to
> > > allow downgrading if a different open owner had a r/o or w/o open, but
> > > it's probably indeed way to much churn for this series to do anything
> > > about it.
> > > 
> > 
> > Ok, thanks.
> > 
> > I agree that having to track this is a little ridiculous. No real client
> > really cares about that, but there are some pynfs tests that will fail
> > if we remove it altogether. I broke it a couple of years ago and Bruce
> > dinged me on it, so I'm inclined not to change it here.
> 
> Looking back at the spec again, that server behavior is a SHOULD, but
> I'm not sure why.
> 
> I suppose it's just an attempt to keep clients to the stricter behavior
> in case some other server implementation requires it.  It seems like a
> low priority, so if it makes your life easier, we can ditch it.
> 

I'd rather not introduce those sorts of behavioral changes in this
series, if only to reduce the churn. I have no objection to that sort
of overhaul after this is complete though.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat
  2014-07-10 16:34       ` Jeff Layton
@ 2014-07-10 17:41         ` J. Bruce Fields
  0 siblings, 0 replies; 144+ messages in thread
From: J. Bruce Fields @ 2014-07-10 17:41 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs

On Thu, Jul 10, 2014 at 12:34:02PM -0400, Jeff Layton wrote:
> On Thu, 10 Jul 2014 12:16:25 -0400
> Jeff Layton <jlayton@primarydata.com> wrote:
> 
> > On Thu, 10 Jul 2014 11:59:05 -0400
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > On Tue, Jul 08, 2014 at 02:02:49PM -0400, Jeff Layton wrote:
> > > > Bruce says:
> > > > 
> > > >     There's also a preexisting expire_client/laundromat vs break race:
> > > > 
> > > >     - expire_client/laundromat adds a delegation to its local
> > > >       reaplist using the same dl_recall_lru field that a delegation
> > > >       uses to track its position on the recall lru and drops the
> > > >       state lock.
> > > > 
> > > >     - a concurrent break_lease adds the delegation to the lru.
> > > > 
> > > >     - expire/client/laundromat then walks it reaplist and sees the
> > > >       lru head as just another delegation on the list....
> > > > 
> > > > Fix this race by checking the dl_time under the state_lock. If we find
> > > > that it's not 0, then we know that it has already been queued to the LRU
> > > > list and that we shouldn't queue it again.
> > > > 
> > > > In the case of destroy_client, we must also ensure that we don't hit
> > > > similar races by ensuring that we don't move any delegations to the
> > > > reaplist with a dl_time of 0. Just bump the dl_time by one before we
> > > > drop the state_lock. We're destroying the delegations anyway, so a 1s
> > > > difference there won't matter.
> > > > 
> > > > The fault injection code also requires a bit of surgery here:
> > > > 
> > > > First, in the case of nfsd_forget_client_delegations, we must prevent
> > > > the same sort of race vs. the delegation break callback. For that, we
> > > > just increment the dl_time to ensure that a delegation callback can't
> > > > race in while we're working on it.
> > > > 
> > > > We can't do that for nfsd_recall_client_delegations, as we need to have
> > > > it actually queue the delegation, and that won't happen if we increment
> > > > the dl_time. The state lock is held over that function, so we don't need
> > > > to worry about these sorts of races there.
> > > > 
> > > > There is one other potential bug nfsd_recall_client_delegations though.
> > > > Entries on the victims list are not dequeued before calling
> > > > nfsd_break_one_deleg. That's a potential list corruptor, so ensure that
> > > > we do that there.
> > > > 
> > > > Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
> > > > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++-------
> > > >  1 file changed, 33 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index 763aeeb67ccf..633b34fd6c92 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -1287,6 +1287,8 @@ destroy_client(struct nfs4_client *clp)
> > > >  	while (!list_empty(&clp->cl_delegations)) {
> > > >  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> > > >  		list_del_init(&dp->dl_perclnt);
> > > > +		/* Ensure that deleg break won't try to requeue it */
> > > > +		++dp->dl_time;
> > > >  		list_move(&dp->dl_recall_lru, &reaplist);
> > > >  	}
> > > >  	spin_unlock(&state_lock);
> > > > @@ -2933,10 +2935,14 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> > > >  	 * it's safe to take a reference: */
> > > >  	atomic_inc(&dp->dl_count);
> > > >  
> > > > -	list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> > > > -
> > > > -	/* Only place dl_time is set; protected by i_lock: */
> > > > -	dp->dl_time = get_seconds();
> > > > +	/*
> > > > +	 * If the dl_time != 0, then we know that it has already been
> > > > +	 * queued for a lease break. Don't queue it again.
> > > > +	 */
> > > > +	if (dp->dl_time == 0) {
> > > 
> > > Any reason not to just make that
> > > 
> > > 	if (dp->dl_time)
> > > 		return;
> > > 
> > > ?
> > > 
> > > I don't know if it matters right now, but it might also be useful to
> > > know that no more work will get queued once dl_time is set.
> > > 
> > > --b.
> > > 
> > 
> > No, that makes a lot of sense. I'll respin and do that instead...
> > 
> > 
> 
> Actually, now that I look, there are a few reasons not to do that...
> 
> First, we want to ensure that block_delegations gets called even if the
> dl_time is non-zero. So fine, we can move that up above the dl_time
> check...
> 
> Second, it'll complicate patch #4 in this series. We have to take a
> reference to the delegation in the delegation break. The dl_time though
> must be checked while holding the state_lock. So, if the dl_time ends
> up being non-zero at that point, we'll have to put the client reference
> and take care not to do the actual recall job. It's doable, but it'll
> make the code more complex. I'm not sure that's worth the extra
> complexity...

OK, fine, we can leave it be for now.  Merging this and the following
patch as is.

--b.

> 
> > > > +		list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru);
> > > > +		dp->dl_time = get_seconds();
> > > > +	}
> > > >  
> > > >  	block_delegations(&dp->dl_fh);
> > > >  
> > > > @@ -5081,8 +5087,23 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
> > > >  
> > > >  	lockdep_assert_held(&state_lock);
> > > >  	list_for_each_entry_safe(dp, next, &clp->cl_delegations, dl_perclnt) {
> > > > -		if (victims)
> > > > +		if (victims) {
> > > > +			/*
> > > > +			 * It's not safe to mess with delegations that have a
> > > > +			 * non-zero dl_time. They might have already been broken
> > > > +			 * and could be processed by the laundromat outside of
> > > > +			 * the state_lock. Just leave them be.
> > > > +			 */
> > > > +			if (dp->dl_time != 0)
> > > > +				continue;
> > > > +
> > > > +			/*
> > > > +			 * Increment dl_time to ensure that delegation breaks
> > > > +			 * don't monkey with it now that we are.
> > > > +			 */
> > > > +			++dp->dl_time;
> > > >  			list_move(&dp->dl_recall_lru, victims);
> > > > +		}
> > > >  		if (++count == max)
> > > >  			break;
> > > >  	}
> > > > @@ -5107,14 +5128,19 @@ u64 nfsd_forget_client_delegations(struct nfs4_client *clp, u64 max)
> > > >  
> > > >  u64 nfsd_recall_client_delegations(struct nfs4_client *clp, u64 max)
> > > >  {
> > > > -	struct nfs4_delegation *dp, *next;
> > > > +	struct nfs4_delegation *dp;
> > > >  	LIST_HEAD(victims);
> > > >  	u64 count;
> > > >  
> > > >  	spin_lock(&state_lock);
> > > >  	count = nfsd_find_all_delegations(clp, max, &victims);
> > > > -	list_for_each_entry_safe(dp, next, &victims, dl_recall_lru)
> > > > +	while (!list_empty(&victims)) {
> > > > +		dp = list_first_entry(&victims, struct nfs4_delegation,
> > > > +					dl_recall_lru);
> > > > +		list_del_init(&dp->dl_recall_lru);
> > > > +		dp->dl_time = 0;
> > > >  		nfsd_break_one_deleg(dp);
> > > > +	}
> > > >  	spin_unlock(&state_lock);
> > > >  
> > > >  	return count;
> > > > -- 
> > > > 1.9.3
> > > > 
> > 
> > 
> 
> 
> -- 
> Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed
  2014-07-10 12:43     ` Jeff Layton
@ 2014-07-14 16:45       ` Jeff Layton
  2014-07-15  7:50         ` Christoph Hellwig
  0 siblings, 1 reply; 144+ messages in thread
From: Jeff Layton @ 2014-07-14 16:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bfields, linux-nfs

On Thu, 10 Jul 2014 08:43:00 -0400
Jeff Layton <jlayton@primarydata.com> wrote:

> On Thu, 10 Jul 2014 04:23:42 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Tue, Jul 08, 2014 at 02:02:51PM -0400, Jeff Layton wrote:
> > > From: Trond Myklebust <trond.myklebust@primarydata.com>
> > > 
> > > Add an extra delegation state to allow the stateid to remain in the idr
> > > tree until the last reference has been released. This will be necessary
> > > to ensure uniqueness once the client_mutex is removed.
> > > 
> > > [jlayton: reset the sc_type under the state_lock in unhash_delegation]
> > 
> > I'd be tempted to instead have a closed flag, there's plenty of space in
> > the hole after sc_type anyway.
> > 

I've been looking at this and I don't really see much in the way of
benefit from changing to a set of flags for that sort of thing. I don't
think it will materially improve the code.

Currently, we only rarely search for these "secondary" stateid types
(only in nfsd4_close), and if we add in a set of flags then we have to
worry about checking them in places where we're searching for "base"
stateid types today.

> > The rationale for that is that a stateid really shouldn't change the
> > type during it's life time, and callers that specify the type to look
> > up shouldn't bother with looking up different types due to this either.
> > 

This rationale seems somewhat backward. If we do this, callers that are
currently looking up a specific type of stateid will now have to also
check these flags in order to see if it's in the state that it expects,
which is a more common case than the one where we search for a
"secondary" sc_type.

> > NFS4_REVOKED_DELEG_STID would also be replaced by a revoked flag, which
> > makes much more sene to start with as well.
> >

There's also unhash_stid() in the current code which sets the sc_type
of a lock stateid to 0 in order to make it unfindable. I think we're
better served by simply using the sc_type for this.

> 
> That sort of change will ripple throughout the set. Your point about
> not changing the sc_type is valid though. I'll see whether it's
> doable.
> 
> > Btw, do you also plan to keep open stateids as NFS4_CLOSED_STID for
> > 4.1+?  In that case the comment there would need an update.  What
> > about lock stateids?
> > 
> 
> No. For v4.1+ we have no need to keep stateids around that are closed.
> They are released as soon as the close occurs. The only reason to keep
> them around is for v4.0 replays.
> 



-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed
  2014-07-14 16:45       ` Jeff Layton
@ 2014-07-15  7:50         ` Christoph Hellwig
  0 siblings, 0 replies; 144+ messages in thread
From: Christoph Hellwig @ 2014-07-15  7:50 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs

On Mon, Jul 14, 2014 at 12:45:51PM -0400, Jeff Layton wrote:
> I've been looking at this and I don't really see much in the way of
> benefit from changing to a set of flags for that sort of thing. I don't
> think it will materially improve the code.
> 
> Currently, we only rarely search for these "secondary" stateid types
> (only in nfsd4_close), and if we add in a set of flags then we have to
> worry about checking them in places where we're searching for "base"
> stateid types today.

Ok.


^ permalink raw reply	[flat|nested] 144+ messages in thread

end of thread, other threads:[~2014-07-15  7:50 UTC | newest]

Thread overview: 144+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-08 18:02 [PATCH v4 000/101] nfsd: eliminate the client_mutex Jeff Layton
2014-07-08 18:02 ` [PATCH v4 001/100] nfsd: close potential race between delegation break and laundromat Jeff Layton
2014-07-10 15:59   ` J. Bruce Fields
2014-07-10 16:16     ` Jeff Layton
2014-07-10 16:34       ` Jeff Layton
2014-07-10 17:41         ` J. Bruce Fields
2014-07-08 18:02 ` [PATCH v4 002/100] nfsd: reduce some spinlocking in put_client_renew Jeff Layton
2014-07-10 11:18   ` Christoph Hellwig
2014-07-08 18:02 ` [PATCH v4 003/100] nfsd: Ensure stateids remain unique until they are freed Jeff Layton
2014-07-10 11:23   ` Christoph Hellwig
2014-07-10 12:43     ` Jeff Layton
2014-07-14 16:45       ` Jeff Layton
2014-07-15  7:50         ` Christoph Hellwig
2014-07-08 18:02 ` [PATCH v4 004/100] nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg Jeff Layton
2014-07-08 18:02 ` [PATCH v4 005/100] nfsd: Move the delegation reference counter into the struct nfs4_stid Jeff Layton
2014-07-10 11:28   ` Christoph Hellwig
2014-07-08 18:02 ` [PATCH v4 006/100] nfsd4: use cl_lock to synchronize all stateid idr calls Jeff Layton
2014-07-10 11:32   ` Christoph Hellwig
2014-07-10 12:45     ` Jeff Layton
2014-07-08 18:02 ` [PATCH v4 007/100] nfsd: Add fine grained protection for the nfs4_file->fi_stateids list Jeff Layton
2014-07-10 11:33   ` Christoph Hellwig
2014-07-08 18:02 ` [PATCH v4 008/100] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache Jeff Layton
2014-07-08 18:02 ` [PATCH v4 009/100] nfsd: Add locking to the nfs4_file->fi_fds[] array Jeff Layton
2014-07-10 10:32   ` Christoph Hellwig
2014-07-08 18:02 ` [PATCH v4 010/100] nfsd: clean up helper __release_lock_stateid Jeff Layton
2014-07-08 18:02 ` [PATCH v4 011/100] nfsd: refactor nfs4_file_get_access and nfs4_file_put_access Jeff Layton
2014-07-10  7:59   ` Christoph Hellwig
2014-07-10 11:32     ` Jeff Layton
2014-07-10 11:35       ` Christoph Hellwig
2014-07-10 12:49         ` Jeff Layton
2014-07-10 13:01           ` Christoph Hellwig
2014-07-08 18:03 ` [PATCH v4 012/100] nfsd: remove nfs4_file_put_fd Jeff Layton
2014-07-10  8:03   ` Christoph Hellwig
2014-07-10 11:49     ` Jeff Layton
2014-07-10 12:05       ` Christoph Hellwig
2014-07-10 13:15         ` Jeff Layton
2014-07-08 18:03 ` [PATCH v4 013/100] nfsd: shrink st_access_bmap and st_deny_bmap Jeff Layton
2014-07-10  8:04   ` Christoph Hellwig
2014-07-10 10:50   ` Christoph Hellwig
2014-07-10 12:36     ` Jeff Layton
2014-07-08 18:03 ` [PATCH v4 014/100] nfsd: set stateid access and deny bits in nfs4_get_vfs_file Jeff Layton
2014-07-10  8:34   ` Christoph Hellwig
2014-07-10 15:05     ` Jeff Layton
2014-07-08 18:03 ` [PATCH v4 015/100] nfsd: clean up reset_union_bmap_deny Jeff Layton
2014-07-10 10:31   ` Christoph Hellwig
2014-07-10 12:19     ` Jeff Layton
2014-07-10 12:43       ` Christoph Hellwig
2014-07-10 13:16       ` Christoph Hellwig
2014-07-10 13:21         ` Jeff Layton
2014-07-10 16:20           ` J. Bruce Fields
2014-07-10 16:37             ` Jeff Layton
2014-07-08 18:03 ` [PATCH v4 016/100] nfsd: always hold the fi_lock when bumping fi_access refcounts Jeff Layton
2014-07-10  8:51   ` Christoph Hellwig
2014-07-10 12:20     ` Jeff Layton
2014-07-10 12:43       ` Christoph Hellwig
2014-07-08 18:03 ` [PATCH v4 017/100] nfsd: make deny mode enforcement more efficient and close races in it Jeff Layton
2014-07-10 10:49   ` Christoph Hellwig
2014-07-10 12:36     ` Jeff Layton
2014-07-10 12:45       ` Christoph Hellwig
2014-07-08 18:03 ` [PATCH v4 018/100] nfsd: cleanup and rename nfs4_check_open Jeff Layton
2014-07-10 10:51   ` Christoph Hellwig
2014-07-08 18:03 ` [PATCH v4 019/100] locks: add file_has_lease to prevent delegation break races Jeff Layton
2014-07-08 18:03 ` [PATCH v4 020/100] nfsd: nfs4_alloc_init_lease should take a nfs4_file arg Jeff Layton
2014-07-08 18:03 ` [PATCH v4 021/100] nfsd: Protect the nfs4_file delegation fields using the fi_lock Jeff Layton
2014-07-08 18:03 ` [PATCH v4 022/100] nfsd: Simplify stateid management Jeff Layton
2014-07-08 18:03 ` [PATCH v4 023/100] nfsd: Fix delegation revocation Jeff Layton
2014-07-08 18:03 ` [PATCH v4 024/100] nfsd: Add reference counting to the lock and open stateids Jeff Layton
2014-07-08 18:03 ` [PATCH v4 025/100] nfsd: Add a struct nfs4_file field to struct nfs4_stid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 026/100] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file Jeff Layton
2014-07-08 18:03 ` [PATCH v4 027/100] nfsd: Ensure atomicity of stateid destruction and idr tree removal Jeff Layton
2014-07-08 18:03 ` [PATCH v4 028/100] nfsd: Cleanup the freeing of stateids Jeff Layton
2014-07-08 18:03 ` [PATCH v4 029/100] nfsd: do filp_close in sc_free callback for lock stateids Jeff Layton
2014-07-08 18:03 ` [PATCH v4 030/100] nfsd: Add locking to protect the state owner lists Jeff Layton
2014-07-08 18:03 ` [PATCH v4 031/100] nfsd: clean up races in lock stateid searching and creation Jeff Layton
2014-07-08 18:03 ` [PATCH v4 032/100] nfsd: Convert delegation counter to an atomic_long_t type Jeff Layton
2014-07-08 18:03 ` [PATCH v4 033/100] nfsd: Slight cleanup of find_stateid() Jeff Layton
2014-07-08 18:03 ` [PATCH v4 034/100] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 035/100] nfsd: Add reference counting to lock stateids Jeff Layton
2014-07-08 18:03 ` [PATCH v4 036/100] nfsd: nfsd4_locku() must reference the lock stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 037/100] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 038/100] nfsd: nfsd4_process_open2() must reference " Jeff Layton
2014-07-08 18:03 ` [PATCH v4 039/100] nfsd: nfsd4_process_open2() must reference the open stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 040/100] nfsd: Prepare nfsd4_close() for open stateid referencing Jeff Layton
2014-07-08 18:03 ` [PATCH v4 041/100] nfsd: nfsd4_open_confirm() must reference the open stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 042/100] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op Jeff Layton
2014-07-08 18:03 ` [PATCH v4 043/100] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op Jeff Layton
2014-07-08 18:03 ` [PATCH v4 044/100] nfsd: Migrate the stateid reference into nfs4_lookup_stateid() Jeff Layton
2014-07-08 18:03 ` [PATCH v4 045/100] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type() Jeff Layton
2014-07-08 18:03 ` [PATCH v4 046/100] nfsd: Add reference counting to state owners Jeff Layton
2014-07-08 18:03 ` [PATCH v4 047/100] nfsd: Keep a reference to the open stateid for the NFSv4.0 replay cache Jeff Layton
2014-07-08 18:03 ` [PATCH v4 048/100] nfsd: clean up lockowner refcounting when finding them Jeff Layton
2014-07-08 18:03 ` [PATCH v4 049/100] nfsd: add an operation for unhashing a stateowner Jeff Layton
2014-07-08 18:03 ` [PATCH v4 050/100] nfsd: Make lock stateid take a reference to the lockowner Jeff Layton
2014-07-08 18:03 ` [PATCH v4 051/100] nfsd: clean up refcounting for lockowners Jeff Layton
2014-07-08 18:03 ` [PATCH v4 052/100] nfsd: make openstateids hold references to their openowners Jeff Layton
2014-07-08 18:03 ` [PATCH v4 053/100] nfsd: don't allow CLOSE to proceed until refcount on stateid drops Jeff Layton
2014-07-08 18:03 ` [PATCH v4 054/100] nfsd: Protect adding/removing open state owners using client_lock Jeff Layton
2014-07-08 18:03 ` [PATCH v4 055/100] nfsd: Protect adding/removing lock " Jeff Layton
2014-07-08 18:03 ` [PATCH v4 056/100] nfsd: Move the open owner hash table into struct nfs4_client Jeff Layton
2014-07-08 18:03 ` [PATCH v4 057/100] nfsd: clean up and reorganize release_lockowner Jeff Layton
2014-07-08 18:03 ` [PATCH v4 058/100] nfsd: add locking to stateowner release Jeff Layton
2014-07-08 18:03 ` [PATCH v4 059/100] nfsd: optimize destroy_lockowner cl_lock thrashing Jeff Layton
2014-07-08 18:03 ` [PATCH v4 060/100] nfsd: close potential race in nfsd4_free_stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 061/100] nfsd: reduce cl_lock thrashing in release_openowner Jeff Layton
2014-07-08 18:03 ` [PATCH v4 062/100] nfsd: don't thrash the cl_lock while freeing an open stateid Jeff Layton
2014-07-08 18:03 ` [PATCH v4 063/100] nfsd: Ensure struct nfs4_client is unhashed before we try to destroy it Jeff Layton
2014-07-08 18:03 ` [PATCH v4 064/100] nfsd: Ensure that the laundromat unhashes the client before releasing locks Jeff Layton
2014-07-08 18:03 ` [PATCH v4 065/100] nfsd: Don't require client_lock in free_client Jeff Layton
2014-07-08 18:03 ` [PATCH v4 066/100] nfsd: Move create_client() call outside the lock Jeff Layton
2014-07-08 18:03 ` [PATCH v4 067/100] nfsd: Protect unconfirmed client creation using client_lock Jeff Layton
2014-07-08 18:03 ` [PATCH v4 068/100] nfsd: Protect session creation and client confirm " Jeff Layton
2014-07-08 18:03 ` [PATCH v4 069/100] nfsd: Protect nfsd4_destroy_clientid " Jeff Layton
2014-07-08 18:03 ` [PATCH v4 070/100] nfsd: Ensure lookup_clientid() takes client_lock Jeff Layton
2014-07-08 18:03 ` [PATCH v4 071/100] nfsd: Add lockdep assertions to document the nfs4_client/session locking Jeff Layton
2014-07-08 18:04 ` [PATCH v4 072/100] nfsd: protect the close_lru list and oo_last_closed_stid with client_lock Jeff Layton
2014-07-08 18:04 ` [PATCH v4 073/100] nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock Jeff Layton
2014-07-08 18:04 ` [PATCH v4 074/100] nfsd: move unhash_client_locked call into mark_client_expired_locked Jeff Layton
2014-07-08 18:04 ` [PATCH v4 075/100] nfsd: don't destroy client if mark_client_expired_locked fails Jeff Layton
2014-07-08 18:04 ` [PATCH v4 076/100] nfsd: don't destroy clients that are busy Jeff Layton
2014-07-08 18:04 ` [PATCH v4 077/100] nfsd: protect clid and verifier generation with client_lock Jeff Layton
2014-07-08 18:04 ` [PATCH v4 078/100] nfsd: abstract out the get and set routines into the fault injection ops Jeff Layton
2014-07-08 18:04 ` [PATCH v4 079/100] nfsd: add a forget_clients "get" routine with proper locking Jeff Layton
2014-07-08 18:04 ` [PATCH v4 080/100] nfsd: add a forget_client set_clnt routine Jeff Layton
2014-07-08 18:04 ` [PATCH v4 081/100] nfsd: add nfsd_inject_forget_clients Jeff Layton
2014-07-08 18:04 ` [PATCH v4 082/100] nfsd: add a list_head arg to nfsd_foreach_client_lock Jeff Layton
2014-07-08 18:04 ` [PATCH v4 083/100] nfsd: add more granular locking to forget_locks fault injector Jeff Layton
2014-07-08 18:04 ` [PATCH v4 084/100] nfsd: add more granular locking to forget_openowners " Jeff Layton
2014-07-08 18:04 ` [PATCH v4 085/100] nfsd: add more granular locking to *_delegations fault injectors Jeff Layton
2014-07-08 18:04 ` [PATCH v4 086/100] nfsd: remove old fault injection infrastructure Jeff Layton
2014-07-08 18:04 ` [PATCH v4 087/100] nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() Jeff Layton
2014-07-08 18:04 ` [PATCH v4 088/100] nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid Jeff Layton
2014-07-08 18:04 ` [PATCH v4 089/100] nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner Jeff Layton
2014-07-08 18:04 ` [PATCH v4 090/100] nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() Jeff Layton
2014-07-08 18:04 ` [PATCH v4 091/100] nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close Jeff Layton
2014-07-08 18:04 ` [PATCH v4 092/100] nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() Jeff Layton
2014-07-08 18:04 ` [PATCH v4 093/100] nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm Jeff Layton
2014-07-08 18:04 ` [PATCH v4 094/100] nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() Jeff Layton
2014-07-08 18:04 ` [PATCH v4 095/100] nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew Jeff Layton
2014-07-08 18:04 ` [PATCH v4 096/100] nfsd: Remove nfs4_lock_state(): reclaim_complete() Jeff Layton
2014-07-08 18:04 ` [PATCH v4 097/100] nfsd: remove nfs4_lock_state: nfs4_laundromat Jeff Layton
2014-07-08 18:04 ` [PATCH v4 098/100] nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net Jeff Layton
2014-07-08 18:04 ` [PATCH v4 099/100] nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers Jeff Layton
2014-07-08 18:04 ` [PATCH v4 100/100] nfsd: add some comments to the nfsd4 object definitions Jeff Layton
2014-07-10  7:41   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.