linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] nfsd: filecache fixes
@ 2022-09-30 19:15 Jeff Layton
  2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 19:15 UTC (permalink / raw)
  To: chuck.lever; +Cc: linux-nfs

I've been trying to hunt down the bug reported here:

    https://bugzilla.linux-nfs.org/show_bug.cgi?id=394

I've not been able to reliably reproduce that, but patches #2 and #3 may
help. The issue I think may be in the management of the sentinel
reference. Responsibility for putting that reference belongs to the
task that clears the HASHED flag, so we need to take care not to put
that ref until we've successfully cleared it.

Hopefully, this will make it a bit more consistent (and close some other
potential races as well).

Jeff Layton (3):
  nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting
    refs
  nfsd: fix potential race in nfsd_file_close
  nfsd: fix nfsd_file_unhash_and_dispose

 fs/nfsd/filecache.c | 53 ++++++++++++++++++---------------------------
 1 file changed, 21 insertions(+), 32 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs
  2022-09-30 19:15 [PATCH 0/3] nfsd: filecache fixes Jeff Layton
@ 2022-09-30 19:15 ` Jeff Layton
  2022-09-30 19:20   ` Chuck Lever III
  2022-10-01  4:44   ` NeilBrown
  2022-09-30 19:15 ` [PATCH 2/3] nfsd: fix potential race in nfsd_file_close Jeff Layton
  2022-09-30 19:15 ` [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose Jeff Layton
  2 siblings, 2 replies; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 19:15 UTC (permalink / raw)
  To: chuck.lever; +Cc: linux-nfs

nfsd_file is RCU-freed, so it's possible that one could be found that's
in the process of being freed and the memory recycled. Ensure we hold
the rcu_read_lock while attempting to get a reference on the object.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index d5c57360b418..6237715bd23e 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 retry:
 	/* Avoid allocation if the item is already in cache */
+	rcu_read_lock();
 	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
 				    nfsd_file_rhash_params);
 	if (nf)
 		nf = nfsd_file_get(nf);
+	rcu_read_unlock();
 	if (nf)
 		goto wait_for_construction;
 
@@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		goto out_status;
 	}
 
+	rcu_read_lock();
 	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
 					      &key, &new->nf_rhash,
 					      nfsd_file_rhash_params);
 	if (!nf) {
+		rcu_read_unlock();
 		nf = new;
 		goto open_file;
 	}
-	if (IS_ERR(nf))
+	if (IS_ERR(nf)) {
+		rcu_read_unlock();
 		goto insert_err;
+	}
 	nf = nfsd_file_get(nf);
+	rcu_read_unlock();
 	if (nf == NULL) {
 		nf = new;
 		goto open_file;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] nfsd: fix potential race in nfsd_file_close
  2022-09-30 19:15 [PATCH 0/3] nfsd: filecache fixes Jeff Layton
  2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
@ 2022-09-30 19:15 ` Jeff Layton
  2022-09-30 20:58   ` Jeff Layton
  2022-10-01  5:03   ` NeilBrown
  2022-09-30 19:15 ` [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose Jeff Layton
  2 siblings, 2 replies; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 19:15 UTC (permalink / raw)
  To: chuck.lever; +Cc: linux-nfs

Once we call nfsd_file_put, there is no guarantee that "nf" can still be
safely accessed. That may have been the last reference.

Change the code to instead check for whether nf_ref is 2 and then unhash
it and put the reference if we're successful.

We might occasionally race with another lookup and end up unhashing it
when it probably shouldn't have been, but that should hopefully be rare
and will just result in the competing lookup having to create a new
nfsd_file.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 6237715bd23e..58f4d9267f4a 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -461,12 +461,14 @@ nfsd_file_put(struct nfsd_file *nf)
  */
 void nfsd_file_close(struct nfsd_file *nf)
 {
-	nfsd_file_put(nf);
-	if (refcount_dec_if_one(&nf->nf_ref)) {
-		nfsd_file_unhash(nf);
-		nfsd_file_lru_remove(nf);
-		nfsd_file_free(nf);
+	/* One for the reference being put, and one for the hash */
+	if (refcount_read(&nf->nf_ref) == 2) {
+		if (nfsd_file_unhash(nf))
+			nfsd_file_put_noref(nf);
 	}
+	/* put the ref for the stateid */
+	nfsd_file_put(nf);
+
 }
 
 struct nfsd_file *
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose
  2022-09-30 19:15 [PATCH 0/3] nfsd: filecache fixes Jeff Layton
  2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
  2022-09-30 19:15 ` [PATCH 2/3] nfsd: fix potential race in nfsd_file_close Jeff Layton
@ 2022-09-30 19:15 ` Jeff Layton
  2022-09-30 19:29   ` Chuck Lever III
  2 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 19:15 UTC (permalink / raw)
  To: chuck.lever; +Cc: linux-nfs

This function is called two reasons:

We're either shutting down and purging the filecache, or we've gotten a
notification about a file delete, so we want to go ahead and unhash it
so that it'll get cleaned up when we close.

We're either walking the hashtable or doing a lookup in it and we
don't take a reference in either case. What we want to do in both cases
is to try and unhash the object and put it on the dispose list if that
was successful. If it's no longer hashed, then we don't want to touch
it, with the assumption being that something else is already cleaning
up the sentinel reference.

Instead of trying to selectively decrement the refcount in this
function, just unhash it, and if that was successful, move it to the
dispose list. Then, the disposal routine will just clean that up as
usual.

Also, just make this a void function, drop the WARN_ON_ONCE, and the
comments about deadlocking since the nature of the purported deadlock
is no longer clear.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 32 ++++++--------------------------
 1 file changed, 6 insertions(+), 26 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 58f4d9267f4a..16bd71a3894e 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -408,19 +408,14 @@ nfsd_file_unhash(struct nfsd_file *nf)
 /*
  * Return true if the file was unhashed.
  */
-static bool
+static void
 nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
 {
 	trace_nfsd_file_unhash_and_dispose(nf);
-	if (!nfsd_file_unhash(nf))
-		return false;
-	/* keep final reference for nfsd_file_lru_dispose */
-	if (refcount_dec_not_one(&nf->nf_ref))
-		return true;
-
-	nfsd_file_lru_remove(nf);
-	list_add(&nf->nf_lru, dispose);
-	return true;
+	if (nfsd_file_unhash(nf)) {
+		nfsd_file_lru_remove(nf);
+		list_add(&nf->nf_lru, dispose);
+	}
 }
 
 static void
@@ -564,8 +559,6 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
  * @lock: LRU list lock (unused)
  * @arg: dispose list
  *
- * Note this can deadlock with nfsd_file_cache_purge.
- *
  * Return values:
  *   %LRU_REMOVED: @item was removed from the LRU
  *   %LRU_ROTATE: @item is to be moved to the LRU tail
@@ -750,8 +743,6 @@ nfsd_file_close_inode(struct inode *inode)
  *
  * Walk the LRU list and close any entries that have not been used since
  * the last scan.
- *
- * Note this can deadlock with nfsd_file_cache_purge.
  */
 static void
 nfsd_file_delayed_close(struct work_struct *work)
@@ -893,16 +884,12 @@ nfsd_file_cache_init(void)
 	goto out;
 }
 
-/*
- * Note this can deadlock with nfsd_file_lru_cb.
- */
 static void
 __nfsd_file_cache_purge(struct net *net)
 {
 	struct rhashtable_iter iter;
 	struct nfsd_file *nf;
 	LIST_HEAD(dispose);
-	bool del;
 
 	rhashtable_walk_enter(&nfsd_file_rhash_tbl, &iter);
 	do {
@@ -912,14 +899,7 @@ __nfsd_file_cache_purge(struct net *net)
 		while (!IS_ERR_OR_NULL(nf)) {
 			if (net && nf->nf_net != net)
 				continue;
-			del = nfsd_file_unhash_and_dispose(nf, &dispose);
-
-			/*
-			 * Deadlock detected! Something marked this entry as
-			 * unhased, but hasn't removed it from the hash list.
-			 */
-			WARN_ON_ONCE(!del);
-
+			nfsd_file_unhash_and_dispose(nf, &dispose);
 			nf = rhashtable_walk_next(&iter);
 		}
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs
  2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
@ 2022-09-30 19:20   ` Chuck Lever III
  2022-09-30 19:33     ` Jeff Layton
  2022-10-01  4:44   ` NeilBrown
  1 sibling, 1 reply; 16+ messages in thread
From: Chuck Lever III @ 2022-09-30 19:20 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List



> On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> nfsd_file is RCU-freed, so it's possible that one could be found that's
> in the process of being freed and the memory recycled. Ensure we hold
> the rcu_read_lock while attempting to get a reference on the object.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>

IIUC, the rcu_read_lock() is held when nfsd_file_obj_cmpfn() is
invoked. So, couldn't we just call nfsd_file_get() on @nf in
there if it returns a match?


> ---
> fs/nfsd/filecache.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index d5c57360b418..6237715bd23e 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> 
> retry:
> 	/* Avoid allocation if the item is already in cache */
> +	rcu_read_lock();
> 	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> 				    nfsd_file_rhash_params);
> 	if (nf)
> 		nf = nfsd_file_get(nf);
> +	rcu_read_unlock();
> 	if (nf)
> 		goto wait_for_construction;
> 
> @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> 		goto out_status;
> 	}
> 
> +	rcu_read_lock();
> 	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> 					      &key, &new->nf_rhash,
> 					      nfsd_file_rhash_params);
> 	if (!nf) {
> +		rcu_read_unlock();
> 		nf = new;
> 		goto open_file;
> 	}
> -	if (IS_ERR(nf))
> +	if (IS_ERR(nf)) {
> +		rcu_read_unlock();
> 		goto insert_err;
> +	}
> 	nf = nfsd_file_get(nf);
> +	rcu_read_unlock();
> 	if (nf == NULL) {
> 		nf = new;
> 		goto open_file;
> -- 
> 2.37.3
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose
  2022-09-30 19:15 ` [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose Jeff Layton
@ 2022-09-30 19:29   ` Chuck Lever III
  2022-09-30 19:42     ` Jeff Layton
  0 siblings, 1 reply; 16+ messages in thread
From: Chuck Lever III @ 2022-09-30 19:29 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List



> On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> This function is called two reasons:
> 
> We're either shutting down and purging the filecache, or we've gotten a
> notification about a file delete, so we want to go ahead and unhash it
> so that it'll get cleaned up when we close.
> 
> We're either walking the hashtable or doing a lookup in it and we
> don't take a reference in either case. What we want to do in both cases
> is to try and unhash the object and put it on the dispose list if that
> was successful. If it's no longer hashed, then we don't want to touch
> it, with the assumption being that something else is already cleaning
> up the sentinel reference.
> 
> Instead of trying to selectively decrement the refcount in this
> function, just unhash it, and if that was successful, move it to the
> dispose list. Then, the disposal routine will just clean that up as
> usual.
> 
> Also, just make this a void function, drop the WARN_ON_ONCE, and the
> comments about deadlocking since the nature of the purported deadlock
> is no longer clear.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> fs/nfsd/filecache.c | 32 ++++++--------------------------
> 1 file changed, 6 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 58f4d9267f4a..16bd71a3894e 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -408,19 +408,14 @@ nfsd_file_unhash(struct nfsd_file *nf)
> /*
>  * Return true if the file was unhashed.
>  */

If you're changing the function to return void, the above
comment is now stale.

> -static bool
> +static void
> nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
> {
> 	trace_nfsd_file_unhash_and_dispose(nf);
> -	if (!nfsd_file_unhash(nf))
> -		return false;
> -	/* keep final reference for nfsd_file_lru_dispose */

This comment has been stale since nfsd_file_lru_dispose() was
renamed or removed. The only trouble I have is there isn't a
comment left that explains why we're not decrementing the hash
table reference here. ("don't have to" is enough to say about
it, but there should be something).


> -	if (refcount_dec_not_one(&nf->nf_ref))
> -		return true;
> -
> -	nfsd_file_lru_remove(nf);
> -	list_add(&nf->nf_lru, dispose);
> -	return true;
> +	if (nfsd_file_unhash(nf)) {
> +		nfsd_file_lru_remove(nf);
> +		list_add(&nf->nf_lru, dispose);
> +	}
> }
> 
> static void
> @@ -564,8 +559,6 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>  * @lock: LRU list lock (unused)
>  * @arg: dispose list
>  *
> - * Note this can deadlock with nfsd_file_cache_purge.
> - *
>  * Return values:
>  *   %LRU_REMOVED: @item was removed from the LRU
>  *   %LRU_ROTATE: @item is to be moved to the LRU tail
> @@ -750,8 +743,6 @@ nfsd_file_close_inode(struct inode *inode)
>  *
>  * Walk the LRU list and close any entries that have not been used since
>  * the last scan.
> - *
> - * Note this can deadlock with nfsd_file_cache_purge.
>  */
> static void
> nfsd_file_delayed_close(struct work_struct *work)
> @@ -893,16 +884,12 @@ nfsd_file_cache_init(void)
> 	goto out;
> }
> 
> -/*
> - * Note this can deadlock with nfsd_file_lru_cb.
> - */
> static void
> __nfsd_file_cache_purge(struct net *net)
> {
> 	struct rhashtable_iter iter;
> 	struct nfsd_file *nf;
> 	LIST_HEAD(dispose);
> -	bool del;
> 
> 	rhashtable_walk_enter(&nfsd_file_rhash_tbl, &iter);
> 	do {
> @@ -912,14 +899,7 @@ __nfsd_file_cache_purge(struct net *net)
> 		while (!IS_ERR_OR_NULL(nf)) {
> 			if (net && nf->nf_net != net)
> 				continue;
> -			del = nfsd_file_unhash_and_dispose(nf, &dispose);
> -
> -			/*
> -			 * Deadlock detected! Something marked this entry as
> -			 * unhased, but hasn't removed it from the hash list.
> -			 */
> -			WARN_ON_ONCE(!del);
> -
> +			nfsd_file_unhash_and_dispose(nf, &dispose);
> 			nf = rhashtable_walk_next(&iter);
> 		}
> 
> -- 
> 2.37.3
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs
  2022-09-30 19:20   ` Chuck Lever III
@ 2022-09-30 19:33     ` Jeff Layton
  2022-09-30 20:06       ` Chuck Lever III
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 19:33 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Linux NFS Mailing List

On Fri, 2022-09-30 at 19:20 +0000, Chuck Lever III wrote:
> 
> > On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > nfsd_file is RCU-freed, so it's possible that one could be found that's
> > in the process of being freed and the memory recycled. Ensure we hold
> > the rcu_read_lock while attempting to get a reference on the object.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> 
> IIUC, the rcu_read_lock() is held when nfsd_file_obj_cmpfn() is
> invoked. So, couldn't we just call nfsd_file_get() on @nf in
> there if it returns a match?
> 
> 

Adding side effects to the comparison function seems kind of gross. Do
you know for sure that it will only be called once when you do a search?

Also, there are other places where we search the hashtable but don't
take references. We could just make those places put the reference, but
that gets messy.

I think this is cleaner.

> > ---
> > fs/nfsd/filecache.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index d5c57360b418..6237715bd23e 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > 
> > retry:
> > 	/* Avoid allocation if the item is already in cache */
> > +	rcu_read_lock();
> > 	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> > 				    nfsd_file_rhash_params);
> > 	if (nf)
> > 		nf = nfsd_file_get(nf);
> > +	rcu_read_unlock();
> > 	if (nf)
> > 		goto wait_for_construction;
> > 
> > @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > 		goto out_status;
> > 	}
> > 
> > +	rcu_read_lock();
> > 	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> > 					      &key, &new->nf_rhash,
> > 					      nfsd_file_rhash_params);
> > 	if (!nf) {
> > +		rcu_read_unlock();
> > 		nf = new;
> > 		goto open_file;
> > 	}
> > -	if (IS_ERR(nf))
> > +	if (IS_ERR(nf)) {
> > +		rcu_read_unlock();
> > 		goto insert_err;
> > +	}
> > 	nf = nfsd_file_get(nf);
> > +	rcu_read_unlock();
> > 	if (nf == NULL) {
> > 		nf = new;
> > 		goto open_file;
> > -- 
> > 2.37.3
> > 
> 
> --
> Chuck Lever
> 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose
  2022-09-30 19:29   ` Chuck Lever III
@ 2022-09-30 19:42     ` Jeff Layton
  2022-09-30 20:23       ` Chuck Lever III
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 19:42 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Linux NFS Mailing List

On Fri, 2022-09-30 at 19:29 +0000, Chuck Lever III wrote:
> 
> > On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > This function is called two reasons:
> > 
> > We're either shutting down and purging the filecache, or we've gotten a
> > notification about a file delete, so we want to go ahead and unhash it
> > so that it'll get cleaned up when we close.
> > 
> > We're either walking the hashtable or doing a lookup in it and we
> > don't take a reference in either case. What we want to do in both cases
> > is to try and unhash the object and put it on the dispose list if that
> > was successful. If it's no longer hashed, then we don't want to touch
> > it, with the assumption being that something else is already cleaning
> > up the sentinel reference.
> > 
> > Instead of trying to selectively decrement the refcount in this
> > function, just unhash it, and if that was successful, move it to the
> > dispose list. Then, the disposal routine will just clean that up as
> > usual.
> > 
> > Also, just make this a void function, drop the WARN_ON_ONCE, and the
> > comments about deadlocking since the nature of the purported deadlock
> > is no longer clear.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > fs/nfsd/filecache.c | 32 ++++++--------------------------
> > 1 file changed, 6 insertions(+), 26 deletions(-)
> > 
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index 58f4d9267f4a..16bd71a3894e 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -408,19 +408,14 @@ nfsd_file_unhash(struct nfsd_file *nf)
> > /*
> >  * Return true if the file was unhashed.
> >  */
> 
> If you're changing the function to return void, the above
> comment is now stale.
> 
> > -static bool
> > +static void
> > nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
> > {
> > 	trace_nfsd_file_unhash_and_dispose(nf);
> > -	if (!nfsd_file_unhash(nf))
> > -		return false;
> > -	/* keep final reference for nfsd_file_lru_dispose */
> 
> This comment has been stale since nfsd_file_lru_dispose() was
> renamed or removed. The only trouble I have is there isn't a
> comment left that explains why we're not decrementing the hash
> table reference here. ("don't have to" is enough to say about
> it, but there should be something).
> 
> 

How about this?

+       if (nfsd_file_unhash(nf)) {
+               /*
+                * Unhashing was successful. Transfer it to the dispose list
+                * so that we can put the sentinel reference for it later.
+                */
+               nfsd_file_lru_remove(nf);
+               list_add(&nf->nf_lru, dispose);
+       }


In this case, we're basically transferring the sentinel reference to the
"dispose" list. Later, we'll call nfsd_file_dispose_list and drop it.

Now that we don't have such onerous spinlocking in this code, we might
be able to just put each reference as we go instead of deferring it to a
list and putting them all at the end. That's probably best done later in
a separate patch however.


> > -	if (refcount_dec_not_one(&nf->nf_ref))
> > -		return true;
> > -
> > -	nfsd_file_lru_remove(nf);
> > -	list_add(&nf->nf_lru, dispose);
> > -	return true;
> > +	if (nfsd_file_unhash(nf)) {
> > +		nfsd_file_lru_remove(nf);
> > +		list_add(&nf->nf_lru, dispose);
> > +	}
> > }
> > 
> > static void
> > @@ -564,8 +559,6 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
> >  * @lock: LRU list lock (unused)
> >  * @arg: dispose list
> >  *
> > - * Note this can deadlock with nfsd_file_cache_purge.
> > - *
> >  * Return values:
> >  *   %LRU_REMOVED: @item was removed from the LRU
> >  *   %LRU_ROTATE: @item is to be moved to the LRU tail
> > @@ -750,8 +743,6 @@ nfsd_file_close_inode(struct inode *inode)
> >  *
> >  * Walk the LRU list and close any entries that have not been used since
> >  * the last scan.
> > - *
> > - * Note this can deadlock with nfsd_file_cache_purge.
> >  */
> > static void
> > nfsd_file_delayed_close(struct work_struct *work)
> > @@ -893,16 +884,12 @@ nfsd_file_cache_init(void)
> > 	goto out;
> > }
> > 
> > -/*
> > - * Note this can deadlock with nfsd_file_lru_cb.
> > - */
> > static void
> > __nfsd_file_cache_purge(struct net *net)
> > {
> > 	struct rhashtable_iter iter;
> > 	struct nfsd_file *nf;
> > 	LIST_HEAD(dispose);
> > -	bool del;
> > 
> > 	rhashtable_walk_enter(&nfsd_file_rhash_tbl, &iter);
> > 	do {
> > @@ -912,14 +899,7 @@ __nfsd_file_cache_purge(struct net *net)
> > 		while (!IS_ERR_OR_NULL(nf)) {
> > 			if (net && nf->nf_net != net)
> > 				continue;
> > -			del = nfsd_file_unhash_and_dispose(nf, &dispose);
> > -
> > -			/*
> > -			 * Deadlock detected! Something marked this entry as
> > -			 * unhased, but hasn't removed it from the hash list.
> > -			 */
> > -			WARN_ON_ONCE(!del);
> > -
> > +			nfsd_file_unhash_and_dispose(nf, &dispose);
> > 			nf = rhashtable_walk_next(&iter);
> > 		}
> > 
> > -- 
> > 2.37.3
> > 
> 
> --
> Chuck Lever
> 
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs
  2022-09-30 19:33     ` Jeff Layton
@ 2022-09-30 20:06       ` Chuck Lever III
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever III @ 2022-09-30 20:06 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List



> On Sep 30, 2022, at 3:33 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Fri, 2022-09-30 at 19:20 +0000, Chuck Lever III wrote:
>> 
>>> On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> nfsd_file is RCU-freed, so it's possible that one could be found that's
>>> in the process of being freed and the memory recycled. Ensure we hold
>>> the rcu_read_lock while attempting to get a reference on the object.
>>> 
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> 
>> IIUC, the rcu_read_lock() is held when nfsd_file_obj_cmpfn() is
>> invoked. So, couldn't we just call nfsd_file_get() on @nf in
>> there if it returns a match?
>> 
>> 
> 
> Adding side effects to the comparison function seems kind of gross.
> Do you know for sure that it will only be called once when you do a search?

Yes, when a match occurs, the chain walk stops.


> Also, there are other places where we search the hashtable but don't
> take references. We could just make those places put the reference, but
> that gets messy.

nfsd_file_do_acquire() is the only place that calls the comparator
with NFSD_FILE_KEY_FULL.

 165                 if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0)
 166                         return 1;
+                    nfsd_file_get(nf);
 167                 break;


> I think this is cleaner.

Unfortunately, the comparator's second parameter is a const
void *, so we can't modify the matching nf. Ho-hum.


>>> ---
>>> fs/nfsd/filecache.c | 9 ++++++++-
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>> index d5c57360b418..6237715bd23e 100644
>>> --- a/fs/nfsd/filecache.c
>>> +++ b/fs/nfsd/filecache.c
>>> @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> 
>>> retry:
>>> 	/* Avoid allocation if the item is already in cache */
>>> +	rcu_read_lock();
>>> 	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
>>> 				    nfsd_file_rhash_params);

This probably should be changed to rhashtable_lookup() so that
we're not double-wrapping the call with rcu_read_lock/unlock.


>>> 	if (nf)
>>> 		nf = nfsd_file_get(nf);
>>> +	rcu_read_unlock();
>>> 	if (nf)
>>> 		goto wait_for_construction;
>>> 
>>> @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>> 		goto out_status;
>>> 	}
>>> 
>>> +	rcu_read_lock();
>>> 	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
>>> 					      &key, &new->nf_rhash,
>>> 					      nfsd_file_rhash_params);
>>> 	if (!nf) {
>>> +		rcu_read_unlock();
>>> 		nf = new;
>>> 		goto open_file;
>>> 	}
>>> -	if (IS_ERR(nf))
>>> +	if (IS_ERR(nf)) {
>>> +		rcu_read_unlock();
>>> 		goto insert_err;
>>> +	}
>>> 	nf = nfsd_file_get(nf);
>>> +	rcu_read_unlock();

What might be nicer is if this used rhashtable_lookup_insert_key()
instead, and if it returns EEXIST, it simply frees @new and retries
the first lookup.


>>> 	if (nf == NULL) {
>>> 		nf = new;
>>> 		goto open_file;
>>> -- 
>>> 2.37.3
>>> 
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

--
Chuck Lever




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose
  2022-09-30 19:42     ` Jeff Layton
@ 2022-09-30 20:23       ` Chuck Lever III
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever III @ 2022-09-30 20:23 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List



> On Sep 30, 2022, at 3:42 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Fri, 2022-09-30 at 19:29 +0000, Chuck Lever III wrote:
>> 
>>> On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> This function is called two reasons:
>>> 
>>> We're either shutting down and purging the filecache, or we've gotten a
>>> notification about a file delete, so we want to go ahead and unhash it
>>> so that it'll get cleaned up when we close.
>>> 
>>> We're either walking the hashtable or doing a lookup in it and we
>>> don't take a reference in either case. What we want to do in both cases
>>> is to try and unhash the object and put it on the dispose list if that
>>> was successful. If it's no longer hashed, then we don't want to touch
>>> it, with the assumption being that something else is already cleaning
>>> up the sentinel reference.
>>> 
>>> Instead of trying to selectively decrement the refcount in this
>>> function, just unhash it, and if that was successful, move it to the
>>> dispose list. Then, the disposal routine will just clean that up as
>>> usual.
>>> 
>>> Also, just make this a void function, drop the WARN_ON_ONCE, and the
>>> comments about deadlocking since the nature of the purported deadlock
>>> is no longer clear.
>>> 
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>> fs/nfsd/filecache.c | 32 ++++++--------------------------
>>> 1 file changed, 6 insertions(+), 26 deletions(-)
>>> 
>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>> index 58f4d9267f4a..16bd71a3894e 100644
>>> --- a/fs/nfsd/filecache.c
>>> +++ b/fs/nfsd/filecache.c
>>> @@ -408,19 +408,14 @@ nfsd_file_unhash(struct nfsd_file *nf)
>>> /*
>>> * Return true if the file was unhashed.
>>> */
>> 
>> If you're changing the function to return void, the above
>> comment is now stale.
>> 
>>> -static bool
>>> +static void
>>> nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
>>> {
>>> 	trace_nfsd_file_unhash_and_dispose(nf);
>>> -	if (!nfsd_file_unhash(nf))
>>> -		return false;
>>> -	/* keep final reference for nfsd_file_lru_dispose */
>> 
>> This comment has been stale since nfsd_file_lru_dispose() was
>> renamed or removed. The only trouble I have is there isn't a
>> comment left that explains why we're not decrementing the hash
>> table reference here. ("don't have to" is enough to say about
>> it, but there should be something).
>> 
>> 
> 
> How about this?
> 
> +       if (nfsd_file_unhash(nf)) {
> +               /*
> +                * Unhashing was successful. Transfer it to the dispose list
> +                * so that we can put the sentinel reference for it later.
> +                */

Right idea, but I would say nothing more than "nfsd_file_dispose_list()
will put the sentinel reference later."


> +               nfsd_file_lru_remove(nf);
> +               list_add(&nf->nf_lru, dispose);
> +       }

I was staring at this earlier today and thinking it needed clean
up. This looks right to me.


> In this case, we're basically transferring the sentinel reference to the
> "dispose" list. Later, we'll call nfsd_file_dispose_list and drop it.
> 
> Now that we don't have such onerous spinlocking in this code, we might
> be able to just put each reference as we go instead of deferring it to a
> list and putting them all at the end. That's probably best done later in
> a separate patch however.
> 
> 
>>> -	if (refcount_dec_not_one(&nf->nf_ref))
>>> -		return true;
>>> -
>>> -	nfsd_file_lru_remove(nf);
>>> -	list_add(&nf->nf_lru, dispose);
>>> -	return true;
>>> +	if (nfsd_file_unhash(nf)) {
>>> +		nfsd_file_lru_remove(nf);
>>> +		list_add(&nf->nf_lru, dispose);
>>> +	}
>>> }
>>> 
>>> static void
>>> @@ -564,8 +559,6 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>>> * @lock: LRU list lock (unused)
>>> * @arg: dispose list
>>> *
>>> - * Note this can deadlock with nfsd_file_cache_purge.
>>> - *
>>> * Return values:
>>> *   %LRU_REMOVED: @item was removed from the LRU
>>> *   %LRU_ROTATE: @item is to be moved to the LRU tail
>>> @@ -750,8 +743,6 @@ nfsd_file_close_inode(struct inode *inode)
>>> *
>>> * Walk the LRU list and close any entries that have not been used since
>>> * the last scan.
>>> - *
>>> - * Note this can deadlock with nfsd_file_cache_purge.
>>> */
>>> static void
>>> nfsd_file_delayed_close(struct work_struct *work)
>>> @@ -893,16 +884,12 @@ nfsd_file_cache_init(void)
>>> 	goto out;
>>> }
>>> 
>>> -/*
>>> - * Note this can deadlock with nfsd_file_lru_cb.
>>> - */
>>> static void
>>> __nfsd_file_cache_purge(struct net *net)
>>> {
>>> 	struct rhashtable_iter iter;
>>> 	struct nfsd_file *nf;
>>> 	LIST_HEAD(dispose);
>>> -	bool del;
>>> 
>>> 	rhashtable_walk_enter(&nfsd_file_rhash_tbl, &iter);
>>> 	do {
>>> @@ -912,14 +899,7 @@ __nfsd_file_cache_purge(struct net *net)
>>> 		while (!IS_ERR_OR_NULL(nf)) {
>>> 			if (net && nf->nf_net != net)
>>> 				continue;
>>> -			del = nfsd_file_unhash_and_dispose(nf, &dispose);
>>> -
>>> -			/*
>>> -			 * Deadlock detected! Something marked this entry as
>>> -			 * unhased, but hasn't removed it from the hash list.
>>> -			 */
>>> -			WARN_ON_ONCE(!del);
>>> -
>>> +			nfsd_file_unhash_and_dispose(nf, &dispose);
>>> 			nf = rhashtable_walk_next(&iter);
>>> 		}
>>> 
>>> -- 
>>> 2.37.3
>>> 
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

--
Chuck Lever




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] nfsd: fix potential race in nfsd_file_close
  2022-09-30 19:15 ` [PATCH 2/3] nfsd: fix potential race in nfsd_file_close Jeff Layton
@ 2022-09-30 20:58   ` Jeff Layton
  2022-09-30 20:59     ` Chuck Lever III
  2022-10-01  5:03   ` NeilBrown
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2022-09-30 20:58 UTC (permalink / raw)
  To: chuck.lever; +Cc: linux-nfs

On Fri, 2022-09-30 at 15:15 -0400, Jeff Layton wrote:
> Once we call nfsd_file_put, there is no guarantee that "nf" can still be
> safely accessed. That may have been the last reference.
> 
> Change the code to instead check for whether nf_ref is 2 and then unhash
> it and put the reference if we're successful.
> 
> We might occasionally race with another lookup and end up unhashing it
> when it probably shouldn't have been, but that should hopefully be rare
> and will just result in the competing lookup having to create a new
> nfsd_file.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/filecache.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 6237715bd23e..58f4d9267f4a 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -461,12 +461,14 @@ nfsd_file_put(struct nfsd_file *nf)
>   */
>  void nfsd_file_close(struct nfsd_file *nf)
>  {
> -	nfsd_file_put(nf);
> -	if (refcount_dec_if_one(&nf->nf_ref)) {
> -		nfsd_file_unhash(nf);
> -		nfsd_file_lru_remove(nf);
> -		nfsd_file_free(nf);
> +	/* One for the reference being put, and one for the hash */
> +	if (refcount_read(&nf->nf_ref) == 2) {
> +		if (nfsd_file_unhash(nf))
> +			nfsd_file_put_noref(nf);
>  	}
> +	/* put the ref for the stateid */
> +	nfsd_file_put(nf);
> +

Chuck if you're ok with this one, can you fix the stray newline above?

>  }
>  
>  struct nfsd_file *


Thanks,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] nfsd: fix potential race in nfsd_file_close
  2022-09-30 20:58   ` Jeff Layton
@ 2022-09-30 20:59     ` Chuck Lever III
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever III @ 2022-09-30 20:59 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List



> On Sep 30, 2022, at 4:58 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Fri, 2022-09-30 at 15:15 -0400, Jeff Layton wrote:
>> Once we call nfsd_file_put, there is no guarantee that "nf" can still be
>> safely accessed. That may have been the last reference.
>> 
>> Change the code to instead check for whether nf_ref is 2 and then unhash
>> it and put the reference if we're successful.
>> 
>> We might occasionally race with another lookup and end up unhashing it
>> when it probably shouldn't have been, but that should hopefully be rare
>> and will just result in the competing lookup having to create a new
>> nfsd_file.
>> 
>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> ---
>> fs/nfsd/filecache.c | 12 +++++++-----
>> 1 file changed, 7 insertions(+), 5 deletions(-)
>> 
>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>> index 6237715bd23e..58f4d9267f4a 100644
>> --- a/fs/nfsd/filecache.c
>> +++ b/fs/nfsd/filecache.c
>> @@ -461,12 +461,14 @@ nfsd_file_put(struct nfsd_file *nf)
>>  */
>> void nfsd_file_close(struct nfsd_file *nf)
>> {
>> -	nfsd_file_put(nf);
>> -	if (refcount_dec_if_one(&nf->nf_ref)) {
>> -		nfsd_file_unhash(nf);
>> -		nfsd_file_lru_remove(nf);
>> -		nfsd_file_free(nf);
>> +	/* One for the reference being put, and one for the hash */
>> +	if (refcount_read(&nf->nf_ref) == 2) {
>> +		if (nfsd_file_unhash(nf))
>> +			nfsd_file_put_noref(nf);
>> 	}
>> +	/* put the ref for the stateid */
>> +	nfsd_file_put(nf);
>> +
> 
> Chuck if you're ok with this one, can you fix the stray newline above?

Sure.


>> }
>> 
>> struct nfsd_file *
> 
> 
> Thanks,
> -- 
> Jeff Layton <jlayton@kernel.org>

--
Chuck Lever




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs
  2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
  2022-09-30 19:20   ` Chuck Lever III
@ 2022-10-01  4:44   ` NeilBrown
  2022-10-01  9:47     ` Jeff Layton
  1 sibling, 1 reply; 16+ messages in thread
From: NeilBrown @ 2022-10-01  4:44 UTC (permalink / raw)
  To: Jeff Layton; +Cc: chuck.lever, linux-nfs

On Sat, 01 Oct 2022, Jeff Layton wrote:
> nfsd_file is RCU-freed, so it's possible that one could be found that's
> in the process of being freed and the memory recycled. Ensure we hold
> the rcu_read_lock while attempting to get a reference on the object.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/filecache.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index d5c57360b418..6237715bd23e 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  
>  retry:
>  	/* Avoid allocation if the item is already in cache */
> +	rcu_read_lock();
>  	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
>  				    nfsd_file_rhash_params);
>  	if (nf)
>  		nf = nfsd_file_get(nf);
> +	rcu_read_unlock();

Looks good.

>  	if (nf)
>  		goto wait_for_construction;
>  
> @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		goto out_status;
>  	}
>  
> +	rcu_read_lock();
>  	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
>  					      &key, &new->nf_rhash,
>  					      nfsd_file_rhash_params);
>  	if (!nf) {
> +		rcu_read_unlock();
>  		nf = new;
>  		goto open_file;
>  	}
> -	if (IS_ERR(nf))
> +	if (IS_ERR(nf)) {
> +		rcu_read_unlock();
>  		goto insert_err;
> +	}
>  	nf = nfsd_file_get(nf);
> +	rcu_read_unlock();

Ugh.
Could we make this:
   rcu_read_lock()
   nf = r_l_g_i_k()
   if (!IS_ERR_OR_NULL(nf))
        nf = nfsd_file_get(nf);
   rcu_read_unlock()
   ...
??

NeilBrown

>  	if (nf == NULL) {
>  		nf = new;
>  		goto open_file;
> -- 
> 2.37.3
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] nfsd: fix potential race in nfsd_file_close
  2022-09-30 19:15 ` [PATCH 2/3] nfsd: fix potential race in nfsd_file_close Jeff Layton
  2022-09-30 20:58   ` Jeff Layton
@ 2022-10-01  5:03   ` NeilBrown
  2022-10-01  9:55     ` Jeff Layton
  1 sibling, 1 reply; 16+ messages in thread
From: NeilBrown @ 2022-10-01  5:03 UTC (permalink / raw)
  To: Jeff Layton; +Cc: chuck.lever, linux-nfs

On Sat, 01 Oct 2022, Jeff Layton wrote:
> Once we call nfsd_file_put, there is no guarantee that "nf" can still be
> safely accessed. That may have been the last reference.
> 
> Change the code to instead check for whether nf_ref is 2 and then unhash
> it and put the reference if we're successful.
> 
> We might occasionally race with another lookup and end up unhashing it
> when it probably shouldn't have been, but that should hopefully be rare
> and will just result in the competing lookup having to create a new
> nfsd_file.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/filecache.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 6237715bd23e..58f4d9267f4a 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -461,12 +461,14 @@ nfsd_file_put(struct nfsd_file *nf)
>   */
>  void nfsd_file_close(struct nfsd_file *nf)
>  {
> -	nfsd_file_put(nf);
> -	if (refcount_dec_if_one(&nf->nf_ref)) {
> -		nfsd_file_unhash(nf);
> -		nfsd_file_lru_remove(nf);
> -		nfsd_file_free(nf);
> +	/* One for the reference being put, and one for the hash */
> +	if (refcount_read(&nf->nf_ref) == 2) {
> +		if (nfsd_file_unhash(nf))
> +			nfsd_file_put_noref(nf);
>  	}
> +	/* put the ref for the stateid */
> +	nfsd_file_put(nf);
> +

This looks racy. What if a get happens after the read and before the unhash?

If we unhash the nfsd_file at last close, why does the hash table hold a
counted reference at all?
When it is hashed, set the NFSD_FILE_HASHED flag.  On last-put, if that
flag is set, unhash it.
If you want to unhash it earlier, test/clear the flag and delete from
rhashtable.

NeilBrown


>  }
>  
>  struct nfsd_file *
> -- 
> 2.37.3
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs
  2022-10-01  4:44   ` NeilBrown
@ 2022-10-01  9:47     ` Jeff Layton
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2022-10-01  9:47 UTC (permalink / raw)
  To: NeilBrown; +Cc: chuck.lever, linux-nfs

On Sat, 2022-10-01 at 14:44 +1000, NeilBrown wrote:
> On Sat, 01 Oct 2022, Jeff Layton wrote:
> > nfsd_file is RCU-freed, so it's possible that one could be found that's
> > in the process of being freed and the memory recycled. Ensure we hold
> > the rcu_read_lock while attempting to get a reference on the object.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfsd/filecache.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index d5c57360b418..6237715bd23e 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -1077,10 +1077,12 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >  
> >  retry:
> >  	/* Avoid allocation if the item is already in cache */
> > +	rcu_read_lock();
> >  	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
> >  				    nfsd_file_rhash_params);
> >  	if (nf)
> >  		nf = nfsd_file_get(nf);
> > +	rcu_read_unlock();
> 
> Looks good.
> 
> >  	if (nf)
> >  		goto wait_for_construction;
> >  
> > @@ -1090,16 +1092,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >  		goto out_status;
> >  	}
> >  
> > +	rcu_read_lock();
> >  	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
> >  					      &key, &new->nf_rhash,
> >  					      nfsd_file_rhash_params);
> >  	if (!nf) {
> > +		rcu_read_unlock();
> >  		nf = new;
> >  		goto open_file;
> >  	}
> > -	if (IS_ERR(nf))
> > +	if (IS_ERR(nf)) {
> > +		rcu_read_unlock();
> >  		goto insert_err;
> > +	}
> >  	nf = nfsd_file_get(nf);
> > +	rcu_read_unlock();
> 
> Ugh.
> Could we make this:
>    rcu_read_lock()
>    nf = r_l_g_i_k()
>    if (!IS_ERR_OR_NULL(nf))
>         nf = nfsd_file_get(nf);
>    rcu_read_unlock()
>    ...
> ??
> 
> NeilBrown
> 

Good point. I'll resend a v2 here.

> >  	if (nf == NULL) {
> >  		nf = new;
> >  		goto open_file;
> > -- 
> > 2.37.3
> > 
> > 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] nfsd: fix potential race in nfsd_file_close
  2022-10-01  5:03   ` NeilBrown
@ 2022-10-01  9:55     ` Jeff Layton
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2022-10-01  9:55 UTC (permalink / raw)
  To: NeilBrown; +Cc: chuck.lever, linux-nfs

On Sat, 2022-10-01 at 15:03 +1000, NeilBrown wrote:
> On Sat, 01 Oct 2022, Jeff Layton wrote:
> > Once we call nfsd_file_put, there is no guarantee that "nf" can still be
> > safely accessed. That may have been the last reference.
> > 
> > Change the code to instead check for whether nf_ref is 2 and then unhash
> > it and put the reference if we're successful.
> > 
> > We might occasionally race with another lookup and end up unhashing it
> > when it probably shouldn't have been, but that should hopefully be rare
> > and will just result in the competing lookup having to create a new
> > nfsd_file.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfsd/filecache.c | 12 +++++++-----
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index 6237715bd23e..58f4d9267f4a 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -461,12 +461,14 @@ nfsd_file_put(struct nfsd_file *nf)
> >   */
> >  void nfsd_file_close(struct nfsd_file *nf)
> >  {
> > -	nfsd_file_put(nf);
> > -	if (refcount_dec_if_one(&nf->nf_ref)) {
> > -		nfsd_file_unhash(nf);
> > -		nfsd_file_lru_remove(nf);
> > -		nfsd_file_free(nf);
> > +	/* One for the reference being put, and one for the hash */
> > +	if (refcount_read(&nf->nf_ref) == 2) {
> > +		if (nfsd_file_unhash(nf))
> > +			nfsd_file_put_noref(nf);
> >  	}
> > +	/* put the ref for the stateid */
> > +	nfsd_file_put(nf);
> > +
> 
> This looks racy. What if a get happens after the read and before the unhash?
> 

It depends on whether the "getter" sees the HASHED flag or not in
nfsd_file_do_acquire.

If HASHED is still set, then it'll get a reference to the old soon to be
unhashed nfsd_file. If it's no longer HASHED in nfsd_file_do_acquire, it
will fall into the "Did construction of this file fail?" case, and
either retry the lookup or return nfserr_jukebox.

Either is an acceptable outcome since this should presumably be a rare
occurrence.

> If we unhash the nfsd_file at last close, why does the hash table hold a
> counted reference at all?
> When it is hashed, set the NFSD_FILE_HASHED flag.  On last-put, if that
> flag is set, unhash it.
> If you want to unhash it earlier, test/clear the flag and delete from
> rhashtable.
> 

That's not the way the refcounting works today and I don't see a clear
benefit to making that change. If you want to propose patches to rework
it, I'd be happy to review them though.

> 
> 
> >  }
> >  
> >  struct nfsd_file *
> > -- 
> > 2.37.3
> > 
> > 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-10-01  9:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-30 19:15 [PATCH 0/3] nfsd: filecache fixes Jeff Layton
2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
2022-09-30 19:20   ` Chuck Lever III
2022-09-30 19:33     ` Jeff Layton
2022-09-30 20:06       ` Chuck Lever III
2022-10-01  4:44   ` NeilBrown
2022-10-01  9:47     ` Jeff Layton
2022-09-30 19:15 ` [PATCH 2/3] nfsd: fix potential race in nfsd_file_close Jeff Layton
2022-09-30 20:58   ` Jeff Layton
2022-09-30 20:59     ` Chuck Lever III
2022-10-01  5:03   ` NeilBrown
2022-10-01  9:55     ` Jeff Layton
2022-09-30 19:15 ` [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose Jeff Layton
2022-09-30 19:29   ` Chuck Lever III
2022-09-30 19:42     ` Jeff Layton
2022-09-30 20:23       ` Chuck Lever III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).