All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/32] Overhaul NFSD filecache
@ 2022-07-08 18:23 Chuck Lever
  2022-07-08 18:23 ` [PATCH v3 01/32] NFSD: Demote a WARN to a pr_warn() Chuck Lever
                   ` (32 more replies)
  0 siblings, 33 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:23 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

This series overhauls the NFSD filecache, a cache of server-side
"struct file" objects recently used by NFS clients. The purposes of
this overhaul are an immediate improvement in cache scalability in
the number of open files, and preparation for further improvements.

There are three categories of patches in this series:

1. Add observability of cache operation so we can see what we're
doing as changes are made to the code.

2. Improve the scalability of filecache garbage collection,
addressing several bugs along the way.

3. Improve the scalability of the filecache hash table by converting
it to use rhashtable.

These patches are available in the for-next branch of:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 

Changes since v2:
- Fix a cred use-after-free crasher
- Fix a Smatch error reported by Dan Carpenter
- Replace dereferences of nfsd_file::nf_inode
- Further clean-ups and white-space adjustments

Changes since RFC:
- Fixed several crashers
- Adjusted some of the new observability
- Tests with generic/531 now pass
- Fixed bugzilla 387 too, maybe
- Plenty of clean-ups

---

Chuck Lever (32):
      NFSD: Demote a WARN to a pr_warn()
      NFSD: Report filecache LRU size
      NFSD: Report count of calls to nfsd_file_acquire()
      NFSD: Report count of freed filecache items
      NFSD: Report average age of filecache items
      NFSD: Add nfsd_file_lru_dispose_list() helper
      NFSD: Refactor nfsd_file_gc()
      NFSD: Refactor nfsd_file_lru_scan()
      NFSD: Report the number of items evicted by the LRU walk
      NFSD: Record number of flush calls
      NFSD: Zero counters when the filecache is re-initialized
      NFSD: Hook up the filecache stat file
      NFSD: WARN when freeing an item still linked via nf_lru
      NFSD: Trace filecache LRU activity
      NFSD: Leave open files out of the filecache LRU
      NFSD: Fix the filecache LRU shrinker
      NFSD: Never call nfsd_file_gc() in foreground paths
      NFSD: No longer record nf_hashval in the trace log
      NFSD: Remove lockdep assertion from unhash_and_release_locked()
      NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
      NFSD: Refactor __nfsd_file_close_inode()
      NFSD: nfsd_file_hash_remove can compute hashval
      NFSD: Remove nfsd_file::nf_hashval
      NFSD: Replace the "init once" mechanism
      NFSD: Set up an rhashtable for the filecache
      NFSD: Convert the filecache to use rhashtable
      NFSD: Clean up unused code after rhashtable conversion
      NFSD: Separate tracepoints for acquire and create
      NFSD: Move nfsd_file_trace_alloc() tracepoint
      NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint
      NFSD: NFSv4 CLOSE should release an nfsd_file immediately
      NFSD: Ensure nf_inode is never dereferenced


 fs/nfsd/filecache.c       | 727 ++++++++++++++++++++++++--------------
 fs/nfsd/filecache.h       |   7 +-
 fs/nfsd/nfs4proc.c        |   6 +-
 fs/nfsd/nfs4state.c       |   7 +-
 fs/nfsd/nfsctl.c          |  10 +
 fs/nfsd/trace.h           | 300 +++++++++++++---
 include/trace/events/fs.h |  37 ++
 7 files changed, 774 insertions(+), 320 deletions(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 01/32] NFSD: Demote a WARN to a pr_warn()
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
@ 2022-07-08 18:23 ` Chuck Lever
  2022-07-08 18:23 ` [PATCH v3 02/32] NFSD: Report filecache LRU size Chuck Lever
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:23 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

The call trace doesn't add much value, but it sure is noisy.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4proc.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index d267b9bcf1fc..5af9f8d1feb6 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -630,9 +630,9 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 
 	status = nfsd4_process_open2(rqstp, resfh, open);
-	WARN(status && open->op_created,
-	     "nfsd4_process_open2 failed to open newly-created file! status=%u\n",
-	     be32_to_cpu(status));
+	if (status && open->op_created)
+		pr_warn("nfsd4_process_open2 failed to open newly-created file: status=%u\n",
+			be32_to_cpu(status));
 	if (reclaim && !status)
 		nn->somebody_reclaimed = true;
 out:



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 02/32] NFSD: Report filecache LRU size
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
  2022-07-08 18:23 ` [PATCH v3 01/32] NFSD: Demote a WARN to a pr_warn() Chuck Lever
@ 2022-07-08 18:23 ` Chuck Lever
  2022-07-08 18:23 ` [PATCH v3 03/32] NFSD: Report count of calls to nfsd_file_acquire() Chuck Lever
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:23 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Surface the NFSD filecache's LRU list length to help field
troubleshooters monitor filecache issues.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 9cb2d590c036..a0234d194ec1 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1068,7 +1068,7 @@ nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
 	unsigned int i, count = 0, longest = 0;
-	unsigned long hits = 0;
+	unsigned long lru = 0, hits = 0;
 
 	/*
 	 * No need for spinlocks here since we're not terribly interested in
@@ -1081,6 +1081,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 			count += nfsd_file_hashtbl[i].nfb_count;
 			longest = max(longest, nfsd_file_hashtbl[i].nfb_count);
 		}
+		lru = list_lru_count(&nfsd_file_lru);
 	}
 	mutex_unlock(&nfsd_mutex);
 
@@ -1089,6 +1090,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 
 	seq_printf(m, "total entries: %u\n", count);
 	seq_printf(m, "longest chain: %u\n", longest);
+	seq_printf(m, "lru entries:   %lu\n", lru);
 	seq_printf(m, "cache hits:    %lu\n", hits);
 	return 0;
 }



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 03/32] NFSD: Report count of calls to nfsd_file_acquire()
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
  2022-07-08 18:23 ` [PATCH v3 01/32] NFSD: Demote a WARN to a pr_warn() Chuck Lever
  2022-07-08 18:23 ` [PATCH v3 02/32] NFSD: Report filecache LRU size Chuck Lever
@ 2022-07-08 18:23 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 04/32] NFSD: Report count of freed filecache items Chuck Lever
                   ` (29 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:23 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Count the number of successful acquisitions that did not create a
file (ie, acquisitions that do not result in a compulsory cache
miss). This count can be compared directly with the reported hit
count to compute a hit ratio.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index a0234d194ec1..3359df6c7ac0 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -43,6 +43,7 @@ struct nfsd_fcache_bucket {
 };
 
 static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
+static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
 
 struct nfsd_fcache_disposal {
 	struct work_struct work;
@@ -975,6 +976,8 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	}
 out:
 	if (status == nfs_ok) {
+		if (open)
+			this_cpu_inc(nfsd_file_acquisitions);
 		*pnf = nf;
 	} else {
 		nfsd_file_put(nf);
@@ -1067,8 +1070,9 @@ nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
  */
 static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
+	unsigned long hits = 0, acquisitions = 0;
 	unsigned int i, count = 0, longest = 0;
-	unsigned long lru = 0, hits = 0;
+	unsigned long lru = 0;
 
 	/*
 	 * No need for spinlocks here since we're not terribly interested in
@@ -1085,13 +1089,16 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	}
 	mutex_unlock(&nfsd_mutex);
 
-	for_each_possible_cpu(i)
+	for_each_possible_cpu(i) {
 		hits += per_cpu(nfsd_file_cache_hits, i);
+		acquisitions += per_cpu(nfsd_file_acquisitions, i);
+	}
 
 	seq_printf(m, "total entries: %u\n", count);
 	seq_printf(m, "longest chain: %u\n", longest);
 	seq_printf(m, "lru entries:   %lu\n", lru);
 	seq_printf(m, "cache hits:    %lu\n", hits);
+	seq_printf(m, "acquisitions:  %lu\n", acquisitions);
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 04/32] NFSD: Report count of freed filecache items
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (2 preceding siblings ...)
  2022-07-08 18:23 ` [PATCH v3 03/32] NFSD: Report count of calls to nfsd_file_acquire() Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 05/32] NFSD: Report average age of " Chuck Lever
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Surface the count of freed nfsd_file items.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 3359df6c7ac0..c28e9577837d 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -44,6 +44,7 @@ struct nfsd_fcache_bucket {
 
 static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
+static DEFINE_PER_CPU(unsigned long, nfsd_file_releases);
 
 struct nfsd_fcache_disposal {
 	struct work_struct work;
@@ -202,6 +203,8 @@ nfsd_file_free(struct nfsd_file *nf)
 {
 	bool flush = false;
 
+	this_cpu_inc(nfsd_file_releases);
+
 	trace_nfsd_file_put_final(nf);
 	if (nf->nf_mark)
 		nfsd_file_mark_put(nf->nf_mark);
@@ -1070,7 +1073,7 @@ nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
  */
 static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
-	unsigned long hits = 0, acquisitions = 0;
+	unsigned long hits = 0, acquisitions = 0, releases = 0;
 	unsigned int i, count = 0, longest = 0;
 	unsigned long lru = 0;
 
@@ -1092,6 +1095,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	for_each_possible_cpu(i) {
 		hits += per_cpu(nfsd_file_cache_hits, i);
 		acquisitions += per_cpu(nfsd_file_acquisitions, i);
+		releases += per_cpu(nfsd_file_releases, i);
 	}
 
 	seq_printf(m, "total entries: %u\n", count);
@@ -1099,6 +1103,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "lru entries:   %lu\n", lru);
 	seq_printf(m, "cache hits:    %lu\n", hits);
 	seq_printf(m, "acquisitions:  %lu\n", acquisitions);
+	seq_printf(m, "releases:      %lu\n", releases);
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 05/32] NFSD: Report average age of filecache items
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (3 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 04/32] NFSD: Report count of freed filecache items Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 06/32] NFSD: Add nfsd_file_lru_dispose_list() helper Chuck Lever
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

This is a measure of how long items stay in the filecache, to help
assess how efficient the cache is.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   11 ++++++++++-
 fs/nfsd/filecache.h |    1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index c28e9577837d..da48c51a2bf0 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -45,6 +45,7 @@ struct nfsd_fcache_bucket {
 static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_releases);
+static DEFINE_PER_CPU(unsigned long, nfsd_file_total_age);
 
 struct nfsd_fcache_disposal {
 	struct work_struct work;
@@ -178,6 +179,7 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval,
 	if (nf) {
 		INIT_HLIST_NODE(&nf->nf_node);
 		INIT_LIST_HEAD(&nf->nf_lru);
+		nf->nf_birthtime = ktime_get();
 		nf->nf_file = NULL;
 		nf->nf_cred = get_current_cred();
 		nf->nf_net = net;
@@ -201,9 +203,11 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval,
 static bool
 nfsd_file_free(struct nfsd_file *nf)
 {
+	s64 age = ktime_to_ms(ktime_sub(ktime_get(), nf->nf_birthtime));
 	bool flush = false;
 
 	this_cpu_inc(nfsd_file_releases);
+	this_cpu_add(nfsd_file_total_age, age);
 
 	trace_nfsd_file_put_final(nf);
 	if (nf->nf_mark)
@@ -1075,7 +1079,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
 	unsigned long hits = 0, acquisitions = 0, releases = 0;
 	unsigned int i, count = 0, longest = 0;
-	unsigned long lru = 0;
+	unsigned long lru = 0, total_age = 0;
 
 	/*
 	 * No need for spinlocks here since we're not terribly interested in
@@ -1096,6 +1100,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 		hits += per_cpu(nfsd_file_cache_hits, i);
 		acquisitions += per_cpu(nfsd_file_acquisitions, i);
 		releases += per_cpu(nfsd_file_releases, i);
+		total_age += per_cpu(nfsd_file_total_age, i);
 	}
 
 	seq_printf(m, "total entries: %u\n", count);
@@ -1104,6 +1109,10 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "cache hits:    %lu\n", hits);
 	seq_printf(m, "acquisitions:  %lu\n", acquisitions);
 	seq_printf(m, "releases:      %lu\n", releases);
+	if (releases)
+		seq_printf(m, "mean age (ms): %ld\n", total_age / releases);
+	else
+		seq_printf(m, "mean age (ms): -\n");
 	return 0;
 }
 
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 1da0c79a5580..d0c42619dc10 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -46,6 +46,7 @@ struct nfsd_file {
 	refcount_t		nf_ref;
 	unsigned char		nf_may;
 	struct nfsd_file_mark	*nf_mark;
+	ktime_t			nf_birthtime;
 };
 
 int nfsd_file_cache_init(void);



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 06/32] NFSD: Add nfsd_file_lru_dispose_list() helper
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (4 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 05/32] NFSD: Report average age of " Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 07/32] NFSD: Refactor nfsd_file_gc() Chuck Lever
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Refactor the invariant part of nfsd_file_lru_walk_list() into a
separate helper function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index da48c51a2bf0..b278030e0a12 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -457,11 +457,31 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
 	return LRU_SKIP;
 }
 
+/*
+ * Unhash items on @dispose immediately, then queue them on the
+ * disposal workqueue to finish releasing them in the background.
+ *
+ * cel: Note that between the time list_lru_shrink_walk runs and
+ * now, these items are in the hash table but marked unhashed.
+ * Why release these outside of lru_cb ? There's no lock ordering
+ * problem since lru_cb currently takes no lock.
+ */
+static void nfsd_file_gc_dispose_list(struct list_head *dispose)
+{
+	struct nfsd_file *nf;
+
+	list_for_each_entry(nf, dispose, nf_lru) {
+		spin_lock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+		nfsd_file_do_unhash(nf);
+		spin_unlock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+	}
+	nfsd_file_dispose_list_delayed(dispose);
+}
+
 static unsigned long
 nfsd_file_lru_walk_list(struct shrink_control *sc)
 {
 	LIST_HEAD(head);
-	struct nfsd_file *nf;
 	unsigned long ret;
 
 	if (sc)
@@ -471,12 +491,7 @@ nfsd_file_lru_walk_list(struct shrink_control *sc)
 		ret = list_lru_walk(&nfsd_file_lru,
 				nfsd_file_lru_cb,
 				&head, LONG_MAX);
-	list_for_each_entry(nf, &head, nf_lru) {
-		spin_lock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
-		nfsd_file_do_unhash(nf);
-		spin_unlock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
-	}
-	nfsd_file_dispose_list_delayed(&head);
+	nfsd_file_gc_dispose_list(&head);
 	return ret;
 }
 



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 07/32] NFSD: Refactor nfsd_file_gc()
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (5 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 06/32] NFSD: Add nfsd_file_lru_dispose_list() helper Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 08/32] NFSD: Refactor nfsd_file_lru_scan() Chuck Lever
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Refactor nfsd_file_gc() to use the new list_lru helper.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index b278030e0a12..4e1162f51a70 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -498,7 +498,11 @@ nfsd_file_lru_walk_list(struct shrink_control *sc)
 static void
 nfsd_file_gc(void)
 {
-	nfsd_file_lru_walk_list(NULL);
+	LIST_HEAD(dispose);
+
+	list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
+		      &dispose, LONG_MAX);
+	nfsd_file_gc_dispose_list(&dispose);
 }
 
 static void



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 08/32] NFSD: Refactor nfsd_file_lru_scan()
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (6 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 07/32] NFSD: Refactor nfsd_file_gc() Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 09/32] NFSD: Report the number of items evicted by the LRU walk Chuck Lever
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   25 +++++++------------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 4e1162f51a70..79cbbbdf8355 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -478,23 +478,6 @@ static void nfsd_file_gc_dispose_list(struct list_head *dispose)
 	nfsd_file_dispose_list_delayed(dispose);
 }
 
-static unsigned long
-nfsd_file_lru_walk_list(struct shrink_control *sc)
-{
-	LIST_HEAD(head);
-	unsigned long ret;
-
-	if (sc)
-		ret = list_lru_shrink_walk(&nfsd_file_lru, sc,
-				nfsd_file_lru_cb, &head);
-	else
-		ret = list_lru_walk(&nfsd_file_lru,
-				nfsd_file_lru_cb,
-				&head, LONG_MAX);
-	nfsd_file_gc_dispose_list(&head);
-	return ret;
-}
-
 static void
 nfsd_file_gc(void)
 {
@@ -521,7 +504,13 @@ nfsd_file_lru_count(struct shrinker *s, struct shrink_control *sc)
 static unsigned long
 nfsd_file_lru_scan(struct shrinker *s, struct shrink_control *sc)
 {
-	return nfsd_file_lru_walk_list(sc);
+	LIST_HEAD(dispose);
+	unsigned long ret;
+
+	ret = list_lru_shrink_walk(&nfsd_file_lru, sc,
+				   nfsd_file_lru_cb, &dispose);
+	nfsd_file_gc_dispose_list(&dispose);
+	return ret;
 }
 
 static struct shrinker	nfsd_file_shrinker = {



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 09/32] NFSD: Report the number of items evicted by the LRU walk
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (7 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 08/32] NFSD: Refactor nfsd_file_lru_scan() Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 10/32] NFSD: Record number of flush calls Chuck Lever
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   13 ++++++++++---
 fs/nfsd/trace.h     |   29 +++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 79cbbbdf8355..12f587473913 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -46,6 +46,7 @@ static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_releases);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_total_age);
+static DEFINE_PER_CPU(unsigned long, nfsd_file_evictions);
 
 struct nfsd_fcache_disposal {
 	struct work_struct work;
@@ -452,6 +453,7 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
 		goto out_skip;
 
 	list_lru_isolate_move(lru, &nf->nf_lru, head);
+	this_cpu_inc(nfsd_file_evictions);
 	return LRU_REMOVED;
 out_skip:
 	return LRU_SKIP;
@@ -482,9 +484,11 @@ static void
 nfsd_file_gc(void)
 {
 	LIST_HEAD(dispose);
+	unsigned long ret;
 
-	list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
-		      &dispose, LONG_MAX);
+	ret = list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
+			    &dispose, LONG_MAX);
+	trace_nfsd_file_gc_removed(ret, list_lru_count(&nfsd_file_lru));
 	nfsd_file_gc_dispose_list(&dispose);
 }
 
@@ -509,6 +513,7 @@ nfsd_file_lru_scan(struct shrinker *s, struct shrink_control *sc)
 
 	ret = list_lru_shrink_walk(&nfsd_file_lru, sc,
 				   nfsd_file_lru_cb, &dispose);
+	trace_nfsd_file_shrinker_removed(ret, list_lru_count(&nfsd_file_lru));
 	nfsd_file_gc_dispose_list(&dispose);
 	return ret;
 }
@@ -1085,7 +1090,7 @@ nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
  */
 static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
-	unsigned long hits = 0, acquisitions = 0, releases = 0;
+	unsigned long hits = 0, acquisitions = 0, releases = 0, evictions = 0;
 	unsigned int i, count = 0, longest = 0;
 	unsigned long lru = 0, total_age = 0;
 
@@ -1109,6 +1114,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 		acquisitions += per_cpu(nfsd_file_acquisitions, i);
 		releases += per_cpu(nfsd_file_releases, i);
 		total_age += per_cpu(nfsd_file_total_age, i);
+		evictions += per_cpu(nfsd_file_evictions, i);
 	}
 
 	seq_printf(m, "total entries: %u\n", count);
@@ -1117,6 +1123,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "cache hits:    %lu\n", hits);
 	seq_printf(m, "acquisitions:  %lu\n", acquisitions);
 	seq_printf(m, "releases:      %lu\n", releases);
+	seq_printf(m, "evictions:     %lu\n", evictions);
 	if (releases)
 		seq_printf(m, "mean age (ms): %ld\n", total_age / releases);
 	else
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 8467fd8f94c2..59dc5b2f4a50 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -897,6 +897,35 @@ TRACE_EVENT(nfsd_file_fsnotify_handle_event,
 			__entry->nlink, __entry->mode, __entry->mask)
 );
 
+DECLARE_EVENT_CLASS(nfsd_file_lruwalk_class,
+	TP_PROTO(
+		unsigned long removed,
+		unsigned long remaining
+	),
+	TP_ARGS(removed, remaining),
+	TP_STRUCT__entry(
+		__field(unsigned long, removed)
+		__field(unsigned long, remaining)
+	),
+	TP_fast_assign(
+		__entry->removed = removed;
+		__entry->remaining = remaining;
+	),
+	TP_printk("%lu entries removed, %lu remaining",
+		__entry->removed, __entry->remaining)
+);
+
+#define DEFINE_NFSD_FILE_LRUWALK_EVENT(name)				\
+DEFINE_EVENT(nfsd_file_lruwalk_class, name,				\
+	TP_PROTO(							\
+		unsigned long removed,					\
+		unsigned long remaining					\
+	),								\
+	TP_ARGS(removed, remaining))
+
+DEFINE_NFSD_FILE_LRUWALK_EVENT(nfsd_file_gc_removed);
+DEFINE_NFSD_FILE_LRUWALK_EVENT(nfsd_file_shrinker_removed);
+
 #include "cache.h"
 
 TRACE_DEFINE_ENUM(RC_DROPIT);



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 10/32] NFSD: Record number of flush calls
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (8 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 09/32] NFSD: Report the number of items evicted by the LRU walk Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 11/32] NFSD: Zero counters when the filecache is re-initialized Chuck Lever
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 12f587473913..7b532449b93f 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -46,6 +46,7 @@ static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_releases);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_total_age);
+static DEFINE_PER_CPU(unsigned long, nfsd_file_pages_flushed);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_evictions);
 
 struct nfsd_fcache_disposal {
@@ -249,7 +250,12 @@ nfsd_file_check_write_error(struct nfsd_file *nf)
 static void
 nfsd_file_flush(struct nfsd_file *nf)
 {
-	if (nf->nf_file && vfs_fsync(nf->nf_file, 1) != 0)
+	struct file *file = nf->nf_file;
+
+	if (!file || !(file->f_mode & FMODE_WRITE))
+		return;
+	this_cpu_add(nfsd_file_pages_flushed, file->f_mapping->nrpages);
+	if (vfs_fsync(file, 1) != 0)
 		nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
 }
 
@@ -1090,7 +1096,8 @@ nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
  */
 static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
-	unsigned long hits = 0, acquisitions = 0, releases = 0, evictions = 0;
+	unsigned long releases = 0, pages_flushed = 0, evictions = 0;
+	unsigned long hits = 0, acquisitions = 0;
 	unsigned int i, count = 0, longest = 0;
 	unsigned long lru = 0, total_age = 0;
 
@@ -1115,6 +1122,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 		releases += per_cpu(nfsd_file_releases, i);
 		total_age += per_cpu(nfsd_file_total_age, i);
 		evictions += per_cpu(nfsd_file_evictions, i);
+		pages_flushed += per_cpu(nfsd_file_pages_flushed, i);
 	}
 
 	seq_printf(m, "total entries: %u\n", count);
@@ -1128,6 +1136,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 		seq_printf(m, "mean age (ms): %ld\n", total_age / releases);
 	else
 		seq_printf(m, "mean age (ms): -\n");
+	seq_printf(m, "pages flushed: %lu\n", pages_flushed);
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 11/32] NFSD: Zero counters when the filecache is re-initialized
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (9 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 10/32] NFSD: Record number of flush calls Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 18:24 ` [PATCH v3 12/32] NFSD: Hook up the filecache stat file Chuck Lever
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

If nfsd_file_cache_init() is called after a shutdown, be sure the
stat counters are reset.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 7b532449b93f..3055a04eeabe 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -830,6 +830,8 @@ nfsd_file_cache_shutdown_net(struct net *net)
 void
 nfsd_file_cache_shutdown(void)
 {
+	int i;
+
 	set_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags);
 
 	lease_unregister_notifier(&nfsd_file_lease_notifier);
@@ -853,6 +855,15 @@ nfsd_file_cache_shutdown(void)
 	nfsd_file_hashtbl = NULL;
 	destroy_workqueue(nfsd_filecache_wq);
 	nfsd_filecache_wq = NULL;
+
+	for_each_possible_cpu(i) {
+		per_cpu(nfsd_file_cache_hits, i) = 0;
+		per_cpu(nfsd_file_acquisitions, i) = 0;
+		per_cpu(nfsd_file_releases, i) = 0;
+		per_cpu(nfsd_file_total_age, i) = 0;
+		per_cpu(nfsd_file_pages_flushed, i) = 0;
+		per_cpu(nfsd_file_evictions, i) = 0;
+	}
 }
 
 static bool



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 12/32] NFSD: Hook up the filecache stat file
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (10 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 11/32] NFSD: Zero counters when the filecache is re-initialized Chuck Lever
@ 2022-07-08 18:24 ` Chuck Lever
  2022-07-08 19:16   ` Jeff Layton
  2022-07-08 18:25 ` [PATCH v3 13/32] NFSD: WARN when freeing an item still linked via nf_lru Chuck Lever
                   ` (20 subsequent siblings)
  32 siblings, 1 reply; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:24 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

There has always been the capability of exporting filecache metrics
via /proc, but it was never hooked up. Let's surface these metrics
to enable better observability of the filecache.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfsctl.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 66c352bf61b1..ecc08cf97a86 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -25,6 +25,7 @@
 #include "state.h"
 #include "netns.h"
 #include "pnfs.h"
+#include "filecache.h"
 
 /*
  *	We have a single directory with several nodes in it.
@@ -46,6 +47,7 @@ enum {
 	NFSD_MaxBlkSize,
 	NFSD_MaxConnections,
 	NFSD_SupportedEnctypes,
+	NFSD_Filecache,
 	/*
 	 * The below MUST come last.  Otherwise we leave a hole in nfsd_files[]
 	 * with !CONFIG_NFSD_V4 and simple_fill_super() goes oops
@@ -229,6 +231,13 @@ static const struct file_operations reply_cache_stats_operations = {
 	.release	= single_release,
 };
 
+static const struct file_operations filecache_ops = {
+	.open		= nfsd_file_cache_stats_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 /*----------------------------------------------------------------------------*/
 /*
  * payload - write methods
@@ -1370,6 +1379,7 @@ static int nfsd_fill_super(struct super_block *sb, struct fs_context *fc)
 		[NFSD_Ports] = {"portlist", &transaction_ops, S_IWUSR|S_IRUGO},
 		[NFSD_MaxBlkSize] = {"max_block_size", &transaction_ops, S_IWUSR|S_IRUGO},
 		[NFSD_MaxConnections] = {"max_connections", &transaction_ops, S_IWUSR|S_IRUGO},
+		[NFSD_Filecache] = {"filecache", &filecache_ops, S_IRUGO},
 #if defined(CONFIG_SUNRPC_GSS) || defined(CONFIG_SUNRPC_GSS_MODULE)
 		[NFSD_SupportedEnctypes] = {"supported_krb5_enctypes", &supported_enctypes_ops, S_IRUGO},
 #endif /* CONFIG_SUNRPC_GSS or CONFIG_SUNRPC_GSS_MODULE */



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 13/32] NFSD: WARN when freeing an item still linked via nf_lru
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (11 preceding siblings ...)
  2022-07-08 18:24 ` [PATCH v3 12/32] NFSD: Hook up the filecache stat file Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 18:25 ` [PATCH v3 14/32] NFSD: Trace filecache LRU activity Chuck Lever
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Add a guardrail to prevent freeing memory that is still on a list.
This includes either a dispose list or the LRU list.

This is the sign of a bug, but this class of bugs can be detected
so that they don't endanger system stability, especially while
debugging.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 3055a04eeabe..8ade3699664c 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -220,6 +220,14 @@ nfsd_file_free(struct nfsd_file *nf)
 		fput(nf->nf_file);
 		flush = true;
 	}
+
+	/*
+	 * If this item is still linked via nf_lru, that's a bug.
+	 * WARN and leak it to preserve system stability.
+	 */
+	if (WARN_ON_ONCE(!list_empty(&nf->nf_lru)))
+		return flush;
+
 	call_rcu(&nf->nf_rcu, nfsd_file_slab_free);
 	return flush;
 }
@@ -349,7 +357,7 @@ nfsd_file_dispose_list(struct list_head *dispose)
 
 	while(!list_empty(dispose)) {
 		nf = list_first_entry(dispose, struct nfsd_file, nf_lru);
-		list_del(&nf->nf_lru);
+		list_del_init(&nf->nf_lru);
 		nfsd_file_flush(nf);
 		nfsd_file_put_noref(nf);
 	}
@@ -363,7 +371,7 @@ nfsd_file_dispose_list_sync(struct list_head *dispose)
 
 	while(!list_empty(dispose)) {
 		nf = list_first_entry(dispose, struct nfsd_file, nf_lru);
-		list_del(&nf->nf_lru);
+		list_del_init(&nf->nf_lru);
 		nfsd_file_flush(nf);
 		if (!refcount_dec_and_test(&nf->nf_ref))
 			continue;



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 14/32] NFSD: Trace filecache LRU activity
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (12 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 13/32] NFSD: WARN when freeing an item still linked via nf_lru Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 18:25 ` [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU Chuck Lever
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Observe the operation of garbage collection and the lifetime of
filecache items.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   44 +++++++++++++++++++++++++++++++-------------
 fs/nfsd/trace.h     |   39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 8ade3699664c..37373b012276 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -267,6 +267,18 @@ nfsd_file_flush(struct nfsd_file *nf)
 		nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
 }
 
+static void nfsd_file_lru_add(struct nfsd_file *nf)
+{
+	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru))
+		trace_nfsd_file_lru_add(nf);
+}
+
+static void nfsd_file_lru_remove(struct nfsd_file *nf)
+{
+	if (list_lru_del(&nfsd_file_lru, &nf->nf_lru))
+		trace_nfsd_file_lru_del(nf);
+}
+
 static void
 nfsd_file_do_unhash(struct nfsd_file *nf)
 {
@@ -286,8 +298,7 @@ nfsd_file_unhash(struct nfsd_file *nf)
 {
 	if (test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
 		nfsd_file_do_unhash(nf);
-		if (!list_empty(&nf->nf_lru))
-			list_lru_del(&nfsd_file_lru, &nf->nf_lru);
+		nfsd_file_lru_remove(nf);
 		return true;
 	}
 	return false;
@@ -450,27 +461,34 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
 	 * counter. Here we check the counter and then test and clear the flag.
 	 * That order is deliberate to ensure that we can do this locklessly.
 	 */
-	if (refcount_read(&nf->nf_ref) > 1)
-		goto out_skip;
+	if (refcount_read(&nf->nf_ref) > 1) {
+		trace_nfsd_file_gc_in_use(nf);
+		return LRU_SKIP;
+	}
 
 	/*
 	 * Don't throw out files that are still undergoing I/O or
 	 * that have uncleared errors pending.
 	 */
-	if (nfsd_file_check_writeback(nf))
-		goto out_skip;
+	if (nfsd_file_check_writeback(nf)) {
+		trace_nfsd_file_gc_writeback(nf);
+		return LRU_SKIP;
+	}
 
-	if (test_and_clear_bit(NFSD_FILE_REFERENCED, &nf->nf_flags))
-		goto out_skip;
+	if (test_and_clear_bit(NFSD_FILE_REFERENCED, &nf->nf_flags)) {
+		trace_nfsd_file_gc_referenced(nf);
+		return LRU_SKIP;
+	}
 
-	if (!test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags))
-		goto out_skip;
+	if (!test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
+		trace_nfsd_file_gc_hashed(nf);
+		return LRU_SKIP;
+	}
 
 	list_lru_isolate_move(lru, &nf->nf_lru, head);
 	this_cpu_inc(nfsd_file_evictions);
+	trace_nfsd_file_gc_disposed(nf);
 	return LRU_REMOVED;
-out_skip:
-	return LRU_SKIP;
 }
 
 /*
@@ -1037,7 +1055,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	refcount_inc(&nf->nf_ref);
 	__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
 	__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
-	list_lru_add(&nfsd_file_lru, &nf->nf_lru);
+	nfsd_file_lru_add(nf);
 	hlist_add_head_rcu(&nf->nf_node, &nfsd_file_hashtbl[hashval].nfb_head);
 	++nfsd_file_hashtbl[hashval].nfb_count;
 	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 59dc5b2f4a50..1cc1133371eb 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -897,6 +897,45 @@ TRACE_EVENT(nfsd_file_fsnotify_handle_event,
 			__entry->nlink, __entry->mode, __entry->mask)
 );
 
+DECLARE_EVENT_CLASS(nfsd_file_gc_class,
+	TP_PROTO(
+		const struct nfsd_file *nf
+	),
+	TP_ARGS(nf),
+	TP_STRUCT__entry(
+		__field(void *, nf_inode)
+		__field(void *, nf_file)
+		__field(int, nf_ref)
+		__field(unsigned long, nf_flags)
+	),
+	TP_fast_assign(
+		__entry->nf_inode = nf->nf_inode;
+		__entry->nf_file = nf->nf_file;
+		__entry->nf_ref = refcount_read(&nf->nf_ref);
+		__entry->nf_flags = nf->nf_flags;
+	),
+	TP_printk("inode=%p ref=%d nf_flags=%s nf_file=%p",
+		__entry->nf_inode, __entry->nf_ref,
+		show_nf_flags(__entry->nf_flags),
+		__entry->nf_file
+	)
+);
+
+#define DEFINE_NFSD_FILE_GC_EVENT(name)					\
+DEFINE_EVENT(nfsd_file_gc_class, name,					\
+	TP_PROTO(							\
+		const struct nfsd_file *nf				\
+	),								\
+	TP_ARGS(nf))
+
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_in_use);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_writeback);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_referenced);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_hashed);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_disposed);
+
 DECLARE_EVENT_CLASS(nfsd_file_lruwalk_class,
 	TP_PROTO(
 		unsigned long removed,



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (13 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 14/32] NFSD: Trace filecache LRU activity Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 19:29   ` Jeff Layton
  2022-07-08 18:25 ` [PATCH v3 16/32] NFSD: Fix the filecache LRU shrinker Chuck Lever
                   ` (17 subsequent siblings)
  32 siblings, 1 reply; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

There have been reports of problems when running fstests generic/531
against Linux NFS servers with NFSv4. The NFS server that hosts the
test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
Analysis shows that:

fs/nfsd/filecache.c
 482                 ret = list_lru_walk(&nfsd_file_lru,
 483                                 nfsd_file_lru_cb,
 484                                 &head, LONG_MAX);

causes nfsd_file_gc() to walk the entire length of the filecache LRU
list every time it is called (which is quite frequently). The walk
holds a spinlock the entire time that prevents other nfsd threads
from accessing the filecache.

What's more, for NFSv4 workloads, none of the items that are visited
during this walk may be evicted, since they are all files that are
held OPEN by NFS clients.

Address this by ensuring that open files are not kept on the LRU
list.

Reported-by: Frank van der Linden <fllinden@amazon.com>
Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   24 +++++++++++++++++++-----
 fs/nfsd/trace.h     |    2 ++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 37373b012276..6e9e186334ab 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -269,6 +269,7 @@ nfsd_file_flush(struct nfsd_file *nf)
 
 static void nfsd_file_lru_add(struct nfsd_file *nf)
 {
+	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
 	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru))
 		trace_nfsd_file_lru_add(nf);
 }
@@ -298,7 +299,6 @@ nfsd_file_unhash(struct nfsd_file *nf)
 {
 	if (test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
 		nfsd_file_do_unhash(nf);
-		nfsd_file_lru_remove(nf);
 		return true;
 	}
 	return false;
@@ -319,6 +319,7 @@ nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *disp
 	if (refcount_dec_not_one(&nf->nf_ref))
 		return true;
 
+	nfsd_file_lru_remove(nf);
 	list_add(&nf->nf_lru, dispose);
 	return true;
 }
@@ -330,6 +331,7 @@ nfsd_file_put_noref(struct nfsd_file *nf)
 
 	if (refcount_dec_and_test(&nf->nf_ref)) {
 		WARN_ON(test_bit(NFSD_FILE_HASHED, &nf->nf_flags));
+		nfsd_file_lru_remove(nf);
 		nfsd_file_free(nf);
 	}
 }
@@ -339,7 +341,7 @@ nfsd_file_put(struct nfsd_file *nf)
 {
 	might_sleep();
 
-	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
+	nfsd_file_lru_add(nf);
 	if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0) {
 		nfsd_file_flush(nf);
 		nfsd_file_put_noref(nf);
@@ -439,8 +441,18 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
 	}
 }
 
-/*
+/**
+ * nfsd_file_lru_cb - Examine an entry on the LRU list
+ * @item: LRU entry to examine
+ * @lru: controlling LRU
+ * @lock: LRU list lock (unused)
+ * @arg: dispose list
+ *
  * Note this can deadlock with nfsd_file_cache_purge.
+ *
+ * Return values:
+ *   %LRU_REMOVED: @item was removed from the LRU
+ *   %LRU_SKIP: @item cannot be evicted
  */
 static enum lru_status
 nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
@@ -462,8 +474,9 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
 	 * That order is deliberate to ensure that we can do this locklessly.
 	 */
 	if (refcount_read(&nf->nf_ref) > 1) {
+		list_lru_isolate(lru, &nf->nf_lru);
 		trace_nfsd_file_gc_in_use(nf);
-		return LRU_SKIP;
+		return LRU_REMOVED;
 	}
 
 	/*
@@ -1020,6 +1033,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		goto retry;
 	}
 
+	nfsd_file_lru_remove(nf);
 	this_cpu_inc(nfsd_file_cache_hits);
 
 	if (!(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
@@ -1055,7 +1069,6 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	refcount_inc(&nf->nf_ref);
 	__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
 	__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
-	nfsd_file_lru_add(nf);
 	hlist_add_head_rcu(&nf->nf_node, &nfsd_file_hashtbl[hashval].nfb_head);
 	++nfsd_file_hashtbl[hashval].nfb_count;
 	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
@@ -1080,6 +1093,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	 */
 	if (status != nfs_ok || inode->i_nlink == 0) {
 		bool do_free;
+		nfsd_file_lru_remove(nf);
 		spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
 		do_free = nfsd_file_unhash(nf);
 		spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 1cc1133371eb..54082b868b72 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -929,7 +929,9 @@ DEFINE_EVENT(nfsd_file_gc_class, name,					\
 	TP_ARGS(nf))
 
 DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add_disposed);
 DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del);
+DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del_disposed);
 DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_in_use);
 DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_writeback);
 DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_referenced);



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 16/32] NFSD: Fix the filecache LRU shrinker
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (14 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 19:37   ` Jeff Layton
  2022-07-08 18:25 ` [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths Chuck Lever
                   ` (16 subsequent siblings)
  32 siblings, 1 reply; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Without LRU item rotation, the shrinker visits only a few items on
the end of the LRU list, and those would always be long-term OPEN
files for NFSv4 workloads. That makes the filecache shrinker
completely ineffective.

Adopt the same strategy as the inode LRU by using LRU_ROTATE.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 6e9e186334ab..bd6ba63f69ae 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -452,6 +452,7 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
  *
  * Return values:
  *   %LRU_REMOVED: @item was removed from the LRU
+ *   %LRU_ROTATED: @item is to be moved to the LRU tail
  *   %LRU_SKIP: @item cannot be evicted
  */
 static enum lru_status
@@ -490,7 +491,7 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
 
 	if (test_and_clear_bit(NFSD_FILE_REFERENCED, &nf->nf_flags)) {
 		trace_nfsd_file_gc_referenced(nf);
-		return LRU_SKIP;
+		return LRU_ROTATE;
 	}
 
 	if (!test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
@@ -532,7 +533,7 @@ nfsd_file_gc(void)
 	unsigned long ret;
 
 	ret = list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
-			    &dispose, LONG_MAX);
+			    &dispose, list_lru_count(&nfsd_file_lru));
 	trace_nfsd_file_gc_removed(ret, list_lru_count(&nfsd_file_lru));
 	nfsd_file_gc_dispose_list(&dispose);
 }



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (15 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 16/32] NFSD: Fix the filecache LRU shrinker Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 19:43   ` Jeff Layton
  2022-07-08 18:25 ` [PATCH v3 18/32] NFSD: No longer record nf_hashval in the trace log Chuck Lever
                   ` (15 subsequent siblings)
  32 siblings, 1 reply; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

The checks in nfsd_file_acquire() and nfsd_file_put() that directly
invoke filecache garbage collection are intended to keep cache
occupancy between a low- and high-watermark. The reason to limit the
capacity of the filecache is to keep filecache lookups reasonably
fast.

However, invoking garbage collection at those points has some
undesirable negative impacts. Files that are held open by NFSv4
clients often push the occupancy of the filecache over these
watermarks. At that point:

- Every call to nfsd_file_acquire() and nfsd_file_put() results in
  an LRU walk. This has the same effect on lookup latency as long
  chains in the hash table.
- Garbage collection will then run on every nfsd thread, causing a
  lot of unnecessary lock contention.
- Limiting cache capacity pushes out files used only by NFSv3
  clients, which are the type of files the filecache is supposed to
  help.

To address those negative impacts, remove the direct calls to the
garbage collector. Subsequent patches will address maintaining
lookup efficiency as cache capacity increases.

Suggested-by: Wang Yugui <wangyugui@e16-tech.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index bd6ba63f69ae..faa8588663d6 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -29,8 +29,6 @@
 #define NFSD_LAUNDRETTE_DELAY		     (2 * HZ)
 
 #define NFSD_FILE_SHUTDOWN		     (1)
-#define NFSD_FILE_LRU_THRESHOLD		     (4096UL)
-#define NFSD_FILE_LRU_LIMIT		     (NFSD_FILE_LRU_THRESHOLD << 2)
 
 /* We only care about NFSD_MAY_READ/WRITE for this cache */
 #define NFSD_FILE_MAY_MASK	(NFSD_MAY_READ|NFSD_MAY_WRITE)
@@ -66,8 +64,6 @@ static struct fsnotify_group		*nfsd_file_fsnotify_group;
 static atomic_long_t			nfsd_filecache_count;
 static struct delayed_work		nfsd_filecache_laundrette;
 
-static void nfsd_file_gc(void);
-
 static void
 nfsd_file_schedule_laundrette(void)
 {
@@ -350,9 +346,6 @@ nfsd_file_put(struct nfsd_file *nf)
 		nfsd_file_schedule_laundrette();
 	} else
 		nfsd_file_put_noref(nf);
-
-	if (atomic_long_read(&nfsd_filecache_count) >= NFSD_FILE_LRU_LIMIT)
-		nfsd_file_gc();
 }
 
 struct nfsd_file *
@@ -1075,8 +1068,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
 			nfsd_file_hashtbl[hashval].nfb_count);
 	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
-	if (atomic_long_inc_return(&nfsd_filecache_count) >= NFSD_FILE_LRU_THRESHOLD)
-		nfsd_file_gc();
+	atomic_long_inc(&nfsd_filecache_count);
 
 	nf->nf_mark = nfsd_file_mark_find_or_create(nf);
 	if (nf->nf_mark) {



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 18/32] NFSD: No longer record nf_hashval in the trace log
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (16 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 18:25 ` [PATCH v3 19/32] NFSD: Remove lockdep assertion from unhash_and_release_locked() Chuck Lever
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

I'm about to replace nfsd_file_hashtbl with an rhashtable. The
individual hash values will no longer be visible or relevant, so
remove them from the tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   15 ++++++++-------
 fs/nfsd/trace.h     |   45 +++++++++++++++++++++------------------------
 2 files changed, 29 insertions(+), 31 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index faa8588663d6..32ada8cce2e0 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -595,7 +595,7 @@ nfsd_file_close_inode_sync(struct inode *inode)
 	LIST_HEAD(dispose);
 
 	__nfsd_file_close_inode(inode, hashval, &dispose);
-	trace_nfsd_file_close_inode_sync(inode, hashval, !list_empty(&dispose));
+	trace_nfsd_file_close_inode_sync(inode, !list_empty(&dispose));
 	nfsd_file_dispose_list_sync(&dispose);
 }
 
@@ -615,7 +615,7 @@ nfsd_file_close_inode(struct inode *inode)
 	LIST_HEAD(dispose);
 
 	__nfsd_file_close_inode(inode, hashval, &dispose);
-	trace_nfsd_file_close_inode(inode, hashval, !list_empty(&dispose));
+	trace_nfsd_file_close_inode(inode, !list_empty(&dispose));
 	nfsd_file_dispose_list_delayed(&dispose);
 }
 
@@ -969,7 +969,7 @@ nfsd_file_is_cached(struct inode *inode)
 		}
 	}
 	rcu_read_unlock();
-	trace_nfsd_file_is_cached(inode, hashval, (int)ret);
+	trace_nfsd_file_is_cached(inode, (int)ret);
 	return ret;
 }
 
@@ -1001,9 +1001,8 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	new = nfsd_file_alloc(inode, may_flags, hashval, net);
 	if (!new) {
-		trace_nfsd_file_acquire(rqstp, hashval, inode, may_flags,
-					NULL, nfserr_jukebox);
-		return nfserr_jukebox;
+		status = nfserr_jukebox;
+		goto out_status;
 	}
 
 	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
@@ -1055,8 +1054,10 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		nf = NULL;
 	}
 
-	trace_nfsd_file_acquire(rqstp, hashval, inode, may_flags, nf, status);
+out_status:
+	trace_nfsd_file_acquire(rqstp, inode, may_flags, nf, status);
 	return status;
+
 open_file:
 	nf = new;
 	/* Take reference for the hashtable */
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 54082b868b72..bb5a17ccf2cd 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -750,7 +750,6 @@ DECLARE_EVENT_CLASS(nfsd_file_class,
 	TP_PROTO(struct nfsd_file *nf),
 	TP_ARGS(nf),
 	TP_STRUCT__entry(
-		__field(unsigned int, nf_hashval)
 		__field(void *, nf_inode)
 		__field(int, nf_ref)
 		__field(unsigned long, nf_flags)
@@ -758,15 +757,13 @@ DECLARE_EVENT_CLASS(nfsd_file_class,
 		__field(struct file *, nf_file)
 	),
 	TP_fast_assign(
-		__entry->nf_hashval = nf->nf_hashval;
 		__entry->nf_inode = nf->nf_inode;
 		__entry->nf_ref = refcount_read(&nf->nf_ref);
 		__entry->nf_flags = nf->nf_flags;
 		__entry->nf_may = nf->nf_may;
 		__entry->nf_file = nf->nf_file;
 	),
-	TP_printk("hash=0x%x inode=%p ref=%d flags=%s may=%s file=%p",
-		__entry->nf_hashval,
+	TP_printk("inode=%p ref=%d flags=%s may=%s nf_file=%p",
 		__entry->nf_inode,
 		__entry->nf_ref,
 		show_nf_flags(__entry->nf_flags),
@@ -786,15 +783,18 @@ DEFINE_NFSD_FILE_EVENT(nfsd_file_put);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_release_locked);
 
 TRACE_EVENT(nfsd_file_acquire,
-	TP_PROTO(struct svc_rqst *rqstp, unsigned int hash,
-		 struct inode *inode, unsigned int may_flags,
-		 struct nfsd_file *nf, __be32 status),
+	TP_PROTO(
+		struct svc_rqst *rqstp,
+		struct inode *inode,
+		unsigned int may_flags,
+		struct nfsd_file *nf,
+		__be32 status
+	),
 
-	TP_ARGS(rqstp, hash, inode, may_flags, nf, status),
+	TP_ARGS(rqstp, inode, may_flags, nf, status),
 
 	TP_STRUCT__entry(
 		__field(u32, xid)
-		__field(unsigned int, hash)
 		__field(void *, inode)
 		__field(unsigned long, may_flags)
 		__field(int, nf_ref)
@@ -806,7 +806,6 @@ TRACE_EVENT(nfsd_file_acquire,
 
 	TP_fast_assign(
 		__entry->xid = be32_to_cpu(rqstp->rq_xid);
-		__entry->hash = hash;
 		__entry->inode = inode;
 		__entry->may_flags = may_flags;
 		__entry->nf_ref = nf ? refcount_read(&nf->nf_ref) : 0;
@@ -816,8 +815,8 @@ TRACE_EVENT(nfsd_file_acquire,
 		__entry->status = be32_to_cpu(status);
 	),
 
-	TP_printk("xid=0x%x hash=0x%x inode=%p may_flags=%s ref=%d nf_flags=%s nf_may=%s nf_file=%p status=%u",
-			__entry->xid, __entry->hash, __entry->inode,
+	TP_printk("xid=0x%x inode=%p may_flags=%s ref=%d nf_flags=%s nf_may=%s nf_file=%p status=%u",
+			__entry->xid, __entry->inode,
 			show_nfsd_may_flags(__entry->may_flags),
 			__entry->nf_ref, show_nf_flags(__entry->nf_flags),
 			show_nfsd_may_flags(__entry->nf_may),
@@ -828,7 +827,6 @@ TRACE_EVENT(nfsd_file_open,
 	TP_PROTO(struct nfsd_file *nf, __be32 status),
 	TP_ARGS(nf, status),
 	TP_STRUCT__entry(
-		__field(unsigned int, nf_hashval)
 		__field(void *, nf_inode)	/* cannot be dereferenced */
 		__field(int, nf_ref)
 		__field(unsigned long, nf_flags)
@@ -836,15 +834,13 @@ TRACE_EVENT(nfsd_file_open,
 		__field(void *, nf_file)	/* cannot be dereferenced */
 	),
 	TP_fast_assign(
-		__entry->nf_hashval = nf->nf_hashval;
 		__entry->nf_inode = nf->nf_inode;
 		__entry->nf_ref = refcount_read(&nf->nf_ref);
 		__entry->nf_flags = nf->nf_flags;
 		__entry->nf_may = nf->nf_may;
 		__entry->nf_file = nf->nf_file;
 	),
-	TP_printk("hash=0x%x inode=%p ref=%d flags=%s may=%s file=%p",
-		__entry->nf_hashval,
+	TP_printk("inode=%p ref=%d flags=%s may=%s file=%p",
 		__entry->nf_inode,
 		__entry->nf_ref,
 		show_nf_flags(__entry->nf_flags),
@@ -853,26 +849,27 @@ TRACE_EVENT(nfsd_file_open,
 )
 
 DECLARE_EVENT_CLASS(nfsd_file_search_class,
-	TP_PROTO(struct inode *inode, unsigned int hash, int found),
-	TP_ARGS(inode, hash, found),
+	TP_PROTO(
+		struct inode *inode,
+		int found
+	),
+	TP_ARGS(inode, found),
 	TP_STRUCT__entry(
 		__field(struct inode *, inode)
-		__field(unsigned int, hash)
 		__field(int, found)
 	),
 	TP_fast_assign(
 		__entry->inode = inode;
-		__entry->hash = hash;
 		__entry->found = found;
 	),
-	TP_printk("hash=0x%x inode=%p found=%d", __entry->hash,
-			__entry->inode, __entry->found)
+	TP_printk("inode=%p found=%d",
+		__entry->inode, __entry->found)
 );
 
 #define DEFINE_NFSD_FILE_SEARCH_EVENT(name)				\
 DEFINE_EVENT(nfsd_file_search_class, name,				\
-	TP_PROTO(struct inode *inode, unsigned int hash, int found),	\
-	TP_ARGS(inode, hash, found))
+	TP_PROTO(struct inode *inode, int found),			\
+	TP_ARGS(inode, found))
 
 DEFINE_NFSD_FILE_SEARCH_EVENT(nfsd_file_close_inode_sync);
 DEFINE_NFSD_FILE_SEARCH_EVENT(nfsd_file_close_inode);



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 19/32] NFSD: Remove lockdep assertion from unhash_and_release_locked()
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (17 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 18/32] NFSD: No longer record nf_hashval in the trace log Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 18:25 ` [PATCH v3 20/32] NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode Chuck Lever
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

IIUC, holding the hash bucket lock is needed only in
nfsd_file_unhash, and there is already a lockdep assertion there.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 32ada8cce2e0..278a13d85e8f 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -306,8 +306,6 @@ nfsd_file_unhash(struct nfsd_file *nf)
 static bool
 nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *dispose)
 {
-	lockdep_assert_held(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
-
 	trace_nfsd_file_unhash_and_release_locked(nf);
 	if (!nfsd_file_unhash(nf))
 		return false;



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 20/32] NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (18 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 19/32] NFSD: Remove lockdep assertion from unhash_and_release_locked() Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 18:25 ` [PATCH v3 21/32] NFSD: Refactor __nfsd_file_close_inode() Chuck Lever
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Remove an unnecessary usage of nf_hashval.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 278a13d85e8f..4143898fff37 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -279,13 +279,17 @@ static void nfsd_file_lru_remove(struct nfsd_file *nf)
 static void
 nfsd_file_do_unhash(struct nfsd_file *nf)
 {
-	lockdep_assert_held(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+	struct inode *inode = nf->nf_inode;
+	unsigned int hashval = (unsigned int)hash_long(inode->i_ino,
+				NFSD_FILE_HASH_BITS);
+
+	lockdep_assert_held(&nfsd_file_hashtbl[hashval].nfb_lock);
 
 	trace_nfsd_file_unhash(nf);
 
 	if (nfsd_file_check_write_error(nf))
 		nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
-	--nfsd_file_hashtbl[nf->nf_hashval].nfb_count;
+	--nfsd_file_hashtbl[hashval].nfb_count;
 	hlist_del_rcu(&nf->nf_node);
 	atomic_long_dec(&nfsd_filecache_count);
 }



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 21/32] NFSD: Refactor __nfsd_file_close_inode()
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (19 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 20/32] NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode Chuck Lever
@ 2022-07-08 18:25 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 22/32] NFSD: nfsd_file_hash_remove can compute hashval Chuck Lever
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:25 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

The code that computes the hashval is the same in both callers.

To prevent them from going stale, reframe the documenting comments
to remove descriptions of the underlying hash table structure, which
is about to be replaced.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   40 +++++++++++++++++++++-------------------
 fs/nfsd/trace.h     |   44 +++++++++++++++++++++++++++++++++-----------
 2 files changed, 54 insertions(+), 30 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 4143898fff37..26d10e3c4e70 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -565,39 +565,44 @@ static struct shrinker	nfsd_file_shrinker = {
 	.seeks = 1,
 };
 
-static void
-__nfsd_file_close_inode(struct inode *inode, unsigned int hashval,
-			struct list_head *dispose)
+/*
+ * Find all cache items across all net namespaces that match @inode and
+ * move them to @dispose. The lookup is atomic wrt nfsd_file_acquire().
+ */
+static unsigned int
+__nfsd_file_close_inode(struct inode *inode, struct list_head *dispose)
 {
+	unsigned int		hashval = (unsigned int)hash_long(inode->i_ino,
+						NFSD_FILE_HASH_BITS);
+	unsigned int		count = 0;
 	struct nfsd_file	*nf;
 	struct hlist_node	*tmp;
 
 	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
 	hlist_for_each_entry_safe(nf, tmp, &nfsd_file_hashtbl[hashval].nfb_head, nf_node) {
-		if (inode == nf->nf_inode)
+		if (inode == nf->nf_inode) {
 			nfsd_file_unhash_and_release_locked(nf, dispose);
+			count++;
+		}
 	}
 	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	return count;
 }
 
 /**
  * nfsd_file_close_inode_sync - attempt to forcibly close a nfsd_file
  * @inode: inode of the file to attempt to remove
  *
- * Walk the whole hash bucket, looking for any files that correspond to "inode".
- * If any do, then unhash them and put the hashtable reference to them and
- * destroy any that had their last reference put. Also ensure that any of the
- * fputs also have their final __fput done as well.
+ * Unhash and put, then flush and fput all cache items associated with @inode.
  */
 void
 nfsd_file_close_inode_sync(struct inode *inode)
 {
-	unsigned int		hashval = (unsigned int)hash_long(inode->i_ino,
-						NFSD_FILE_HASH_BITS);
 	LIST_HEAD(dispose);
+	unsigned int count;
 
-	__nfsd_file_close_inode(inode, hashval, &dispose);
-	trace_nfsd_file_close_inode_sync(inode, !list_empty(&dispose));
+	count = __nfsd_file_close_inode(inode, &dispose);
+	trace_nfsd_file_close_inode_sync(inode, count);
 	nfsd_file_dispose_list_sync(&dispose);
 }
 
@@ -605,19 +610,16 @@ nfsd_file_close_inode_sync(struct inode *inode)
  * nfsd_file_close_inode - attempt a delayed close of a nfsd_file
  * @inode: inode of the file to attempt to remove
  *
- * Walk the whole hash bucket, looking for any files that correspond to "inode".
- * If any do, then unhash them and put the hashtable reference to them and
- * destroy any that had their last reference put.
+ * Unhash and put all cache item associated with @inode.
  */
 static void
 nfsd_file_close_inode(struct inode *inode)
 {
-	unsigned int		hashval = (unsigned int)hash_long(inode->i_ino,
-						NFSD_FILE_HASH_BITS);
 	LIST_HEAD(dispose);
+	unsigned int count;
 
-	__nfsd_file_close_inode(inode, hashval, &dispose);
-	trace_nfsd_file_close_inode(inode, !list_empty(&dispose));
+	count = __nfsd_file_close_inode(inode, &dispose);
+	trace_nfsd_file_close_inode(inode, count);
 	nfsd_file_dispose_list_delayed(&dispose);
 }
 
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index bb5a17ccf2cd..af609590ac86 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -850,30 +850,52 @@ TRACE_EVENT(nfsd_file_open,
 
 DECLARE_EVENT_CLASS(nfsd_file_search_class,
 	TP_PROTO(
-		struct inode *inode,
-		int found
+		const struct inode *inode,
+		unsigned int count
 	),
-	TP_ARGS(inode, found),
+	TP_ARGS(inode, count),
 	TP_STRUCT__entry(
-		__field(struct inode *, inode)
-		__field(int, found)
+		__field(const struct inode *, inode)
+		__field(unsigned int, count)
 	),
 	TP_fast_assign(
 		__entry->inode = inode;
-		__entry->found = found;
+		__entry->count = count;
 	),
-	TP_printk("inode=%p found=%d",
-		__entry->inode, __entry->found)
+	TP_printk("inode=%p count=%u",
+		__entry->inode, __entry->count)
 );
 
 #define DEFINE_NFSD_FILE_SEARCH_EVENT(name)				\
 DEFINE_EVENT(nfsd_file_search_class, name,				\
-	TP_PROTO(struct inode *inode, int found),			\
-	TP_ARGS(inode, found))
+	TP_PROTO(							\
+		const struct inode *inode,				\
+		unsigned int count					\
+	),								\
+	TP_ARGS(inode, count))
 
 DEFINE_NFSD_FILE_SEARCH_EVENT(nfsd_file_close_inode_sync);
 DEFINE_NFSD_FILE_SEARCH_EVENT(nfsd_file_close_inode);
-DEFINE_NFSD_FILE_SEARCH_EVENT(nfsd_file_is_cached);
+
+TRACE_EVENT(nfsd_file_is_cached,
+	TP_PROTO(
+		const struct inode *inode,
+		int found
+	),
+	TP_ARGS(inode, found),
+	TP_STRUCT__entry(
+		__field(const struct inode *, inode)
+		__field(int, found)
+	),
+	TP_fast_assign(
+		__entry->inode = inode;
+		__entry->found = found;
+	),
+	TP_printk("inode=%p is %scached",
+		__entry->inode,
+		__entry->found ? "" : "not "
+	)
+);
 
 TRACE_EVENT(nfsd_file_fsnotify_handle_event,
 	TP_PROTO(struct inode *inode, u32 mask),



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 22/32] NFSD: nfsd_file_hash_remove can compute hashval
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (20 preceding siblings ...)
  2022-07-08 18:25 ` [PATCH v3 21/32] NFSD: Refactor __nfsd_file_close_inode() Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 23/32] NFSD: Remove nfsd_file::nf_hashval Chuck Lever
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Remove an unnecessary use of nf_hashval.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 26d10e3c4e70..490e7d7da7d2 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -294,6 +294,18 @@ nfsd_file_do_unhash(struct nfsd_file *nf)
 	atomic_long_dec(&nfsd_filecache_count);
 }
 
+static void
+nfsd_file_hash_remove(struct nfsd_file *nf)
+{
+	struct inode *inode = nf->nf_inode;
+	unsigned int hashval = (unsigned int)hash_long(inode->i_ino,
+				NFSD_FILE_HASH_BITS);
+
+	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	nfsd_file_do_unhash(nf);
+	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+}
+
 static bool
 nfsd_file_unhash(struct nfsd_file *nf)
 {
@@ -513,11 +525,8 @@ static void nfsd_file_gc_dispose_list(struct list_head *dispose)
 {
 	struct nfsd_file *nf;
 
-	list_for_each_entry(nf, dispose, nf_lru) {
-		spin_lock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
-		nfsd_file_do_unhash(nf);
-		spin_unlock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
-	}
+	list_for_each_entry(nf, dispose, nf_lru)
+		nfsd_file_hash_remove(nf);
 	nfsd_file_dispose_list_delayed(dispose);
 }
 



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 23/32] NFSD: Remove nfsd_file::nf_hashval
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (21 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 22/32] NFSD: nfsd_file_hash_remove can compute hashval Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 24/32] NFSD: Replace the "init once" mechanism Chuck Lever
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

The value in this field can always be computed from nf_inode, thus
it is no longer used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    6 ++----
 fs/nfsd/filecache.h |    1 -
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 490e7d7da7d2..7e857f291d4a 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -168,8 +168,7 @@ nfsd_file_mark_find_or_create(struct nfsd_file *nf)
 }
 
 static struct nfsd_file *
-nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval,
-		struct net *net)
+nfsd_file_alloc(struct inode *inode, unsigned int may, struct net *net)
 {
 	struct nfsd_file *nf;
 
@@ -183,7 +182,6 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval,
 		nf->nf_net = net;
 		nf->nf_flags = 0;
 		nf->nf_inode = inode;
-		nf->nf_hashval = hashval;
 		refcount_set(&nf->nf_ref, 1);
 		nf->nf_may = may & NFSD_FILE_MAY_MASK;
 		if (may & NFSD_MAY_NOT_BREAK_LEASE) {
@@ -1012,7 +1010,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	if (nf)
 		goto wait_for_construction;
 
-	new = nfsd_file_alloc(inode, may_flags, hashval, net);
+	new = nfsd_file_alloc(inode, may_flags, net);
 	if (!new) {
 		status = nfserr_jukebox;
 		goto out_status;
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index d0c42619dc10..31dc65f82c75 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -42,7 +42,6 @@ struct nfsd_file {
 #define NFSD_FILE_REFERENCED	(4)
 	unsigned long		nf_flags;
 	struct inode		*nf_inode;
-	unsigned int		nf_hashval;
 	refcount_t		nf_ref;
 	unsigned char		nf_may;
 	struct nfsd_file_mark	*nf_mark;



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 24/32] NFSD: Replace the "init once" mechanism
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (22 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 23/32] NFSD: Remove nfsd_file::nf_hashval Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 25/32] NFSD: Set up an rhashtable for the filecache Chuck Lever
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

In a moment, the nfsd_file_hashtbl global will be replaced with an
rhashtable. Replace the one or two spots that need to check if the
hash table is available. We can easily reuse the SHUTDOWN flag for
this purpose.

Document that this mechanism relies on callers to hold the
nfsd_mutex to prevent init, shutdown, and purging to run
concurrently.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   42 ++++++++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 7e857f291d4a..b075c9223377 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -28,7 +28,7 @@
 #define NFSD_FILE_HASH_SIZE                  (1 << NFSD_FILE_HASH_BITS)
 #define NFSD_LAUNDRETTE_DELAY		     (2 * HZ)
 
-#define NFSD_FILE_SHUTDOWN		     (1)
+#define NFSD_FILE_CACHE_UP		     (0)
 
 /* We only care about NFSD_MAY_READ/WRITE for this cache */
 #define NFSD_FILE_MAY_MASK	(NFSD_MAY_READ|NFSD_MAY_WRITE)
@@ -59,7 +59,7 @@ static struct kmem_cache		*nfsd_file_slab;
 static struct kmem_cache		*nfsd_file_mark_slab;
 static struct nfsd_fcache_bucket	*nfsd_file_hashtbl;
 static struct list_lru			nfsd_file_lru;
-static long				nfsd_file_lru_flags;
+static unsigned long			nfsd_file_flags;
 static struct fsnotify_group		*nfsd_file_fsnotify_group;
 static atomic_long_t			nfsd_filecache_count;
 static struct delayed_work		nfsd_filecache_laundrette;
@@ -67,9 +67,8 @@ static struct delayed_work		nfsd_filecache_laundrette;
 static void
 nfsd_file_schedule_laundrette(void)
 {
-	long count = atomic_long_read(&nfsd_filecache_count);
-
-	if (count == 0 || test_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags))
+	if ((atomic_long_read(&nfsd_filecache_count) == 0) ||
+	    test_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 0)
 		return;
 
 	queue_delayed_work(system_wq, &nfsd_filecache_laundrette,
@@ -704,9 +703,8 @@ nfsd_file_cache_init(void)
 	int		ret = -ENOMEM;
 	unsigned int	i;
 
-	clear_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags);
-
-	if (nfsd_file_hashtbl)
+	lockdep_assert_held(&nfsd_mutex);
+	if (test_and_set_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 1)
 		return 0;
 
 	nfsd_filecache_wq = alloc_workqueue("nfsd_filecache", 0, 0);
@@ -792,8 +790,8 @@ nfsd_file_cache_init(void)
 /*
  * Note this can deadlock with nfsd_file_lru_cb.
  */
-void
-nfsd_file_cache_purge(struct net *net)
+static void
+__nfsd_file_cache_purge(struct net *net)
 {
 	unsigned int		i;
 	struct nfsd_file	*nf;
@@ -801,9 +799,6 @@ nfsd_file_cache_purge(struct net *net)
 	LIST_HEAD(dispose);
 	bool del;
 
-	if (!nfsd_file_hashtbl)
-		return;
-
 	for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
 		struct nfsd_fcache_bucket *nfb = &nfsd_file_hashtbl[i];
 
@@ -864,6 +859,19 @@ nfsd_file_cache_start_net(struct net *net)
 	return nn->fcache_disposal ? 0 : -ENOMEM;
 }
 
+/**
+ * nfsd_file_cache_purge - Remove all cache items associated with @net
+ * @net: target net namespace
+ *
+ */
+void
+nfsd_file_cache_purge(struct net *net)
+{
+	lockdep_assert_held(&nfsd_mutex);
+	if (test_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 1)
+		__nfsd_file_cache_purge(net);
+}
+
 void
 nfsd_file_cache_shutdown_net(struct net *net)
 {
@@ -876,7 +884,9 @@ nfsd_file_cache_shutdown(void)
 {
 	int i;
 
-	set_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags);
+	lockdep_assert_held(&nfsd_mutex);
+	if (test_and_clear_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 0)
+		return;
 
 	lease_unregister_notifier(&nfsd_file_lease_notifier);
 	unregister_shrinker(&nfsd_file_shrinker);
@@ -885,7 +895,7 @@ nfsd_file_cache_shutdown(void)
 	 * calling nfsd_file_cache_purge
 	 */
 	cancel_delayed_work_sync(&nfsd_filecache_laundrette);
-	nfsd_file_cache_purge(NULL);
+	__nfsd_file_cache_purge(NULL);
 	list_lru_destroy(&nfsd_file_lru);
 	rcu_barrier();
 	fsnotify_put_group(nfsd_file_fsnotify_group);
@@ -1163,7 +1173,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	 * don't end up racing with server shutdown
 	 */
 	mutex_lock(&nfsd_mutex);
-	if (nfsd_file_hashtbl) {
+	if (test_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 1) {
 		for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
 			count += nfsd_file_hashtbl[i].nfb_count;
 			longest = max(longest, nfsd_file_hashtbl[i].nfb_count);



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 25/32] NFSD: Set up an rhashtable for the filecache
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (23 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 24/32] NFSD: Replace the "init once" mechanism Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 26/32] NFSD: Convert the filecache to use rhashtable Chuck Lever
                   ` (7 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Add code to initialize and tear down an rhashtable. The rhashtable
is not used yet.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |  160 ++++++++++++++++++++++++++++++++++++++++++++-------
 fs/nfsd/filecache.h |    1 
 2 files changed, 140 insertions(+), 21 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index b075c9223377..bc2ec7db4927 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -13,6 +13,7 @@
 #include <linux/fsnotify_backend.h>
 #include <linux/fsnotify.h>
 #include <linux/seq_file.h>
+#include <linux/rhashtable.h>
 
 #include "vfs.h"
 #include "nfsd.h"
@@ -63,6 +64,136 @@ static unsigned long			nfsd_file_flags;
 static struct fsnotify_group		*nfsd_file_fsnotify_group;
 static atomic_long_t			nfsd_filecache_count;
 static struct delayed_work		nfsd_filecache_laundrette;
+static struct rhashtable		nfsd_file_rhash_tbl
+						____cacheline_aligned_in_smp;
+
+enum nfsd_file_lookup_type {
+	NFSD_FILE_KEY_INODE,
+	NFSD_FILE_KEY_FULL,
+};
+
+struct nfsd_file_lookup_key {
+	struct inode			*inode;
+	struct net			*net;
+	const struct cred		*cred;
+	unsigned char			need;
+	enum nfsd_file_lookup_type	type;
+};
+
+/*
+ * The returned hash value is based solely on the address of an in-code
+ * inode, a pointer to a slab-allocated object. The entropy in such a
+ * pointer is concentrated in its middle bits.
+ */
+static u32 nfsd_file_inode_hash(const struct inode *inode, u32 seed)
+{
+	unsigned long ptr = (unsigned long)inode;
+	u32 k;
+
+	k = ptr >> L1_CACHE_SHIFT;
+	k &= 0x00ffffff;
+	return jhash2(&k, 1, seed);
+}
+
+/**
+ * nfsd_file_key_hashfn - Compute the hash value of a lookup key
+ * @data: key on which to compute the hash value
+ * @len: rhash table's key_len parameter (unused)
+ * @seed: rhash table's random seed of the day
+ *
+ * Return value:
+ *   Computed 32-bit hash value
+ */
+static u32 nfsd_file_key_hashfn(const void *data, u32 len, u32 seed)
+{
+	const struct nfsd_file_lookup_key *key = data;
+
+	return nfsd_file_inode_hash(key->inode, seed);
+}
+
+/**
+ * nfsd_file_obj_hashfn - Compute the hash value of an nfsd_file
+ * @data: object on which to compute the hash value
+ * @len: rhash table's key_len parameter (unused)
+ * @seed: rhash table's random seed of the day
+ *
+ * Return value:
+ *   Computed 32-bit hash value
+ */
+static u32 nfsd_file_obj_hashfn(const void *data, u32 len, u32 seed)
+{
+	const struct nfsd_file *nf = data;
+
+	return nfsd_file_inode_hash(nf->nf_inode, seed);
+}
+
+static bool
+nfsd_match_cred(const struct cred *c1, const struct cred *c2)
+{
+	int i;
+
+	if (!uid_eq(c1->fsuid, c2->fsuid))
+		return false;
+	if (!gid_eq(c1->fsgid, c2->fsgid))
+		return false;
+	if (c1->group_info == NULL || c2->group_info == NULL)
+		return c1->group_info == c2->group_info;
+	if (c1->group_info->ngroups != c2->group_info->ngroups)
+		return false;
+	for (i = 0; i < c1->group_info->ngroups; i++) {
+		if (!gid_eq(c1->group_info->gid[i], c2->group_info->gid[i]))
+			return false;
+	}
+	return true;
+}
+
+/**
+ * nfsd_file_obj_cmpfn - Match a cache item against search criteria
+ * @arg: search criteria
+ * @ptr: cache item to check
+ *
+ * Return values:
+ *   %0 - Item matches search criteria
+ *   %1 - Item does not match search criteria
+ */
+static int nfsd_file_obj_cmpfn(struct rhashtable_compare_arg *arg,
+			       const void *ptr)
+{
+	const struct nfsd_file_lookup_key *key = arg->key;
+	const struct nfsd_file *nf = ptr;
+
+	switch (key->type) {
+	case NFSD_FILE_KEY_INODE:
+		if (nf->nf_inode != key->inode)
+			return 1;
+		break;
+	case NFSD_FILE_KEY_FULL:
+		if (nf->nf_inode != key->inode)
+			return 1;
+		if (nf->nf_may != key->need)
+			return 1;
+		if (nf->nf_net != key->net)
+			return 1;
+		if (!nfsd_match_cred(nf->nf_cred, key->cred))
+			return 1;
+		if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0)
+			return 1;
+		break;
+	}
+	return 0;
+}
+
+static const struct rhashtable_params nfsd_file_rhash_params = {
+	.key_len		= sizeof_field(struct nfsd_file, nf_inode),
+	.key_offset		= offsetof(struct nfsd_file, nf_inode),
+	.head_offset		= offsetof(struct nfsd_file, nf_rhash),
+	.hashfn			= nfsd_file_key_hashfn,
+	.obj_hashfn		= nfsd_file_obj_hashfn,
+	.obj_cmpfn		= nfsd_file_obj_cmpfn,
+	/* Reduce resizing churn on light workloads */
+	.min_size		= 512,		/* buckets */
+	.automatic_shrinking	= true,
+};
 
 static void
 nfsd_file_schedule_laundrette(void)
@@ -700,13 +831,18 @@ static const struct fsnotify_ops nfsd_file_fsnotify_ops = {
 int
 nfsd_file_cache_init(void)
 {
-	int		ret = -ENOMEM;
+	int		ret;
 	unsigned int	i;
 
 	lockdep_assert_held(&nfsd_mutex);
 	if (test_and_set_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 1)
 		return 0;
 
+	ret = rhashtable_init(&nfsd_file_rhash_tbl, &nfsd_file_rhash_params);
+	if (ret)
+		return ret;
+
+	ret = -ENOMEM;
 	nfsd_filecache_wq = alloc_workqueue("nfsd_filecache", 0, 0);
 	if (!nfsd_filecache_wq)
 		goto out;
@@ -784,6 +920,7 @@ nfsd_file_cache_init(void)
 	nfsd_file_hashtbl = NULL;
 	destroy_workqueue(nfsd_filecache_wq);
 	nfsd_filecache_wq = NULL;
+	rhashtable_destroy(&nfsd_file_rhash_tbl);
 	goto out;
 }
 
@@ -909,6 +1046,7 @@ nfsd_file_cache_shutdown(void)
 	nfsd_file_hashtbl = NULL;
 	destroy_workqueue(nfsd_filecache_wq);
 	nfsd_filecache_wq = NULL;
+	rhashtable_destroy(&nfsd_file_rhash_tbl);
 
 	for_each_possible_cpu(i) {
 		per_cpu(nfsd_file_cache_hits, i) = 0;
@@ -920,26 +1058,6 @@ nfsd_file_cache_shutdown(void)
 	}
 }
 
-static bool
-nfsd_match_cred(const struct cred *c1, const struct cred *c2)
-{
-	int i;
-
-	if (!uid_eq(c1->fsuid, c2->fsuid))
-		return false;
-	if (!gid_eq(c1->fsgid, c2->fsgid))
-		return false;
-	if (c1->group_info == NULL || c2->group_info == NULL)
-		return c1->group_info == c2->group_info;
-	if (c1->group_info->ngroups != c2->group_info->ngroups)
-		return false;
-	for (i = 0; i < c1->group_info->ngroups; i++) {
-		if (!gid_eq(c1->group_info->gid[i], c2->group_info->gid[i]))
-			return false;
-	}
-	return true;
-}
-
 static struct nfsd_file *
 nfsd_file_find_locked(struct inode *inode, unsigned int may_flags,
 			unsigned int hashval, struct net *net)
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 31dc65f82c75..7fc017e7b09e 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -29,6 +29,7 @@ struct nfsd_file_mark {
  * never be dereferenced, only used for comparison.
  */
 struct nfsd_file {
+	struct rhash_head	nf_rhash;
 	struct hlist_node	nf_node;
 	struct list_head	nf_lru;
 	struct rcu_head		nf_rcu;



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 26/32] NFSD: Convert the filecache to use rhashtable
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (24 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 25/32] NFSD: Set up an rhashtable for the filecache Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 27/32] NFSD: Clean up unused code after rhashtable conversion Chuck Lever
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Enable the filecache hash table to start small, then grow with the
workload. Smaller server deployments benefit because there should
be lower memory utilization. Larger server deployments should see
improved scaling with the number of open files.

Suggested-by: Jeff Layton <jlayton@kernel.org>
Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |  265 +++++++++++++++++++++++----------------------------
 fs/nfsd/trace.h     |   63 ++++++++++++
 2 files changed, 179 insertions(+), 149 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index bc2ec7db4927..7ae243ddf631 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -62,7 +62,6 @@ static struct nfsd_fcache_bucket	*nfsd_file_hashtbl;
 static struct list_lru			nfsd_file_lru;
 static unsigned long			nfsd_file_flags;
 static struct fsnotify_group		*nfsd_file_fsnotify_group;
-static atomic_long_t			nfsd_filecache_count;
 static struct delayed_work		nfsd_filecache_laundrette;
 static struct rhashtable		nfsd_file_rhash_tbl
 						____cacheline_aligned_in_smp;
@@ -198,7 +197,7 @@ static const struct rhashtable_params nfsd_file_rhash_params = {
 static void
 nfsd_file_schedule_laundrette(void)
 {
-	if ((atomic_long_read(&nfsd_filecache_count) == 0) ||
+	if ((atomic_read(&nfsd_file_rhash_tbl.nelems) == 0) ||
 	    test_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 0)
 		return;
 
@@ -298,7 +297,7 @@ nfsd_file_mark_find_or_create(struct nfsd_file *nf)
 }
 
 static struct nfsd_file *
-nfsd_file_alloc(struct inode *inode, unsigned int may, struct net *net)
+nfsd_file_alloc(struct nfsd_file_lookup_key *key, unsigned int may)
 {
 	struct nfsd_file *nf;
 
@@ -309,11 +308,14 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, struct net *net)
 		nf->nf_birthtime = ktime_get();
 		nf->nf_file = NULL;
 		nf->nf_cred = get_current_cred();
-		nf->nf_net = net;
+		nf->nf_net = key->net;
 		nf->nf_flags = 0;
-		nf->nf_inode = inode;
-		refcount_set(&nf->nf_ref, 1);
-		nf->nf_may = may & NFSD_FILE_MAY_MASK;
+		__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
+		__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
+		nf->nf_inode = key->inode;
+		/* nf_ref is pre-incremented for hash table */
+		refcount_set(&nf->nf_ref, 2);
+		nf->nf_may = key->need;
 		if (may & NFSD_MAY_NOT_BREAK_LEASE) {
 			if (may & NFSD_MAY_WRITE)
 				__set_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags);
@@ -405,40 +407,21 @@ static void nfsd_file_lru_remove(struct nfsd_file *nf)
 }
 
 static void
-nfsd_file_do_unhash(struct nfsd_file *nf)
+nfsd_file_hash_remove(struct nfsd_file *nf)
 {
-	struct inode *inode = nf->nf_inode;
-	unsigned int hashval = (unsigned int)hash_long(inode->i_ino,
-				NFSD_FILE_HASH_BITS);
-
-	lockdep_assert_held(&nfsd_file_hashtbl[hashval].nfb_lock);
-
 	trace_nfsd_file_unhash(nf);
 
 	if (nfsd_file_check_write_error(nf))
 		nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
-	--nfsd_file_hashtbl[hashval].nfb_count;
-	hlist_del_rcu(&nf->nf_node);
-	atomic_long_dec(&nfsd_filecache_count);
-}
-
-static void
-nfsd_file_hash_remove(struct nfsd_file *nf)
-{
-	struct inode *inode = nf->nf_inode;
-	unsigned int hashval = (unsigned int)hash_long(inode->i_ino,
-				NFSD_FILE_HASH_BITS);
-
-	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
-	nfsd_file_do_unhash(nf);
-	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	rhashtable_remove_fast(&nfsd_file_rhash_tbl, &nf->nf_rhash,
+			       nfsd_file_rhash_params);
 }
 
 static bool
 nfsd_file_unhash(struct nfsd_file *nf)
 {
 	if (test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
-		nfsd_file_do_unhash(nf);
+		nfsd_file_hash_remove(nf);
 		return true;
 	}
 	return false;
@@ -448,9 +431,9 @@ nfsd_file_unhash(struct nfsd_file *nf)
  * Return true if the file was unhashed.
  */
 static bool
-nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *dispose)
+nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
 {
-	trace_nfsd_file_unhash_and_release_locked(nf);
+	trace_nfsd_file_unhash_and_dispose(nf);
 	if (!nfsd_file_unhash(nf))
 		return false;
 	/* keep final reference for nfsd_file_lru_dispose */
@@ -709,20 +692,23 @@ static struct shrinker	nfsd_file_shrinker = {
 static unsigned int
 __nfsd_file_close_inode(struct inode *inode, struct list_head *dispose)
 {
-	unsigned int		hashval = (unsigned int)hash_long(inode->i_ino,
-						NFSD_FILE_HASH_BITS);
-	unsigned int		count = 0;
-	struct nfsd_file	*nf;
-	struct hlist_node	*tmp;
-
-	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
-	hlist_for_each_entry_safe(nf, tmp, &nfsd_file_hashtbl[hashval].nfb_head, nf_node) {
-		if (inode == nf->nf_inode) {
-			nfsd_file_unhash_and_release_locked(nf, dispose);
-			count++;
-		}
-	}
-	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	struct nfsd_file_lookup_key key = {
+		.type	= NFSD_FILE_KEY_INODE,
+		.inode	= inode,
+	};
+	unsigned int count = 0;
+	struct nfsd_file *nf;
+
+	rcu_read_lock();
+	do {
+		nf = rhashtable_lookup(&nfsd_file_rhash_tbl, &key,
+				       nfsd_file_rhash_params);
+		if (!nf)
+			break;
+		nfsd_file_unhash_and_dispose(nf, dispose);
+		count++;
+	} while (1);
+	rcu_read_unlock();
 	return count;
 }
 
@@ -930,30 +916,35 @@ nfsd_file_cache_init(void)
 static void
 __nfsd_file_cache_purge(struct net *net)
 {
-	unsigned int		i;
-	struct nfsd_file	*nf;
-	struct hlist_node	*next;
+	struct rhashtable_iter iter;
+	struct nfsd_file *nf;
 	LIST_HEAD(dispose);
 	bool del;
 
-	for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
-		struct nfsd_fcache_bucket *nfb = &nfsd_file_hashtbl[i];
+	rhashtable_walk_enter(&nfsd_file_rhash_tbl, &iter);
+	do {
+		rhashtable_walk_start(&iter);
 
-		spin_lock(&nfb->nfb_lock);
-		hlist_for_each_entry_safe(nf, next, &nfb->nfb_head, nf_node) {
+		nf = rhashtable_walk_next(&iter);
+		while (!IS_ERR_OR_NULL(nf)) {
 			if (net && nf->nf_net != net)
 				continue;
-			del = nfsd_file_unhash_and_release_locked(nf, &dispose);
+			del = nfsd_file_unhash_and_dispose(nf, &dispose);
 
 			/*
 			 * Deadlock detected! Something marked this entry as
 			 * unhased, but hasn't removed it from the hash list.
 			 */
 			WARN_ON_ONCE(!del);
+
+			nf = rhashtable_walk_next(&iter);
 		}
-		spin_unlock(&nfb->nfb_lock);
-		nfsd_file_dispose_list(&dispose);
-	}
+
+		rhashtable_walk_stop(&iter);
+	} while (nf == ERR_PTR(-EAGAIN));
+	rhashtable_walk_exit(&iter);
+
+	nfsd_file_dispose_list(&dispose);
 }
 
 static struct nfsd_fcache_disposal *
@@ -1058,56 +1049,29 @@ nfsd_file_cache_shutdown(void)
 	}
 }
 
-static struct nfsd_file *
-nfsd_file_find_locked(struct inode *inode, unsigned int may_flags,
-			unsigned int hashval, struct net *net)
-{
-	struct nfsd_file *nf;
-	unsigned char need = may_flags & NFSD_FILE_MAY_MASK;
-
-	hlist_for_each_entry_rcu(nf, &nfsd_file_hashtbl[hashval].nfb_head,
-				 nf_node, lockdep_is_held(&nfsd_file_hashtbl[hashval].nfb_lock)) {
-		if (nf->nf_may != need)
-			continue;
-		if (nf->nf_inode != inode)
-			continue;
-		if (nf->nf_net != net)
-			continue;
-		if (!nfsd_match_cred(nf->nf_cred, current_cred()))
-			continue;
-		if (!test_bit(NFSD_FILE_HASHED, &nf->nf_flags))
-			continue;
-		if (nfsd_file_get(nf) != NULL)
-			return nf;
-	}
-	return NULL;
-}
-
 /**
- * nfsd_file_is_cached - are there any cached open files for this fh?
- * @inode: inode of the file to check
+ * nfsd_file_is_cached - are there any cached open files for this inode?
+ * @inode: inode to check
+ *
+ * The lookup matches inodes in all net namespaces and is atomic wrt
+ * nfsd_file_acquire().
  *
- * Scan the hashtable for open files that match this fh. Returns true if there
- * are any, and false if not.
+ * Return values:
+ *   %true: filecache contains at least one file matching this inode
+ *   %false: filecache contains no files matching this inode
  */
 bool
 nfsd_file_is_cached(struct inode *inode)
 {
-	bool			ret = false;
-	struct nfsd_file	*nf;
-	unsigned int		hashval;
-
-        hashval = (unsigned int)hash_long(inode->i_ino, NFSD_FILE_HASH_BITS);
-
-	rcu_read_lock();
-	hlist_for_each_entry_rcu(nf, &nfsd_file_hashtbl[hashval].nfb_head,
-				 nf_node) {
-		if (inode == nf->nf_inode) {
-			ret = true;
-			break;
-		}
-	}
-	rcu_read_unlock();
+	struct nfsd_file_lookup_key key = {
+		.type	= NFSD_FILE_KEY_INODE,
+		.inode	= inode,
+	};
+	bool ret = false;
+
+	if (rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
+				   nfsd_file_rhash_params) != NULL)
+		ret = true;
 	trace_nfsd_file_is_cached(inode, (int)ret);
 	return ret;
 }
@@ -1116,39 +1080,51 @@ static __be32
 nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		     unsigned int may_flags, struct nfsd_file **pnf, bool open)
 {
-	__be32	status;
-	struct net *net = SVC_NET(rqstp);
+	struct nfsd_file_lookup_key key = {
+		.type	= NFSD_FILE_KEY_FULL,
+		.need	= may_flags & NFSD_FILE_MAY_MASK,
+		.net	= SVC_NET(rqstp),
+	};
 	struct nfsd_file *nf, *new;
-	struct inode *inode;
-	unsigned int hashval;
 	bool retry = true;
+	__be32 status;
 
-	/* FIXME: skip this if fh_dentry is already set? */
 	status = fh_verify(rqstp, fhp, S_IFREG,
 				may_flags|NFSD_MAY_OWNER_OVERRIDE);
 	if (status != nfs_ok)
 		return status;
+	key.inode = d_inode(fhp->fh_dentry);
+	key.cred = get_current_cred();
 
-	inode = d_inode(fhp->fh_dentry);
-	hashval = (unsigned int)hash_long(inode->i_ino, NFSD_FILE_HASH_BITS);
 retry:
-	rcu_read_lock();
-	nf = nfsd_file_find_locked(inode, may_flags, hashval, net);
-	rcu_read_unlock();
+	/* Avoid allocation if the item is already in cache */
+	nf = rhashtable_lookup_fast(&nfsd_file_rhash_tbl, &key,
+				    nfsd_file_rhash_params);
+	if (nf)
+		nf = nfsd_file_get(nf);
 	if (nf)
 		goto wait_for_construction;
 
-	new = nfsd_file_alloc(inode, may_flags, net);
+	new = nfsd_file_alloc(&key, may_flags);
 	if (!new) {
 		status = nfserr_jukebox;
 		goto out_status;
 	}
 
-	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
-	nf = nfsd_file_find_locked(inode, may_flags, hashval, net);
-	if (nf == NULL)
+	nf = rhashtable_lookup_get_insert_key(&nfsd_file_rhash_tbl,
+					      &key, &new->nf_rhash,
+					      nfsd_file_rhash_params);
+	if (!nf) {
+		nf = new;
+		goto open_file;
+	}
+	if (IS_ERR(nf))
+		goto insert_err;
+	nf = nfsd_file_get(nf);
+	if (nf == NULL) {
+		nf = new;
 		goto open_file;
-	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	}
 	nfsd_file_slab_free(&new->nf_rcu);
 
 wait_for_construction:
@@ -1156,6 +1132,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	/* Did construction of this file fail? */
 	if (!test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
+		trace_nfsd_file_cons_err(rqstp, key.inode, may_flags, nf);
 		if (!retry) {
 			status = nfserr_jukebox;
 			goto out;
@@ -1194,22 +1171,11 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	}
 
 out_status:
-	trace_nfsd_file_acquire(rqstp, inode, may_flags, nf, status);
+	put_cred(key.cred);
+	trace_nfsd_file_acquire(rqstp, key.inode, may_flags, nf, status);
 	return status;
 
 open_file:
-	nf = new;
-	/* Take reference for the hashtable */
-	refcount_inc(&nf->nf_ref);
-	__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
-	__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
-	hlist_add_head_rcu(&nf->nf_node, &nfsd_file_hashtbl[hashval].nfb_head);
-	++nfsd_file_hashtbl[hashval].nfb_count;
-	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
-			nfsd_file_hashtbl[hashval].nfb_count);
-	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
-	atomic_long_inc(&nfsd_filecache_count);
-
 	nf->nf_mark = nfsd_file_mark_find_or_create(nf);
 	if (nf->nf_mark) {
 		if (open) {
@@ -1224,19 +1190,20 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	 * If construction failed, or we raced with a call to unlink()
 	 * then unhash.
 	 */
-	if (status != nfs_ok || inode->i_nlink == 0) {
-		bool do_free;
-		nfsd_file_lru_remove(nf);
-		spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
-		do_free = nfsd_file_unhash(nf);
-		spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
-		if (do_free)
+	if (status != nfs_ok || key.inode->i_nlink == 0)
+		if (nfsd_file_unhash(nf))
 			nfsd_file_put_noref(nf);
-	}
 	clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags);
 	smp_mb__after_atomic();
 	wake_up_bit(&nf->nf_flags, NFSD_FILE_PENDING);
 	goto out;
+
+insert_err:
+	nfsd_file_slab_free(&new->nf_rcu);
+	trace_nfsd_file_insert_err(rqstp, key.inode, may_flags, PTR_ERR(nf));
+	nf = NULL;
+	status = nfserr_jukebox;
+	goto out_status;
 }
 
 /**
@@ -1282,21 +1249,23 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 {
 	unsigned long releases = 0, pages_flushed = 0, evictions = 0;
 	unsigned long hits = 0, acquisitions = 0;
-	unsigned int i, count = 0, longest = 0;
+	unsigned int i, count = 0, buckets = 0;
 	unsigned long lru = 0, total_age = 0;
 
-	/*
-	 * No need for spinlocks here since we're not terribly interested in
-	 * accuracy. We do take the nfsd_mutex simply to ensure that we
-	 * don't end up racing with server shutdown
-	 */
+	/* Serialize with server shutdown */
 	mutex_lock(&nfsd_mutex);
 	if (test_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 1) {
-		for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
-			count += nfsd_file_hashtbl[i].nfb_count;
-			longest = max(longest, nfsd_file_hashtbl[i].nfb_count);
-		}
+		struct bucket_table *tbl;
+		struct rhashtable *ht;
+
 		lru = list_lru_count(&nfsd_file_lru);
+
+		rcu_read_lock();
+		ht = &nfsd_file_rhash_tbl;
+		count = atomic_read(&ht->nelems);
+		tbl = rht_dereference_rcu(ht->tbl, ht);
+		buckets = tbl->size;
+		rcu_read_unlock();
 	}
 	mutex_unlock(&nfsd_mutex);
 
@@ -1310,7 +1279,7 @@ static int nfsd_file_cache_stats_show(struct seq_file *m, void *v)
 	}
 
 	seq_printf(m, "total entries: %u\n", count);
-	seq_printf(m, "longest chain: %u\n", longest);
+	seq_printf(m, "hash buckets:  %u\n", buckets);
 	seq_printf(m, "lru entries:   %lu\n", lru);
 	seq_printf(m, "cache hits:    %lu\n", hits);
 	seq_printf(m, "acquisitions:  %lu\n", acquisitions);
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index af609590ac86..68b02497233d 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -780,7 +780,7 @@ DEFINE_NFSD_FILE_EVENT(nfsd_file_alloc);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_put_final);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_put);
-DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_release_locked);
+DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_dispose);
 
 TRACE_EVENT(nfsd_file_acquire,
 	TP_PROTO(
@@ -823,6 +823,67 @@ TRACE_EVENT(nfsd_file_acquire,
 			__entry->nf_file, __entry->status)
 );
 
+TRACE_EVENT(nfsd_file_insert_err,
+	TP_PROTO(
+		const struct svc_rqst *rqstp,
+		const struct inode *inode,
+		unsigned int may_flags,
+		long error
+	),
+	TP_ARGS(rqstp, inode, may_flags, error),
+	TP_STRUCT__entry(
+		__field(u32, xid)
+		__field(const void *, inode)
+		__field(unsigned long, may_flags)
+		__field(long, error)
+	),
+	TP_fast_assign(
+		__entry->xid = be32_to_cpu(rqstp->rq_xid);
+		__entry->inode = inode;
+		__entry->may_flags = may_flags;
+		__entry->error = error;
+	),
+	TP_printk("xid=0x%x inode=%p may_flags=%s error=%ld",
+		__entry->xid, __entry->inode,
+		show_nfsd_may_flags(__entry->may_flags),
+		__entry->error
+	)
+);
+
+TRACE_EVENT(nfsd_file_cons_err,
+	TP_PROTO(
+		const struct svc_rqst *rqstp,
+		const struct inode *inode,
+		unsigned int may_flags,
+		const struct nfsd_file *nf
+	),
+	TP_ARGS(rqstp, inode, may_flags, nf),
+	TP_STRUCT__entry(
+		__field(u32, xid)
+		__field(const void *, inode)
+		__field(unsigned long, may_flags)
+		__field(unsigned int, nf_ref)
+		__field(unsigned long, nf_flags)
+		__field(unsigned long, nf_may)
+		__field(const void *, nf_file)
+	),
+	TP_fast_assign(
+		__entry->xid = be32_to_cpu(rqstp->rq_xid);
+		__entry->inode = inode;
+		__entry->may_flags = may_flags;
+		__entry->nf_ref = refcount_read(&nf->nf_ref);
+		__entry->nf_flags = nf->nf_flags;
+		__entry->nf_may = nf->nf_may;
+		__entry->nf_file = nf->nf_file;
+	),
+	TP_printk("xid=0x%x inode=%p may_flags=%s ref=%u nf_flags=%s nf_may=%s nf_file=%p",
+		__entry->xid, __entry->inode,
+		show_nfsd_may_flags(__entry->may_flags), __entry->nf_ref,
+		show_nf_flags(__entry->nf_flags),
+		show_nfsd_may_flags(__entry->nf_may), __entry->nf_file
+	)
+);
+
 TRACE_EVENT(nfsd_file_open,
 	TP_PROTO(struct nfsd_file *nf, __be32 status),
 	TP_ARGS(nf, status),



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 27/32] NFSD: Clean up unused code after rhashtable conversion
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (25 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 26/32] NFSD: Convert the filecache to use rhashtable Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 28/32] NFSD: Separate tracepoints for acquire and create Chuck Lever
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   33 +--------------------------------
 fs/nfsd/filecache.h |    1 -
 2 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 7ae243ddf631..47826fb09b67 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -22,11 +22,6 @@
 #include "filecache.h"
 #include "trace.h"
 
-#define NFSDDBG_FACILITY	NFSDDBG_FH
-
-/* FIXME: dynamically size this for the machine somehow? */
-#define NFSD_FILE_HASH_BITS                   12
-#define NFSD_FILE_HASH_SIZE                  (1 << NFSD_FILE_HASH_BITS)
 #define NFSD_LAUNDRETTE_DELAY		     (2 * HZ)
 
 #define NFSD_FILE_CACHE_UP		     (0)
@@ -34,13 +29,6 @@
 /* We only care about NFSD_MAY_READ/WRITE for this cache */
 #define NFSD_FILE_MAY_MASK	(NFSD_MAY_READ|NFSD_MAY_WRITE)
 
-struct nfsd_fcache_bucket {
-	struct hlist_head	nfb_head;
-	spinlock_t		nfb_lock;
-	unsigned int		nfb_count;
-	unsigned int		nfb_maxcount;
-};
-
 static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_releases);
@@ -58,7 +46,6 @@ static struct workqueue_struct *nfsd_filecache_wq __read_mostly;
 
 static struct kmem_cache		*nfsd_file_slab;
 static struct kmem_cache		*nfsd_file_mark_slab;
-static struct nfsd_fcache_bucket	*nfsd_file_hashtbl;
 static struct list_lru			nfsd_file_lru;
 static unsigned long			nfsd_file_flags;
 static struct fsnotify_group		*nfsd_file_fsnotify_group;
@@ -303,7 +290,6 @@ nfsd_file_alloc(struct nfsd_file_lookup_key *key, unsigned int may)
 
 	nf = kmem_cache_alloc(nfsd_file_slab, GFP_KERNEL);
 	if (nf) {
-		INIT_HLIST_NODE(&nf->nf_node);
 		INIT_LIST_HEAD(&nf->nf_lru);
 		nf->nf_birthtime = ktime_get();
 		nf->nf_file = NULL;
@@ -817,8 +803,7 @@ static const struct fsnotify_ops nfsd_file_fsnotify_ops = {
 int
 nfsd_file_cache_init(void)
 {
-	int		ret;
-	unsigned int	i;
+	int ret;
 
 	lockdep_assert_held(&nfsd_mutex);
 	if (test_and_set_bit(NFSD_FILE_CACHE_UP, &nfsd_file_flags) == 1)
@@ -833,13 +818,6 @@ nfsd_file_cache_init(void)
 	if (!nfsd_filecache_wq)
 		goto out;
 
-	nfsd_file_hashtbl = kvcalloc(NFSD_FILE_HASH_SIZE,
-				sizeof(*nfsd_file_hashtbl), GFP_KERNEL);
-	if (!nfsd_file_hashtbl) {
-		pr_err("nfsd: unable to allocate nfsd_file_hashtbl\n");
-		goto out_err;
-	}
-
 	nfsd_file_slab = kmem_cache_create("nfsd_file",
 				sizeof(struct nfsd_file), 0, 0, NULL);
 	if (!nfsd_file_slab) {
@@ -883,11 +861,6 @@ nfsd_file_cache_init(void)
 		goto out_notifier;
 	}
 
-	for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
-		INIT_HLIST_HEAD(&nfsd_file_hashtbl[i].nfb_head);
-		spin_lock_init(&nfsd_file_hashtbl[i].nfb_lock);
-	}
-
 	INIT_DELAYED_WORK(&nfsd_filecache_laundrette, nfsd_file_gc_worker);
 out:
 	return ret;
@@ -902,8 +875,6 @@ nfsd_file_cache_init(void)
 	nfsd_file_slab = NULL;
 	kmem_cache_destroy(nfsd_file_mark_slab);
 	nfsd_file_mark_slab = NULL;
-	kvfree(nfsd_file_hashtbl);
-	nfsd_file_hashtbl = NULL;
 	destroy_workqueue(nfsd_filecache_wq);
 	nfsd_filecache_wq = NULL;
 	rhashtable_destroy(&nfsd_file_rhash_tbl);
@@ -1033,8 +1004,6 @@ nfsd_file_cache_shutdown(void)
 	fsnotify_wait_marks_destroyed();
 	kmem_cache_destroy(nfsd_file_mark_slab);
 	nfsd_file_mark_slab = NULL;
-	kvfree(nfsd_file_hashtbl);
-	nfsd_file_hashtbl = NULL;
 	destroy_workqueue(nfsd_filecache_wq);
 	nfsd_filecache_wq = NULL;
 	rhashtable_destroy(&nfsd_file_rhash_tbl);
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 7fc017e7b09e..7b40d5b446e1 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -30,7 +30,6 @@ struct nfsd_file_mark {
  */
 struct nfsd_file {
 	struct rhash_head	nf_rhash;
-	struct hlist_node	nf_node;
 	struct list_head	nf_lru;
 	struct rcu_head		nf_rcu;
 	struct file		*nf_file;



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 28/32] NFSD: Separate tracepoints for acquire and create
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (26 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 27/32] NFSD: Clean up unused code after rhashtable conversion Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 29/32] NFSD: Move nfsd_file_trace_alloc() tracepoint Chuck Lever
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

These tracepoints collect different information: the create case does
not open a file, so there's no nf_file available.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    9 +++++----
 fs/nfsd/nfs4state.c |    1 +
 fs/nfsd/trace.h     |   54 +++++++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 47826fb09b67..21bc7b9c25f3 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1046,7 +1046,7 @@ nfsd_file_is_cached(struct inode *inode)
 }
 
 static __be32
-nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
+nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		     unsigned int may_flags, struct nfsd_file **pnf, bool open)
 {
 	struct nfsd_file_lookup_key key = {
@@ -1141,7 +1141,8 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 out_status:
 	put_cred(key.cred);
-	trace_nfsd_file_acquire(rqstp, key.inode, may_flags, nf, status);
+	if (open)
+		trace_nfsd_file_acquire(rqstp, key.inode, may_flags, nf, status);
 	return status;
 
 open_file:
@@ -1189,7 +1190,7 @@ __be32
 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		  unsigned int may_flags, struct nfsd_file **pnf)
 {
-	return nfsd_do_file_acquire(rqstp, fhp, may_flags, pnf, true);
+	return nfsd_file_do_acquire(rqstp, fhp, may_flags, pnf, true);
 }
 
 /**
@@ -1206,7 +1207,7 @@ __be32
 nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		 unsigned int may_flags, struct nfsd_file **pnf)
 {
-	return nfsd_do_file_acquire(rqstp, fhp, may_flags, pnf, false);
+	return nfsd_file_do_acquire(rqstp, fhp, may_flags, pnf, false);
 }
 
 /*
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9409a0dc1b76..3a05c095dfe5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5104,6 +5104,7 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 				goto out_put_access;
 			nf->nf_file = open->op_filp;
 			open->op_filp = NULL;
+			trace_nfsd_file_create(rqstp, access, nf);
 		}
 
 		spin_lock(&fp->fi_lock);
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 68b02497233d..1c4cf9d2dd8e 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -784,10 +784,10 @@ DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_dispose);
 
 TRACE_EVENT(nfsd_file_acquire,
 	TP_PROTO(
-		struct svc_rqst *rqstp,
-		struct inode *inode,
+		const struct svc_rqst *rqstp,
+		const struct inode *inode,
 		unsigned int may_flags,
-		struct nfsd_file *nf,
+		const struct nfsd_file *nf,
 		__be32 status
 	),
 
@@ -795,12 +795,12 @@ TRACE_EVENT(nfsd_file_acquire,
 
 	TP_STRUCT__entry(
 		__field(u32, xid)
-		__field(void *, inode)
+		__field(const void *, inode)
 		__field(unsigned long, may_flags)
-		__field(int, nf_ref)
+		__field(unsigned int, nf_ref)
 		__field(unsigned long, nf_flags)
 		__field(unsigned long, nf_may)
-		__field(struct file *, nf_file)
+		__field(const void *, nf_file)
 		__field(u32, status)
 	),
 
@@ -815,12 +815,50 @@ TRACE_EVENT(nfsd_file_acquire,
 		__entry->status = be32_to_cpu(status);
 	),
 
-	TP_printk("xid=0x%x inode=%p may_flags=%s ref=%d nf_flags=%s nf_may=%s nf_file=%p status=%u",
+	TP_printk("xid=0x%x inode=%p may_flags=%s ref=%u nf_flags=%s nf_may=%s nf_file=%p status=%u",
 			__entry->xid, __entry->inode,
 			show_nfsd_may_flags(__entry->may_flags),
 			__entry->nf_ref, show_nf_flags(__entry->nf_flags),
 			show_nfsd_may_flags(__entry->nf_may),
-			__entry->nf_file, __entry->status)
+			__entry->nf_file, __entry->status
+	)
+);
+
+TRACE_EVENT(nfsd_file_create,
+	TP_PROTO(
+		const struct svc_rqst *rqstp,
+		unsigned int may_flags,
+		const struct nfsd_file *nf
+	),
+
+	TP_ARGS(rqstp, may_flags, nf),
+
+	TP_STRUCT__entry(
+		__field(const void *, nf_inode)
+		__field(const void *, nf_file)
+		__field(unsigned long, may_flags)
+		__field(unsigned long, nf_flags)
+		__field(unsigned long, nf_may)
+		__field(unsigned int, nf_ref)
+		__field(u32, xid)
+	),
+
+	TP_fast_assign(
+		__entry->nf_inode = nf->nf_inode;
+		__entry->nf_file = nf->nf_file;
+		__entry->may_flags = may_flags;
+		__entry->nf_flags = nf->nf_flags;
+		__entry->nf_may = nf->nf_may;
+		__entry->nf_ref = refcount_read(&nf->nf_ref);
+		__entry->xid = be32_to_cpu(rqstp->rq_xid);
+	),
+
+	TP_printk("xid=0x%x inode=%p may_flags=%s ref=%u nf_flags=%s nf_may=%s nf_file=%p",
+		__entry->xid, __entry->nf_inode,
+		show_nfsd_may_flags(__entry->may_flags),
+		__entry->nf_ref, show_nf_flags(__entry->nf_flags),
+		show_nfsd_may_flags(__entry->nf_may), __entry->nf_file
+	)
 );
 
 TRACE_EVENT(nfsd_file_insert_err,



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 29/32] NFSD: Move nfsd_file_trace_alloc() tracepoint
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (27 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 28/32] NFSD: Separate tracepoints for acquire and create Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:26 ` [PATCH v3 30/32] NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint Chuck Lever
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

Avoid recording the allocation of an nfsd_file item that is
immediately released because a matching item was already
inserted in the hash.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    2 +-
 fs/nfsd/trace.h     |   25 ++++++++++++++++++++++++-
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 21bc7b9c25f3..a904364551e8 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -309,7 +309,6 @@ nfsd_file_alloc(struct nfsd_file_lookup_key *key, unsigned int may)
 				__set_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
 		}
 		nf->nf_mark = NULL;
-		trace_nfsd_file_alloc(nf);
 	}
 	return nf;
 }
@@ -1146,6 +1145,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	return status;
 
 open_file:
+	trace_nfsd_file_alloc(nf);
 	nf->nf_mark = nfsd_file_mark_find_or_create(nf);
 	if (nf->nf_mark) {
 		if (open) {
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 1c4cf9d2dd8e..96bb6629541e 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -776,12 +776,35 @@ DEFINE_EVENT(nfsd_file_class, name, \
 	TP_PROTO(struct nfsd_file *nf), \
 	TP_ARGS(nf))
 
-DEFINE_NFSD_FILE_EVENT(nfsd_file_alloc);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_put_final);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_put);
 DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_dispose);
 
+TRACE_EVENT(nfsd_file_alloc,
+	TP_PROTO(
+		const struct nfsd_file *nf
+	),
+	TP_ARGS(nf),
+	TP_STRUCT__entry(
+		__field(const void *, nf_inode)
+		__field(unsigned long, nf_flags)
+		__field(unsigned long, nf_may)
+		__field(unsigned int, nf_ref)
+	),
+	TP_fast_assign(
+		__entry->nf_inode = nf->nf_inode;
+		__entry->nf_flags = nf->nf_flags;
+		__entry->nf_ref = refcount_read(&nf->nf_ref);
+		__entry->nf_may = nf->nf_may;
+	),
+	TP_printk("inode=%p ref=%u flags=%s may=%s",
+		__entry->nf_inode, __entry->nf_ref,
+		show_nf_flags(__entry->nf_flags),
+		show_nfsd_may_flags(__entry->nf_may)
+	)
+);
+
 TRACE_EVENT(nfsd_file_acquire,
 	TP_PROTO(
 		const struct svc_rqst *rqstp,



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 30/32] NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (28 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 29/32] NFSD: Move nfsd_file_trace_alloc() tracepoint Chuck Lever
@ 2022-07-08 18:26 ` Chuck Lever
  2022-07-08 18:27 ` [PATCH v3 31/32] NFSD: NFSv4 CLOSE should release an nfsd_file immediately Chuck Lever
                   ` (2 subsequent siblings)
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:26 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

As a convenience, display the mode and event mask symbolically
rather than numerically.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/trace.h           |   21 ++++++++++++++-------
 include/trace/events/fs.h |   37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 96bb6629541e..f68bcf13be8e 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -9,6 +9,7 @@
 #define _NFSD_TRACE_H
 
 #include <linux/tracepoint.h>
+#include <trace/events/fs.h>
 
 #include "export.h"
 #include "nfsfh.h"
@@ -1020,22 +1021,28 @@ TRACE_EVENT(nfsd_file_is_cached,
 );
 
 TRACE_EVENT(nfsd_file_fsnotify_handle_event,
-	TP_PROTO(struct inode *inode, u32 mask),
+	TP_PROTO(
+		const struct inode *inode,
+		u32 mask
+	),
 	TP_ARGS(inode, mask),
 	TP_STRUCT__entry(
-		__field(struct inode *, inode)
+		__field(const struct inode *, inode)
 		__field(unsigned int, nlink)
-		__field(umode_t, mode)
-		__field(u32, mask)
+		__field(unsigned long, mode)
+		__field(unsigned long, mask)
 	),
 	TP_fast_assign(
 		__entry->inode = inode;
 		__entry->nlink = inode->i_nlink;
-		__entry->mode = inode->i_mode;
+		__entry->mode = inode->i_mode & S_IFMT;
 		__entry->mask = mask;
 	),
-	TP_printk("inode=%p nlink=%u mode=0%ho mask=0x%x", __entry->inode,
-			__entry->nlink, __entry->mode, __entry->mask)
+	TP_printk("inode=%p nlink=%u mode=%s mask=%s",
+		__entry->inode, __entry->nlink,
+		show_fs_file_type(__entry->mode),
+		show_fs_notify_flags(__entry->mask)
+	)
 );
 
 DECLARE_EVENT_CLASS(nfsd_file_gc_class,
diff --git a/include/trace/events/fs.h b/include/trace/events/fs.h
index 738b97f22f36..3c75f85086a2 100644
--- a/include/trace/events/fs.h
+++ b/include/trace/events/fs.h
@@ -120,3 +120,40 @@
 		{ LOOKUP_BENEATH,	"BENEATH" }, \
 		{ LOOKUP_IN_ROOT,	"IN_ROOT" }, \
 		{ LOOKUP_CACHED,	"CACHED" })
+
+#define show_fs_file_type(x) \
+	__print_symbolic(x, \
+		{ S_IFLNK,		"LNK" }, \
+		{ S_IFREG,		"REG" }, \
+		{ S_IFDIR,		"DIR" }, \
+		{ S_IFCHR,		"CHR" }, \
+		{ S_IFBLK,		"BLK" }, \
+		{ S_IFIFO,		"FIFO" }, \
+		{ S_IFSOCK,		"SOCK" })
+
+#define show_fs_notify_flags(x) \
+	__print_flags(x, "|", \
+		{ FS_ACCESS,		"ACCESS" }, \
+		{ FS_MODIFY,		"MODIFY" }, \
+		{ FS_ATTRIB,		"ATTRIB" }, \
+		{ FS_CLOSE_WRITE,	"CLOSE_WRITE" }, \
+		{ FS_CLOSE_NOWRITE,	"CLOSE_NOWRITE" }, \
+		{ FS_OPEN,		"OPEN" }, \
+		{ FS_MOVED_FROM,	"MOVED_FROM" }, \
+		{ FS_MOVED_TO,		"MOVED_TO" }, \
+		{ FS_CREATE,		"CREATE" }, \
+		{ FS_DELETE,		"DELETE" }, \
+		{ FS_DELETE_SELF,	"DELETE_SELF" }, \
+		{ FS_MOVE_SELF,		"MOVE_SELF" }, \
+		{ FS_OPEN_EXEC,		"OPEN_EXEC" }, \
+		{ FS_UNMOUNT,		"UNMOUNT" }, \
+		{ FS_Q_OVERFLOW,	"Q_OVERFLOW" }, \
+		{ FS_ERROR,		"ERROR" }, \
+		{ FS_IN_IGNORED,	"IN_IGNORED" }, \
+		{ FS_OPEN_PERM,		"OPEN_PERM" }, \
+		{ FS_ACCESS_PERM,	"ACCESS_PERM" }, \
+		{ FS_OPEN_EXEC_PERM,	"OPEN_EXEC_PERM" }, \
+		{ FS_EVENT_ON_CHILD,	"EVENT_ON_CHILD" }, \
+		{ FS_RENAME,		"RENAME" }, \
+		{ FS_DN_MULTISHOT,	"DN_MULTISHOT" }, \
+		{ FS_ISDIR,		"ISDIR" })



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 31/32] NFSD: NFSv4 CLOSE should release an nfsd_file immediately
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (29 preceding siblings ...)
  2022-07-08 18:26 ` [PATCH v3 30/32] NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint Chuck Lever
@ 2022-07-08 18:27 ` Chuck Lever
  2022-07-08 18:27 ` [PATCH v3 32/32] NFSD: Ensure nf_inode is never dereferenced Chuck Lever
  2022-07-08 20:27 ` [PATCH v3 00/32] Overhaul NFSD filecache Jeff Layton
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:27 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

The last close of a file should enable other accessors to open and
use that file immediately. Leaving the file open in the filecache
prevents other users from accessing that file until the filecache
garbage-collects the file -- sometimes that takes several seconds.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |   18 ++++++++++++++++++
 fs/nfsd/filecache.h |    1 +
 fs/nfsd/nfs4state.c |    4 ++--
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index a904364551e8..36d61012d5de 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -458,6 +458,24 @@ nfsd_file_put(struct nfsd_file *nf)
 		nfsd_file_put_noref(nf);
 }
 
+/**
+ * nfsd_file_close - Close an nfsd_file
+ * @nf: nfsd_file to close
+ *
+ * If this is the final reference for @nf, free it immediately.
+ * This reflects an on-the-wire CLOSE or DELEGRETURN into the
+ * VFS and exported filesystem.
+ */
+void nfsd_file_close(struct nfsd_file *nf)
+{
+	nfsd_file_put(nf);
+	if (refcount_dec_if_one(&nf->nf_ref)) {
+		nfsd_file_unhash(nf);
+		nfsd_file_lru_remove(nf);
+		nfsd_file_free(nf);
+	}
+}
+
 struct nfsd_file *
 nfsd_file_get(struct nfsd_file *nf)
 {
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 7b40d5b446e1..c5ddc877116b 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -54,6 +54,7 @@ void nfsd_file_cache_shutdown(void);
 int nfsd_file_cache_start_net(struct net *net);
 void nfsd_file_cache_shutdown_net(struct net *net);
 void nfsd_file_put(struct nfsd_file *nf);
+void nfsd_file_close(struct nfsd_file *nf);
 struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
 void nfsd_file_close_inode_sync(struct inode *inode);
 bool nfsd_file_is_cached(struct inode *inode);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3a05c095dfe5..9d1a3e131c49 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -820,9 +820,9 @@ static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 			swap(f2, fp->fi_fds[O_RDWR]);
 		spin_unlock(&fp->fi_lock);
 		if (f1)
-			nfsd_file_put(f1);
+			nfsd_file_close(f1);
 		if (f2)
-			nfsd_file_put(f2);
+			nfsd_file_close(f2);
 	}
 }
 



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 32/32] NFSD: Ensure nf_inode is never dereferenced
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (30 preceding siblings ...)
  2022-07-08 18:27 ` [PATCH v3 31/32] NFSD: NFSv4 CLOSE should release an nfsd_file immediately Chuck Lever
@ 2022-07-08 18:27 ` Chuck Lever
  2022-07-08 20:27 ` [PATCH v3 00/32] Overhaul NFSD filecache Jeff Layton
  32 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever @ 2022-07-08 18:27 UTC (permalink / raw)
  To: linux-nfs, netdev; +Cc: david, jlayton, tgraf

The documenting comment for struct nf_file states:

/*
 * A representation of a file that has been opened by knfsd. These are hashed
 * in the hashtable by inode pointer value. Note that this object doesn't
 * hold a reference to the inode by itself, so the nf_inode pointer should
 * never be dereferenced, only used for comparison.
 */

Replace the two existing dereferences to make the comment always
true.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/filecache.c |    5 ++---
 fs/nfsd/filecache.h |    2 +-
 fs/nfsd/nfs4state.c |    2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 36d61012d5de..11099c8918a4 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -228,12 +228,11 @@ nfsd_file_mark_put(struct nfsd_file_mark *nfm)
 }
 
 static struct nfsd_file_mark *
-nfsd_file_mark_find_or_create(struct nfsd_file *nf)
+nfsd_file_mark_find_or_create(struct nfsd_file *nf, struct inode *inode)
 {
 	int			err;
 	struct fsnotify_mark	*mark;
 	struct nfsd_file_mark	*nfm = NULL, *new;
-	struct inode *inode = nf->nf_inode;
 
 	do {
 		fsnotify_group_lock(nfsd_file_fsnotify_group);
@@ -1164,7 +1163,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 open_file:
 	trace_nfsd_file_alloc(nf);
-	nf->nf_mark = nfsd_file_mark_find_or_create(nf);
+	nf->nf_mark = nfsd_file_mark_find_or_create(nf, key.inode);
 	if (nf->nf_mark) {
 		if (open) {
 			status = nfsd_open_verified(rqstp, fhp, may_flags,
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index c5ddc877116b..d534b76cb65b 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -41,7 +41,7 @@ struct nfsd_file {
 #define NFSD_FILE_BREAK_WRITE	(3)
 #define NFSD_FILE_REFERENCED	(4)
 	unsigned long		nf_flags;
-	struct inode		*nf_inode;
+	struct inode		*nf_inode;	/* don't deref */
 	refcount_t		nf_ref;
 	unsigned char		nf_may;
 	struct nfsd_file_mark	*nf_mark;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9d1a3e131c49..994bd11bafe0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2564,7 +2564,7 @@ static void nfs4_show_fname(struct seq_file *s, struct nfsd_file *f)
 
 static void nfs4_show_superblock(struct seq_file *s, struct nfsd_file *f)
 {
-	struct inode *inode = f->nf_inode;
+	struct inode *inode = file_inode(f->nf_file);
 
 	seq_printf(s, "superblock: \"%02x:%02x:%ld\"",
 					MAJOR(inode->i_sb->s_dev),



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 12/32] NFSD: Hook up the filecache stat file
  2022-07-08 18:24 ` [PATCH v3 12/32] NFSD: Hook up the filecache stat file Chuck Lever
@ 2022-07-08 19:16   ` Jeff Layton
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff Layton @ 2022-07-08 19:16 UTC (permalink / raw)
  To: Chuck Lever, linux-nfs, netdev; +Cc: david, tgraf

On Fri, 2022-07-08 at 14:24 -0400, Chuck Lever wrote:
> There has always been the capability of exporting filecache metrics
> via /proc, but it was never hooked up. Let's surface these metrics
> to enable better observability of the filecache.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfsctl.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 66c352bf61b1..ecc08cf97a86 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -25,6 +25,7 @@
>  #include "state.h"
>  #include "netns.h"
>  #include "pnfs.h"
> +#include "filecache.h"
>  
>  /*
>   *	We have a single directory with several nodes in it.
> @@ -46,6 +47,7 @@ enum {
>  	NFSD_MaxBlkSize,
>  	NFSD_MaxConnections,
>  	NFSD_SupportedEnctypes,
> +	NFSD_Filecache,
>  	/*
>  	 * The below MUST come last.  Otherwise we leave a hole in nfsd_files[]
>  	 * with !CONFIG_NFSD_V4 and simple_fill_super() goes oops
> @@ -229,6 +231,13 @@ static const struct file_operations reply_cache_stats_operations = {
>  	.release	= single_release,
>  };
>  
> +static const struct file_operations filecache_ops = {
> +	.open		= nfsd_file_cache_stats_open,
> +	.read		= seq_read,
> +	.llseek		= seq_lseek,
> +	.release	= single_release,
> +};
> +
>  /*----------------------------------------------------------------------------*/
>  /*
>   * payload - write methods
> @@ -1370,6 +1379,7 @@ static int nfsd_fill_super(struct super_block *sb, struct fs_context *fc)
>  		[NFSD_Ports] = {"portlist", &transaction_ops, S_IWUSR|S_IRUGO},
>  		[NFSD_MaxBlkSize] = {"max_block_size", &transaction_ops, S_IWUSR|S_IRUGO},
>  		[NFSD_MaxConnections] = {"max_connections", &transaction_ops, S_IWUSR|S_IRUGO},
> +		[NFSD_Filecache] = {"filecache", &filecache_ops, S_IRUGO},
>  #if defined(CONFIG_SUNRPC_GSS) || defined(CONFIG_SUNRPC_GSS_MODULE)
>  		[NFSD_SupportedEnctypes] = {"supported_krb5_enctypes", &supported_enctypes_ops, S_IRUGO},
>  #endif /* CONFIG_SUNRPC_GSS or CONFIG_SUNRPC_GSS_MODULE */
> 
> 

<facepalm>
Ouch, that's quite an oversight.
</facepalm>

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU
  2022-07-08 18:25 ` [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU Chuck Lever
@ 2022-07-08 19:29   ` Jeff Layton
  2022-07-09 20:45     ` Chuck Lever III
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff Layton @ 2022-07-08 19:29 UTC (permalink / raw)
  To: Chuck Lever, linux-nfs, netdev; +Cc: david, tgraf

On Fri, 2022-07-08 at 14:25 -0400, Chuck Lever wrote:
> There have been reports of problems when running fstests generic/531
> against Linux NFS servers with NFSv4. The NFS server that hosts the
> test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
> Analysis shows that:
> 
> fs/nfsd/filecache.c
>  482                 ret = list_lru_walk(&nfsd_file_lru,
>  483                                 nfsd_file_lru_cb,
>  484                                 &head, LONG_MAX);
> 
> causes nfsd_file_gc() to walk the entire length of the filecache LRU
> list every time it is called (which is quite frequently). The walk
> holds a spinlock the entire time that prevents other nfsd threads
> from accessing the filecache.
> 
> What's more, for NFSv4 workloads, none of the items that are visited
> during this walk may be evicted, since they are all files that are
> held OPEN by NFS clients.
> 
> Address this by ensuring that open files are not kept on the LRU
> list.
> 
> Reported-by: Frank van der Linden <fllinden@amazon.com>
> Reported-by: Wang Yugui <wangyugui@e16-tech.com>
> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
> Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/filecache.c |   24 +++++++++++++++++++-----
>  fs/nfsd/trace.h     |    2 ++
>  2 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 37373b012276..6e9e186334ab 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -269,6 +269,7 @@ nfsd_file_flush(struct nfsd_file *nf)
>  
>  static void nfsd_file_lru_add(struct nfsd_file *nf)
>  {
> +	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
>  	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru))
>  		trace_nfsd_file_lru_add(nf);
>  }
> @@ -298,7 +299,6 @@ nfsd_file_unhash(struct nfsd_file *nf)
>  {
>  	if (test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
>  		nfsd_file_do_unhash(nf);
> -		nfsd_file_lru_remove(nf);
>  		return true;
>  	}
>  	return false;
> @@ -319,6 +319,7 @@ nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *disp
>  	if (refcount_dec_not_one(&nf->nf_ref))
>  		return true;
>  
> +	nfsd_file_lru_remove(nf);
>  	list_add(&nf->nf_lru, dispose);
>  	return true;
>  }
> @@ -330,6 +331,7 @@ nfsd_file_put_noref(struct nfsd_file *nf)
>  
>  	if (refcount_dec_and_test(&nf->nf_ref)) {
>  		WARN_ON(test_bit(NFSD_FILE_HASHED, &nf->nf_flags));
> +		nfsd_file_lru_remove(nf);
>  		nfsd_file_free(nf);
>  	}
>  }
> @@ -339,7 +341,7 @@ nfsd_file_put(struct nfsd_file *nf)
>  {
>  	might_sleep();
>  
> -	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
> +	nfsd_file_lru_add(nf);

Do you really want to add this on every put? I would have thought you'd
only want to do this on a 2->1 nf_ref transition.

>  	if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0) {
>  		nfsd_file_flush(nf);
>  		nfsd_file_put_noref(nf);
> @@ -439,8 +441,18 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>  	}
>  }
>  
> -/*
> +/**
> + * nfsd_file_lru_cb - Examine an entry on the LRU list
> + * @item: LRU entry to examine
> + * @lru: controlling LRU
> + * @lock: LRU list lock (unused)
> + * @arg: dispose list
> + *
>   * Note this can deadlock with nfsd_file_cache_purge.
> + *
> + * Return values:
> + *   %LRU_REMOVED: @item was removed from the LRU
> + *   %LRU_SKIP: @item cannot be evicted
>   */
>  static enum lru_status
>  nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
> @@ -462,8 +474,9 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
>  	 * That order is deliberate to ensure that we can do this locklessly.
>  	 */
>  	if (refcount_read(&nf->nf_ref) > 1) {
> +		list_lru_isolate(lru, &nf->nf_lru);
>  		trace_nfsd_file_gc_in_use(nf);
> -		return LRU_SKIP;
> +		return LRU_REMOVED;

Interesting. So you wait until the LRU scanner runs to remove these
entries? I expected to see you do this in nfsd_file_get, but this does
seem likely to be more efficient.

>  	}
>  
>  	/*
> @@ -1020,6 +1033,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		goto retry;
>  	}
>  
> +	nfsd_file_lru_remove(nf);
>  	this_cpu_inc(nfsd_file_cache_hits);
>  
>  	if (!(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
> @@ -1055,7 +1069,6 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	refcount_inc(&nf->nf_ref);
>  	__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
>  	__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
> -	nfsd_file_lru_add(nf);
>  	hlist_add_head_rcu(&nf->nf_node, &nfsd_file_hashtbl[hashval].nfb_head);
>  	++nfsd_file_hashtbl[hashval].nfb_count;
>  	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
> @@ -1080,6 +1093,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	 */
>  	if (status != nfs_ok || inode->i_nlink == 0) {
>  		bool do_free;
> +		nfsd_file_lru_remove(nf);
>  		spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
>  		do_free = nfsd_file_unhash(nf);
>  		spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> index 1cc1133371eb..54082b868b72 100644
> --- a/fs/nfsd/trace.h
> +++ b/fs/nfsd/trace.h
> @@ -929,7 +929,9 @@ DEFINE_EVENT(nfsd_file_gc_class, name,					\
>  	TP_ARGS(nf))
>  
>  DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add);
> +DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add_disposed);
>  DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del);
> +DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del_disposed);
>  DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_in_use);
>  DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_writeback);
>  DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_referenced);
> 
> 

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 16/32] NFSD: Fix the filecache LRU shrinker
  2022-07-08 18:25 ` [PATCH v3 16/32] NFSD: Fix the filecache LRU shrinker Chuck Lever
@ 2022-07-08 19:37   ` Jeff Layton
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff Layton @ 2022-07-08 19:37 UTC (permalink / raw)
  To: Chuck Lever, linux-nfs, netdev; +Cc: david, tgraf

On Fri, 2022-07-08 at 14:25 -0400, Chuck Lever wrote:
> Without LRU item rotation, the shrinker visits only a few items on
> the end of the LRU list, and those would always be long-term OPEN
> files for NFSv4 workloads. That makes the filecache shrinker
> completely ineffective.
> 
> Adopt the same strategy as the inode LRU by using LRU_ROTATE.
> 
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/filecache.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 6e9e186334ab..bd6ba63f69ae 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -452,6 +452,7 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>   *
>   * Return values:
>   *   %LRU_REMOVED: @item was removed from the LRU
> + *   %LRU_ROTATED: @item is to be moved to the LRU tail

%LRU_ROTATE

>   *   %LRU_SKIP: @item cannot be evicted
>   */
>  static enum lru_status
> @@ -490,7 +491,7 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
>  
>  	if (test_and_clear_bit(NFSD_FILE_REFERENCED, &nf->nf_flags)) {
>  		trace_nfsd_file_gc_referenced(nf);
> -		return LRU_SKIP;
> +		return LRU_ROTATE;
>  	}
>  
>  	if (!test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
> @@ -532,7 +533,7 @@ nfsd_file_gc(void)
>  	unsigned long ret;
>  
>  	ret = list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
> -			    &dispose, LONG_MAX);
> +			    &dispose, list_lru_count(&nfsd_file_lru));
>  	trace_nfsd_file_gc_removed(ret, list_lru_count(&nfsd_file_lru));
>  	nfsd_file_gc_dispose_list(&dispose);
>  }
> 
> 

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths
  2022-07-08 18:25 ` [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths Chuck Lever
@ 2022-07-08 19:43   ` Jeff Layton
  2022-07-08 19:45     ` Chuck Lever III
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff Layton @ 2022-07-08 19:43 UTC (permalink / raw)
  To: Chuck Lever, linux-nfs, netdev; +Cc: david, tgraf

On Fri, 2022-07-08 at 14:25 -0400, Chuck Lever wrote:
> The checks in nfsd_file_acquire() and nfsd_file_put() that directly
> invoke filecache garbage collection are intended to keep cache
> occupancy between a low- and high-watermark. The reason to limit the
> capacity of the filecache is to keep filecache lookups reasonably
> fast.
> 
> However, invoking garbage collection at those points has some
> undesirable negative impacts. Files that are held open by NFSv4
> clients often push the occupancy of the filecache over these
> watermarks. At that point:
> 
> - Every call to nfsd_file_acquire() and nfsd_file_put() results in
>   an LRU walk. This has the same effect on lookup latency as long
>   chains in the hash table.
> - Garbage collection will then run on every nfsd thread, causing a
>   lot of unnecessary lock contention.
> - Limiting cache capacity pushes out files used only by NFSv3
>   clients, which are the type of files the filecache is supposed to
>   help.
> 
> To address those negative impacts, remove the direct calls to the
> garbage collector. Subsequent patches will address maintaining
> lookup efficiency as cache capacity increases.
> 
> Suggested-by: Wang Yugui <wangyugui@e16-tech.com>
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/filecache.c |   10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index bd6ba63f69ae..faa8588663d6 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -29,8 +29,6 @@
>  #define NFSD_LAUNDRETTE_DELAY		     (2 * HZ)
>  
>  #define NFSD_FILE_SHUTDOWN		     (1)
> -#define NFSD_FILE_LRU_THRESHOLD		     (4096UL)
> -#define NFSD_FILE_LRU_LIMIT		     (NFSD_FILE_LRU_THRESHOLD << 2)
>  
>  /* We only care about NFSD_MAY_READ/WRITE for this cache */
>  #define NFSD_FILE_MAY_MASK	(NFSD_MAY_READ|NFSD_MAY_WRITE)
> @@ -66,8 +64,6 @@ static struct fsnotify_group		*nfsd_file_fsnotify_group;
>  static atomic_long_t			nfsd_filecache_count;
>  static struct delayed_work		nfsd_filecache_laundrette;
>  
> -static void nfsd_file_gc(void);
> -
>  static void
>  nfsd_file_schedule_laundrette(void)
>  {
> @@ -350,9 +346,6 @@ nfsd_file_put(struct nfsd_file *nf)
>  		nfsd_file_schedule_laundrette();
>  	} else
>  		nfsd_file_put_noref(nf);
> -
> -	if (atomic_long_read(&nfsd_filecache_count) >= NFSD_FILE_LRU_LIMIT)
> -		nfsd_file_gc();

This may be addressed in later patches, but instead of just removing
these, would it be better to instead call
nfsd_file_schedule_laundrette() ?

>  }
>  
>  struct nfsd_file *
> @@ -1075,8 +1068,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
>  			nfsd_file_hashtbl[hashval].nfb_count);
>  	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
> -	if (atomic_long_inc_return(&nfsd_filecache_count) >= NFSD_FILE_LRU_THRESHOLD)
> -		nfsd_file_gc();
> +	atomic_long_inc(&nfsd_filecache_count);
>  
>  	nf->nf_mark = nfsd_file_mark_find_or_create(nf);
>  	if (nf->nf_mark) {
> 
> 

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths
  2022-07-08 19:43   ` Jeff Layton
@ 2022-07-08 19:45     ` Chuck Lever III
  0 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever III @ 2022-07-08 19:45 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List, netdev, david, tgraf



> On Jul 8, 2022, at 3:43 PM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> On Fri, 2022-07-08 at 14:25 -0400, Chuck Lever wrote:
>> The checks in nfsd_file_acquire() and nfsd_file_put() that directly
>> invoke filecache garbage collection are intended to keep cache
>> occupancy between a low- and high-watermark. The reason to limit the
>> capacity of the filecache is to keep filecache lookups reasonably
>> fast.
>> 
>> However, invoking garbage collection at those points has some
>> undesirable negative impacts. Files that are held open by NFSv4
>> clients often push the occupancy of the filecache over these
>> watermarks. At that point:
>> 
>> - Every call to nfsd_file_acquire() and nfsd_file_put() results in
>> an LRU walk. This has the same effect on lookup latency as long
>> chains in the hash table.
>> - Garbage collection will then run on every nfsd thread, causing a
>> lot of unnecessary lock contention.
>> - Limiting cache capacity pushes out files used only by NFSv3
>> clients, which are the type of files the filecache is supposed to
>> help.
>> 
>> To address those negative impacts, remove the direct calls to the
>> garbage collector. Subsequent patches will address maintaining
>> lookup efficiency as cache capacity increases.
>> 
>> Suggested-by: Wang Yugui <wangyugui@e16-tech.com>
>> Suggested-by: Dave Chinner <david@fromorbit.com>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> fs/nfsd/filecache.c | 10 +---------
>> 1 file changed, 1 insertion(+), 9 deletions(-)
>> 
>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>> index bd6ba63f69ae..faa8588663d6 100644
>> --- a/fs/nfsd/filecache.c
>> +++ b/fs/nfsd/filecache.c
>> @@ -29,8 +29,6 @@
>> #define NFSD_LAUNDRETTE_DELAY		 (2 * HZ)
>> 
>> #define NFSD_FILE_SHUTDOWN		 (1)
>> -#define NFSD_FILE_LRU_THRESHOLD		 (4096UL)
>> -#define NFSD_FILE_LRU_LIMIT		 (NFSD_FILE_LRU_THRESHOLD << 2)
>> 
>> /* We only care about NFSD_MAY_READ/WRITE for this cache */
>> #define NFSD_FILE_MAY_MASK	(NFSD_MAY_READ|NFSD_MAY_WRITE)
>> @@ -66,8 +64,6 @@ static struct fsnotify_group		*nfsd_file_fsnotify_group;
>> static atomic_long_t			nfsd_filecache_count;
>> static struct delayed_work		nfsd_filecache_laundrette;
>> 
>> -static void nfsd_file_gc(void);
>> -
>> static void
>> nfsd_file_schedule_laundrette(void)
>> {
>> @@ -350,9 +346,6 @@ nfsd_file_put(struct nfsd_file *nf)
>> 		nfsd_file_schedule_laundrette();
>> 	} else
>> 		nfsd_file_put_noref(nf);
>> -
>> -	if (atomic_long_read(&nfsd_filecache_count) >= NFSD_FILE_LRU_LIMIT)
>> -		nfsd_file_gc();
> 
> This may be addressed in later patches, but instead of just removing
> these, would it be better to instead call
> nfsd_file_schedule_laundrette() ?

nfsd_file_put() already kicks the laundrette.

I can't see a reason to call the laundrette again; once there are items
on the LRU it seems to run every 2 seconds anyway.


>> }
>> 
>> struct nfsd_file *
>> @@ -1075,8 +1068,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> 	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
>> 			nfsd_file_hashtbl[hashval].nfb_count);
>> 	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
>> -	if (atomic_long_inc_return(&nfsd_filecache_count) >= NFSD_FILE_LRU_THRESHOLD)
>> -		nfsd_file_gc();
>> +	atomic_long_inc(&nfsd_filecache_count);
>> 
>> 	nf->nf_mark = nfsd_file_mark_find_or_create(nf);
>> 	if (nf->nf_mark) {
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@redhat.com>

--
Chuck Lever




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 00/32] Overhaul NFSD filecache
  2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
                   ` (31 preceding siblings ...)
  2022-07-08 18:27 ` [PATCH v3 32/32] NFSD: Ensure nf_inode is never dereferenced Chuck Lever
@ 2022-07-08 20:27 ` Jeff Layton
  2022-07-08 20:32   ` Chuck Lever III
  32 siblings, 1 reply; 42+ messages in thread
From: Jeff Layton @ 2022-07-08 20:27 UTC (permalink / raw)
  To: Chuck Lever, linux-nfs, netdev; +Cc: david, tgraf

On Fri, 2022-07-08 at 14:23 -0400, Chuck Lever wrote:
> This series overhauls the NFSD filecache, a cache of server-side
> "struct file" objects recently used by NFS clients. The purposes of
> this overhaul are an immediate improvement in cache scalability in
> the number of open files, and preparation for further improvements.
> 
> There are three categories of patches in this series:
> 
> 1. Add observability of cache operation so we can see what we're
> doing as changes are made to the code.
> 
> 2. Improve the scalability of filecache garbage collection,
> addressing several bugs along the way.
> 
> 3. Improve the scalability of the filecache hash table by converting
> it to use rhashtable.
> 
> These patches are available in the for-next branch of:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 
> 
> Changes since v2:
> - Fix a cred use-after-free crasher
> - Fix a Smatch error reported by Dan Carpenter
> - Replace dereferences of nfsd_file::nf_inode
> - Further clean-ups and white-space adjustments
> 
> Changes since RFC:
> - Fixed several crashers
> - Adjusted some of the new observability
> - Tests with generic/531 now pass
> - Fixed bugzilla 387 too, maybe
> - Plenty of clean-ups
> 
> ---
> 
> Chuck Lever (32):
>       NFSD: Demote a WARN to a pr_warn()
>       NFSD: Report filecache LRU size
>       NFSD: Report count of calls to nfsd_file_acquire()
>       NFSD: Report count of freed filecache items
>       NFSD: Report average age of filecache items
>       NFSD: Add nfsd_file_lru_dispose_list() helper
>       NFSD: Refactor nfsd_file_gc()
>       NFSD: Refactor nfsd_file_lru_scan()
>       NFSD: Report the number of items evicted by the LRU walk
>       NFSD: Record number of flush calls
>       NFSD: Zero counters when the filecache is re-initialized
>       NFSD: Hook up the filecache stat file
>       NFSD: WARN when freeing an item still linked via nf_lru
>       NFSD: Trace filecache LRU activity
>       NFSD: Leave open files out of the filecache LRU
>       NFSD: Fix the filecache LRU shrinker
>       NFSD: Never call nfsd_file_gc() in foreground paths
>       NFSD: No longer record nf_hashval in the trace log
>       NFSD: Remove lockdep assertion from unhash_and_release_locked()
>       NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>       NFSD: Refactor __nfsd_file_close_inode()
>       NFSD: nfsd_file_hash_remove can compute hashval
>       NFSD: Remove nfsd_file::nf_hashval
>       NFSD: Replace the "init once" mechanism
>       NFSD: Set up an rhashtable for the filecache
>       NFSD: Convert the filecache to use rhashtable
>       NFSD: Clean up unused code after rhashtable conversion
>       NFSD: Separate tracepoints for acquire and create
>       NFSD: Move nfsd_file_trace_alloc() tracepoint
>       NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint
>       NFSD: NFSv4 CLOSE should release an nfsd_file immediately
>       NFSD: Ensure nf_inode is never dereferenced
> 
> 
>  fs/nfsd/filecache.c       | 727 ++++++++++++++++++++++++--------------
>  fs/nfsd/filecache.h       |   7 +-
>  fs/nfsd/nfs4proc.c        |   6 +-
>  fs/nfsd/nfs4state.c       |   7 +-
>  fs/nfsd/nfsctl.c          |  10 +
>  fs/nfsd/trace.h           | 300 +++++++++++++---
>  include/trace/events/fs.h |  37 ++
>  7 files changed, 774 insertions(+), 320 deletions(-)
> 
> --
> Chuck Lever
> 

Nice work, Chuck!

You can add this to all but #15 (where I still have a question about
whether adding it to the LRU on every put is the right thing to do).

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 00/32] Overhaul NFSD filecache
  2022-07-08 20:27 ` [PATCH v3 00/32] Overhaul NFSD filecache Jeff Layton
@ 2022-07-08 20:32   ` Chuck Lever III
  0 siblings, 0 replies; 42+ messages in thread
From: Chuck Lever III @ 2022-07-08 20:32 UTC (permalink / raw)
  To: Jeff Layton, Dave Chinner, Trond Myklebust
  Cc: Linux NFS Mailing List, netdev, tgraf



> On Jul 8, 2022, at 4:27 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Fri, 2022-07-08 at 14:23 -0400, Chuck Lever wrote:
>> This series overhauls the NFSD filecache, a cache of server-side
>> "struct file" objects recently used by NFS clients. The purposes of
>> this overhaul are an immediate improvement in cache scalability in
>> the number of open files, and preparation for further improvements.
>> 
>> There are three categories of patches in this series:
>> 
>> 1. Add observability of cache operation so we can see what we're
>> doing as changes are made to the code.
>> 
>> 2. Improve the scalability of filecache garbage collection,
>> addressing several bugs along the way.
>> 
>> 3. Improve the scalability of the filecache hash table by converting
>> it to use rhashtable.
>> 
>> These patches are available in the for-next branch of:
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 
>> 
>> Changes since v2:
>> - Fix a cred use-after-free crasher
>> - Fix a Smatch error reported by Dan Carpenter
>> - Replace dereferences of nfsd_file::nf_inode
>> - Further clean-ups and white-space adjustments
>> 
>> Changes since RFC:
>> - Fixed several crashers
>> - Adjusted some of the new observability
>> - Tests with generic/531 now pass
>> - Fixed bugzilla 387 too, maybe
>> - Plenty of clean-ups
>> 
>> ---
>> 
>> Chuck Lever (32):
>>      NFSD: Demote a WARN to a pr_warn()
>>      NFSD: Report filecache LRU size
>>      NFSD: Report count of calls to nfsd_file_acquire()
>>      NFSD: Report count of freed filecache items
>>      NFSD: Report average age of filecache items
>>      NFSD: Add nfsd_file_lru_dispose_list() helper
>>      NFSD: Refactor nfsd_file_gc()
>>      NFSD: Refactor nfsd_file_lru_scan()
>>      NFSD: Report the number of items evicted by the LRU walk
>>      NFSD: Record number of flush calls
>>      NFSD: Zero counters when the filecache is re-initialized
>>      NFSD: Hook up the filecache stat file
>>      NFSD: WARN when freeing an item still linked via nf_lru
>>      NFSD: Trace filecache LRU activity
>>      NFSD: Leave open files out of the filecache LRU
>>      NFSD: Fix the filecache LRU shrinker
>>      NFSD: Never call nfsd_file_gc() in foreground paths
>>      NFSD: No longer record nf_hashval in the trace log
>>      NFSD: Remove lockdep assertion from unhash_and_release_locked()
>>      NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>>      NFSD: Refactor __nfsd_file_close_inode()
>>      NFSD: nfsd_file_hash_remove can compute hashval
>>      NFSD: Remove nfsd_file::nf_hashval
>>      NFSD: Replace the "init once" mechanism
>>      NFSD: Set up an rhashtable for the filecache
>>      NFSD: Convert the filecache to use rhashtable
>>      NFSD: Clean up unused code after rhashtable conversion
>>      NFSD: Separate tracepoints for acquire and create
>>      NFSD: Move nfsd_file_trace_alloc() tracepoint
>>      NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint
>>      NFSD: NFSv4 CLOSE should release an nfsd_file immediately
>>      NFSD: Ensure nf_inode is never dereferenced
>> 
>> 
>> fs/nfsd/filecache.c       | 727 ++++++++++++++++++++++++--------------
>> fs/nfsd/filecache.h       |   7 +-
>> fs/nfsd/nfs4proc.c        |   6 +-
>> fs/nfsd/nfs4state.c       |   7 +-
>> fs/nfsd/nfsctl.c          |  10 +
>> fs/nfsd/trace.h           | 300 +++++++++++++---
>> include/trace/events/fs.h |  37 ++
>> 7 files changed, 774 insertions(+), 320 deletions(-)
>> 
>> --
>> Chuck Lever
>> 
> 
> Nice work, Chuck!
> 
> You can add this to all but #15 (where I still have a question about
> whether adding it to the LRU on every put is the right thing to do).

Thanks for your review! #15 emulates what other list_lru consumers
appear to do, but I'd like to hear from Trond and/or Dave who
advocated for that approach.


> Reviewed-by: Jeff Layton <jlayton@kernel.org>

--
Chuck Lever




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU
  2022-07-08 19:29   ` Jeff Layton
@ 2022-07-09 20:45     ` Chuck Lever III
  2022-07-11 11:39       ` Jeff Layton
  0 siblings, 1 reply; 42+ messages in thread
From: Chuck Lever III @ 2022-07-09 20:45 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Linux NFS Mailing List, netdev, david, tgraf



> On Jul 8, 2022, at 3:29 PM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> On Fri, 2022-07-08 at 14:25 -0400, Chuck Lever wrote:
>> There have been reports of problems when running fstests generic/531
>> against Linux NFS servers with NFSv4. The NFS server that hosts the
>> test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
>> Analysis shows that:
>> 
>> fs/nfsd/filecache.c
>> 482 ret = list_lru_walk(&nfsd_file_lru,
>> 483 nfsd_file_lru_cb,
>> 484 &head, LONG_MAX);
>> 
>> causes nfsd_file_gc() to walk the entire length of the filecache LRU
>> list every time it is called (which is quite frequently). The walk
>> holds a spinlock the entire time that prevents other nfsd threads
>> from accessing the filecache.
>> 
>> What's more, for NFSv4 workloads, none of the items that are visited
>> during this walk may be evicted, since they are all files that are
>> held OPEN by NFS clients.
>> 
>> Address this by ensuring that open files are not kept on the LRU
>> list.
>> 
>> Reported-by: Frank van der Linden <fllinden@amazon.com>
>> Reported-by: Wang Yugui <wangyugui@e16-tech.com>
>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
>> Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> fs/nfsd/filecache.c | 24 +++++++++++++++++++-----
>> fs/nfsd/trace.h | 2 ++
>> 2 files changed, 21 insertions(+), 5 deletions(-)
>> 
>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>> index 37373b012276..6e9e186334ab 100644
>> --- a/fs/nfsd/filecache.c
>> +++ b/fs/nfsd/filecache.c
>> @@ -269,6 +269,7 @@ nfsd_file_flush(struct nfsd_file *nf)
>> 
>> static void nfsd_file_lru_add(struct nfsd_file *nf)
>> {
>> +	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
>> 	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru))
>> 		trace_nfsd_file_lru_add(nf);
>> }
>> @@ -298,7 +299,6 @@ nfsd_file_unhash(struct nfsd_file *nf)
>> {
>> 	if (test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
>> 		nfsd_file_do_unhash(nf);
>> -		nfsd_file_lru_remove(nf);
>> 		return true;
>> 	}
>> 	return false;
>> @@ -319,6 +319,7 @@ nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *disp
>> 	if (refcount_dec_not_one(&nf->nf_ref))
>> 		return true;
>> 
>> +	nfsd_file_lru_remove(nf);
>> 	list_add(&nf->nf_lru, dispose);
>> 	return true;
>> }
>> @@ -330,6 +331,7 @@ nfsd_file_put_noref(struct nfsd_file *nf)
>> 
>> 	if (refcount_dec_and_test(&nf->nf_ref)) {
>> 		WARN_ON(test_bit(NFSD_FILE_HASHED, &nf->nf_flags));
>> +		nfsd_file_lru_remove(nf);
>> 		nfsd_file_free(nf);
>> 	}
>> }
>> @@ -339,7 +341,7 @@ nfsd_file_put(struct nfsd_file *nf)
>> {
>> 	might_sleep();
>> 
>> -	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
>> +	nfsd_file_lru_add(nf);
> 
> Do you really want to add this on every put? I would have thought you'd
> only want to do this on a 2->1 nf_ref transition.

My measurements indicate that 2->1 is the common case, so checking
that this is /not/ a 2->1 transition doesn't confer much if any
benefit.

Under load, I don't see any contention on the LRU locks, which is
where I'd expect to see a problem if this design were not efficient.


>> 	if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0) {
>> 		nfsd_file_flush(nf);
>> 		nfsd_file_put_noref(nf);
>> @@ -439,8 +441,18 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>> 	}
>> }
>> 
>> -/*
>> +/**
>> + * nfsd_file_lru_cb - Examine an entry on the LRU list
>> + * @item: LRU entry to examine
>> + * @lru: controlling LRU
>> + * @lock: LRU list lock (unused)
>> + * @arg: dispose list
>> + *
>> * Note this can deadlock with nfsd_file_cache_purge.
>> + *
>> + * Return values:
>> + * %LRU_REMOVED: @item was removed from the LRU
>> + * %LRU_SKIP: @item cannot be evicted
>> */
>> static enum lru_status
>> nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
>> @@ -462,8 +474,9 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
>> 	 * That order is deliberate to ensure that we can do this locklessly.
>> 	 */
>> 	if (refcount_read(&nf->nf_ref) > 1) {
>> +		list_lru_isolate(lru, &nf->nf_lru);
>> 		trace_nfsd_file_gc_in_use(nf);
>> -		return LRU_SKIP;
>> +		return LRU_REMOVED;
> 
> Interesting. So you wait until the LRU scanner runs to remove these
> entries? I expected to see you do this in nfsd_file_get, but this does
> seem likely to be more efficient.
> 
>> 	}
>> 
>> 	/*
>> @@ -1020,6 +1033,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> 		goto retry;
>> 	}
>> 
>> +	nfsd_file_lru_remove(nf);
>> 	this_cpu_inc(nfsd_file_cache_hits);
>> 
>> 	if (!(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
>> @@ -1055,7 +1069,6 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> 	refcount_inc(&nf->nf_ref);
>> 	__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
>> 	__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
>> -	nfsd_file_lru_add(nf);
>> 	hlist_add_head_rcu(&nf->nf_node, &nfsd_file_hashtbl[hashval].nfb_head);
>> 	++nfsd_file_hashtbl[hashval].nfb_count;
>> 	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
>> @@ -1080,6 +1093,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> 	 */
>> 	if (status != nfs_ok || inode->i_nlink == 0) {
>> 		bool do_free;
>> +		nfsd_file_lru_remove(nf);
>> 		spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
>> 		do_free = nfsd_file_unhash(nf);
>> 		spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
>> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
>> index 1cc1133371eb..54082b868b72 100644
>> --- a/fs/nfsd/trace.h
>> +++ b/fs/nfsd/trace.h
>> @@ -929,7 +929,9 @@ DEFINE_EVENT(nfsd_file_gc_class, name,					\
>> 	TP_ARGS(nf))
>> 
>> DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add);
>> +DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add_disposed);
>> DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del);
>> +DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del_disposed);
>> DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_in_use);
>> DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_writeback);
>> DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_referenced);
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@redhat.com>

--
Chuck Lever




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU
  2022-07-09 20:45     ` Chuck Lever III
@ 2022-07-11 11:39       ` Jeff Layton
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff Layton @ 2022-07-11 11:39 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Linux NFS Mailing List, netdev, david, tgraf

On Sat, 2022-07-09 at 20:45 +0000, Chuck Lever III wrote:
> 
> > On Jul 8, 2022, at 3:29 PM, Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > On Fri, 2022-07-08 at 14:25 -0400, Chuck Lever wrote:
> > > There have been reports of problems when running fstests generic/531
> > > against Linux NFS servers with NFSv4. The NFS server that hosts the
> > > test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
> > > Analysis shows that:
> > > 
> > > fs/nfsd/filecache.c
> > > 482 ret = list_lru_walk(&nfsd_file_lru,
> > > 483 nfsd_file_lru_cb,
> > > 484 &head, LONG_MAX);
> > > 
> > > causes nfsd_file_gc() to walk the entire length of the filecache LRU
> > > list every time it is called (which is quite frequently). The walk
> > > holds a spinlock the entire time that prevents other nfsd threads
> > > from accessing the filecache.
> > > 
> > > What's more, for NFSv4 workloads, none of the items that are visited
> > > during this walk may be evicted, since they are all files that are
> > > held OPEN by NFS clients.
> > > 
> > > Address this by ensuring that open files are not kept on the LRU
> > > list.
> > > 
> > > Reported-by: Frank van der Linden <fllinden@amazon.com>
> > > Reported-by: Wang Yugui <wangyugui@e16-tech.com>
> > > Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
> > > Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > ---
> > > fs/nfsd/filecache.c | 24 +++++++++++++++++++-----
> > > fs/nfsd/trace.h | 2 ++
> > > 2 files changed, 21 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > > index 37373b012276..6e9e186334ab 100644
> > > --- a/fs/nfsd/filecache.c
> > > +++ b/fs/nfsd/filecache.c
> > > @@ -269,6 +269,7 @@ nfsd_file_flush(struct nfsd_file *nf)
> > > 
> > > static void nfsd_file_lru_add(struct nfsd_file *nf)
> > > {
> > > +	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
> > > 	if (list_lru_add(&nfsd_file_lru, &nf->nf_lru))
> > > 		trace_nfsd_file_lru_add(nf);
> > > }
> > > @@ -298,7 +299,6 @@ nfsd_file_unhash(struct nfsd_file *nf)
> > > {
> > > 	if (test_and_clear_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
> > > 		nfsd_file_do_unhash(nf);
> > > -		nfsd_file_lru_remove(nf);
> > > 		return true;
> > > 	}
> > > 	return false;
> > > @@ -319,6 +319,7 @@ nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *disp
> > > 	if (refcount_dec_not_one(&nf->nf_ref))
> > > 		return true;
> > > 
> > > +	nfsd_file_lru_remove(nf);
> > > 	list_add(&nf->nf_lru, dispose);
> > > 	return true;
> > > }
> > > @@ -330,6 +331,7 @@ nfsd_file_put_noref(struct nfsd_file *nf)
> > > 
> > > 	if (refcount_dec_and_test(&nf->nf_ref)) {
> > > 		WARN_ON(test_bit(NFSD_FILE_HASHED, &nf->nf_flags));
> > > +		nfsd_file_lru_remove(nf);
> > > 		nfsd_file_free(nf);
> > > 	}
> > > }
> > > @@ -339,7 +341,7 @@ nfsd_file_put(struct nfsd_file *nf)
> > > {
> > > 	might_sleep();
> > > 
> > > -	set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
> > > +	nfsd_file_lru_add(nf);
> > 
> > Do you really want to add this on every put? I would have thought you'd
> > only want to do this on a 2->1 nf_ref transition.
> 
> My measurements indicate that 2->1 is the common case, so checking
> that this is /not/ a 2->1 transition doesn't confer much if any
> benefit.
> 
> Under load, I don't see any contention on the LRU locks, which is
> where I'd expect to see a problem if this design were not efficient.
> 
> 

Fair enough. I guess the idea is to throw it onto the LRU and the
scanner will just (eventually) take it off again without reaping it.

You can add my Reviewed-by: to this one as well.


> > > 	if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0) {
> > > 		nfsd_file_flush(nf);
> > > 		nfsd_file_put_noref(nf);
> > > @@ -439,8 +441,18 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
> > > 	}
> > > }
> > > 
> > > -/*
> > > +/**
> > > + * nfsd_file_lru_cb - Examine an entry on the LRU list
> > > + * @item: LRU entry to examine
> > > + * @lru: controlling LRU
> > > + * @lock: LRU list lock (unused)
> > > + * @arg: dispose list
> > > + *
> > > * Note this can deadlock with nfsd_file_cache_purge.
> > > + *
> > > + * Return values:
> > > + * %LRU_REMOVED: @item was removed from the LRU
> > > + * %LRU_SKIP: @item cannot be evicted
> > > */
> > > static enum lru_status
> > > nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
> > > @@ -462,8 +474,9 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
> > > 	 * That order is deliberate to ensure that we can do this locklessly.
> > > 	 */
> > > 	if (refcount_read(&nf->nf_ref) > 1) {
> > > +		list_lru_isolate(lru, &nf->nf_lru);
> > > 		trace_nfsd_file_gc_in_use(nf);
> > > -		return LRU_SKIP;
> > > +		return LRU_REMOVED;
> > 
> > Interesting. So you wait until the LRU scanner runs to remove these
> > entries? I expected to see you do this in nfsd_file_get, but this does
> > seem likely to be more efficient.
> > 
> > > 	}
> > > 
> > > 	/*
> > > @@ -1020,6 +1033,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > 		goto retry;
> > > 	}
> > > 
> > > +	nfsd_file_lru_remove(nf);
> > > 	this_cpu_inc(nfsd_file_cache_hits);
> > > 
> > > 	if (!(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
> > > @@ -1055,7 +1069,6 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > 	refcount_inc(&nf->nf_ref);
> > > 	__set_bit(NFSD_FILE_HASHED, &nf->nf_flags);
> > > 	__set_bit(NFSD_FILE_PENDING, &nf->nf_flags);
> > > -	nfsd_file_lru_add(nf);
> > > 	hlist_add_head_rcu(&nf->nf_node, &nfsd_file_hashtbl[hashval].nfb_head);
> > > 	++nfsd_file_hashtbl[hashval].nfb_count;
> > > 	nfsd_file_hashtbl[hashval].nfb_maxcount = max(nfsd_file_hashtbl[hashval].nfb_maxcount,
> > > @@ -1080,6 +1093,7 @@ nfsd_do_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > 	 */
> > > 	if (status != nfs_ok || inode->i_nlink == 0) {
> > > 		bool do_free;
> > > +		nfsd_file_lru_remove(nf);
> > > 		spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
> > > 		do_free = nfsd_file_unhash(nf);
> > > 		spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
> > > diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> > > index 1cc1133371eb..54082b868b72 100644
> > > --- a/fs/nfsd/trace.h
> > > +++ b/fs/nfsd/trace.h
> > > @@ -929,7 +929,9 @@ DEFINE_EVENT(nfsd_file_gc_class, name,					\
> > > 	TP_ARGS(nf))
> > > 
> > > DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add);
> > > +DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_add_disposed);
> > > DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del);
> > > +DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_lru_del_disposed);
> > > DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_in_use);
> > > DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_writeback);
> > > DEFINE_NFSD_FILE_GC_EVENT(nfsd_file_gc_referenced);
> > > 
> > > 
> > 
> > -- 
> > Jeff Layton <jlayton@redhat.com>
> 
> --
> Chuck Lever
> 
> 
> 

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-07-11 11:43 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-08 18:23 [PATCH v3 00/32] Overhaul NFSD filecache Chuck Lever
2022-07-08 18:23 ` [PATCH v3 01/32] NFSD: Demote a WARN to a pr_warn() Chuck Lever
2022-07-08 18:23 ` [PATCH v3 02/32] NFSD: Report filecache LRU size Chuck Lever
2022-07-08 18:23 ` [PATCH v3 03/32] NFSD: Report count of calls to nfsd_file_acquire() Chuck Lever
2022-07-08 18:24 ` [PATCH v3 04/32] NFSD: Report count of freed filecache items Chuck Lever
2022-07-08 18:24 ` [PATCH v3 05/32] NFSD: Report average age of " Chuck Lever
2022-07-08 18:24 ` [PATCH v3 06/32] NFSD: Add nfsd_file_lru_dispose_list() helper Chuck Lever
2022-07-08 18:24 ` [PATCH v3 07/32] NFSD: Refactor nfsd_file_gc() Chuck Lever
2022-07-08 18:24 ` [PATCH v3 08/32] NFSD: Refactor nfsd_file_lru_scan() Chuck Lever
2022-07-08 18:24 ` [PATCH v3 09/32] NFSD: Report the number of items evicted by the LRU walk Chuck Lever
2022-07-08 18:24 ` [PATCH v3 10/32] NFSD: Record number of flush calls Chuck Lever
2022-07-08 18:24 ` [PATCH v3 11/32] NFSD: Zero counters when the filecache is re-initialized Chuck Lever
2022-07-08 18:24 ` [PATCH v3 12/32] NFSD: Hook up the filecache stat file Chuck Lever
2022-07-08 19:16   ` Jeff Layton
2022-07-08 18:25 ` [PATCH v3 13/32] NFSD: WARN when freeing an item still linked via nf_lru Chuck Lever
2022-07-08 18:25 ` [PATCH v3 14/32] NFSD: Trace filecache LRU activity Chuck Lever
2022-07-08 18:25 ` [PATCH v3 15/32] NFSD: Leave open files out of the filecache LRU Chuck Lever
2022-07-08 19:29   ` Jeff Layton
2022-07-09 20:45     ` Chuck Lever III
2022-07-11 11:39       ` Jeff Layton
2022-07-08 18:25 ` [PATCH v3 16/32] NFSD: Fix the filecache LRU shrinker Chuck Lever
2022-07-08 19:37   ` Jeff Layton
2022-07-08 18:25 ` [PATCH v3 17/32] NFSD: Never call nfsd_file_gc() in foreground paths Chuck Lever
2022-07-08 19:43   ` Jeff Layton
2022-07-08 19:45     ` Chuck Lever III
2022-07-08 18:25 ` [PATCH v3 18/32] NFSD: No longer record nf_hashval in the trace log Chuck Lever
2022-07-08 18:25 ` [PATCH v3 19/32] NFSD: Remove lockdep assertion from unhash_and_release_locked() Chuck Lever
2022-07-08 18:25 ` [PATCH v3 20/32] NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode Chuck Lever
2022-07-08 18:25 ` [PATCH v3 21/32] NFSD: Refactor __nfsd_file_close_inode() Chuck Lever
2022-07-08 18:26 ` [PATCH v3 22/32] NFSD: nfsd_file_hash_remove can compute hashval Chuck Lever
2022-07-08 18:26 ` [PATCH v3 23/32] NFSD: Remove nfsd_file::nf_hashval Chuck Lever
2022-07-08 18:26 ` [PATCH v3 24/32] NFSD: Replace the "init once" mechanism Chuck Lever
2022-07-08 18:26 ` [PATCH v3 25/32] NFSD: Set up an rhashtable for the filecache Chuck Lever
2022-07-08 18:26 ` [PATCH v3 26/32] NFSD: Convert the filecache to use rhashtable Chuck Lever
2022-07-08 18:26 ` [PATCH v3 27/32] NFSD: Clean up unused code after rhashtable conversion Chuck Lever
2022-07-08 18:26 ` [PATCH v3 28/32] NFSD: Separate tracepoints for acquire and create Chuck Lever
2022-07-08 18:26 ` [PATCH v3 29/32] NFSD: Move nfsd_file_trace_alloc() tracepoint Chuck Lever
2022-07-08 18:26 ` [PATCH v3 30/32] NFSD: Update the nfsd_file_fsnotify_handle_event() tracepoint Chuck Lever
2022-07-08 18:27 ` [PATCH v3 31/32] NFSD: NFSv4 CLOSE should release an nfsd_file immediately Chuck Lever
2022-07-08 18:27 ` [PATCH v3 32/32] NFSD: Ensure nf_inode is never dereferenced Chuck Lever
2022-07-08 20:27 ` [PATCH v3 00/32] Overhaul NFSD filecache Jeff Layton
2022-07-08 20:32   ` Chuck Lever III

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.