All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Bugfixes and tracing enhancements for knfsd
@ 2020-03-01 23:21 Trond Myklebust
  2020-03-01 23:21 ` [PATCH 1/8] nfsd: Don't add locks to closed or closing open stateids Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

This series fixes an Oops that has been getting in the way of testing
NFSv4.1 in the last few weeks. It adds a number of tracepoints around
the knfsd cache and upcall mechanism in order to help debugging, then
fixes a few races caused by the cache_listeners_exist() mechanism
and goes back to addressing the garbage collection timeouts.

Finally, it fixes the long standing issue that knfsd will drop requests
without dropping the TCP connection, something which violates the NFSv4
protocol.

Trond Myklebust (8):
  nfsd: Don't add locks to closed or closing open stateids
  nfsd: Add tracing to nfsd_set_fh_dentry()
  nfsd: Add tracepoints for exp_find_key() and exp_get_by_name()
  nfsd: Add tracepoints for update of the expkey and export cache
    entries
  nfsd: export upcalls must not return ESTALE when mountd is down
  SUNRPC/cache: Allow garbage collection of invalid cache entries
  sunrpc: Add tracing for cache events
  sunrpc: Drop the connection when the server drops a request

 fs/nfs/dns_resolve.c              |  11 +--
 fs/nfsd/export.c                  |  45 ++++++++---
 fs/nfsd/nfs4idmap.c               |  14 ++++
 fs/nfsd/nfs4state.c               |  73 ++++++++++--------
 fs/nfsd/nfsfh.c                   |  13 +++-
 fs/nfsd/trace.h                   | 122 +++++++++++++++++++++++++++++
 include/linux/sunrpc/cache.h      |   6 +-
 include/trace/events/sunrpc.h     |  33 ++++++++
 net/sunrpc/auth_gss/svcauth_gss.c |  12 +++
 net/sunrpc/cache.c                | 124 +++++++++++++++++-------------
 net/sunrpc/svc_xprt.c             |  10 +++
 net/sunrpc/svcauth_unix.c         |  12 +++
 12 files changed, 370 insertions(+), 105 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/8] nfsd: Don't add locks to closed or closing open stateids
  2020-03-01 23:21 [PATCH 0/8] Bugfixes and tracing enhancements for knfsd Trond Myklebust
@ 2020-03-01 23:21 ` Trond Myklebust
  2020-03-01 23:21   ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

In NFSv4, the lock stateids are tied to the lockowner, and the open stateid,
so that the action of closing the file also results in either an automatic
loss of the locks, or an error of the form NFS4ERR_LOCKS_HELD.

In practice this means we must not add new locks to the open stateid
after the close process has been invoked. In fact doing so, can result
in the following panic:

 kernel BUG at lib/list_debug.c:51!
 invalid opcode: 0000 [#1] SMP NOPTI
 CPU: 2 PID: 1085 Comm: nfsd Not tainted 5.6.0-rc3+ #2
 Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.14410784.B64.1908150010 08/15/2019
 RIP: 0010:__list_del_entry_valid.cold+0x31/0x55
 Code: 1a 3d 9b e8 74 10 c2 ff 0f 0b 48 c7 c7 f0 1a 3d 9b e8 66 10 c2 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 b0 1a 3d 9b e8 52 10 c2 ff <0f> 0b 48 89 fe 4c 89 c2 48 c7 c7 78 1a 3d 9b e8 3e 10 c2 ff 0f 0b
 RSP: 0018:ffffb296c1d47d90 EFLAGS: 00010246
 RAX: 0000000000000054 RBX: ffff8ba032456ec8 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff8ba039e99cc8 RDI: ffff8ba039e99cc8
 RBP: ffff8ba032456e60 R08: 0000000000000781 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8ba009a4abe0
 R13: ffff8ba032456e8c R14: 0000000000000000 R15: ffff8ba00adb01d8
 FS:  0000000000000000(0000) GS:ffff8ba039e80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fb213f0b008 CR3: 00000001347de006 CR4: 00000000003606e0
 Call Trace:
  release_lock_stateid+0x2b/0x80 [nfsd]
  nfsd4_free_stateid+0x1e9/0x210 [nfsd]
  nfsd4_proc_compound+0x414/0x700 [nfsd]
  ? nfs4svc_decode_compoundargs+0x407/0x4c0 [nfsd]
  nfsd_dispatch+0xc1/0x200 [nfsd]
  svc_process_common+0x476/0x6f0 [sunrpc]
  ? svc_sock_secure_port+0x12/0x30 [sunrpc]
  ? svc_recv+0x313/0x9c0 [sunrpc]
  ? nfsd_svc+0x2d0/0x2d0 [nfsd]
  svc_process+0xd4/0x110 [sunrpc]
  nfsd+0xe3/0x140 [nfsd]
  kthread+0xf9/0x130
  ? nfsd_destroy+0x50/0x50 [nfsd]
  ? kthread_park+0x90/0x90
  ret_from_fork+0x1f/0x40

The fix is to ensure that lock creation tests for whether or not the
open stateid is unhashed, and to fail if that is the case.

Fixes: 659aefb68eca ("nfsd: Ensure we don't recognise lock stateids after freeing them")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfsd/nfs4state.c | 73 ++++++++++++++++++++++++++-------------------
 1 file changed, 43 insertions(+), 30 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 65cfe9ab47be..1ba4514be18d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -494,6 +494,8 @@ find_any_file(struct nfs4_file *f)
 {
 	struct nfsd_file *ret;
 
+	if (!f)
+		return NULL;
 	spin_lock(&f->fi_lock);
 	ret = __nfs4_get_fd(f, O_RDWR);
 	if (!ret) {
@@ -1309,6 +1311,12 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop)
 	nfs4_free_stateowner(sop);
 }
 
+static bool
+nfs4_ol_stateid_unhashed(const struct nfs4_ol_stateid *stp)
+{
+	return list_empty(&stp->st_perfile);
+}
+
 static bool unhash_ol_stateid(struct nfs4_ol_stateid *stp)
 {
 	struct nfs4_file *fp = stp->st_stid.sc_file;
@@ -1379,9 +1387,11 @@ static bool unhash_lock_stateid(struct nfs4_ol_stateid *stp)
 {
 	lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);
 
+	if (!unhash_ol_stateid(stp))
+		return false;
 	list_del_init(&stp->st_locks);
 	nfs4_unhash_stid(&stp->st_stid);
-	return unhash_ol_stateid(stp);
+	return true;
 }
 
 static void release_lock_stateid(struct nfs4_ol_stateid *stp)
@@ -1446,13 +1456,12 @@ static void release_open_stateid_locks(struct nfs4_ol_stateid *open_stp,
 static bool unhash_open_stateid(struct nfs4_ol_stateid *stp,
 				struct list_head *reaplist)
 {
-	bool unhashed;
-
 	lockdep_assert_held(&stp->st_stid.sc_client->cl_lock);
 
-	unhashed = unhash_ol_stateid(stp);
+	if (!unhash_ol_stateid(stp))
+		return false;
 	release_open_stateid_locks(stp, reaplist);
-	return unhashed;
+	return true;
 }
 
 static void release_open_stateid(struct nfs4_ol_stateid *stp)
@@ -6393,21 +6402,21 @@ alloc_init_lock_stateowner(unsigned int strhashval, struct nfs4_client *clp,
 }
 
 static struct nfs4_ol_stateid *
-find_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fp)
+find_lock_stateid(const struct nfs4_lockowner *lo,
+		  const struct nfs4_ol_stateid *ost)
 {
 	struct nfs4_ol_stateid *lst;
-	struct nfs4_client *clp = lo->lo_owner.so_client;
 
-	lockdep_assert_held(&clp->cl_lock);
+	lockdep_assert_held(&ost->st_stid.sc_client->cl_lock);
 
-	list_for_each_entry(lst, &lo->lo_owner.so_stateids, st_perstateowner) {
-		if (lst->st_stid.sc_type != NFS4_LOCK_STID)
-			continue;
-		if (lst->st_stid.sc_file == fp) {
-			refcount_inc(&lst->st_stid.sc_count);
-			return lst;
+	/* If ost is not hashed, ost->st_locks will not be valid */
+	if (!nfs4_ol_stateid_unhashed(ost))
+		list_for_each_entry(lst, &ost->st_locks, st_locks) {
+			if (lst->st_stateowner == &lo->lo_owner) {
+				refcount_inc(&lst->st_stid.sc_count);
+				return lst;
+			}
 		}
-	}
 	return NULL;
 }
 
@@ -6423,11 +6432,11 @@ init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
 	mutex_lock_nested(&stp->st_mutex, OPEN_STATEID_MUTEX);
 retry:
 	spin_lock(&clp->cl_lock);
-	spin_lock(&fp->fi_lock);
-	retstp = find_lock_stateid(lo, fp);
+	if (nfs4_ol_stateid_unhashed(open_stp))
+		goto out_close;
+	retstp = find_lock_stateid(lo, open_stp);
 	if (retstp)
-		goto out_unlock;
-
+		goto out_found;
 	refcount_inc(&stp->st_stid.sc_count);
 	stp->st_stid.sc_type = NFS4_LOCK_STID;
 	stp->st_stateowner = nfs4_get_stateowner(&lo->lo_owner);
@@ -6436,22 +6445,26 @@ init_lock_stateid(struct nfs4_ol_stateid *stp, struct nfs4_lockowner *lo,
 	stp->st_access_bmap = 0;
 	stp->st_deny_bmap = open_stp->st_deny_bmap;
 	stp->st_openstp = open_stp;
+	spin_lock(&fp->fi_lock);
 	list_add(&stp->st_locks, &open_stp->st_locks);
 	list_add(&stp->st_perstateowner, &lo->lo_owner.so_stateids);
 	list_add(&stp->st_perfile, &fp->fi_stateids);
-out_unlock:
 	spin_unlock(&fp->fi_lock);
 	spin_unlock(&clp->cl_lock);
-	if (retstp) {
-		if (nfsd4_lock_ol_stateid(retstp) != nfs_ok) {
-			nfs4_put_stid(&retstp->st_stid);
-			goto retry;
-		}
-		/* To keep mutex tracking happy */
-		mutex_unlock(&stp->st_mutex);
-		stp = retstp;
-	}
 	return stp;
+out_found:
+	spin_unlock(&clp->cl_lock);
+	if (nfsd4_lock_ol_stateid(retstp) != nfs_ok) {
+		nfs4_put_stid(&retstp->st_stid);
+		goto retry;
+	}
+	/* To keep mutex tracking happy */
+	mutex_unlock(&stp->st_mutex);
+	return retstp;
+out_close:
+	spin_unlock(&clp->cl_lock);
+	mutex_unlock(&stp->st_mutex);
+	return NULL;
 }
 
 static struct nfs4_ol_stateid *
@@ -6466,7 +6479,7 @@ find_or_create_lock_stateid(struct nfs4_lockowner *lo, struct nfs4_file *fi,
 
 	*new = false;
 	spin_lock(&clp->cl_lock);
-	lst = find_lock_stateid(lo, fi);
+	lst = find_lock_stateid(lo, ost);
 	spin_unlock(&clp->cl_lock);
 	if (lst != NULL) {
 		if (nfsd4_lock_ol_stateid(lst) == nfs_ok)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry()
  2020-03-01 23:21 ` [PATCH 1/8] nfsd: Don't add locks to closed or closing open stateids Trond Myklebust
@ 2020-03-01 23:21   ` Trond Myklebust
  2020-03-01 23:21     ` [PATCH 3/8] nfsd: Add tracepoints for exp_find_key() and exp_get_by_name() Trond Myklebust
  2020-03-03  0:22     ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Chuck Lever
  0 siblings, 2 replies; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Add tracing to allow us to figure out where any stale filehandle issues
may be originating from.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfsd/nfsfh.c | 13 ++++++++++---
 fs/nfsd/trace.h | 30 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
index b319080288c3..37bc8f5f4514 100644
--- a/fs/nfsd/nfsfh.c
+++ b/fs/nfsd/nfsfh.c
@@ -14,6 +14,7 @@
 #include "nfsd.h"
 #include "vfs.h"
 #include "auth.h"
+#include "trace.h"
 
 #define NFSDDBG_FACILITY		NFSDDBG_FH
 
@@ -209,11 +210,14 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp)
 	}
 
 	error = nfserr_stale;
-	if (PTR_ERR(exp) == -ENOENT)
-		return error;
+	if (IS_ERR(exp)) {
+		trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, PTR_ERR(exp));
+
+		if (PTR_ERR(exp) == -ENOENT)
+			return error;
 
-	if (IS_ERR(exp))
 		return nfserrno(PTR_ERR(exp));
+	}
 
 	if (exp->ex_flags & NFSEXP_NOSUBTREECHECK) {
 		/* Elevate privileges so that the lack of 'r' or 'x'
@@ -267,6 +271,9 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp)
 		dentry = exportfs_decode_fh(exp->ex_path.mnt, fid,
 				data_left, fileid_type,
 				nfsd_acceptable, exp);
+		if (IS_ERR_OR_NULL(dentry))
+			trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp,
+					dentry ?  PTR_ERR(dentry) : -ESTALE);
 	}
 	if (dentry == NULL)
 		goto out;
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 06dd0d337049..9abd1591a841 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -50,6 +50,36 @@ TRACE_EVENT(nfsd_compound_status,
 		__get_str(name), __entry->status)
 )
 
+DECLARE_EVENT_CLASS(nfsd_fh_err_class,
+	TP_PROTO(struct svc_rqst *rqstp,
+		 struct svc_fh	*fhp,
+		 int		status),
+	TP_ARGS(rqstp, fhp, status),
+	TP_STRUCT__entry(
+		__field(u32, xid)
+		__field(u32, fh_hash)
+		__field(int, status)
+	),
+	TP_fast_assign(
+		__entry->xid = be32_to_cpu(rqstp->rq_xid);
+		__entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
+		__entry->status = status;
+	),
+	TP_printk("xid=0x%08x fh_hash=0x%08x status=%d",
+		  __entry->xid, __entry->fh_hash,
+		  __entry->status)
+)
+
+#define DEFINE_NFSD_FH_ERR_EVENT(name)		\
+DEFINE_EVENT(nfsd_fh_err_class, nfsd_##name,	\
+	TP_PROTO(struct svc_rqst *rqstp,	\
+		 struct svc_fh	*fhp,		\
+		 int		status),	\
+	TP_ARGS(rqstp, fhp, status))
+
+DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badexport);
+DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badhandle);
+
 DECLARE_EVENT_CLASS(nfsd_io_class,
 	TP_PROTO(struct svc_rqst *rqstp,
 		 struct svc_fh	*fhp,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/8] nfsd: Add tracepoints for exp_find_key() and exp_get_by_name()
  2020-03-01 23:21   ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Trond Myklebust
@ 2020-03-01 23:21     ` Trond Myklebust
  2020-03-01 23:21       ` [PATCH 4/8] nfsd: Add tracepoints for update of the expkey and export cache entries Trond Myklebust
  2020-03-03  0:22     ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Chuck Lever
  1 sibling, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Add tracepoints for upcalls.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfsd/export.c |  9 +++++++--
 fs/nfsd/trace.h  | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 15422c951fd1..e867db0bb380 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -23,6 +23,7 @@
 #include "netns.h"
 #include "pnfs.h"
 #include "filecache.h"
+#include "trace.h"
 
 #define NFSDDBG_FACILITY	NFSDDBG_EXPORT
 
@@ -832,8 +833,10 @@ exp_find_key(struct cache_detail *cd, struct auth_domain *clp, int fsid_type,
 	if (ek == NULL)
 		return ERR_PTR(-ENOMEM);
 	err = cache_check(cd, &ek->h, reqp);
-	if (err)
+	if (err) {
+		trace_nfsd_exp_find_key(&key, err);
 		return ERR_PTR(err);
+	}
 	return ek;
 }
 
@@ -855,8 +858,10 @@ exp_get_by_name(struct cache_detail *cd, struct auth_domain *clp,
 	if (exp == NULL)
 		return ERR_PTR(-ENOMEM);
 	err = cache_check(cd, &exp->h, reqp);
-	if (err)
+	if (err) {
+		trace_nfsd_exp_get_by_name(&key, err);
 		return ERR_PTR(err);
+	}
 	return exp;
 }
 
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 9abd1591a841..3b9277d7b5f2 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -9,6 +9,7 @@
 #define _NFSD_TRACE_H
 
 #include <linux/tracepoint.h>
+#include "export.h"
 #include "nfsfh.h"
 
 TRACE_EVENT(nfsd_compound,
@@ -80,6 +81,51 @@ DEFINE_EVENT(nfsd_fh_err_class, nfsd_##name,	\
 DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badexport);
 DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badhandle);
 
+TRACE_EVENT(nfsd_exp_find_key,
+	TP_PROTO(const struct svc_expkey *key,
+		 int status),
+	TP_ARGS(key, status),
+	TP_STRUCT__entry(
+		__field(int, fsidtype)
+		__array(u32, fsid, 6)
+		__string(auth_domain, key->ek_client->name)
+		__field(int, status)
+	),
+	TP_fast_assign(
+		__entry->fsidtype = key->ek_fsidtype;
+		memcpy(__entry->fsid, key->ek_fsid, 4*6);
+		__assign_str(auth_domain, key->ek_client->name);
+		__entry->status = status;
+	),
+	TP_printk("fsid=%x::%s domain=%s status=%d",
+		__entry->fsidtype,
+		__print_array(__entry->fsid, 6, 4),
+		__get_str(auth_domain),
+		__entry->status
+	)
+);
+
+TRACE_EVENT(nfsd_exp_get_by_name,
+	TP_PROTO(const struct svc_export *key,
+		 int status),
+	TP_ARGS(key, status),
+	TP_STRUCT__entry(
+		__string(path, key->ex_path.dentry->d_name.name)
+		__string(auth_domain, key->ex_client->name)
+		__field(int, status)
+	),
+	TP_fast_assign(
+		__assign_str(path, key->ex_path.dentry->d_name.name);
+		__assign_str(auth_domain, key->ex_client->name);
+		__entry->status = status;
+	),
+	TP_printk("path=%s domain=%s status=%d",
+		__get_str(path),
+		__get_str(auth_domain),
+		__entry->status
+	)
+);
+
 DECLARE_EVENT_CLASS(nfsd_io_class,
 	TP_PROTO(struct svc_rqst *rqstp,
 		 struct svc_fh	*fhp,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/8] nfsd: Add tracepoints for update of the expkey and export cache entries
  2020-03-01 23:21     ` [PATCH 3/8] nfsd: Add tracepoints for exp_find_key() and exp_get_by_name() Trond Myklebust
@ 2020-03-01 23:21       ` Trond Myklebust
  2020-03-01 23:21         ` [PATCH 5/8] nfsd: export upcalls must not return ESTALE when mountd is down Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfsd/export.c | 24 +++++++++++++++---------
 fs/nfsd/trace.h  | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index e867db0bb380..6e6cbeb7ac2b 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -141,7 +141,9 @@ static int expkey_parse(struct cache_detail *cd, char *mesg, int mlen)
 	if (len == 0) {
 		set_bit(CACHE_NEGATIVE, &key.h.flags);
 		ek = svc_expkey_update(cd, &key, ek);
-		if (!ek)
+		if (ek)
+			trace_nfsd_expkey_update(ek, NULL);
+		else
 			err = -ENOMEM;
 	} else {
 		err = kern_path(buf, 0, &key.ek_path);
@@ -151,7 +153,9 @@ static int expkey_parse(struct cache_detail *cd, char *mesg, int mlen)
 		dprintk("Found the path %s\n", buf);
 
 		ek = svc_expkey_update(cd, &key, ek);
-		if (!ek)
+		if (ek)
+			trace_nfsd_expkey_update(ek, buf);
+		else
 			err = -ENOMEM;
 		path_put(&key.ek_path);
 	}
@@ -644,15 +648,17 @@ static int svc_export_parse(struct cache_detail *cd, char *mesg, int mlen)
 	}
 
 	expp = svc_export_lookup(&exp);
-	if (expp)
-		expp = svc_export_update(&exp, expp);
-	else
-		err = -ENOMEM;
-	cache_flush();
-	if (expp == NULL)
+	if (!expp) {
 		err = -ENOMEM;
-	else
+		goto out4;
+	}
+	expp = svc_export_update(&exp, expp);
+	if (expp) {
+		trace_nfsd_export_update(expp);
+		cache_flush();
 		exp_put(expp);
+	} else
+		err = -ENOMEM;
 out4:
 	nfsd4_fslocs_free(&exp.ex_fslocs);
 	kfree(exp.ex_uuid);
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 3b9277d7b5f2..78c574251c60 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -105,6 +105,32 @@ TRACE_EVENT(nfsd_exp_find_key,
 	)
 );
 
+TRACE_EVENT(nfsd_expkey_update,
+	TP_PROTO(const struct svc_expkey *key, const char *exp_path),
+	TP_ARGS(key, exp_path),
+	TP_STRUCT__entry(
+		__field(int, fsidtype)
+		__array(u32, fsid, 6)
+		__string(auth_domain, key->ek_client->name)
+		__string(path, exp_path)
+		__field(bool, cache)
+	),
+	TP_fast_assign(
+		__entry->fsidtype = key->ek_fsidtype;
+		memcpy(__entry->fsid, key->ek_fsid, 4*6);
+		__assign_str(auth_domain, key->ek_client->name);
+		__assign_str(path, exp_path);
+		__entry->cache = !test_bit(CACHE_NEGATIVE, &key->h.flags);
+	),
+	TP_printk("fsid=%x::%s domain=%s path=%s cache=%s",
+		__entry->fsidtype,
+		__print_array(__entry->fsid, 6, 4),
+		__get_str(auth_domain),
+		__get_str(path),
+		__entry->cache ? "pos" : "neg"
+	)
+);
+
 TRACE_EVENT(nfsd_exp_get_by_name,
 	TP_PROTO(const struct svc_export *key,
 		 int status),
@@ -126,6 +152,26 @@ TRACE_EVENT(nfsd_exp_get_by_name,
 	)
 );
 
+TRACE_EVENT(nfsd_export_update,
+	TP_PROTO(const struct svc_export *key),
+	TP_ARGS(key),
+	TP_STRUCT__entry(
+		__string(path, key->ex_path.dentry->d_name.name)
+		__string(auth_domain, key->ex_client->name)
+		__field(bool, cache)
+	),
+	TP_fast_assign(
+		__assign_str(path, key->ex_path.dentry->d_name.name);
+		__assign_str(auth_domain, key->ex_client->name);
+		__entry->cache = !test_bit(CACHE_NEGATIVE, &key->h.flags);
+	),
+	TP_printk("path=%s domain=%s cache=%s",
+		__get_str(path),
+		__get_str(auth_domain),
+		__entry->cache ? "pos" : "neg"
+	)
+);
+
 DECLARE_EVENT_CLASS(nfsd_io_class,
 	TP_PROTO(struct svc_rqst *rqstp,
 		 struct svc_fh	*fhp,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/8] nfsd: export upcalls must not return ESTALE when mountd is down
  2020-03-01 23:21       ` [PATCH 4/8] nfsd: Add tracepoints for update of the expkey and export cache entries Trond Myklebust
@ 2020-03-01 23:21         ` Trond Myklebust
  2020-03-01 23:21           ` [PATCH 6/8] SUNRPC/cache: Allow garbage collection of invalid cache entries Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

If the rpc.mountd daemon goes down, then that should not cause all
exports to start failing with ESTALE errors. Let's explicitly
distinguish between the cache upcall cases that need to time out,
and those that do not.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/dns_resolve.c              | 11 ++++---
 fs/nfsd/export.c                  | 12 +++++++
 fs/nfsd/nfs4idmap.c               | 14 ++++++++
 include/linux/sunrpc/cache.h      |  3 ++
 net/sunrpc/auth_gss/svcauth_gss.c | 12 +++++++
 net/sunrpc/cache.c                | 53 +++++++++++++++----------------
 net/sunrpc/svcauth_unix.c         | 12 +++++++
 7 files changed, 85 insertions(+), 32 deletions(-)

diff --git a/fs/nfs/dns_resolve.c b/fs/nfs/dns_resolve.c
index 89bd5581f317..963800037609 100644
--- a/fs/nfs/dns_resolve.c
+++ b/fs/nfs/dns_resolve.c
@@ -152,12 +152,13 @@ static int nfs_dns_upcall(struct cache_detail *cd,
 		struct cache_head *ch)
 {
 	struct nfs_dns_ent *key = container_of(ch, struct nfs_dns_ent, h);
-	int ret;
 
-	ret = nfs_cache_upcall(cd, key->hostname);
-	if (ret)
-		ret = sunrpc_cache_pipe_upcall(cd, ch);
-	return ret;
+	if (test_and_set_bit(CACHE_PENDING, &ch->flags))
+		return 0;
+	if (!nfs_cache_upcall(cd, key->hostname))
+		return 0;
+	clear_bit(CACHE_PENDING, &ch->flags);
+	return sunrpc_cache_pipe_upcall_timeout(cd, ch);
 }
 
 static int nfs_dns_match(struct cache_head *ca,
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 6e6cbeb7ac2b..cb777fe82988 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -51,6 +51,11 @@ static void expkey_put(struct kref *ref)
 	kfree_rcu(key, ek_rcu);
 }
 
+static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall(cd, h);
+}
+
 static void expkey_request(struct cache_detail *cd,
 			   struct cache_head *h,
 			   char **bpp, int *blen)
@@ -254,6 +259,7 @@ static const struct cache_detail svc_expkey_cache_template = {
 	.hash_size	= EXPKEY_HASHMAX,
 	.name		= "nfsd.fh",
 	.cache_put	= expkey_put,
+	.cache_upcall	= expkey_upcall,
 	.cache_request	= expkey_request,
 	.cache_parse	= expkey_parse,
 	.cache_show	= expkey_show,
@@ -335,6 +341,11 @@ static void svc_export_put(struct kref *ref)
 	kfree_rcu(exp, ex_rcu);
 }
 
+static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall(cd, h);
+}
+
 static void svc_export_request(struct cache_detail *cd,
 			       struct cache_head *h,
 			       char **bpp, int *blen)
@@ -774,6 +785,7 @@ static const struct cache_detail svc_export_cache_template = {
 	.hash_size	= EXPORT_HASHMAX,
 	.name		= "nfsd.export",
 	.cache_put	= svc_export_put,
+	.cache_upcall	= svc_export_upcall,
 	.cache_request	= svc_export_request,
 	.cache_parse	= svc_export_parse,
 	.cache_show	= svc_export_show,
diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
index d1f285245af8..9460be8a8321 100644
--- a/fs/nfsd/nfs4idmap.c
+++ b/fs/nfsd/nfs4idmap.c
@@ -122,6 +122,12 @@ idtoname_hash(struct ent *ent)
 	return hash;
 }
 
+static int
+idtoname_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall_timeout(cd, h);
+}
+
 static void
 idtoname_request(struct cache_detail *cd, struct cache_head *ch, char **bpp,
     int *blen)
@@ -184,6 +190,7 @@ static const struct cache_detail idtoname_cache_template = {
 	.hash_size	= ENT_HASHMAX,
 	.name		= "nfs4.idtoname",
 	.cache_put	= ent_put,
+	.cache_upcall	= idtoname_upcall,
 	.cache_request	= idtoname_request,
 	.cache_parse	= idtoname_parse,
 	.cache_show	= idtoname_show,
@@ -295,6 +302,12 @@ nametoid_hash(struct ent *ent)
 	return hash_str(ent->name, ENT_HASHBITS);
 }
 
+static int
+nametoid_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall_timeout(cd, h);
+}
+
 static void
 nametoid_request(struct cache_detail *cd, struct cache_head *ch, char **bpp,
     int *blen)
@@ -347,6 +360,7 @@ static const struct cache_detail nametoid_cache_template = {
 	.hash_size	= ENT_HASHMAX,
 	.name		= "nfs4.nametoid",
 	.cache_put	= ent_put,
+	.cache_upcall	= nametoid_upcall,
 	.cache_request	= nametoid_request,
 	.cache_parse	= nametoid_parse,
 	.cache_show	= nametoid_show,
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 0f64de7caa39..656882a50991 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -179,6 +179,9 @@ sunrpc_cache_update(struct cache_detail *detail,
 
 extern int
 sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h);
+extern int
+sunrpc_cache_pipe_upcall_timeout(struct cache_detail *detail,
+				 struct cache_head *h);
 
 
 extern void cache_clean_deferred(void *owner);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 65b67b257302..6d698bf1c16e 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -184,6 +184,11 @@ static struct cache_head *rsi_alloc(void)
 		return NULL;
 }
 
+static int rsi_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall_timeout(cd, h);
+}
+
 static void rsi_request(struct cache_detail *cd,
 		       struct cache_head *h,
 		       char **bpp, int *blen)
@@ -282,6 +287,7 @@ static const struct cache_detail rsi_cache_template = {
 	.hash_size	= RSI_HASHMAX,
 	.name           = "auth.rpcsec.init",
 	.cache_put      = rsi_put,
+	.cache_upcall	= rsi_upcall,
 	.cache_request  = rsi_request,
 	.cache_parse    = rsi_parse,
 	.match		= rsi_match,
@@ -428,6 +434,11 @@ rsc_alloc(void)
 		return NULL;
 }
 
+static int rsc_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return -EINVAL;
+}
+
 static int rsc_parse(struct cache_detail *cd,
 		     char *mesg, int mlen)
 {
@@ -554,6 +565,7 @@ static const struct cache_detail rsc_cache_template = {
 	.hash_size	= RSC_HASHMAX,
 	.name		= "auth.rpcsec.context",
 	.cache_put	= rsc_put,
+	.cache_upcall	= rsc_upcall,
 	.cache_parse	= rsc_parse,
 	.match		= rsc_match,
 	.init		= rsc_init,
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index bd843a81afa0..5e14513603cb 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -38,7 +38,6 @@
 
 static bool cache_defer_req(struct cache_req *req, struct cache_head *item);
 static void cache_revisit_request(struct cache_head *item);
-static bool cache_listeners_exist(struct cache_detail *detail);
 
 static void cache_init(struct cache_head *h, struct cache_detail *detail)
 {
@@ -224,13 +223,6 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 }
 EXPORT_SYMBOL_GPL(sunrpc_cache_update);
 
-static int cache_make_upcall(struct cache_detail *cd, struct cache_head *h)
-{
-	if (cd->cache_upcall)
-		return cd->cache_upcall(cd, h);
-	return sunrpc_cache_pipe_upcall(cd, h);
-}
-
 static inline int cache_is_valid(struct cache_head *h)
 {
 	if (!test_bit(CACHE_VALID, &h->flags))
@@ -303,17 +295,14 @@ int cache_check(struct cache_detail *detail,
 		   (h->expiry_time != 0 && age > refresh_age/2)) {
 		dprintk("RPC:       Want update, refage=%lld, age=%lld\n",
 				refresh_age, age);
-		if (!test_and_set_bit(CACHE_PENDING, &h->flags)) {
-			switch (cache_make_upcall(detail, h)) {
-			case -EINVAL:
-				rv = try_to_negate_entry(detail, h);
-				break;
-			case -EAGAIN:
-				cache_fresh_unlocked(h, detail);
-				break;
-			}
-		} else if (!cache_listeners_exist(detail))
+		switch (detail->cache_upcall(detail, h)) {
+		case -EINVAL:
 			rv = try_to_negate_entry(detail, h);
+			break;
+		case -EAGAIN:
+			cache_fresh_unlocked(h, detail);
+			break;
+		}
 	}
 
 	if (rv == -EAGAIN) {
@@ -1195,20 +1184,12 @@ static bool cache_listeners_exist(struct cache_detail *detail)
  *
  * Each request is at most one page long.
  */
-int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h)
+static int cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h)
 {
-
 	char *buf;
 	struct cache_request *crq;
 	int ret = 0;
 
-	if (!detail->cache_request)
-		return -EINVAL;
-
-	if (!cache_listeners_exist(detail)) {
-		warn_no_listener(detail);
-		return -EINVAL;
-	}
 	if (test_bit(CACHE_CLEANED, &h->flags))
 		/* Too late to make an upcall */
 		return -EAGAIN;
@@ -1242,8 +1223,26 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h)
 	}
 	return ret;
 }
+
+int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h)
+{
+	if (test_and_set_bit(CACHE_PENDING, &h->flags))
+		return 0;
+	return cache_pipe_upcall(detail, h);
+}
 EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall);
 
+int sunrpc_cache_pipe_upcall_timeout(struct cache_detail *detail,
+				     struct cache_head *h)
+{
+	if (!cache_listeners_exist(detail)) {
+		warn_no_listener(detail);
+		return -EINVAL;
+	}
+	return sunrpc_cache_pipe_upcall(detail, h);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall_timeout);
+
 /*
  * parse a message from user-space and pass it
  * to an appropriate cache
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 04aa80a2d752..6c8f802c4261 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -148,6 +148,11 @@ static struct cache_head *ip_map_alloc(void)
 		return NULL;
 }
 
+static int ip_map_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall(cd, h);
+}
+
 static void ip_map_request(struct cache_detail *cd,
 				  struct cache_head *h,
 				  char **bpp, int *blen)
@@ -467,6 +472,11 @@ static struct cache_head *unix_gid_alloc(void)
 		return NULL;
 }
 
+static int unix_gid_upcall(struct cache_detail *cd, struct cache_head *h)
+{
+	return sunrpc_cache_pipe_upcall_timeout(cd, h);
+}
+
 static void unix_gid_request(struct cache_detail *cd,
 			     struct cache_head *h,
 			     char **bpp, int *blen)
@@ -584,6 +594,7 @@ static const struct cache_detail unix_gid_cache_template = {
 	.hash_size	= GID_HASHMAX,
 	.name		= "auth.unix.gid",
 	.cache_put	= unix_gid_put,
+	.cache_upcall	= unix_gid_upcall,
 	.cache_request	= unix_gid_request,
 	.cache_parse	= unix_gid_parse,
 	.cache_show	= unix_gid_show,
@@ -881,6 +892,7 @@ static const struct cache_detail ip_map_cache_template = {
 	.hash_size	= IP_HASHMAX,
 	.name		= "auth.unix.ip",
 	.cache_put	= ip_map_put,
+	.cache_upcall	= ip_map_upcall,
 	.cache_request	= ip_map_request,
 	.cache_parse	= ip_map_parse,
 	.cache_show	= ip_map_show,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/8] SUNRPC/cache: Allow garbage collection of invalid cache entries
  2020-03-01 23:21         ` [PATCH 5/8] nfsd: export upcalls must not return ESTALE when mountd is down Trond Myklebust
@ 2020-03-01 23:21           ` Trond Myklebust
  2020-03-01 23:21             ` [PATCH 7/8] sunrpc: Add tracing for cache events Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

If the cache entry never gets initialised, we want the garbage
collector to be able to evict it. Otherwise if the upcall daemon
fails to initialise the entry, we end up never expiring it.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 include/linux/sunrpc/cache.h |  3 ---
 net/sunrpc/cache.c           | 36 +++++++++++++++++++-----------------
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 656882a50991..532cdbda43da 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -209,9 +209,6 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
 
 static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
 {
-	if (!test_bit(CACHE_VALID, &h->flags))
-		return false;
-
 	return  (h->expiry_time < seconds_since_boot()) ||
 		(detail->flush_time >= h->last_refresh);
 }
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 5e14513603cb..b7ddb2affb7e 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -64,13 +64,14 @@ static struct cache_head *sunrpc_cache_find_rcu(struct cache_detail *detail,
 
 	rcu_read_lock();
 	hlist_for_each_entry_rcu(tmp, head, cache_list) {
-		if (detail->match(tmp, key)) {
-			if (cache_is_expired(detail, tmp))
-				continue;
-			tmp = cache_get_rcu(tmp);
-			rcu_read_unlock();
-			return tmp;
-		}
+		if (!detail->match(tmp, key))
+			continue;
+		if (test_bit(CACHE_VALID, &tmp->flags) &&
+		    cache_is_expired(detail, tmp))
+			continue;
+		tmp = cache_get_rcu(tmp);
+		rcu_read_unlock();
+		return tmp;
 	}
 	rcu_read_unlock();
 	return NULL;
@@ -113,17 +114,18 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail,
 
 	/* check if entry appeared while we slept */
 	hlist_for_each_entry_rcu(tmp, head, cache_list) {
-		if (detail->match(tmp, key)) {
-			if (cache_is_expired(detail, tmp)) {
-				sunrpc_begin_cache_remove_entry(tmp, detail);
-				freeme = tmp;
-				break;
-			}
-			cache_get(tmp);
-			spin_unlock(&detail->hash_lock);
-			cache_put(new, detail);
-			return tmp;
+		if (!detail->match(tmp, key))
+			continue;
+		if (test_bit(CACHE_VALID, &tmp->flags) &&
+		    cache_is_expired(detail, tmp)) {
+			sunrpc_begin_cache_remove_entry(tmp, detail);
+			freeme = tmp;
+			break;
 		}
+		cache_get(tmp);
+		spin_unlock(&detail->hash_lock);
+		cache_put(new, detail);
+		return tmp;
 	}
 
 	hlist_add_head_rcu(&new->cache_list, head);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/8] sunrpc: Add tracing for cache events
  2020-03-01 23:21           ` [PATCH 6/8] SUNRPC/cache: Allow garbage collection of invalid cache entries Trond Myklebust
@ 2020-03-01 23:21             ` Trond Myklebust
  2020-03-01 23:21               ` [PATCH 8/8] sunrpc: Drop the connection when the server drops a request Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Add basic tracing for debugging the sunrpc cache events.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 include/trace/events/sunrpc.h | 33 +++++++++++++++++++++++++++++++++
 net/sunrpc/cache.c            | 35 ++++++++++++++++++++++++++---------
 2 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index ee993575d2fa..236d42539c7b 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -1292,6 +1292,39 @@ DECLARE_EVENT_CLASS(svc_deferred_event,
 DEFINE_SVC_DEFERRED_EVENT(drop);
 DEFINE_SVC_DEFERRED_EVENT(revisit);
 
+DECLARE_EVENT_CLASS(cache_event,
+	TP_PROTO(
+		const struct cache_detail *cd,
+		const struct cache_head *h
+	),
+
+	TP_ARGS(cd, h),
+
+	TP_STRUCT__entry(
+		__field(const struct cache_head *, h)
+		__string(name, cd->name)
+	),
+
+	TP_fast_assign(
+		__entry->h = h;
+		__assign_str(name, cd->name);
+	),
+
+	TP_printk("cache=%s entry=%p", __get_str(name), __entry->h)
+);
+#define DEFINE_CACHE_EVENT(name) \
+	DEFINE_EVENT(cache_event, name, \
+			TP_PROTO( \
+				const struct cache_detail *cd, \
+				const struct cache_head *h \
+			), \
+			TP_ARGS(cd, h))
+DEFINE_CACHE_EVENT(cache_entry_expired);
+DEFINE_CACHE_EVENT(cache_entry_upcall);
+DEFINE_CACHE_EVENT(cache_entry_update);
+DEFINE_CACHE_EVENT(cache_entry_make_negative);
+DEFINE_CACHE_EVENT(cache_entry_no_listener);
+
 #endif /* _TRACE_SUNRPC_H */
 
 #include <trace/define_trace.h>
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index b7ddb2affb7e..559a61644037 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -32,6 +32,7 @@
 #include <linux/sunrpc/cache.h>
 #include <linux/sunrpc/stats.h>
 #include <linux/sunrpc/rpc_pipe_fs.h>
+#include <trace/events/sunrpc.h>
 #include "netns.h"
 
 #define	 RPCDBG_FACILITY RPCDBG_CACHE
@@ -119,6 +120,7 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail,
 		if (test_bit(CACHE_VALID, &tmp->flags) &&
 		    cache_is_expired(detail, tmp)) {
 			sunrpc_begin_cache_remove_entry(tmp, detail);
+			trace_cache_entry_expired(detail, tmp);
 			freeme = tmp;
 			break;
 		}
@@ -175,6 +177,24 @@ static void cache_fresh_unlocked(struct cache_head *head,
 	}
 }
 
+static void cache_make_negative(struct cache_detail *detail,
+				struct cache_head *h)
+{
+	set_bit(CACHE_NEGATIVE, &h->flags);
+	trace_cache_entry_make_negative(detail, h);
+}
+
+static void cache_entry_update(struct cache_detail *detail,
+			       struct cache_head *h,
+			       struct cache_head *new)
+{
+	if (!test_bit(CACHE_NEGATIVE, &new->flags)) {
+		detail->update(h, new);
+		trace_cache_entry_update(detail, h);
+	} else
+		cache_make_negative(detail, h);
+}
+
 struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 				       struct cache_head *new, struct cache_head *old, int hash)
 {
@@ -187,10 +207,7 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	if (!test_bit(CACHE_VALID, &old->flags)) {
 		spin_lock(&detail->hash_lock);
 		if (!test_bit(CACHE_VALID, &old->flags)) {
-			if (test_bit(CACHE_NEGATIVE, &new->flags))
-				set_bit(CACHE_NEGATIVE, &old->flags);
-			else
-				detail->update(old, new);
+			cache_entry_update(detail, old, new);
 			cache_fresh_locked(old, new->expiry_time, detail);
 			spin_unlock(&detail->hash_lock);
 			cache_fresh_unlocked(old, detail);
@@ -208,10 +225,7 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	detail->init(tmp, old);
 
 	spin_lock(&detail->hash_lock);
-	if (test_bit(CACHE_NEGATIVE, &new->flags))
-		set_bit(CACHE_NEGATIVE, &tmp->flags);
-	else
-		detail->update(tmp, new);
+	cache_entry_update(detail, tmp, new);
 	hlist_add_head(&tmp->cache_list, &detail->hash_table[hash]);
 	detail->entries++;
 	cache_get(tmp);
@@ -253,7 +267,7 @@ static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h
 	spin_lock(&detail->hash_lock);
 	rv = cache_is_valid(h);
 	if (rv == -EAGAIN) {
-		set_bit(CACHE_NEGATIVE, &h->flags);
+		cache_make_negative(detail, h);
 		cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY,
 				   detail);
 		rv = -ENOENT;
@@ -459,6 +473,7 @@ static int cache_clean(void)
 				continue;
 
 			sunrpc_begin_cache_remove_entry(ch, current_detail);
+			trace_cache_entry_expired(current_detail, ch);
 			rv = 1;
 			break;
 		}
@@ -1214,6 +1229,7 @@ static int cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h)
 	if (test_bit(CACHE_PENDING, &h->flags)) {
 		crq->item = cache_get(h);
 		list_add_tail(&crq->q.list, &detail->queue);
+		trace_cache_entry_upcall(detail, h);
 	} else
 		/* Lost a race, no longer PENDING, so don't enqueue */
 		ret = -EAGAIN;
@@ -1239,6 +1255,7 @@ int sunrpc_cache_pipe_upcall_timeout(struct cache_detail *detail,
 {
 	if (!cache_listeners_exist(detail)) {
 		warn_no_listener(detail);
+		trace_cache_entry_no_listener(detail, h);
 		return -EINVAL;
 	}
 	return sunrpc_cache_pipe_upcall(detail, h);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 8/8] sunrpc: Drop the connection when the server drops a request
  2020-03-01 23:21             ` [PATCH 7/8] sunrpc: Add tracing for cache events Trond Myklebust
@ 2020-03-01 23:21               ` Trond Myklebust
  2020-03-02 16:27                 ` David Wysochanski
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-01 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

If a server wants to drop a request, then it should also drop the
connection, in order to let the client know.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/svc_xprt.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index de3c077733a7..83a527e56c87 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -873,6 +873,13 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
 }
 EXPORT_SYMBOL_GPL(svc_recv);
 
+static void svc_drop_connection(struct svc_xprt *xprt)
+{
+	if (test_bit(XPT_TEMP, &xprt->xpt_flags) &&
+	    !test_and_set_bit(XPT_CLOSE, &xprt->xpt_flags))
+		svc_xprt_enqueue(xprt);
+}
+
 /*
  * Drop request
  */
@@ -880,6 +887,8 @@ void svc_drop(struct svc_rqst *rqstp)
 {
 	trace_svc_drop(rqstp);
 	dprintk("svc: xprt %p dropped request\n", rqstp->rq_xprt);
+	/* Close the connection when dropping a request */
+	svc_drop_connection(rqstp->rq_xprt);
 	svc_xprt_release(rqstp);
 }
 EXPORT_SYMBOL_GPL(svc_drop);
@@ -1148,6 +1157,7 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
 	if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
 		spin_unlock(&xprt->xpt_lock);
 		dprintk("revisit canceled\n");
+		svc_drop_connection(xprt);
 		svc_xprt_put(xprt);
 		trace_svc_drop_deferred(dr);
 		kfree(dr);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 8/8] sunrpc: Drop the connection when the server drops a request
  2020-03-01 23:21               ` [PATCH 8/8] sunrpc: Drop the connection when the server drops a request Trond Myklebust
@ 2020-03-02 16:27                 ` David Wysochanski
  2020-03-02 20:12                   ` Trond Myklebust
  0 siblings, 1 reply; 14+ messages in thread
From: David Wysochanski @ 2020-03-02 16:27 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

On Sun, Mar 1, 2020 at 6:25 PM Trond Myklebust <trondmy@gmail.com> wrote:
>
> If a server wants to drop a request, then it should also drop the
> connection, in order to let the client know.
>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  net/sunrpc/svc_xprt.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index de3c077733a7..83a527e56c87 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -873,6 +873,13 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
>  }
>  EXPORT_SYMBOL_GPL(svc_recv);
>
> +static void svc_drop_connection(struct svc_xprt *xprt)
> +{
> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags) &&
> +           !test_and_set_bit(XPT_CLOSE, &xprt->xpt_flags))
> +               svc_xprt_enqueue(xprt);
> +}
> +
>  /*
>   * Drop request
>   */
> @@ -880,6 +887,8 @@ void svc_drop(struct svc_rqst *rqstp)
>  {
>         trace_svc_drop(rqstp);
>         dprintk("svc: xprt %p dropped request\n", rqstp->rq_xprt);
> +       /* Close the connection when dropping a request */
> +       svc_drop_connection(rqstp->rq_xprt);
>         svc_xprt_release(rqstp);
>  }
>  EXPORT_SYMBOL_GPL(svc_drop);
> @@ -1148,6 +1157,7 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>         if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
>                 spin_unlock(&xprt->xpt_lock);
>                 dprintk("revisit canceled\n");
> +               svc_drop_connection(xprt);
>                 svc_xprt_put(xprt);
>                 trace_svc_drop_deferred(dr);
>                 kfree(dr);
> --
> 2.24.1
>

Trond, back in 2014 you had this NFSv4 only patch that took a more
surgical approach:
https://marc.info/?l=linux-nfs&m=141414531832768&w=2

It looks like discussion died out on it after it was ineffective to
solve a different problem.
Is there a reason why you don't want to do that approach now?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 8/8] sunrpc: Drop the connection when the server drops a request
  2020-03-02 16:27                 ` David Wysochanski
@ 2020-03-02 20:12                   ` Trond Myklebust
  0 siblings, 0 replies; 14+ messages in thread
From: Trond Myklebust @ 2020-03-02 20:12 UTC (permalink / raw)
  To: dwysocha; +Cc: linux-nfs, bfields

On Mon, 2020-03-02 at 11:27 -0500, David Wysochanski wrote:
> On Sun, Mar 1, 2020 at 6:25 PM Trond Myklebust <trondmy@gmail.com>
> wrote:
> > If a server wants to drop a request, then it should also drop the
> > connection, in order to let the client know.
> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  net/sunrpc/svc_xprt.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > index de3c077733a7..83a527e56c87 100644
> > --- a/net/sunrpc/svc_xprt.c
> > +++ b/net/sunrpc/svc_xprt.c
> > @@ -873,6 +873,13 @@ int svc_recv(struct svc_rqst *rqstp, long
> > timeout)
> >  }
> >  EXPORT_SYMBOL_GPL(svc_recv);
> > 
> > +static void svc_drop_connection(struct svc_xprt *xprt)
> > +{
> > +       if (test_bit(XPT_TEMP, &xprt->xpt_flags) &&
> > +           !test_and_set_bit(XPT_CLOSE, &xprt->xpt_flags))
> > +               svc_xprt_enqueue(xprt);
> > +}
> > +
> >  /*
> >   * Drop request
> >   */
> > @@ -880,6 +887,8 @@ void svc_drop(struct svc_rqst *rqstp)
> >  {
> >         trace_svc_drop(rqstp);
> >         dprintk("svc: xprt %p dropped request\n", rqstp->rq_xprt);
> > +       /* Close the connection when dropping a request */
> > +       svc_drop_connection(rqstp->rq_xprt);
> >         svc_xprt_release(rqstp);
> >  }
> >  EXPORT_SYMBOL_GPL(svc_drop);
> > @@ -1148,6 +1157,7 @@ static void svc_revisit(struct
> > cache_deferred_req *dreq, int too_many)
> >         if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> >                 spin_unlock(&xprt->xpt_lock);
> >                 dprintk("revisit canceled\n");
> > +               svc_drop_connection(xprt);
> >                 svc_xprt_put(xprt);
> >                 trace_svc_drop_deferred(dr);
> >                 kfree(dr);
> > --
> > 2.24.1
> > 
> 
> Trond, back in 2014 you had this NFSv4 only patch that took a more
> surgical approach:
> https://marc.info/?l=linux-nfs&m=141414531832768&w=2
> 
> It looks like discussion died out on it after it was ineffective to
> solve a different problem.
> Is there a reason why you don't want to do that approach now?
> 

Let me resend this patch with a better proposal. I think the main 2
problems here are really

   1. the svc_revisit() case, where we cancel the revisit. That case
      affects all versions of NFS, and can lead to performance issues.
   2. the NFSv2,v3,v4.0 replay cache, where dropping the replay (e.g.
      after a connection breakage) can cause a performance hit, and for
      something like TCP, which has long (usually 60 second) timeouts it
      could cause the replay to be delayed until after the reply gets
      kicked out of the cache. This is the case where NFSv4.0 can probably
      end up hanging, since the replay won't be forthcoming until a new
      connection breakage occurs.

I think (1) is best served by a patch like this one.
Perhaps (2) is better served by adopting the svc_defer() mechanism?

Hmm... Perhaps 2 patches are in order...

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry()
  2020-03-01 23:21   ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Trond Myklebust
  2020-03-01 23:21     ` [PATCH 3/8] nfsd: Add tracepoints for exp_find_key() and exp_get_by_name() Trond Myklebust
@ 2020-03-03  0:22     ` Chuck Lever
  2020-03-03  5:59       ` Trond Myklebust
  1 sibling, 1 reply; 14+ messages in thread
From: Chuck Lever @ 2020-03-03  0:22 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Bruce Fields, Linux NFS Mailing List



> On Mar 1, 2020, at 6:21 PM, Trond Myklebust <trondmy@gmail.com> wrote:
> 
> Add tracing to allow us to figure out where any stale filehandle issues
> may be originating from.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
> fs/nfsd/nfsfh.c | 13 ++++++++++---
> fs/nfsd/trace.h | 30 ++++++++++++++++++++++++++++++
> 2 files changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
> index b319080288c3..37bc8f5f4514 100644
> --- a/fs/nfsd/nfsfh.c
> +++ b/fs/nfsd/nfsfh.c
> @@ -14,6 +14,7 @@
> #include "nfsd.h"
> #include "vfs.h"
> #include "auth.h"
> +#include "trace.h"
> 
> #define NFSDDBG_FACILITY		NFSDDBG_FH
> 
> @@ -209,11 +210,14 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp)
> 	}
> 
> 	error = nfserr_stale;
> -	if (PTR_ERR(exp) == -ENOENT)
> -		return error;
> +	if (IS_ERR(exp)) {
> +		trace_nfsd_set_fh_dentry_badexport(rqstp, fhp, PTR_ERR(exp));
> +
> +		if (PTR_ERR(exp) == -ENOENT)
> +			return error;
> 
> -	if (IS_ERR(exp))
> 		return nfserrno(PTR_ERR(exp));
> +	}
> 
> 	if (exp->ex_flags & NFSEXP_NOSUBTREECHECK) {
> 		/* Elevate privileges so that the lack of 'r' or 'x'
> @@ -267,6 +271,9 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp)
> 		dentry = exportfs_decode_fh(exp->ex_path.mnt, fid,
> 				data_left, fileid_type,
> 				nfsd_acceptable, exp);
> +		if (IS_ERR_OR_NULL(dentry))
> +			trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp,
> +					dentry ?  PTR_ERR(dentry) : -ESTALE);

If you'll be respinning this series, a handful of nits:

- the line above has a double space
- the trace point names here are a little long, will result in
  hard-to-read formatting in the trace log
- checkpatch.pl complains about a couple of the later patches,
  where one arm of an "if" statement has braces but the other
  does not


> 	}
> 	if (dentry == NULL)
> 		goto out;
> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> index 06dd0d337049..9abd1591a841 100644
> --- a/fs/nfsd/trace.h
> +++ b/fs/nfsd/trace.h
> @@ -50,6 +50,36 @@ TRACE_EVENT(nfsd_compound_status,
> 		__get_str(name), __entry->status)
> )
> 
> +DECLARE_EVENT_CLASS(nfsd_fh_err_class,
> +	TP_PROTO(struct svc_rqst *rqstp,
> +		 struct svc_fh	*fhp,
> +		 int		status),
> +	TP_ARGS(rqstp, fhp, status),
> +	TP_STRUCT__entry(
> +		__field(u32, xid)
> +		__field(u32, fh_hash)
> +		__field(int, status)
> +	),
> +	TP_fast_assign(
> +		__entry->xid = be32_to_cpu(rqstp->rq_xid);
> +		__entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
> +		__entry->status = status;
> +	),
> +	TP_printk("xid=0x%08x fh_hash=0x%08x status=%d",
> +		  __entry->xid, __entry->fh_hash,
> +		  __entry->status)
> +)
> +
> +#define DEFINE_NFSD_FH_ERR_EVENT(name)		\
> +DEFINE_EVENT(nfsd_fh_err_class, nfsd_##name,	\
> +	TP_PROTO(struct svc_rqst *rqstp,	\
> +		 struct svc_fh	*fhp,		\
> +		 int		status),	\
> +	TP_ARGS(rqstp, fhp, status))
> +
> +DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badexport);
> +DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badhandle);
> +
> DECLARE_EVENT_CLASS(nfsd_io_class,
> 	TP_PROTO(struct svc_rqst *rqstp,
> 		 struct svc_fh	*fhp,
> -- 
> 2.24.1
> 

--
Chuck Lever
chucklever@gmail.com




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry()
  2020-03-03  0:22     ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Chuck Lever
@ 2020-03-03  5:59       ` Trond Myklebust
  2020-03-03 13:27         ` Chuck Lever
  0 siblings, 1 reply; 14+ messages in thread
From: Trond Myklebust @ 2020-03-03  5:59 UTC (permalink / raw)
  To: chucklever; +Cc: linux-nfs, bfields

On Mon, 2020-03-02 at 19:22 -0500, Chuck Lever wrote:
> > On Mar 1, 2020, at 6:21 PM, Trond Myklebust <trondmy@gmail.com>
> > wrote:
> > 
> > Add tracing to allow us to figure out where any stale filehandle
> > issues
> > may be originating from.
> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> > fs/nfsd/nfsfh.c | 13 ++++++++++---
> > fs/nfsd/trace.h | 30 ++++++++++++++++++++++++++++++
> > 2 files changed, 40 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
> > index b319080288c3..37bc8f5f4514 100644
> > --- a/fs/nfsd/nfsfh.c
> > +++ b/fs/nfsd/nfsfh.c
> > @@ -14,6 +14,7 @@
> > #include "nfsd.h"
> > #include "vfs.h"
> > #include "auth.h"
> > +#include "trace.h"
> > 
> > #define NFSDDBG_FACILITY		NFSDDBG_FH
> > 
> > @@ -209,11 +210,14 @@ static __be32 nfsd_set_fh_dentry(struct
> > svc_rqst *rqstp, struct svc_fh *fhp)
> > 	}
> > 
> > 	error = nfserr_stale;
> > -	if (PTR_ERR(exp) == -ENOENT)
> > -		return error;
> > +	if (IS_ERR(exp)) {
> > +		trace_nfsd_set_fh_dentry_badexport(rqstp, fhp,
> > PTR_ERR(exp));
> > +
> > +		if (PTR_ERR(exp) == -ENOENT)
> > +			return error;
> > 
> > -	if (IS_ERR(exp))
> > 		return nfserrno(PTR_ERR(exp));
> > +	}
> > 
> > 	if (exp->ex_flags & NFSEXP_NOSUBTREECHECK) {
> > 		/* Elevate privileges so that the lack of 'r' or 'x'
> > @@ -267,6 +271,9 @@ static __be32 nfsd_set_fh_dentry(struct
> > svc_rqst *rqstp, struct svc_fh *fhp)
> > 		dentry = exportfs_decode_fh(exp->ex_path.mnt, fid,
> > 				data_left, fileid_type,
> > 				nfsd_acceptable, exp);
> > +		if (IS_ERR_OR_NULL(dentry))
> > +			trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp,
> > +					dentry ?  PTR_ERR(dentry) :
> > -ESTALE);
> 
> If you'll be respinning this series, a handful of nits:
> 

I see no need to respin the entire series. Just the last patch.

> - the line above has a double space
> - the trace point names here are a little long, will result in
>   hard-to-read formatting in the trace log
> - checkpatch.pl complains about a couple of the later patches,
>   where one arm of an "if" statement has braces but the other
>   does not
> 
> > 	}
> > 	if (dentry == NULL)
> > 		goto out;
> > diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> > index 06dd0d337049..9abd1591a841 100644
> > --- a/fs/nfsd/trace.h
> > +++ b/fs/nfsd/trace.h
> > @@ -50,6 +50,36 @@ TRACE_EVENT(nfsd_compound_status,
> > 		__get_str(name), __entry->status)
> > )
> > 
> > +DECLARE_EVENT_CLASS(nfsd_fh_err_class,
> > +	TP_PROTO(struct svc_rqst *rqstp,
> > +		 struct svc_fh	*fhp,
> > +		 int		status),
> > +	TP_ARGS(rqstp, fhp, status),
> > +	TP_STRUCT__entry(
> > +		__field(u32, xid)
> > +		__field(u32, fh_hash)
> > +		__field(int, status)
> > +	),
> > +	TP_fast_assign(
> > +		__entry->xid = be32_to_cpu(rqstp->rq_xid);
> > +		__entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
> > +		__entry->status = status;
> > +	),
> > +	TP_printk("xid=0x%08x fh_hash=0x%08x status=%d",
> > +		  __entry->xid, __entry->fh_hash,
> > +		  __entry->status)
> > +)
> > +
> > +#define DEFINE_NFSD_FH_ERR_EVENT(name)		\
> > +DEFINE_EVENT(nfsd_fh_err_class, nfsd_##name,	\
> > +	TP_PROTO(struct svc_rqst *rqstp,	\
> > +		 struct svc_fh	*fhp,		\
> > +		 int		status),	\
> > +	TP_ARGS(rqstp, fhp, status))
> > +
> > +DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badexport);
> > +DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badhandle);
> > +
> > DECLARE_EVENT_CLASS(nfsd_io_class,
> > 	TP_PROTO(struct svc_rqst *rqstp,
> > 		 struct svc_fh	*fhp,
> > -- 
> > 2.24.1
> > 
> 
> --
> Chuck Lever
> chucklever@gmail.com
> 
> 
> 
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry()
  2020-03-03  5:59       ` Trond Myklebust
@ 2020-03-03 13:27         ` Chuck Lever
  0 siblings, 0 replies; 14+ messages in thread
From: Chuck Lever @ 2020-03-03 13:27 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: chucklever, linux-nfs, bfields


> On Mar 3, 2020, at 12:59 AM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Mon, 2020-03-02 at 19:22 -0500, Chuck Lever wrote:
>>> On Mar 1, 2020, at 6:21 PM, Trond Myklebust <trondmy@gmail.com>
>>> wrote:
>>> 
>>> Add tracing to allow us to figure out where any stale filehandle
>>> issues
>>> may be originating from.
>>> 
>>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>>> ---
>>> fs/nfsd/nfsfh.c | 13 ++++++++++---
>>> fs/nfsd/trace.h | 30 ++++++++++++++++++++++++++++++
>>> 2 files changed, 40 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
>>> index b319080288c3..37bc8f5f4514 100644
>>> --- a/fs/nfsd/nfsfh.c
>>> +++ b/fs/nfsd/nfsfh.c
>>> @@ -14,6 +14,7 @@
>>> #include "nfsd.h"
>>> #include "vfs.h"
>>> #include "auth.h"
>>> +#include "trace.h"
>>> 
>>> #define NFSDDBG_FACILITY        NFSDDBG_FH
>>> 
>>> @@ -209,11 +210,14 @@ static __be32 nfsd_set_fh_dentry(struct
>>> svc_rqst *rqstp, struct svc_fh *fhp)
>>>    }
>>> 
>>>    error = nfserr_stale;
>>> -    if (PTR_ERR(exp) == -ENOENT)
>>> -        return error;
>>> +    if (IS_ERR(exp)) {
>>> +        trace_nfsd_set_fh_dentry_badexport(rqstp, fhp,
>>> PTR_ERR(exp));
>>> +
>>> +        if (PTR_ERR(exp) == -ENOENT)
>>> +            return error;
>>> 
>>> -    if (IS_ERR(exp))
>>>        return nfserrno(PTR_ERR(exp));
>>> +    }
>>> 
>>>    if (exp->ex_flags & NFSEXP_NOSUBTREECHECK) {
>>>        /* Elevate privileges so that the lack of 'r' or 'x'
>>> @@ -267,6 +271,9 @@ static __be32 nfsd_set_fh_dentry(struct
>>> svc_rqst *rqstp, struct svc_fh *fhp)
>>>        dentry = exportfs_decode_fh(exp->ex_path.mnt, fid,
>>>                data_left, fileid_type,
>>>                nfsd_acceptable, exp);
>>> +        if (IS_ERR_OR_NULL(dentry))
>>> +            trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp,
>>> +                    dentry ?  PTR_ERR(dentry) :
>>> -ESTALE);
>> 
>> If you'll be respinning this series, a handful of nits:
>> 
> 
> I see no need to respin the entire series. Just the last patch.

Fair enough.


>> - the line above has a double space
>> - the trace point names here are a little long, will result in
>>  hard-to-read formatting in the trace log
>> - checkpatch.pl complains about a couple of the later patches,
>>  where one arm of an "if" statement has braces but the other
>>  does not
>> 
>>>    }
>>>    if (dentry == NULL)
>>>        goto out;
>>> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
>>> index 06dd0d337049..9abd1591a841 100644
>>> --- a/fs/nfsd/trace.h
>>> +++ b/fs/nfsd/trace.h
>>> @@ -50,6 +50,36 @@ TRACE_EVENT(nfsd_compound_status,
>>>        __get_str(name), __entry->status)
>>> )
>>> 
>>> +DECLARE_EVENT_CLASS(nfsd_fh_err_class,
>>> +    TP_PROTO(struct svc_rqst *rqstp,
>>> +         struct svc_fh    *fhp,
>>> +         int        status),
>>> +    TP_ARGS(rqstp, fhp, status),
>>> +    TP_STRUCT__entry(
>>> +        __field(u32, xid)
>>> +        __field(u32, fh_hash)
>>> +        __field(int, status)
>>> +    ),
>>> +    TP_fast_assign(
>>> +        __entry->xid = be32_to_cpu(rqstp->rq_xid);
>>> +        __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
>>> +        __entry->status = status;
>>> +    ),
>>> +    TP_printk("xid=0x%08x fh_hash=0x%08x status=%d",
>>> +          __entry->xid, __entry->fh_hash,
>>> +          __entry->status)
>>> +)
>>> +
>>> +#define DEFINE_NFSD_FH_ERR_EVENT(name)        \
>>> +DEFINE_EVENT(nfsd_fh_err_class, nfsd_##name,    \
>>> +    TP_PROTO(struct svc_rqst *rqstp,    \
>>> +         struct svc_fh    *fhp,        \
>>> +         int        status),    \
>>> +    TP_ARGS(rqstp, fhp, status))
>>> +
>>> +DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badexport);
>>> +DEFINE_NFSD_FH_ERR_EVENT(set_fh_dentry_badhandle);
>>> +
>>> DECLARE_EVENT_CLASS(nfsd_io_class,
>>>    TP_PROTO(struct svc_rqst *rqstp,
>>>         struct svc_fh    *fhp,
>>> -- 
>>> 2.24.1
>>> 
>> 
>> --
>> Chuck Lever
>> chucklever@gmail.com
>> 
>> 
>> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-03-03 13:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-01 23:21 [PATCH 0/8] Bugfixes and tracing enhancements for knfsd Trond Myklebust
2020-03-01 23:21 ` [PATCH 1/8] nfsd: Don't add locks to closed or closing open stateids Trond Myklebust
2020-03-01 23:21   ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Trond Myklebust
2020-03-01 23:21     ` [PATCH 3/8] nfsd: Add tracepoints for exp_find_key() and exp_get_by_name() Trond Myklebust
2020-03-01 23:21       ` [PATCH 4/8] nfsd: Add tracepoints for update of the expkey and export cache entries Trond Myklebust
2020-03-01 23:21         ` [PATCH 5/8] nfsd: export upcalls must not return ESTALE when mountd is down Trond Myklebust
2020-03-01 23:21           ` [PATCH 6/8] SUNRPC/cache: Allow garbage collection of invalid cache entries Trond Myklebust
2020-03-01 23:21             ` [PATCH 7/8] sunrpc: Add tracing for cache events Trond Myklebust
2020-03-01 23:21               ` [PATCH 8/8] sunrpc: Drop the connection when the server drops a request Trond Myklebust
2020-03-02 16:27                 ` David Wysochanski
2020-03-02 20:12                   ` Trond Myklebust
2020-03-03  0:22     ` [PATCH 2/8] nfsd: Add tracing to nfsd_set_fh_dentry() Chuck Lever
2020-03-03  5:59       ` Trond Myklebust
2020-03-03 13:27         ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.