linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] exposing knfsd opens to userspace
@ 2019-04-25 14:04 J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 01/10] nfsd: persist nfsd filesystem across mounts J. Bruce Fields
                   ` (12 more replies)
  0 siblings, 13 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

The following patches expose information about NFSv4 opens held by knfsd
on behalf of NFSv4 clients.  Those are currently invisible to userspace,
unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).

The approach is to add a new directory /proc/fs/nfsd/clients/ with
subdirectories for each active NFSv4 client.  Each subdirectory has an
"info" file with some basic information to help identify the client and
an "opens" directory that lists the opens held by that client.

I got it working by cobbling together some poorly-understood code I
found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
and tell me what I've got wrong, they're more than welcome, but at this
stage I'm more curious for feedback on the interface.

I'm also cc'ing people responsible for lsof and util-linux in case they
have any opinions.

Currently these pseudofiles look like:

  # find /proc/fs/nfsd/clients -type f|xargs tail 
  ==> /proc/fs/nfsd/clients/3741/opens <==
  5cc0cd36/6debfb50/00000001/00000001	rw	--	fd:10:13649	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
  5cc0cd36/6debfb50/00000003/00000001	r-	--	fd:10:13650	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'

  ==> /proc/fs/nfsd/clients/3741/info <==
  clientid: 6debfb505cc0cd36
  address: 192.168.122.36:0
  name: Linux NFSv4.2 test2.fieldses.org
  minor version: 2

Each line of the "opens" file is tab-delimited and describes one open,
and the fields are stateid, open access bits, deny bits,
major:minor:ino, and open owner.

So, some random questions:

	- I just copied the major:minor:ino thing from /proc/locks, I
	  suspect we would have picked something different to identify
	  inodes if /proc/locks were done now.  (Mount id and inode?
	  Something else?)

	- The open owner is just an opaque blob of binary data, but
	  clients may choose to include some useful asci-encoded
	  information, so I'm formatting them as strings with non-ascii
	  stuff escaped.  For example, pynfs usually uses the name of
	  the test as the open owner.  But as you see above, the ascii
	  content of the Linux client's open owners is less useful.
	  Also, there's no way I know of to map them back to a file
	  description or process or anything else useful on the client,
	  so perhaps they're of limited interest.

	- I'm not sure about the stateid either.  I did think it might
	  be useful just as a unique identifier for each line.
	  (Actually for that it'd be enough to take just the third of
	  those four numbers making up the stateid--maybe that would be
	  better.)

In the "info" file, the "name" line is the client identifier/client
owner provided by the client, which (like the stateowner) is just opaque
binary data, though as you can see here the Linux client is providing a
readable ascii string.

There's probably a lot more we could add to that info file eventually.

Other stuff to add next:

	- nfsd/clients/#/kill that you can write to to revoke all a
	  client's state if it's wedged somehow.
	- lists of locks and delegations; lower priority since most of
	  that information is already in /proc/locks.

--b.

J. Bruce Fields (10):
  nfsd: persist nfsd filesystem across mounts
  nfsd: rename cl_refcount
  nfsd4: use reference count to free client
  nfsd: add nfsd/clients directory
  nfsd: make client/ directory names small ints
  rpc: replace rpc_filelist by tree_descr
  nfsd4: add a client info file
  nfsd4: add file to display list of client's opens
  nfsd: expose some more information about NFSv4 opens
  nfsd: add more information to client info file

 fs/nfsd/netns.h                |   6 +
 fs/nfsd/nfs4state.c            | 228 ++++++++++++++++++++++++++++++---
 fs/nfsd/nfsctl.c               | 225 +++++++++++++++++++++++++++++++-
 fs/nfsd/nfsd.h                 |  11 ++
 fs/nfsd/state.h                |   9 +-
 fs/seq_file.c                  |  17 +++
 include/linux/seq_file.h       |   2 +
 include/linux/string_helpers.h |   1 +
 lib/string_helpers.c           |   5 +-
 net/sunrpc/rpc_pipe.c          |  37 ++----
 10 files changed, 491 insertions(+), 50 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/10] nfsd: persist nfsd filesystem across mounts
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 02/10] nfsd: rename cl_refcount J. Bruce Fields
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Keep around one internal mount of the nfsd filesystem so that we can
e.g. add stuff to it when clients come and go, regardless of whether
anyone has it mounted.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/netns.h  |  3 +++
 fs/nfsd/nfsctl.c | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 32cb8c027483..cce335e1ec98 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -55,6 +55,9 @@ struct nfsd_net {
 	bool grace_ended;
 	time_t boot_time;
 
+	/* internal mount of the "nfsd" pseudofilesystem: */
+	struct vfsmount *nfsd_mnt;
+
 	/*
 	 * reclaim_str_hashtbl[] holds known client info from previous reset/reboot
 	 * used in reboot/reset lease grace period processing
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index f2feb2d11bae..8d2062428569 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1231,6 +1231,7 @@ unsigned int nfsd_net_id;
 static __net_init int nfsd_init_net(struct net *net)
 {
 	int retval;
+	struct vfsmount *mnt;
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	retval = nfsd_export_init(net);
@@ -1248,8 +1249,17 @@ static __net_init int nfsd_init_net(struct net *net)
 
 	atomic_set(&nn->ntf_refcnt, 0);
 	init_waitqueue_head(&nn->ntf_wq);
+
+	mnt =  vfs_kern_mount(&nfsd_fs_type, SB_KERNMOUNT, "nfsd", NULL);
+	if (IS_ERR(mnt)) {
+		retval = PTR_ERR(mnt);
+		goto out_mount_err;
+	}
+	nn->nfsd_mnt = mnt;
 	return 0;
 
+out_mount_err:
+	nfsd_idmap_shutdown(net);
 out_idmap_error:
 	nfsd_export_shutdown(net);
 out_export_error:
@@ -1258,6 +1268,9 @@ static __net_init int nfsd_init_net(struct net *net)
 
 static __net_exit void nfsd_exit_net(struct net *net)
 {
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+	mntput(nn->nfsd_mnt);
 	nfsd_idmap_shutdown(net);
 	nfsd_export_shutdown(net);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/10] nfsd: rename cl_refcount
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 01/10] nfsd: persist nfsd filesystem across mounts J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 03/10] nfsd4: use reference count to free client J. Bruce Fields
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Rename this to a more descriptive name.

Also, I'm going to want a second refcount with a slightly different use.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4state.c | 26 +++++++++++++-------------
 fs/nfsd/state.h     |  2 +-
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6a45fb00c5fc..9dab61bbd256 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -137,7 +137,7 @@ static __be32 get_client_locked(struct nfs4_client *clp)
 
 	if (is_client_expired(clp))
 		return nfserr_expired;
-	atomic_inc(&clp->cl_refcount);
+	atomic_inc(&clp->cl_rpc_users);
 	return nfs_ok;
 }
 
@@ -169,7 +169,7 @@ static void put_client_renew_locked(struct nfs4_client *clp)
 
 	lockdep_assert_held(&nn->client_lock);
 
-	if (!atomic_dec_and_test(&clp->cl_refcount))
+	if (!atomic_dec_and_test(&clp->cl_rpc_users))
 		return;
 	if (!is_client_expired(clp))
 		renew_client_locked(clp);
@@ -179,7 +179,7 @@ static void put_client_renew(struct nfs4_client *clp)
 {
 	struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
-	if (!atomic_dec_and_lock(&clp->cl_refcount, &nn->client_lock))
+	if (!atomic_dec_and_lock(&clp->cl_rpc_users, &nn->client_lock))
 		return;
 	if (!is_client_expired(clp))
 		renew_client_locked(clp);
@@ -1847,7 +1847,7 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
 	clp->cl_name.len = name.len;
 	INIT_LIST_HEAD(&clp->cl_sessions);
 	idr_init(&clp->cl_stateids);
-	atomic_set(&clp->cl_refcount, 0);
+	atomic_set(&clp->cl_rpc_users, 0);
 	clp->cl_cb_state = NFSD4_CB_UNKNOWN;
 	INIT_LIST_HEAD(&clp->cl_idhash);
 	INIT_LIST_HEAD(&clp->cl_openowners);
@@ -1926,7 +1926,7 @@ unhash_client(struct nfs4_client *clp)
 
 static __be32 mark_client_expired_locked(struct nfs4_client *clp)
 {
-	if (atomic_read(&clp->cl_refcount))
+	if (atomic_read(&clp->cl_rpc_users))
 		return nfserr_jukebox;
 	unhash_client_locked(clp);
 	return nfs_ok;
@@ -4066,7 +4066,7 @@ static __be32 lookup_clientid(clientid_t *clid,
 		spin_unlock(&nn->client_lock);
 		return nfserr_expired;
 	}
-	atomic_inc(&found->cl_refcount);
+	atomic_inc(&found->cl_rpc_users);
 	spin_unlock(&nn->client_lock);
 
 	/* Cache the nfs4_client in cstate! */
@@ -6550,7 +6550,7 @@ nfs4_check_open_reclaim(clientid_t *clid,
 static inline void
 put_client(struct nfs4_client *clp)
 {
-	atomic_dec(&clp->cl_refcount);
+	atomic_dec(&clp->cl_rpc_users);
 }
 
 static struct nfs4_client *
@@ -6668,7 +6668,7 @@ nfsd_inject_add_lock_to_list(struct nfs4_ol_stateid *lst,
 		return;
 
 	lockdep_assert_held(&nn->client_lock);
-	atomic_inc(&clp->cl_refcount);
+	atomic_inc(&clp->cl_rpc_users);
 	list_add(&lst->st_locks, collect);
 }
 
@@ -6697,7 +6697,7 @@ static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
 				 * Despite the fact that these functions deal
 				 * with 64-bit integers for "count", we must
 				 * ensure that it doesn't blow up the
-				 * clp->cl_refcount. Throw a warning if we
+				 * clp->cl_rpc_users. Throw a warning if we
 				 * start to approach INT_MAX here.
 				 */
 				WARN_ON_ONCE(count == (INT_MAX / 2));
@@ -6821,7 +6821,7 @@ nfsd_foreach_client_openowner(struct nfs4_client *clp, u64 max,
 		if (func) {
 			func(oop);
 			if (collect) {
-				atomic_inc(&clp->cl_refcount);
+				atomic_inc(&clp->cl_rpc_users);
 				list_add(&oop->oo_perclient, collect);
 			}
 		}
@@ -6829,7 +6829,7 @@ nfsd_foreach_client_openowner(struct nfs4_client *clp, u64 max,
 		/*
 		 * Despite the fact that these functions deal with
 		 * 64-bit integers for "count", we must ensure that
-		 * it doesn't blow up the clp->cl_refcount. Throw a
+		 * it doesn't blow up the clp->cl_rpc_users. Throw a
 		 * warning if we start to approach INT_MAX here.
 		 */
 		WARN_ON_ONCE(count == (INT_MAX / 2));
@@ -6959,7 +6959,7 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 			if (dp->dl_time != 0)
 				continue;
 
-			atomic_inc(&clp->cl_refcount);
+			atomic_inc(&clp->cl_rpc_users);
 			WARN_ON(!unhash_delegation_locked(dp));
 			list_add(&dp->dl_recall_lru, victims);
 		}
@@ -6967,7 +6967,7 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
 		/*
 		 * Despite the fact that these functions deal with
 		 * 64-bit integers for "count", we must ensure that
-		 * it doesn't blow up the clp->cl_refcount. Throw a
+		 * it doesn't blow up the clp->cl_rpc_users. Throw a
 		 * warning if we start to approach INT_MAX here.
 		 */
 		WARN_ON_ONCE(count == (INT_MAX / 2));
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 396c76755b03..aa9f1676e88a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -346,7 +346,7 @@ struct nfs4_client {
 	struct nfsd4_clid_slot	cl_cs_slot;	/* create_session slot */
 	u32			cl_exchange_flags;
 	/* number of rpc's in progress over an associated session: */
-	atomic_t		cl_refcount;
+	atomic_t		cl_rpc_users;
 	struct nfs4_op_map      cl_spo_must_allow;
 
 	/* for nfs41 callbacks */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/10] nfsd4: use reference count to free client
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 01/10] nfsd: persist nfsd filesystem across mounts J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 02/10] nfsd: rename cl_refcount J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 04/10] nfsd: add nfsd/clients directory J. Bruce Fields
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Keep a second reference count which is what is really used to decide
when to free the client's memory.  File objects under nfsd/client/ will
hold these references.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4state.c | 26 +++++++++++++++++++++-----
 fs/nfsd/state.h     |  1 +
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9dab61bbd256..83d0ee329e14 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1869,6 +1869,24 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
 	return NULL;
 }
 
+static void __free_client(struct kref *k)
+{
+	struct nfs4_client *clp = container_of(k, struct nfs4_client, cl_ref);
+
+	free_svc_cred(&clp->cl_cred);
+	kfree(clp->cl_ownerstr_hashtbl);
+	kfree(clp->cl_name.data);
+	idr_destroy(&clp->cl_stateids);
+	if (clp->cl_nfsd_dentry)
+		nfsd_client_rmdir(clp->cl_nfsd_dentry);
+	kmem_cache_free(client_slab, clp);
+}
+
+void drop_client(struct nfs4_client *clp)
+{
+	kref_put(&clp->cl_ref, __free_client);
+}
+
 static void
 free_client(struct nfs4_client *clp)
 {
@@ -1881,11 +1899,7 @@ free_client(struct nfs4_client *clp)
 		free_session(ses);
 	}
 	rpc_destroy_wait_queue(&clp->cl_cb_waitq);
-	free_svc_cred(&clp->cl_cred);
-	kfree(clp->cl_ownerstr_hashtbl);
-	kfree(clp->cl_name.data);
-	idr_destroy(&clp->cl_stateids);
-	kmem_cache_free(client_slab, clp);
+	drop_client(clp);
 }
 
 /* must be called under the client_lock */
@@ -2195,6 +2209,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 		free_client(clp);
 		return NULL;
 	}
+
+	kref_init(&clp->cl_ref);
 	nfsd4_init_cb(&clp->cl_cb_null, clp, NULL, NFSPROC4_CLNT_CB_NULL);
 	clp->cl_time = get_seconds();
 	clear_bit(0, &clp->cl_cb_slot_busy);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index aa9f1676e88a..aa26ae520fb6 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -347,6 +347,7 @@ struct nfs4_client {
 	u32			cl_exchange_flags;
 	/* number of rpc's in progress over an associated session: */
 	atomic_t		cl_rpc_users;
+	struct kref		cl_ref;
 	struct nfs4_op_map      cl_spo_must_allow;
 
 	/* for nfs41 callbacks */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/10] nfsd: add nfsd/clients directory
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (2 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 03/10] nfsd4: use reference count to free client J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 05/10] nfsd: make client/ directory names small ints J. Bruce Fields
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

I plan to expose some information about nfsv4 clients here.
---
 fs/nfsd/netns.h     |   2 +
 fs/nfsd/nfs4state.c |  23 ++++++----
 fs/nfsd/nfsctl.c    | 108 +++++++++++++++++++++++++++++++++++++++++++-
 fs/nfsd/nfsd.h      |   9 ++++
 fs/nfsd/state.h     |   6 ++-
 5 files changed, 138 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index cce335e1ec98..46d956676480 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -58,6 +58,8 @@ struct nfsd_net {
 	/* internal mount of the "nfsd" pseudofilesystem: */
 	struct vfsmount *nfsd_mnt;
 
+	struct dentry *nfsd_client_dir;
+
 	/*
 	 * reclaim_str_hashtbl[] holds known client info from previous reset/reboot
 	 * used in reboot/reset lease grace period processing
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 83d0ee329e14..dfcb90743861 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1871,20 +1871,19 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
 
 static void __free_client(struct kref *k)
 {
-	struct nfs4_client *clp = container_of(k, struct nfs4_client, cl_ref);
+	struct nfsdfs_client *c = container_of(k, struct nfsdfs_client, cl_ref);
+	struct nfs4_client *clp = container_of(c, struct nfs4_client, cl_nfsdfs);
 
 	free_svc_cred(&clp->cl_cred);
 	kfree(clp->cl_ownerstr_hashtbl);
 	kfree(clp->cl_name.data);
 	idr_destroy(&clp->cl_stateids);
-	if (clp->cl_nfsd_dentry)
-		nfsd_client_rmdir(clp->cl_nfsd_dentry);
 	kmem_cache_free(client_slab, clp);
 }
 
 void drop_client(struct nfs4_client *clp)
 {
-	kref_put(&clp->cl_ref, __free_client);
+	kref_put(&clp->cl_nfsdfs.cl_ref, __free_client);
 }
 
 static void
@@ -1899,6 +1898,8 @@ free_client(struct nfs4_client *clp)
 		free_session(ses);
 	}
 	rpc_destroy_wait_queue(&clp->cl_cb_waitq);
+	if (clp->cl_nfsd_dentry)
+		nfsd_client_rmdir(clp->cl_nfsd_dentry);
 	drop_client(clp);
 }
 
@@ -2199,6 +2200,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	struct sockaddr *sa = svc_addr(rqstp);
 	int ret;
 	struct net *net = SVC_NET(rqstp);
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	clp = alloc_client(name);
 	if (clp == NULL)
@@ -2209,8 +2211,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 		free_client(clp);
 		return NULL;
 	}
-
-	kref_init(&clp->cl_ref);
+	gen_clid(clp, nn);
+	kref_init(&clp->cl_nfsdfs.cl_ref);
 	nfsd4_init_cb(&clp->cl_cb_null, clp, NULL, NFSPROC4_CLNT_CB_NULL);
 	clp->cl_time = get_seconds();
 	clear_bit(0, &clp->cl_cb_slot_busy);
@@ -2218,6 +2220,12 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	rpc_copy_addr((struct sockaddr *) &clp->cl_addr, sa);
 	clp->cl_cb_session = NULL;
 	clp->net = net;
+	clp->cl_nfsd_dentry = nfsd_client_mkdir(nn, &clp->cl_nfsdfs,
+						clp->cl_clientid.cl_id);
+	if (!clp->cl_nfsd_dentry) {
+		free_client(clp);
+		return NULL;
+	}
 	return clp;
 }
 
@@ -2661,7 +2669,6 @@ nfsd4_exchange_id(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	new->cl_spo_must_allow.u.words[0] = exid->spo_must_allow[0];
 	new->cl_spo_must_allow.u.words[1] = exid->spo_must_allow[1];
 
-	gen_clid(new, nn);
 	add_to_unconfirmed(new);
 	swap(new, conf);
 out_copy:
@@ -3404,7 +3411,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		copy_clid(new, conf);
 		gen_confirm(new, nn);
 	} else /* case 4 (new client) or cases 2, 3 (client reboot): */
-		gen_clid(new, nn);
+		;
 	new->cl_minorversion = 0;
 	gen_callback(new, setclid, rqstp);
 	add_to_unconfirmed(new);
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 8d2062428569..8a2261cdefee 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -15,6 +15,7 @@
 #include <linux/sunrpc/gss_krb5_enctypes.h>
 #include <linux/sunrpc/rpc_pipe_fs.h>
 #include <linux/module.h>
+#include <linux/fsnotify.h>
 
 #include "idmap.h"
 #include "nfsd.h"
@@ -52,6 +53,7 @@ enum {
 	NFSD_RecoveryDir,
 	NFSD_V4EndGrace,
 #endif
+	NFSD_MaxReserved
 };
 
 /*
@@ -1146,8 +1148,99 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
  *	populating the filesystem.
  */
 
+/* Basically copying rpc_get_inode. */
+static struct inode *nfsd_get_inode(struct super_block *sb, umode_t mode)
+{
+	struct inode *inode = new_inode(sb);
+	if (!inode)
+		return NULL;
+	/* Following advice from simple_fill_super documentation: */
+	inode->i_ino = iunique(sb, NFSD_MaxReserved);
+	inode->i_mode = mode;
+	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
+	switch (mode & S_IFMT) {
+	case S_IFDIR:
+		inode->i_fop = &simple_dir_operations;
+		inode->i_op = &simple_dir_inode_operations;
+		inc_nlink(inode);
+	default:
+		break;
+	}
+	return inode;
+}
+
+static int __nfsd_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
+{
+	struct inode *inode;
+
+	inode = nfsd_get_inode(dir->i_sb, mode);
+	if (!inode)
+		return -ENOMEM;
+	d_add(dentry, inode);
+	inc_nlink(dir);
+	fsnotify_mkdir(dir, dentry);
+	return 0;
+}
+
+static struct dentry *nfsd_mkdir(struct dentry *parent, struct nfsdfs_client *ncl, char *name)
+{
+	struct inode *dir = parent->d_inode;
+	struct dentry *dentry;
+	int ret = -ENOMEM;
+
+	inode_lock(dir);
+	dentry = d_alloc_name(parent, name);
+	if (!dentry)
+		goto out_err;
+	ret = __nfsd_mkdir(d_inode(parent), dentry, S_IFDIR | 0600);
+	if (ret)
+		goto out_err;
+	if (ncl) {
+		d_inode(dentry)->i_private = ncl;
+		kref_get(&ncl->cl_ref);
+	}
+out:
+	inode_unlock(dir);
+	return dentry;
+out_err:
+	dentry = ERR_PTR(ret);
+	goto out;
+}
+
+/* on success, returns positive number unique to that client. */
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id)
+{
+	char name[11];
+
+	sprintf(name, "%d", id++);
+
+	return nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
+}
+
+/* Taken from __rpc_rmdir: */
+void nfsd_client_rmdir(struct dentry *dentry)
+{
+	struct inode *dir = d_inode(dentry->d_parent);
+	struct inode *inode = d_inode(dentry);
+	struct nfsdfs_client *ncl = inode->i_private;
+	int ret;
+
+	inode->i_private = NULL;
+	synchronize_rcu();
+	kref_put(&ncl->cl_ref, ncl->cl_release);
+	dget(dentry);
+	ret = simple_rmdir(dir, dentry);
+	WARN_ON_ONCE(ret);
+	d_delete(dentry);
+}
+
 static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
 {
+	struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+							nfsd_net_id);
+	struct dentry *dentry;
+	int ret;
+
 	static const struct tree_descr nfsd_files[] = {
 		[NFSD_List] = {"exports", &exports_nfsd_operations, S_IRUGO},
 		[NFSD_Export_features] = {"export_features",
@@ -1177,7 +1270,20 @@ static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
 		/* last one */ {""}
 	};
 	get_net(sb->s_fs_info);
-	return simple_fill_super(sb, 0x6e667364, nfsd_files);
+	ret = simple_fill_super(sb, 0x6e667364, nfsd_files);
+	if (ret)
+		return ret;
+	dentry = nfsd_mkdir(sb->s_root, NULL, "clients");
+	if (IS_ERR(dentry)) {
+		/* XXX: test: */
+		d_genocide(sb->s_root);
+		shrink_dcache_parent(sb->s_root);
+		dput(sb->s_root);
+		return PTR_ERR(dentry);
+	}
+	nn->nfsd_client_dir = dentry;
+	return 0;
+
 }
 
 static struct dentry *nfsd_mount(struct file_system_type *fs_type,
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 066899929863..fe7418c2b88f 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -21,6 +21,7 @@
 
 #include <uapi/linux/nfsd/debug.h>
 
+#include "netns.h"
 #include "stats.h"
 #include "export.h"
 
@@ -85,6 +86,14 @@ int		nfsd_pool_stats_release(struct inode *, struct file *);
 
 void		nfsd_destroy(struct net *net);
 
+struct nfsdfs_client {
+	struct kref cl_ref;
+	void (*cl_release)(struct kref *kref);
+};
+
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id);
+void nfsd_client_rmdir(struct dentry *dentry);
+
 #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
 #ifdef CONFIG_NFSD_V2_ACL
 extern const struct svc_version nfsd_acl_version2;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index aa26ae520fb6..5a999e4b766e 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -39,6 +39,7 @@
 #include <linux/refcount.h>
 #include <linux/sunrpc/svc_xprt.h>
 #include "nfsfh.h"
+#include "nfsd.h"
 
 typedef struct {
 	u32             cl_boot;
@@ -347,9 +348,12 @@ struct nfs4_client {
 	u32			cl_exchange_flags;
 	/* number of rpc's in progress over an associated session: */
 	atomic_t		cl_rpc_users;
-	struct kref		cl_ref;
+	struct nfsdfs_client	cl_nfsdfs;
 	struct nfs4_op_map      cl_spo_must_allow;
 
+	/* debugging info directory under nfsd/clients/ : */
+	struct dentry		*cl_nfsd_dentry;
+
 	/* for nfs41 callbacks */
 	/* We currently support a single back channel with a single slot */
 	unsigned long		cl_cb_slot_busy;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/10] nfsd: make client/ directory names small ints
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (3 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 04/10] nfsd: add nfsd/clients directory J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 06/10] rpc: replace rpc_filelist by tree_descr J. Bruce Fields
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We want clientid's on the wire to be randomized for reasons explained in
ebd7c72c63ac "nfsd: randomize SETCLIENTID reply to help distinguish
servers".  But I'd rather have mostly small integers for the clients/
directory.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/netns.h     | 1 +
 fs/nfsd/nfs4state.c | 2 +-
 fs/nfsd/nfsctl.c    | 5 +++--
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 46d956676480..888b2891960e 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -121,6 +121,7 @@ struct nfsd_net {
 	 */
 	unsigned int max_connections;
 
+	u32 clientid_base;
 	u32 clientid_counter;
 	u32 clverifier_counter;
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index dfcb90743861..de99bfe3e6e2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2221,7 +2221,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	clp->cl_cb_session = NULL;
 	clp->net = net;
 	clp->cl_nfsd_dentry = nfsd_client_mkdir(nn, &clp->cl_nfsdfs,
-						clp->cl_clientid.cl_id);
+			clp->cl_clientid.cl_id - nn->clientid_base);
 	if (!clp->cl_nfsd_dentry) {
 		free_client(clp);
 		return NULL;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 8a2261cdefee..edf4dada42bf 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1212,7 +1212,7 @@ struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl,
 {
 	char name[11];
 
-	sprintf(name, "%d", id++);
+	sprintf(name, "%u", id);
 
 	return nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
 }
@@ -1350,7 +1350,8 @@ static __net_init int nfsd_init_net(struct net *net)
 	nn->nfsd4_grace = 90;
 	nn->somebody_reclaimed = false;
 	nn->clverifier_counter = prandom_u32();
-	nn->clientid_counter = prandom_u32();
+	nn->clientid_base = prandom_u32();
+	nn->clientid_counter = nn->clientid_base + 1;
 	nn->s2s_cp_cl_id = nn->clientid_counter++;
 
 	atomic_set(&nn->ntf_refcnt, 0);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/10] rpc: replace rpc_filelist by tree_descr
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (4 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 05/10] nfsd: make client/ directory names small ints J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 07/10] nfsd4: add a client info file J. Bruce Fields
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Use a common structure instead of defining our own here.

The only difference is using an int instead of umode_t for the mode
field; I haven't tried to figure out why tree_descr uses int.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 net/sunrpc/rpc_pipe.c | 37 ++++++++++++++-----------------------
 1 file changed, 14 insertions(+), 23 deletions(-)

diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 69663681bf9d..9af9f4c8fa1e 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -460,15 +460,6 @@ static const struct file_operations rpc_info_operations = {
 };
 
 
-/*
- * Description of fs contents.
- */
-struct rpc_filelist {
-	const char *name;
-	const struct file_operations *i_fop;
-	umode_t mode;
-};
-
 static struct inode *
 rpc_get_inode(struct super_block *sb, umode_t mode)
 {
@@ -648,7 +639,7 @@ static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
  * FIXME: This probably has races.
  */
 static void __rpc_depopulate(struct dentry *parent,
-			     const struct rpc_filelist *files,
+			     const struct tree_descr *files,
 			     int start, int eof)
 {
 	struct inode *dir = d_inode(parent);
@@ -680,7 +671,7 @@ static void __rpc_depopulate(struct dentry *parent,
 }
 
 static void rpc_depopulate(struct dentry *parent,
-			   const struct rpc_filelist *files,
+			   const struct tree_descr *files,
 			   int start, int eof)
 {
 	struct inode *dir = d_inode(parent);
@@ -691,7 +682,7 @@ static void rpc_depopulate(struct dentry *parent,
 }
 
 static int rpc_populate(struct dentry *parent,
-			const struct rpc_filelist *files,
+			const struct tree_descr *files,
 			int start, int eof,
 			void *private)
 {
@@ -711,7 +702,7 @@ static int rpc_populate(struct dentry *parent,
 			case S_IFREG:
 				err = __rpc_create(dir, dentry,
 						files[i].mode,
-						files[i].i_fop,
+						files[i].ops,
 						private);
 				break;
 			case S_IFDIR:
@@ -1015,10 +1006,10 @@ enum {
 	RPCAUTH_EOF
 };
 
-static const struct rpc_filelist authfiles[] = {
+static const struct tree_descr authfiles[] = {
 	[RPCAUTH_info] = {
 		.name = "info",
-		.i_fop = &rpc_info_operations,
+		.ops = &rpc_info_operations,
 		.mode = S_IFREG | 0400,
 	},
 };
@@ -1076,20 +1067,20 @@ int rpc_remove_client_dir(struct rpc_clnt *rpc_client)
 	return rpc_rmdir_depopulate(dentry, rpc_clntdir_depopulate);
 }
 
-static const struct rpc_filelist cache_pipefs_files[3] = {
+static const struct tree_descr cache_pipefs_files[3] = {
 	[0] = {
 		.name = "channel",
-		.i_fop = &cache_file_operations_pipefs,
+		.ops = &cache_file_operations_pipefs,
 		.mode = S_IFREG | 0600,
 	},
 	[1] = {
 		.name = "content",
-		.i_fop = &content_file_operations_pipefs,
+		.ops = &content_file_operations_pipefs,
 		.mode = S_IFREG | 0400,
 	},
 	[2] = {
 		.name = "flush",
-		.i_fop = &cache_flush_operations_pipefs,
+		.ops = &cache_flush_operations_pipefs,
 		.mode = S_IFREG | 0600,
 	},
 };
@@ -1145,7 +1136,7 @@ enum {
 	RPCAUTH_RootEOF
 };
 
-static const struct rpc_filelist files[] = {
+static const struct tree_descr files[] = {
 	[RPCAUTH_lockd] = {
 		.name = "lockd",
 		.mode = S_IFDIR | 0555,
@@ -1242,7 +1233,7 @@ void rpc_put_sb_net(const struct net *net)
 }
 EXPORT_SYMBOL_GPL(rpc_put_sb_net);
 
-static const struct rpc_filelist gssd_dummy_clnt_dir[] = {
+static const struct tree_descr gssd_dummy_clnt_dir[] = {
 	[0] = {
 		.name = "clntXX",
 		.mode = S_IFDIR | 0555,
@@ -1277,10 +1268,10 @@ rpc_dummy_info_show(struct seq_file *m, void *v)
 }
 DEFINE_SHOW_ATTRIBUTE(rpc_dummy_info);
 
-static const struct rpc_filelist gssd_dummy_info_file[] = {
+static const struct tree_descr gssd_dummy_info_file[] = {
 	[0] = {
 		.name = "info",
-		.i_fop = &rpc_dummy_info_fops,
+		.ops = &rpc_dummy_info_fops,
 		.mode = S_IFREG | 0400,
 	},
 };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/10] nfsd4: add a client info file
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (5 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 06/10] rpc: replace rpc_filelist by tree_descr J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 14:04 ` [PATCH 08/10] nfsd4: add file to display list of client's opens J. Bruce Fields
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Add a new nfsd/clients/#/info file with some basic information about
each NFSv4 client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4state.c |  38 ++++++++++++++-
 fs/nfsd/nfsctl.c    | 115 +++++++++++++++++++++++++++++++++++++++++---
 fs/nfsd/nfsd.h      |   4 +-
 3 files changed, 148 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index de99bfe3e6e2..928705fc8ff5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2193,6 +2193,41 @@ find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
 	return s;
 }
 
+static int client_info_show(struct seq_file *m, void *v)
+{
+	struct inode *inode = m->private;
+	struct nfsdfs_client *nc;
+	struct nfs4_client *clp;
+	u64 clid;
+
+	nc = get_nfsdfs_client(inode);
+	if (!nc)
+		return -ENXIO;
+	clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
+	memcpy(&clid, &clp->cl_clientid, sizeof(clid));
+	seq_printf(m, "clientid: %llx\n", clid);
+	drop_client(clp);
+
+	return 0;
+}
+
+static int client_info_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, client_info_show, inode);
+}
+
+static const struct file_operations client_info_fops = {
+	.open		= client_info_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static const struct tree_descr client_files[] = {
+	[0] = {"info", &client_info_fops, S_IRUSR},
+	[1] = {""},
+};
+
 static struct nfs4_client *create_client(struct xdr_netobj name,
 		struct svc_rqst *rqstp, nfs4_verifier *verf)
 {
@@ -2221,7 +2256,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
 	clp->cl_cb_session = NULL;
 	clp->net = net;
 	clp->cl_nfsd_dentry = nfsd_client_mkdir(nn, &clp->cl_nfsdfs,
-			clp->cl_clientid.cl_id - nn->clientid_base);
+			clp->cl_clientid.cl_id - nn->clientid_base,
+			client_files);
 	if (!clp->cl_nfsd_dentry) {
 		free_client(clp);
 		return NULL;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index edf4dada42bf..205a56718f3a 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1207,14 +1207,116 @@ static struct dentry *nfsd_mkdir(struct dentry *parent, struct nfsdfs_client *nc
 	goto out;
 }
 
+static void clear_ncl(struct inode *inode)
+{
+	struct nfsdfs_client *ncl = inode->i_private;
+
+	inode->i_private = NULL;
+	synchronize_rcu();
+	kref_put(&ncl->cl_ref, ncl->cl_release);
+}
+
+
+struct nfsdfs_client *__get_nfsdfs_client(struct inode *inode)
+{
+	struct nfsdfs_client *nc = inode->i_private;
+
+	if (nc)
+		kref_get(&nc->cl_ref);
+	return nc;
+}
+
+struct nfsdfs_client *get_nfsdfs_client(struct inode *inode)
+{
+	struct nfsdfs_client *nc;
+
+	rcu_read_lock();
+	nc = __get_nfsdfs_client(inode);
+	rcu_read_unlock();
+	return nc;
+}
+/* from __rpc_unlink */
+static void nfsdfs_remove_file(struct inode *dir, struct dentry *dentry)
+{
+	int ret;
+
+	clear_ncl(d_inode(dentry));
+	dget(dentry);
+	ret = simple_unlink(dir, dentry);
+	d_delete(dentry);
+	dput(dentry);
+	WARN_ON_ONCE(ret);
+}
+
+static void nfsdfs_remove_files(struct dentry *root)
+{
+	struct dentry *dentry, *tmp;
+
+	list_for_each_entry_safe(dentry, tmp, &root->d_subdirs, d_child) {
+		if (!simple_positive(dentry)) {
+			WARN_ON_ONCE(1); /* I think this can't happen? */
+			continue;
+		}
+		nfsdfs_remove_file(d_inode(root), dentry);
+	}
+}
+
+/* XXX: cut'n'paste from simple_fill_super; figure out if we could share
+ * code instead. */
+static  int nfsdfs_create_files(struct dentry *root,
+					const struct tree_descr *files)
+{
+	struct inode *dir = d_inode(root);
+	struct inode *inode;
+	struct dentry *dentry;
+	int i;
+
+	inode_lock(dir);
+	for (i = 0; files->name && files->name[0]; i++, files++) {
+		if (!files->name)
+			continue;
+		dentry = d_alloc_name(root, files->name);
+		if (!dentry)
+			goto out;
+		inode = nfsd_get_inode(d_inode(root)->i_sb,
+					S_IFREG | files->mode);
+		if (!inode) {
+			dput(dentry);
+			goto out;
+		}
+		inode->i_fop = files->ops;
+		inode->i_private = __get_nfsdfs_client(dir);
+		d_add(dentry, inode);
+		fsnotify_create(dir, dentry);
+	}
+	inode_unlock(dir);
+	return 0;
+out:
+	nfsdfs_remove_files(root);
+	inode_unlock(dir);
+	return -ENOMEM;
+}
+
 /* on success, returns positive number unique to that client. */
-struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id)
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn,
+		struct nfsdfs_client *ncl, u32 id,
+		const struct tree_descr *files)
 {
+	struct dentry *dentry;
 	char name[11];
+	int ret;
 
 	sprintf(name, "%u", id);
 
-	return nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
+	dentry = nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
+	if (IS_ERR(dentry)) /* XXX: tossing errors? */
+		return NULL;
+	ret = nfsdfs_create_files(dentry, files);
+	if (ret) {
+		nfsd_client_rmdir(dentry);
+		return NULL;
+	}
+	return dentry;
 }
 
 /* Taken from __rpc_rmdir: */
@@ -1222,16 +1324,16 @@ void nfsd_client_rmdir(struct dentry *dentry)
 {
 	struct inode *dir = d_inode(dentry->d_parent);
 	struct inode *inode = d_inode(dentry);
-	struct nfsdfs_client *ncl = inode->i_private;
 	int ret;
 
-	inode->i_private = NULL;
-	synchronize_rcu();
-	kref_put(&ncl->cl_ref, ncl->cl_release);
+	inode_lock(dir);
+	nfsdfs_remove_files(dentry);
+	clear_ncl(inode);
 	dget(dentry);
 	ret = simple_rmdir(dir, dentry);
 	WARN_ON_ONCE(ret);
 	d_delete(dentry);
+	inode_unlock(dir);
 }
 
 static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
@@ -1275,7 +1377,6 @@ static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
 		return ret;
 	dentry = nfsd_mkdir(sb->s_root, NULL, "clients");
 	if (IS_ERR(dentry)) {
-		/* XXX: test: */
 		d_genocide(sb->s_root);
 		shrink_dcache_parent(sb->s_root);
 		dput(sb->s_root);
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index fe7418c2b88f..a60a2ee2e8c1 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -91,7 +91,9 @@ struct nfsdfs_client {
 	void (*cl_release)(struct kref *kref);
 };
 
-struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id);
+struct nfsdfs_client *get_nfsdfs_client(struct inode *);
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn,
+		struct nfsdfs_client *ncl, u32 id, const struct tree_descr *);
 void nfsd_client_rmdir(struct dentry *dentry);
 
 #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/10] nfsd4: add file to display list of client's opens
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (6 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 07/10] nfsd4: add a client info file J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 18:04   ` Jeff Layton
  2019-04-25 14:04 ` [PATCH 09/10] nfsd: expose some more information about NFSv4 opens J. Bruce Fields
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Add a nfsd/clients/#/opens file to list some information about all the
opens held by the given client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4state.c | 100 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 98 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 928705fc8ff5..829d1e5440d3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -684,7 +684,8 @@ struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *sla
 
 	idr_preload(GFP_KERNEL);
 	spin_lock(&cl->cl_lock);
-	new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 0, 0, GFP_NOWAIT);
+	/* Reserving 0 for start of file in nfsdfs "opens" file: */
+	new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 1, 0, GFP_NOWAIT);
 	spin_unlock(&cl->cl_lock);
 	idr_preload_end();
 	if (new_id < 0)
@@ -2223,9 +2224,104 @@ static const struct file_operations client_info_fops = {
 	.release	= single_release,
 };
 
+static void *opens_start(struct seq_file *s, loff_t *pos)
+	__acquires(&clp->cl_lock)
+{
+	struct nfs4_client *clp = s->private;
+	unsigned long id = *pos;
+	void *ret;
+
+	spin_lock(&clp->cl_lock);
+	ret = idr_get_next_ul(&clp->cl_stateids, &id);
+	*pos = id;
+	return ret;
+}
+
+static void *opens_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct nfs4_client *clp = s->private;
+	unsigned long id = *pos;
+	void *ret;
+
+	id = *pos;
+	id++;
+	ret = idr_get_next_ul(&clp->cl_stateids, &id);
+	*pos = id;
+	return ret;
+}
+
+static void opens_stop(struct seq_file *s, void *v)
+	__releases(&clp->cl_lock)
+{
+	struct nfs4_client *clp = s->private;
+
+	spin_unlock(&clp->cl_lock);
+}
+
+static int opens_show(struct seq_file *s, void *v)
+{
+	struct nfs4_stid *st = v;
+	struct nfs4_ol_stateid *os;
+	u64 stateid;
+
+	if (st->sc_type != NFS4_OPEN_STID)
+		return 0; /* XXX: or SEQ_SKIP? */
+	os = openlockstateid(st);
+	/* XXX: get info about file, etc., here */
+
+	memcpy(&stateid, &st->sc_stateid, sizeof(stateid));
+	seq_printf(s, "stateid: %llx\n", stateid);
+	return 0;
+}
+
+static struct seq_operations opens_seq_ops = {
+	.start = opens_start,
+	.next = opens_next,
+	.stop = opens_stop,
+	.show = opens_show
+};
+
+static int client_opens_open(struct inode *inode, struct file *file)
+{
+	struct nfsdfs_client *nc;
+	struct seq_file *s;
+	struct nfs4_client *clp;
+	int ret;
+
+	nc = get_nfsdfs_client(inode);
+	if (!nc)
+		return -ENXIO;
+	clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
+
+	ret = seq_open(file, &opens_seq_ops);
+	if (ret)
+		return ret;
+	s = file->private_data;
+	s->private = clp;
+	return 0;
+}
+
+static int client_opens_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *m = file->private_data;
+	struct nfs4_client *clp = m->private;
+
+	/* XXX: alternatively, we could get/drop in seq start/stop */
+	drop_client(clp);
+	return 0;
+}
+
+static const struct file_operations client_opens_fops = {
+	.open		= client_opens_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= client_opens_release,
+};
+
 static const struct tree_descr client_files[] = {
 	[0] = {"info", &client_info_fops, S_IRUSR},
-	[1] = {""},
+	[1] = {"open", &client_opens_fops, S_IRUSR},
+	[2] = {""},
 };
 
 static struct nfs4_client *create_client(struct xdr_netobj name,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/10] nfsd: expose some more information about NFSv4 opens
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (7 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 08/10] nfsd4: add file to display list of client's opens J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-05-02 15:28   ` Benjamin Coddington
  2019-04-25 14:04 ` [PATCH 10/10] nfsd: add more information to client info file J. Bruce Fields
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Add open modes, device numbers, inode numbers, and open owners to each
line of nfsd/clients/#/opens.

Open owners are totally opaque but seem to sometimes have some useful
ascii strings included, so passing through printable ascii characters
and escaping the rest seems useful while still being machine-readable.
---
 fs/nfsd/nfs4state.c            | 40 ++++++++++++++++++++++++++++------
 fs/nfsd/state.h                |  2 +-
 fs/seq_file.c                  | 17 +++++++++++++++
 include/linux/seq_file.h       |  2 ++
 include/linux/string_helpers.h |  1 +
 lib/string_helpers.c           |  5 +++--
 6 files changed, 57 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 829d1e5440d3..f53621a65e60 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -42,6 +42,7 @@
 #include <linux/sunrpc/svcauth_gss.h>
 #include <linux/sunrpc/addr.h>
 #include <linux/jhash.h>
+#include <linux/string_helpers.h>
 #include "xdr4.h"
 #include "xdr4cb.h"
 #include "vfs.h"
@@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s, void *v)
 static int opens_show(struct seq_file *s, void *v)
 {
 	struct nfs4_stid *st = v;
-	struct nfs4_ol_stateid *os;
-	u64 stateid;
+	struct nfs4_ol_stateid *ols;
+	struct nfs4_file *nf;
+	struct file *file;
+	struct inode *inode;
+	struct nfs4_stateowner *oo;
+	unsigned int access, deny;
 
 	if (st->sc_type != NFS4_OPEN_STID)
 		return 0; /* XXX: or SEQ_SKIP? */
-	os = openlockstateid(st);
-	/* XXX: get info about file, etc., here */
+	ols = openlockstateid(st);
+	oo = ols->st_stateowner;
+	nf = st->sc_file;
+	file = find_any_file(nf);
+	inode = d_inode(file->f_path.dentry);
+
+	seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
+
+	access = bmap_to_share_mode(ols->st_access_bmap);
+	deny   = bmap_to_share_mode(ols->st_deny_bmap);
+
+	seq_printf(s, "\t\%s\%s",
+		access & NFS4_SHARE_ACCESS_READ ? "r" : "-",
+		access & NFS4_SHARE_ACCESS_WRITE ? "w" : "-");
+	seq_printf(s, "\t\%s\%s",
+		deny & NFS4_SHARE_ACCESS_READ ? "R" : "-",
+		deny & NFS4_SHARE_ACCESS_WRITE ? "W" : "-");
+
+	seq_printf(s, "\t%02x:%02x:%ld\t", MAJOR(inode->i_sb->s_dev),
+					 MINOR(inode->i_sb->s_dev),
+					 inode->i_ino);
+	seq_printf(s, "'");
+	seq_escape_mem(s, oo->so_owner.data, oo->so_owner.len,
+			ESCAPE_NP|ESCAPE_HEX|ESCAPE_NOHIGH, NULL);
+	seq_printf(s, "'\n");
 
-	memcpy(&stateid, &st->sc_stateid, sizeof(stateid));
-	seq_printf(s, "stateid: %llx\n", stateid);
 	return 0;
 }
 
@@ -2320,7 +2346,7 @@ static const struct file_operations client_opens_fops = {
 
 static const struct tree_descr client_files[] = {
 	[0] = {"info", &client_info_fops, S_IRUSR},
-	[1] = {"open", &client_opens_fops, S_IRUSR},
+	[1] = {"opens", &client_opens_fops, S_IRUSR},
 	[2] = {""},
 };
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 5a999e4b766e..4db22433be82 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -56,7 +56,7 @@ typedef struct {
 	stateid_opaque_t        si_opaque;
 } stateid_t;
 
-#define STATEID_FMT	"(%08x/%08x/%08x/%08x)"
+#define STATEID_FMT	"%08x/%08x/%08x/%08x"
 #define STATEID_VAL(s) \
 	(s)->si_opaque.so_clid.cl_boot, \
 	(s)->si_opaque.so_clid.cl_id, \
diff --git a/fs/seq_file.c b/fs/seq_file.c
index 1dea7a8a5255..bb646e0c5b9c 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -383,6 +383,23 @@ void seq_escape(struct seq_file *m, const char *s, const char *esc)
 }
 EXPORT_SYMBOL(seq_escape);
 
+/**
+ * seq_escape_mem - copy data to buffer, escaping some characters
+ *
+ * See string_escape_mem for explanations of the arguments.
+ */
+void seq_escape_mem(struct seq_file *m, const char *src, size_t isz,
+		    unsigned int flags, const char *only)
+{
+	char *buf;
+	size_t size = seq_get_buf(m, &buf);
+	int ret;
+
+	ret = string_escape_mem(src, isz, buf, size, flags, only);
+	seq_commit(m, ret < size ? ret : -1);
+}
+EXPORT_SYMBOL(seq_escape_mem);
+
 void seq_vprintf(struct seq_file *m, const char *f, va_list args)
 {
 	int len;
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index a121982af0f5..de6ace9e5e6b 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -127,6 +127,8 @@ void seq_put_hex_ll(struct seq_file *m, const char *delimiter,
 		    unsigned long long v, unsigned int width);
 
 void seq_escape(struct seq_file *m, const char *s, const char *esc);
+void seq_escape_mem(struct seq_file *m, const char *src, size_t isz,
+		    unsigned int flags, const char *only);
 
 void seq_hex_dump(struct seq_file *m, const char *prefix_str, int prefix_type,
 		  int rowsize, int groupsize, const void *buf, size_t len,
diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
index d23c5030901a..54034526aa46 100644
--- a/include/linux/string_helpers.h
+++ b/include/linux/string_helpers.h
@@ -50,6 +50,7 @@ static inline int string_unescape_any_inplace(char *buf)
 #define ESCAPE_NP		0x10
 #define ESCAPE_ANY_NP		(ESCAPE_ANY | ESCAPE_NP)
 #define ESCAPE_HEX		0x20
+#define ESCAPE_NOHIGH		0x40
 
 int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
 		unsigned int flags, const char *only);
diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 29c490e5d478..2c3e7b93775e 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -511,8 +511,9 @@ int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
 		 * In these cases we just pass through a character to the
 		 * output buffer.
 		 */
-		if ((flags & ESCAPE_NP && isprint(c)) ||
-		    (is_dict && !strchr(only, c))) {
+		if ((  (flags & ESCAPE_NP && isprint(c)) ||
+		       (is_dict && !strchr(only, c))) &&
+		    !(flags & ESCAPE_NOHIGH && (unsigned char)c > 127)) {
 			/* do nothing */
 		} else {
 			if (flags & ESCAPE_SPACE && escape_space(c, &p, end))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/10] nfsd: add more information to client info file
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (8 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 09/10] nfsd: expose some more information about NFSv4 opens J. Bruce Fields
@ 2019-04-25 14:04 ` J. Bruce Fields
  2019-04-25 17:02 ` [PATCH 00/10] exposing knfsd opens to userspace Jeff Layton
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 14:04 UTC (permalink / raw)
  To: linux-nfs
  Cc: linux-fsdevel, abe, lsof-l, util-linux, jlayton, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Add ip address, full client-provided identifier, and minor version.
There's much more that could possibly be useful but this is a start.

As with open owners, client identifiers are opaque binary data, but some
client happen to include some useful information in ascii.
---
 fs/nfsd/nfs4state.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f53621a65e60..39e1956b1bd2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2208,6 +2208,11 @@ static int client_info_show(struct seq_file *m, void *v)
 	clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
 	memcpy(&clid, &clp->cl_clientid, sizeof(clid));
 	seq_printf(m, "clientid: %llx\n", clid);
+	seq_printf(m, "address: %pISpc\n", (struct sockaddr *)&clp->cl_addr);
+	seq_printf(m, "name: ");
+	seq_escape_mem(m, clp->cl_name.data, clp->cl_name.len,
+			ESCAPE_NP|ESCAPE_HEX|ESCAPE_NOHIGH, NULL);
+	seq_printf(m, "\nminor version: %d\n", clp->cl_minorversion);
 	drop_client(clp);
 
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (9 preceding siblings ...)
  2019-04-25 14:04 ` [PATCH 10/10] nfsd: add more information to client info file J. Bruce Fields
@ 2019-04-25 17:02 ` Jeff Layton
  2019-04-25 20:01   ` J. Bruce Fields
  2019-04-25 18:17 ` Jeff Layton
  2019-04-25 21:08 ` Andreas Dilger
  12 siblings, 1 reply; 30+ messages in thread
From: Jeff Layton @ 2019-04-25 17:02 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs; +Cc: linux-fsdevel, abe, lsof-l, util-linux

On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> The following patches expose information about NFSv4 opens held by knfsd
> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
> 
> The approach is to add a new directory /proc/fs/nfsd/clients/ with
> subdirectories for each active NFSv4 client.  Each subdirectory has an
> "info" file with some basic information to help identify the client and
> an "opens" directory that lists the opens held by that client.
> 
> I got it working by cobbling together some poorly-understood code I
> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
> and tell me what I've got wrong, they're more than welcome, but at this
> stage I'm more curious for feedback on the interface.
> 
> I'm also cc'ing people responsible for lsof and util-linux in case they
> have any opinions.
> 
> Currently these pseudofiles look like:
> 
>   # find /proc/fs/nfsd/clients -type f|xargs tail 
>   ==> /proc/fs/nfsd/clients/3741/opens <==
>   5cc0cd36/6debfb50/00000001/00000001	rw	--	fd:10:13649	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
>   5cc0cd36/6debfb50/00000003/00000001	r-	--	fd:10:13650	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> 
>   ==> /proc/fs/nfsd/clients/3741/info <==
>   clientid: 6debfb505cc0cd36
>   address: 192.168.122.36:0
>   name: Linux NFSv4.2 test2.fieldses.org
>   minor version: 2
> 
> Each line of the "opens" file is tab-delimited and describes one open,
> and the fields are stateid, open access bits, deny bits,
> major:minor:ino, and open owner.
> 

Nice work! We've needed this for a long time.

One thing we need to consider here from the get-go though is what sort
of ABI guarantee you want for this format. People _will_ write scripts
that scrape this info, so we should take that into account up front.


> So, some random questions:
> 
> 	- I just copied the major:minor:ino thing from /proc/locks, I
> 	  suspect we would have picked something different to identify
> 	  inodes if /proc/locks were done now.  (Mount id and inode?
> 	  Something else?)
> 

That does make it easy to correlate with the info in /proc/locks.

We'd have a dentry here by virtue of the nfs4_file. Should we print a
path in addition to this?

> 	- The open owner is just an opaque blob of binary data, but
> 	  clients may choose to include some useful asci-encoded
> 	  information, so I'm formatting them as strings with non-ascii
> 	  stuff escaped.  For example, pynfs usually uses the name of
> 	  the test as the open owner.  But as you see above, the ascii
> 	  content of the Linux client's open owners is less useful.
> 	  Also, there's no way I know of to map them back to a file
> 	  description or process or anything else useful on the client,
> 	  so perhaps they're of limited interest.
> 
> 	- I'm not sure about the stateid either.  I did think it might
> 	  be useful just as a unique identifier for each line.
> 	  (Actually for that it'd be enough to take just the third of
> 	  those four numbers making up the stateid--maybe that would be
> 	  better.)
> 

It'd be ideal to be able to easily correlate this info with what
wireshark displays. Does wireshark display hashes for openowners? I know
it does for stateids. If so, generating the same hash would be really
nice.

That said, waybe it's best to just dump the raw info out here though and
rely on some postprocessing scripts for viewing it?

> In the "info" file, the "name" line is the client identifier/client
> owner provided by the client, which (like the stateowner) is just opaque
> binary data, though as you can see here the Linux client is providing a
> readable ascii string.
> 
> There's probably a lot more we could add to that info file eventually.
> 
> Other stuff to add next:
> 
> 	- nfsd/clients/#/kill that you can write to to revoke all a
> 	  client's state if it's wedged somehow.

That would also be neat. We have a bit of code to support today that in
the fault injection code, but it'll need some cleanup and wiring it into
a knob here would be better.

> 	- lists of locks and delegations; lower priority since most of
> 	  that information is already in /proc/locks.

Agreed.

> J. Bruce Fields (10):
>   nfsd: persist nfsd filesystem across mounts
>   nfsd: rename cl_refcount
>   nfsd4: use reference count to free client
>   nfsd: add nfsd/clients directory
>   nfsd: make client/ directory names small ints
>   rpc: replace rpc_filelist by tree_descr
>   nfsd4: add a client info file
>   nfsd4: add file to display list of client's opens
>   nfsd: expose some more information about NFSv4 opens
>   nfsd: add more information to client info file
> 
>  fs/nfsd/netns.h                |   6 +
>  fs/nfsd/nfs4state.c            | 228 ++++++++++++++++++++++++++++++---
>  fs/nfsd/nfsctl.c               | 225 +++++++++++++++++++++++++++++++-
>  fs/nfsd/nfsd.h                 |  11 ++
>  fs/nfsd/state.h                |   9 +-
>  fs/seq_file.c                  |  17 +++
>  include/linux/seq_file.h       |   2 +
>  include/linux/string_helpers.h |   1 +
>  lib/string_helpers.c           |   5 +-
>  net/sunrpc/rpc_pipe.c          |  37 ++----
>  10 files changed, 491 insertions(+), 50 deletions(-)
> 

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/10] nfsd4: add file to display list of client's opens
  2019-04-25 14:04 ` [PATCH 08/10] nfsd4: add file to display list of client's opens J. Bruce Fields
@ 2019-04-25 18:04   ` Jeff Layton
  2019-04-25 20:14     ` J. Bruce Fields
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff Layton @ 2019-04-25 18:04 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs; +Cc: linux-fsdevel, abe, lsof-l, util-linux

On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> Add a nfsd/clients/#/opens file to list some information about all the
> opens held by the given client.
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfsd/nfs4state.c | 100 +++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 98 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 928705fc8ff5..829d1e5440d3 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -684,7 +684,8 @@ struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *sla
>  
>  	idr_preload(GFP_KERNEL);
>  	spin_lock(&cl->cl_lock);
> -	new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 0, 0, GFP_NOWAIT);
> +	/* Reserving 0 for start of file in nfsdfs "opens" file: */
> +	new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 1, 0, GFP_NOWAIT);
>  	spin_unlock(&cl->cl_lock);
>  	idr_preload_end();
>  	if (new_id < 0)
> @@ -2223,9 +2224,104 @@ static const struct file_operations client_info_fops = {
>  	.release	= single_release,
>  };
>  
> +static void *opens_start(struct seq_file *s, loff_t *pos)
> +	__acquires(&clp->cl_lock)
> +{
> +	struct nfs4_client *clp = s->private;
> +	unsigned long id = *pos;
> +	void *ret;
> +
> +	spin_lock(&clp->cl_lock);
> +	ret = idr_get_next_ul(&clp->cl_stateids, &id);
> +	*pos = id;
> +	return ret;
> +}
> +
> +static void *opens_next(struct seq_file *s, void *v, loff_t *pos)
> +{
> +	struct nfs4_client *clp = s->private;
> +	unsigned long id = *pos;
> +	void *ret;
> +
> +	id = *pos;
> +	id++;
> +	ret = idr_get_next_ul(&clp->cl_stateids, &id);
> +	*pos = id;
> +	return ret;
> +}
> +
> +static void opens_stop(struct seq_file *s, void *v)
> +	__releases(&clp->cl_lock)
> +{
> +	struct nfs4_client *clp = s->private;
> +
> +	spin_unlock(&clp->cl_lock);
> +}
> +
> +static int opens_show(struct seq_file *s, void *v)
> +{
> +	struct nfs4_stid *st = v;
> +	struct nfs4_ol_stateid *os;
> +	u64 stateid;
> +
> +	if (st->sc_type != NFS4_OPEN_STID)
> +		return 0; /* XXX: or SEQ_SKIP? */
> +	os = openlockstateid(st);
> +	/* XXX: get info about file, etc., here */
> +
> +	memcpy(&stateid, &st->sc_stateid, sizeof(stateid));
> +	seq_printf(s, "stateid: %llx\n", stateid);
> +	return 0;
> +}

More bikeshedding: should we have a "states" file instead of an "opens"
file and print a different set of output for each stateid type?

> +
> +static struct seq_operations opens_seq_ops = {
> +	.start = opens_start,
> +	.next = opens_next,
> +	.stop = opens_stop,
> +	.show = opens_show
> +};
> +
> +static int client_opens_open(struct inode *inode, struct file *file)
> +{
> +	struct nfsdfs_client *nc;
> +	struct seq_file *s;
> +	struct nfs4_client *clp;
> +	int ret;
> +
> +	nc = get_nfsdfs_client(inode);
> +	if (!nc)
> +		return -ENXIO;
> +	clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
> +
> +	ret = seq_open(file, &opens_seq_ops);
> +	if (ret)
> +		return ret;
> +	s = file->private_data;
> +	s->private = clp;
> +	return 0;
> +}
> +
> +static int client_opens_release(struct inode *inode, struct file *file)
> +{
> +	struct seq_file *m = file->private_data;
> +	struct nfs4_client *clp = m->private;
> +
> +	/* XXX: alternatively, we could get/drop in seq start/stop */
> +	drop_client(clp);
> +	return 0;
> +}
> +
> +static const struct file_operations client_opens_fops = {
> +	.open		= client_opens_open,
> +	.read		= seq_read,
> +	.llseek		= seq_lseek,
> +	.release	= client_opens_release,
> +};
> +
>  static const struct tree_descr client_files[] = {
>  	[0] = {"info", &client_info_fops, S_IRUSR},
> -	[1] = {""},
> +	[1] = {"open", &client_opens_fops, S_IRUSR},
> +	[2] = {""},
>  };
>  
>  static struct nfs4_client *create_client(struct xdr_netobj name,

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (10 preceding siblings ...)
  2019-04-25 17:02 ` [PATCH 00/10] exposing knfsd opens to userspace Jeff Layton
@ 2019-04-25 18:17 ` Jeff Layton
  2019-04-25 21:08 ` Andreas Dilger
  12 siblings, 0 replies; 30+ messages in thread
From: Jeff Layton @ 2019-04-25 18:17 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs; +Cc: linux-fsdevel, abe, lsof-l, util-linux

On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> The following patches expose information about NFSv4 opens held by knfsd
> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
> 
> The approach is to add a new directory /proc/fs/nfsd/clients/ with
> subdirectories for each active NFSv4 client.  Each subdirectory has an
> "info" file with some basic information to help identify the client and
> an "opens" directory that lists the opens held by that client.
> 
> I got it working by cobbling together some poorly-understood code I
> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
> and tell me what I've got wrong, they're more than welcome, but at this
> stage I'm more curious for feedback on the interface.
> 
> I'm also cc'ing people responsible for lsof and util-linux in case they
> have any opinions.
> 
> Currently these pseudofiles look like:
> 
>   # find /proc/fs/nfsd/clients -type f|xargs tail 
>   ==> /proc/fs/nfsd/clients/3741/opens <==
>   5cc0cd36/6debfb50/00000001/00000001	rw	--	fd:10:13649	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
>   5cc0cd36/6debfb50/00000003/00000001	r-	--	fd:10:13650	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> 
>   ==> /proc/fs/nfsd/clients/3741/info <==
>   clientid: 6debfb505cc0cd36
>   address: 192.168.122.36:0
>   name: Linux NFSv4.2 test2.fieldses.org
>   minor version: 2
> 
> Each line of the "opens" file is tab-delimited and describes one open,
> and the fields are stateid, open access bits, deny bits,
> major:minor:ino, and open owner.
> 
> So, some random questions:
> 
> 	- I just copied the major:minor:ino thing from /proc/locks, I
> 	  suspect we would have picked something different to identify
> 	  inodes if /proc/locks were done now.  (Mount id and inode?
> 	  Something else?)
> 
> 	- The open owner is just an opaque blob of binary data, but
> 	  clients may choose to include some useful asci-encoded
> 	  information, so I'm formatting them as strings with non-ascii
> 	  stuff escaped.  For example, pynfs usually uses the name of
> 	  the test as the open owner.  But as you see above, the ascii
> 	  content of the Linux client's open owners is less useful.
> 	  Also, there's no way I know of to map them back to a file
> 	  description or process or anything else useful on the client,
> 	  so perhaps they're of limited interest.
> 
> 	- I'm not sure about the stateid either.  I did think it might
> 	  be useful just as a unique identifier for each line.
> 	  (Actually for that it'd be enough to take just the third of
> 	  those four numbers making up the stateid--maybe that would be
> 	  better.)
> 
> In the "info" file, the "name" line is the client identifier/client
> owner provided by the client, which (like the stateowner) is just opaque
> binary data, though as you can see here the Linux client is providing a
> readable ascii string.
> 
> There's probably a lot more we could add to that info file eventually.
> 
> Other stuff to add next:
> 
> 	- nfsd/clients/#/kill that you can write to to revoke all a
> 	  client's state if it's wedged somehow.
> 	- lists of locks and delegations; lower priority since most of
> 	  that information is already in /proc/locks.
> 
> --b.
> 
> J. Bruce Fields (10):
>   nfsd: persist nfsd filesystem across mounts
>   nfsd: rename cl_refcount
>   nfsd4: use reference count to free client
>   nfsd: add nfsd/clients directory
>   nfsd: make client/ directory names small ints
>   rpc: replace rpc_filelist by tree_descr
>   nfsd4: add a client info file
>   nfsd4: add file to display list of client's opens
>   nfsd: expose some more information about NFSv4 opens
>   nfsd: add more information to client info file
> 
>  fs/nfsd/netns.h                |   6 +
>  fs/nfsd/nfs4state.c            | 228 ++++++++++++++++++++++++++++++---
>  fs/nfsd/nfsctl.c               | 225 +++++++++++++++++++++++++++++++-
>  fs/nfsd/nfsd.h                 |  11 ++
>  fs/nfsd/state.h                |   9 +-
>  fs/seq_file.c                  |  17 +++
>  include/linux/seq_file.h       |   2 +
>  include/linux/string_helpers.h |   1 +
>  lib/string_helpers.c           |   5 +-
>  net/sunrpc/rpc_pipe.c          |  37 ++----
>  10 files changed, 491 insertions(+), 50 deletions(-)
> 

I went through the patches and the code all looks fine to me. The only
real questions I think are what sort of info we want to present here,
and how we'll deal with changes to the format in the future.

Maybe on that latter point, we can reserve the right to add fields to
this info later, but do our best to never remove existing ones? Then
tools could be written to just ignore the fields that they don't
understand.

Reviewed-by: Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-25 17:02 ` [PATCH 00/10] exposing knfsd opens to userspace Jeff Layton
@ 2019-04-25 20:01   ` J. Bruce Fields
  0 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 20:01 UTC (permalink / raw)
  To: Jeff Layton
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, abe, lsof-l, util-linux

On Thu, Apr 25, 2019 at 01:02:30PM -0400, Jeff Layton wrote:
> On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > The following patches expose information about NFSv4 opens held by knfsd
> > on behalf of NFSv4 clients.  Those are currently invisible to userspace,
> > unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
> > 
> > The approach is to add a new directory /proc/fs/nfsd/clients/ with
> > subdirectories for each active NFSv4 client.  Each subdirectory has an
> > "info" file with some basic information to help identify the client and
> > an "opens" directory that lists the opens held by that client.
> > 
> > I got it working by cobbling together some poorly-understood code I
> > found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
> > and tell me what I've got wrong, they're more than welcome, but at this
> > stage I'm more curious for feedback on the interface.
> > 
> > I'm also cc'ing people responsible for lsof and util-linux in case they
> > have any opinions.
> > 
> > Currently these pseudofiles look like:
> > 
> >   # find /proc/fs/nfsd/clients -type f|xargs tail 
> >   ==> /proc/fs/nfsd/clients/3741/opens <==
> >   5cc0cd36/6debfb50/00000001/00000001	rw	--	fd:10:13649	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> >   5cc0cd36/6debfb50/00000003/00000001	r-	--	fd:10:13650	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> > 
> >   ==> /proc/fs/nfsd/clients/3741/info <==
> >   clientid: 6debfb505cc0cd36
> >   address: 192.168.122.36:0
> >   name: Linux NFSv4.2 test2.fieldses.org
> >   minor version: 2
> > 
> > Each line of the "opens" file is tab-delimited and describes one open,
> > and the fields are stateid, open access bits, deny bits,
> > major:minor:ino, and open owner.
> > 
> 
> Nice work! We've needed this for a long time.
> 
> One thing we need to consider here from the get-go though is what sort
> of ABI guarantee you want for this format. People _will_ write scripts
> that scrape this info, so we should take that into account up front.

There is a man page for the nfsd filesystem, nfsd(7).  I should write up
something to add to that.  If people write code without reading that
then we may still end up boxed in, of course, but it's a start.

What I'm hoping we can count on from readers:

	- they will ignore any unkown files in clients/#/.
	- readers will ignore any lines in clients/#/info starting with
	  an unrecognized keyword.
	- they will ignore any unknown data at the end of
	  clients/#/opens.

That's in approximate decreasing order of my confidence in those rules
being observed, though I don't think any of those are too much to ask.

> > So, some random questions:
> > 
> > 	- I just copied the major:minor:ino thing from /proc/locks, I
> > 	  suspect we would have picked something different to identify
> > 	  inodes if /proc/locks were done now.  (Mount id and inode?
> > 	  Something else?)
> > 
> 
> That does make it easy to correlate with the info in /proc/locks.
> 
> We'd have a dentry here by virtue of the nfs4_file. Should we print a
> path in addition to this?

We could.  It won't be 100% reliable, of course (unlinks, renames), but
it could still be convenient for human readers, and an optimization for
non-human readers trying to find an inode.

The filehandle might be a good idea too.

I wonder if there's any issue with line length, or with quantity of data
emitted by a single seq_file show method.  The open owner can be up to
4K (after escaping), paths and filehandles can be long too.

> > 	- The open owner is just an opaque blob of binary data, but
> > 	  clients may choose to include some useful asci-encoded
> > 	  information, so I'm formatting them as strings with non-ascii
> > 	  stuff escaped.  For example, pynfs usually uses the name of
> > 	  the test as the open owner.  But as you see above, the ascii
> > 	  content of the Linux client's open owners is less useful.
> > 	  Also, there's no way I know of to map them back to a file
> > 	  description or process or anything else useful on the client,
> > 	  so perhaps they're of limited interest.
> > 
> > 	- I'm not sure about the stateid either.  I did think it might
> > 	  be useful just as a unique identifier for each line.
> > 	  (Actually for that it'd be enough to take just the third of
> > 	  those four numbers making up the stateid--maybe that would be
> > 	  better.)
> 
> It'd be ideal to be able to easily correlate this info with what
> wireshark displays. Does wireshark display hashes for openowners? I know
> it does for stateids. If so, generating the same hash would be really
> nice.
> 
> That said, waybe it's best to just dump the raw info out here though and
> rely on some postprocessing scripts for viewing it?

In that case, I think so, as I don't know how committed wireshark is to
the choice of hash.

> > In the "info" file, the "name" line is the client identifier/client
> > owner provided by the client, which (like the stateowner) is just opaque
> > binary data, though as you can see here the Linux client is providing a
> > readable ascii string.
> > 
> > There's probably a lot more we could add to that info file eventually.
> > 
> > Other stuff to add next:
> > 
> > 	- nfsd/clients/#/kill that you can write to to revoke all a
> > 	  client's state if it's wedged somehow.
> 
> That would also be neat. We have a bit of code to support today that in
> the fault injection code, but it'll need some cleanup and wiring it into
> a knob here would be better.

OK, good, I'm working on that.  Looks like fault injection gives up if
there are rpc's in process for the given client, whereas here I'd rather
force the expiry.  Looks like that needs some straightforward waitqueue
logic to wait for the in-progress rpc's.

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/10] nfsd4: add file to display list of client's opens
  2019-04-25 18:04   ` Jeff Layton
@ 2019-04-25 20:14     ` J. Bruce Fields
  2019-04-25 21:14       ` Andreas Dilger
  0 siblings, 1 reply; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-25 20:14 UTC (permalink / raw)
  To: Jeff Layton
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, abe, lsof-l, util-linux

On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
> More bikeshedding: should we have a "states" file instead of an "opens"
> file and print a different set of output for each stateid type?

Sure.  The format of the file could be something like

	<stateid> open rw -- <openowner>...
	<stateid> lock r 0-EOF <lockowner>...
	<stateid> deleg r

I wonder if we could put owners on separate lines and do some
heirarchical thing to show owner-stateid relationships?  Hm.  That's
kind of appealing but more work.

I was only planning to do opens for the first iteration, and I think
extending later in separate files is slightly safer.

More trivial, but: it'd lengthen lines and make columns line up less
often.  But if we include a lot of long variable-length fields then
that's kinda hopeless anyway.

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
                   ` (11 preceding siblings ...)
  2019-04-25 18:17 ` Jeff Layton
@ 2019-04-25 21:08 ` Andreas Dilger
  2019-04-25 23:20   ` NeilBrown
  12 siblings, 1 reply; 30+ messages in thread
From: Andreas Dilger @ 2019-04-25 21:08 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-nfs, linux-fsdevel, abe, lsof-l, util-linux, Jeff Layton,
	James Simmons, NeilBrown

[-- Attachment #1: Type: text/plain, Size: 5103 bytes --]

On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> The following patches expose information about NFSv4 opens held by knfsd
> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
> 
> The approach is to add a new directory /proc/fs/nfsd/clients/ with
> subdirectories for each active NFSv4 client.  Each subdirectory has an
> "info" file with some basic information to help identify the client and
> an "opens" directory that lists the opens held by that client.
> 
> I got it working by cobbling together some poorly-understood code I
> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
> and tell me what I've got wrong, they're more than welcome, but at this
> stage I'm more curious for feedback on the interface.

Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
My understanding is that "complex" files are verboten in procfs and sysfs?
We've been going through a lengthy process to move files out of procfs
into sysfs and debugfs as a result (while trying to maintain some kind of
compatibility in the user tools), but if it is possible to use a separate
filesystem to hold all of the stats/parameters I'd much rather do that
than use debugfs (which has become root-access-only in newer kernels).

Cheers, Andreas

> I'm also cc'ing people responsible for lsof and util-linux in case they
> have any opinions.
> 
> Currently these pseudofiles look like:
> 
>  # find /proc/fs/nfsd/clients -type f|xargs tail
>  ==> /proc/fs/nfsd/clients/3741/opens <==
>  5cc0cd36/6debfb50/00000001/00000001	rw	--	fd:10:13649	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
>  5cc0cd36/6debfb50/00000003/00000001	r-	--	fd:10:13650	'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> 
>  ==> /proc/fs/nfsd/clients/3741/info <==
>  clientid: 6debfb505cc0cd36
>  address: 192.168.122.36:0
>  name: Linux NFSv4.2 test2.fieldses.org
>  minor version: 2
> 
> Each line of the "opens" file is tab-delimited and describes one open,
> and the fields are stateid, open access bits, deny bits,
> major:minor:ino, and open owner.
> 
> So, some random questions:
> 
> 	- I just copied the major:minor:ino thing from /proc/locks, I
> 	  suspect we would have picked something different to identify
> 	  inodes if /proc/locks were done now.  (Mount id and inode?
> 	  Something else?)
> 
> 	- The open owner is just an opaque blob of binary data, but
> 	  clients may choose to include some useful asci-encoded
> 	  information, so I'm formatting them as strings with non-ascii
> 	  stuff escaped.  For example, pynfs usually uses the name of
> 	  the test as the open owner.  But as you see above, the ascii
> 	  content of the Linux client's open owners is less useful.
> 	  Also, there's no way I know of to map them back to a file
> 	  description or process or anything else useful on the client,
> 	  so perhaps they're of limited interest.
> 
> 	- I'm not sure about the stateid either.  I did think it might
> 	  be useful just as a unique identifier for each line.
> 	  (Actually for that it'd be enough to take just the third of
> 	  those four numbers making up the stateid--maybe that would be
> 	  better.)
> 
> In the "info" file, the "name" line is the client identifier/client
> owner provided by the client, which (like the stateowner) is just opaque
> binary data, though as you can see here the Linux client is providing a
> readable ascii string.
> 
> There's probably a lot more we could add to that info file eventually.
> 
> Other stuff to add next:
> 
> 	- nfsd/clients/#/kill that you can write to to revoke all a
> 	  client's state if it's wedged somehow.
> 	- lists of locks and delegations; lower priority since most of
> 	  that information is already in /proc/locks.
> 
> --b.
> 
> J. Bruce Fields (10):
>  nfsd: persist nfsd filesystem across mounts
>  nfsd: rename cl_refcount
>  nfsd4: use reference count to free client
>  nfsd: add nfsd/clients directory
>  nfsd: make client/ directory names small ints
>  rpc: replace rpc_filelist by tree_descr
>  nfsd4: add a client info file
>  nfsd4: add file to display list of client's opens
>  nfsd: expose some more information about NFSv4 opens
>  nfsd: add more information to client info file
> 
> fs/nfsd/netns.h                |   6 +
> fs/nfsd/nfs4state.c            | 228 ++++++++++++++++++++++++++++++---
> fs/nfsd/nfsctl.c               | 225 +++++++++++++++++++++++++++++++-
> fs/nfsd/nfsd.h                 |  11 ++
> fs/nfsd/state.h                |   9 +-
> fs/seq_file.c                  |  17 +++
> include/linux/seq_file.h       |   2 +
> include/linux/string_helpers.h |   1 +
> lib/string_helpers.c           |   5 +-
> net/sunrpc/rpc_pipe.c          |  37 ++----
> 10 files changed, 491 insertions(+), 50 deletions(-)
> 
> --
> 2.20.1
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/10] nfsd4: add file to display list of client's opens
  2019-04-25 20:14     ` J. Bruce Fields
@ 2019-04-25 21:14       ` Andreas Dilger
  2019-04-26  1:18         ` J. Bruce Fields
  0 siblings, 1 reply; 30+ messages in thread
From: Andreas Dilger @ 2019-04-25 21:14 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jeff Layton, J. Bruce Fields, linux-nfs, linux-fsdevel, abe,
	lsof-l, util-linux

[-- Attachment #1: Type: text/plain, Size: 1413 bytes --]

On Apr 25, 2019, at 10:14 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
>> More bikeshedding: should we have a "states" file instead of an "opens"
>> file and print a different set of output for each stateid type?
> 
> Sure.  The format of the file could be something like
> 
> 	<stateid> open rw -- <openowner>...
> 	<stateid> lock r 0-EOF <lockowner>...
> 	<stateid> deleg r
> 
> I wonder if we could put owners on separate lines and do some
> heirarchical thing to show owner-stateid relationships?  Hm.  That's
> kind of appealing but more work.

My suggestion here would be to use YAML-formatted output rather than
space/tab separated positional fields.  That can still be made human
readable, but also machine parseable and extensible if formatted properly.

Something like the following (use https://yaml-online-parser.appspot.com/
to validate the formatting):

- <stateid>: { state: open, mode: rw, owner: <openowner> }
- <stateid>: { state: lock, mode: r, start: 0, end: EOF, owner: <openowner> }


> I was only planning to do opens for the first iteration, and I think
> extending later in separate files is slightly safer.
> 
> More trivial, but: it'd lengthen lines and make columns line up less
> often.  But if we include a lot of long variable-length fields then
> that's kinda hopeless anyway.
> 
> --b.


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-25 21:08 ` Andreas Dilger
@ 2019-04-25 23:20   ` NeilBrown
  2019-04-26 11:00     ` Andreas Dilger
  0 siblings, 1 reply; 30+ messages in thread
From: NeilBrown @ 2019-04-25 23:20 UTC (permalink / raw)
  To: Andreas Dilger, J. Bruce Fields
  Cc: linux-nfs, linux-fsdevel, abe, lsof-l, util-linux, Jeff Layton,
	James Simmons

[-- Attachment #1: Type: text/plain, Size: 2616 bytes --]

On Thu, Apr 25 2019, Andreas Dilger wrote:

> On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> 
>> From: "J. Bruce Fields" <bfields@redhat.com>
>> 
>> The following patches expose information about NFSv4 opens held by knfsd
>> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
>> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>> 
>> The approach is to add a new directory /proc/fs/nfsd/clients/ with
>> subdirectories for each active NFSv4 client.  Each subdirectory has an
>> "info" file with some basic information to help identify the client and
>> an "opens" directory that lists the opens held by that client.
>> 
>> I got it working by cobbling together some poorly-understood code I
>> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
>> and tell me what I've got wrong, they're more than welcome, but at this
>> stage I'm more curious for feedback on the interface.
>
> Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
> My understanding is that "complex" files are verboten in procfs and sysfs?
> We've been going through a lengthy process to move files out of procfs
> into sysfs and debugfs as a result (while trying to maintain some kind of
> compatibility in the user tools), but if it is possible to use a separate
> filesystem to hold all of the stats/parameters I'd much rather do that
> than use debugfs (which has become root-access-only in newer kernels).

/proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
filesystem, originally created to replace the nfsd-specific
systemcall.
So the nfsd developers have a fair degree of latitude as to what can go
in there.

But I *don't* think it is a good idea to follow this pattern.  Creating
a separate control filesystem for every different module that thinks it
has different needs doesn't scale well.  We could end up with dozens of
tiny filesystems that all need to be mounted at just the right place.  I
don't think that is healthy for Linus.

Nor do I think we should be stuffing stuff into debugfs that isn't
really for debugging.  That isn't healthy either.

If sysfs doesn't meet our needs, then we need to raise that in
appropriate fora and present a clear case and try to build consensus -
because if we see a problem, then it is likely that others do to.

This is all presumably in the context of Lustre and while lustre is
out-of-tree we don't have a lot of leverage.  So I wouldn't consider
pursuing anything here until we get back upstream.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/10] nfsd4: add file to display list of client's opens
  2019-04-25 21:14       ` Andreas Dilger
@ 2019-04-26  1:18         ` J. Bruce Fields
  2019-05-16  0:40           ` J. Bruce Fields
  0 siblings, 1 reply; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-26  1:18 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Jeff Layton, J. Bruce Fields, linux-nfs, linux-fsdevel, abe,
	lsof-l, util-linux

On Thu, Apr 25, 2019 at 11:14:23PM +0200, Andreas Dilger wrote:
> On Apr 25, 2019, at 10:14 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > 
> > On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
> >> More bikeshedding: should we have a "states" file instead of an "opens"
> >> file and print a different set of output for each stateid type?
> > 
> > Sure.  The format of the file could be something like
> > 
> > 	<stateid> open rw -- <openowner>...
> > 	<stateid> lock r 0-EOF <lockowner>...
> > 	<stateid> deleg r
> > 
> > I wonder if we could put owners on separate lines and do some
> > heirarchical thing to show owner-stateid relationships?  Hm.  That's
> > kind of appealing but more work.
> 
> My suggestion here would be to use YAML-formatted output rather than
> space/tab separated positional fields.  That can still be made human
> readable, but also machine parseable and extensible if formatted properly.

Well, anything we do will be machine-parseable.  But I can believe YAML
would make future extension easier.  It doesn't look like it would be
more complicated to generate.  It uses C-style escaping (like \x32) so
there'd be no change to how we format binary blobs.

The field names make it a tad more verbose but I guess it's not too bad.

> Something like the following (use https://yaml-online-parser.appspot.com/
> to validate the formatting):
> 
> - <stateid>: { state: open, mode: rw, owner: <openowner> }
> - <stateid>: { state: lock, mode: r, start: 0, end: EOF, owner: <openowner> }

I haven't noticed other kernel interfaces using YAML.  But I guess I'm
not coming up with any real objection.

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-25 23:20   ` NeilBrown
@ 2019-04-26 11:00     ` Andreas Dilger
  2019-04-26 12:56       ` J. Bruce Fields
  2019-04-27  0:03       ` NeilBrown
  0 siblings, 2 replies; 30+ messages in thread
From: Andreas Dilger @ 2019-04-26 11:00 UTC (permalink / raw)
  To: NeilBrown
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, abe, lsof-l,
	util-linux, Jeff Layton, James Simmons

[-- Attachment #1: Type: text/plain, Size: 4298 bytes --]


> On Apr 26, 2019, at 1:20 AM, NeilBrown <neilb@suse.com> wrote:
> 
> On Thu, Apr 25 2019, Andreas Dilger wrote:
> 
>> On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> 
>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>> 
>>> The following patches expose information about NFSv4 opens held by knfsd
>>> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
>>> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>>> 
>>> The approach is to add a new directory /proc/fs/nfsd/clients/ with
>>> subdirectories for each active NFSv4 client.  Each subdirectory has an
>>> "info" file with some basic information to help identify the client and
>>> an "opens" directory that lists the opens held by that client.
>>> 
>>> I got it working by cobbling together some poorly-understood code I
>>> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
>>> and tell me what I've got wrong, they're more than welcome, but at this
>>> stage I'm more curious for feedback on the interface.
>> 
>> Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
>> My understanding is that "complex" files are verboten in procfs and sysfs?
>> We've been going through a lengthy process to move files out of procfs
>> into sysfs and debugfs as a result (while trying to maintain some kind of
>> compatibility in the user tools), but if it is possible to use a separate
>> filesystem to hold all of the stats/parameters I'd much rather do that
>> than use debugfs (which has become root-access-only in newer kernels).
> 
> /proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
> filesystem, originally created to replace the nfsd-specific systemcall.
> So the nfsd developers have a fair degree of latitude as to what can go
> in there.
> 
> But I *don't* think it is a good idea to follow this pattern.  Creating
> a separate control filesystem for every different module that thinks it
> has different needs doesn't scale well.  We could end up with dozens of
> tiny filesystems that all need to be mounted at just the right place.  I
> don't think that is healthy for Linus.
> 
> Nor do I think we should be stuffing stuff into debugfs that isn't
> really for debugging.  That isn't healthy either.
> 
> If sysfs doesn't meet our needs, then we need to raise that in
> appropriate fora and present a clear case and try to build consensus -
> because if we see a problem, then it is likely that others do to.

I definitely *do* see the restrictions sysfs as being a problem, and I'd
guess NFS developers thought the same, since the "one value per file"
paradigm means that any kind of complex data needs to be split over
hundreds or thousands of files, which is very inefficient for userspace to
use.  Consider if /proc/slabinfo had to follow the sysfs paradigm, this would
(on my system) need about 225 directories (one per slab) and 3589 separate
files in total (one per value) that would need to be read every second to
implement "slabtop".  Running strace on "top" shows it taking 0.25s wall time
to open and read the files for only 350 processes on my system, at 2 files
per process ("stat" and "statm"), and those have 44 and 7 values, respectively,
so if it had to follow the sysfs paradigm would make this far worse.

I think it would make a lot more sense to have one file per item of interest,
and make it e.g. a well-structured YAML format ("name: value", with indentation
denoting a hierarchy/grouping of related items) so that it can be both human
and machine readable, easily parsed by scripts using bash or awk, rather than
having an explicit directory+file hierarchy.  Files like /proc/meminfo and
/proc/<pid>/status are already YAML-formatted (or almost so), so it isn't ugly
like XML encoding.

> This is all presumably in the context of Lustre and while lustre is
> out-of-tree we don't have a lot of leverage.  So I wouldn't consider
> pursuing anything here until we get back upstream.

Sure, except that is a catch-22.  We can't discuss what is needed until
the code is in the kernel, but we can't get it into the kernel until the
files it puts in /proc have been moved into /sys?

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-26 11:00     ` Andreas Dilger
@ 2019-04-26 12:56       ` J. Bruce Fields
  2019-04-26 23:55         ` NeilBrown
  2019-04-27  0:03       ` NeilBrown
  1 sibling, 1 reply; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-26 12:56 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: NeilBrown, J. Bruce Fields, linux-nfs, linux-fsdevel, abe,
	lsof-l, util-linux, Jeff Layton, James Simmons

On Fri, Apr 26, 2019 at 01:00:19PM +0200, Andreas Dilger wrote:
> 
> On Apr 26, 2019, at 1:20 AM, NeilBrown <neilb@suse.com> wrote:
> > /proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
> > filesystem, originally created to replace the nfsd-specific systemcall.
> > So the nfsd developers have a fair degree of latitude as to what can go
> > in there.
> > 
> > But I *don't* think it is a good idea to follow this pattern.  Creating
> > a separate control filesystem for every different module that thinks it
> > has different needs doesn't scale well.  We could end up with dozens of
> > tiny filesystems that all need to be mounted at just the right place.

Aren't we already there?  My laptop, Fedora 29 with everything pretty much
default:

$  findmnt -n -oFSTYPE|sort|uniq -c
      1 autofs
      1 bpf
     11 cgroup
      1 cgroup2
      1 configfs
      1 debugfs
      1 devpts
      1 devtmpfs
      3 ext4
      1 fusectl
      1 fuse.gvfsd-fuse
      1 hugetlbfs
      1 mqueue
      1 proc
      1 pstore
      1 rpc_pipefs
      1 securityfs
      1 selinuxfs
      1 sysfs
      5 tmpfs

> > I don't think that is healthy for Linus.

What are the problems you see?

> > Nor do I think we should be stuffing stuff into debugfs that isn't
> > really for debugging.  That isn't healthy either.
> > 
> > If sysfs doesn't meet our needs, then we need to raise that in
> > appropriate fora and present a clear case and try to build consensus -
> > because if we see a problem, then it is likely that others do to.
> 
> I definitely *do* see the restrictions sysfs as being a problem, and I'd
> guess NFS developers thought the same,

For what it's worth, the nfsd filesystem predated sysfs, just barely.

Looking at the history....  It was actually Al that introduced it in March
2002.  Patrick Mochel added sysfs in October 2002.

But it's true that from the start nfsd didn't really fit the model of a single
(possibly writeable) attribute per file.

(Might be interesting to look at the distribution of new filesystem types over
time, there may have been a peak around then.)

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-26 12:56       ` J. Bruce Fields
@ 2019-04-26 23:55         ` NeilBrown
  2019-04-27 19:00           ` J. Bruce Fields
  0 siblings, 1 reply; 30+ messages in thread
From: NeilBrown @ 2019-04-26 23:55 UTC (permalink / raw)
  To: J. Bruce Fields, Andreas Dilger
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, abe, lsof-l,
	util-linux, Jeff Layton, James Simmons

[-- Attachment #1: Type: text/plain, Size: 4760 bytes --]

On Fri, Apr 26 2019, J. Bruce Fields wrote:

> On Fri, Apr 26, 2019 at 01:00:19PM +0200, Andreas Dilger wrote:
>> 
>> On Apr 26, 2019, at 1:20 AM, NeilBrown <neilb@suse.com> wrote:
>> > /proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
>> > filesystem, originally created to replace the nfsd-specific systemcall.
>> > So the nfsd developers have a fair degree of latitude as to what can go
>> > in there.
>> > 
>> > But I *don't* think it is a good idea to follow this pattern.  Creating
>> > a separate control filesystem for every different module that thinks it
>> > has different needs doesn't scale well.  We could end up with dozens of
>> > tiny filesystems that all need to be mounted at just the right place.
>
> Aren't we already there?  My laptop, Fedora 29 with everything pretty much
> default:
>
> $  findmnt -n -oFSTYPE|sort|uniq -c
>       1 autofs
>       1 bpf
>      11 cgroup
>       1 cgroup2
>       1 configfs
>       1 debugfs
>       1 devpts
>       1 devtmpfs
>       3 ext4
>       1 fusectl
>       1 fuse.gvfsd-fuse
>       1 hugetlbfs
>       1 mqueue
>       1 proc
>       1 pstore
>       1 rpc_pipefs
>       1 securityfs
>       1 selinuxfs
>       1 sysfs
>       5 tmpfs

Sometimes I think "Linux is so full of crap, it hardly matters if more
crap is added".  Other times I feel a bit more social responsibility and
want to fight against adding too much more crap.

>
>> > I don't think that is healthy for Linus.
                                       ^^^^^ oops
>
> What are the problems you see?

Fragmentation.
This is exactly the platform problem.  People look at the existing
functionality, decide that it doesn't quite meet their needs, and then
do an implement something that is mostly the same but just different
enough so that they feel more comfortable.

We have 'seq_file' which encourages people to create line-oriented info
files, but if some have tab-separate lines, others have CSV, others have
yaml etc, then there is plenty of room for confusion.

If we could, instead, just agreed that more structured data actually can
make sense in sysfs, and come up with a coherent approach that we can
all tolerate, then life would be much easier.

>
>> > Nor do I think we should be stuffing stuff into debugfs that isn't
>> > really for debugging.  That isn't healthy either.
>> > 
>> > If sysfs doesn't meet our needs, then we need to raise that in
>> > appropriate fora and present a clear case and try to build consensus -
>> > because if we see a problem, then it is likely that others do to.
>> 
>> I definitely *do* see the restrictions sysfs as being a problem, and I'd
>> guess NFS developers thought the same,
>
> For what it's worth, the nfsd filesystem predated sysfs, just barely.
>
> Looking at the history....  It was actually Al that introduced it in March
> 2002.  Patrick Mochel added sysfs in October 2002.

IIRC the nfsd filesystem was added to address the difficulty of
supporting a systemcall in a module.  That basically doesn't work, so
there needed to be some clean interface between some minimal nfsdctl systemcall
code that was always compiled in, and the actually implementation that
could be in a module.  It was Viro who particularly noticed this need,
so it was a filesystem that was chosen to fill the need - filesystems in
modules was already a solved problem.
The system call was (I think) write-only (no non-trivial return data)
and each file just recieved a binary struct that matches the syscall
argument.
Once that was in place, it became a dumping ground for whatever we
wanted.

sysfs was, I think, originally about power management.  It exposed
connections between devices more than the devices themselves.  This
allowed user-space to be able to power-things down when not in use, and
to understand the dependencies.
Since then it grew....

>
> But it's true that from the start nfsd didn't really fit the model of a single
> (possibly writeable) attribute per file.

Depends on what you mean by that.  Original files where write-only and
where slightly complex attributes. Writing performed an action, like
adding an entry to the export table (first you add a client, then add a
client+filesystem to export it).

This idea for a file performing an action, rather than presenting an
attribute, is much the same as the "bind" and "unbind" files you can
find in sysfs.

(see also https://lwn.net/Articles/378884/ for examples of sysfs files
that are not one-attribute-per-file)

NeilBrown

>
> (Might be interesting to look at the distribution of new filesystem types over
> time, there may have been a peak around then.)
>
> --b.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-26 11:00     ` Andreas Dilger
  2019-04-26 12:56       ` J. Bruce Fields
@ 2019-04-27  0:03       ` NeilBrown
  1 sibling, 0 replies; 30+ messages in thread
From: NeilBrown @ 2019-04-27  0:03 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, abe, lsof-l,
	util-linux, Jeff Layton, James Simmons

[-- Attachment #1: Type: text/plain, Size: 5028 bytes --]

On Fri, Apr 26 2019, Andreas Dilger wrote:

>> On Apr 26, 2019, at 1:20 AM, NeilBrown <neilb@suse.com> wrote:
>> 
>> On Thu, Apr 25 2019, Andreas Dilger wrote:
>> 
>>> On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>>> 
>>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>>> 
>>>> The following patches expose information about NFSv4 opens held by knfsd
>>>> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
>>>> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>>>> 
>>>> The approach is to add a new directory /proc/fs/nfsd/clients/ with
>>>> subdirectories for each active NFSv4 client.  Each subdirectory has an
>>>> "info" file with some basic information to help identify the client and
>>>> an "opens" directory that lists the opens held by that client.
>>>> 
>>>> I got it working by cobbling together some poorly-understood code I
>>>> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
>>>> and tell me what I've got wrong, they're more than welcome, but at this
>>>> stage I'm more curious for feedback on the interface.
>>> 
>>> Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
>>> My understanding is that "complex" files are verboten in procfs and sysfs?
>>> We've been going through a lengthy process to move files out of procfs
>>> into sysfs and debugfs as a result (while trying to maintain some kind of
>>> compatibility in the user tools), but if it is possible to use a separate
>>> filesystem to hold all of the stats/parameters I'd much rather do that
>>> than use debugfs (which has become root-access-only in newer kernels).
>> 
>> /proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
>> filesystem, originally created to replace the nfsd-specific systemcall.
>> So the nfsd developers have a fair degree of latitude as to what can go
>> in there.
>> 
>> But I *don't* think it is a good idea to follow this pattern.  Creating
>> a separate control filesystem for every different module that thinks it
>> has different needs doesn't scale well.  We could end up with dozens of
>> tiny filesystems that all need to be mounted at just the right place.  I
>> don't think that is healthy for Linus.
>> 
>> Nor do I think we should be stuffing stuff into debugfs that isn't
>> really for debugging.  That isn't healthy either.
>> 
>> If sysfs doesn't meet our needs, then we need to raise that in
>> appropriate fora and present a clear case and try to build consensus -
>> because if we see a problem, then it is likely that others do to.
>
> I definitely *do* see the restrictions sysfs as being a problem, and I'd
> guess NFS developers thought the same, since the "one value per file"
> paradigm means that any kind of complex data needs to be split over
> hundreds or thousands of files, which is very inefficient for userspace to
> use.  Consider if /proc/slabinfo had to follow the sysfs paradigm, this would
> (on my system) need about 225 directories (one per slab) and 3589 separate
> files in total (one per value) that would need to be read every second to
> implement "slabtop".  Running strace on "top" shows it taking 0.25s wall time
> to open and read the files for only 350 processes on my system, at 2 files
> per process ("stat" and "statm"), and those have 44 and 7 values, respectively,
> so if it had to follow the sysfs paradigm would make this far worse.
>
> I think it would make a lot more sense to have one file per item of interest,
> and make it e.g. a well-structured YAML format ("name: value", with indentation
> denoting a hierarchy/grouping of related items) so that it can be both human
> and machine readable, easily parsed by scripts using bash or awk, rather than
> having an explicit directory+file hierarchy.  Files like /proc/meminfo and
> /proc/<pid>/status are already YAML-formatted (or almost so), so it isn't ugly
> like XML encoding.

So what are your pain points?  What data do you really want to present in
a structured file?
Look at /proc/self/mountstats on some machine which has an NFS mount.
There would be no problem adding similar information for lustre mounts.

What data do you want to export to user-space, which wouldn't fit there
and doesn't fit the one-value-per-file model.  To make a case, we need
concrete data.

>
>> This is all presumably in the context of Lustre and while lustre is
>> out-of-tree we don't have a lot of leverage.  So I wouldn't consider
>> pursuing anything here until we get back upstream.
>
> Sure, except that is a catch-22.  We can't discuss what is needed until
> the code is in the kernel, but we can't get it into the kernel until the
> files it puts in /proc have been moved into /sys?

Or maybe just removed.  If lustre is usable without some of these files,
then we can land lustre without them, and then start the conversation
about how to export the data that we want exported.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-26 23:55         ` NeilBrown
@ 2019-04-27 19:00           ` J. Bruce Fields
  2019-04-28 22:57             ` NeilBrown
  0 siblings, 1 reply; 30+ messages in thread
From: J. Bruce Fields @ 2019-04-27 19:00 UTC (permalink / raw)
  To: NeilBrown
  Cc: Andreas Dilger, J. Bruce Fields, linux-nfs, linux-fsdevel, abe,
	lsof-l, util-linux, Jeff Layton, James Simmons

On Sat, Apr 27, 2019 at 09:55:23AM +1000, NeilBrown wrote:
> On Fri, Apr 26 2019, J. Bruce Fields wrote:
> > But it's true that from the start nfsd didn't really fit the model
> > of a single (possibly writeable) attribute per file.
> 
> Depends on what you mean by that.  Original files where write-only and
> where slightly complex attributes.

Yes I thought it was just those too, but then I looked at the original
commit it also included at least the "exports" file.

> Writing performed an action, like
> adding an entry to the export table (first you add a client, then add a
> client+filesystem to export it).
> 
> This idea for a file performing an action, rather than presenting an
> attribute, is much the same as the "bind" and "unbind" files you can
> find in sysfs.
> 
> (see also https://lwn.net/Articles/378884/ for examples of sysfs files
> that are not one-attribute-per-file)

I'll give that a re-read, thanks.

I did spend maybe a few minutes looking into basing nfsd code on kernfs
and didn't think it was worth it.  I could take a more serious look.

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] exposing knfsd opens to userspace
  2019-04-27 19:00           ` J. Bruce Fields
@ 2019-04-28 22:57             ` NeilBrown
  0 siblings, 0 replies; 30+ messages in thread
From: NeilBrown @ 2019-04-28 22:57 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andreas Dilger, J. Bruce Fields, linux-nfs, linux-fsdevel, abe,
	lsof-l, util-linux, Jeff Layton, James Simmons

[-- Attachment #1: Type: text/plain, Size: 2422 bytes --]

On Sat, Apr 27 2019, J. Bruce Fields wrote:

> On Sat, Apr 27, 2019 at 09:55:23AM +1000, NeilBrown wrote:
>> On Fri, Apr 26 2019, J. Bruce Fields wrote:
>> > But it's true that from the start nfsd didn't really fit the model
>> > of a single (possibly writeable) attribute per file.
>> 
>> Depends on what you mean by that.  Original files where write-only and
>> where slightly complex attributes.
>
> Yes I thought it was just those too, but then I looked at the original
> commit it also included at least the "exports" file.

Maybe it depends on how one chooses to interpret history - never an
exact science.

The "exports" file pre-existed the nfsd filesystem - it was (and still
is) a file in procfs: /proc/fs/nfs/exports.  So the nfsd filesystem was
not created to provide that file.  It was created to replace a
systemcall.
As I said, it subsequently had a variety of things added to it.  exports
was just the first of these.  At least, that is how I choose to see it.

>
>> Writing performed an action, like
>> adding an entry to the export table (first you add a client, then add a
>> client+filesystem to export it).
>> 
>> This idea for a file performing an action, rather than presenting an
>> attribute, is much the same as the "bind" and "unbind" files you can
>> find in sysfs.
>> 
>> (see also https://lwn.net/Articles/378884/ for examples of sysfs files
>> that are not one-attribute-per-file)
>
> I'll give that a re-read, thanks.
>
> I did spend maybe a few minutes looking into basing nfsd code on kernfs
> and didn't think it was worth it.  I could take a more serious look.

I think that in your use-case it make lots of sense to have a structured
file for the "opens" (similar to /proc/locks and /proc/mounts).
The "info" could reasonably be several attribute files (clientid,
address, name, minor_version), but I don't think it benefits anyone for
'opens' to be a directory full of directories each with a file for each
of the fields.

So using kernfs would mean pushing for allowing structured files in
kernfs.  I'd be happy to support that, but I think you would need to go
into it convinced that you really wanted to persist.

I think using kernfs is mostly about embedding a kobject in everything,
then setting an appropriate ktype with appropriate attributes.  Not
particularly complex, but certainly a bit of work.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 09/10] nfsd: expose some more information about NFSv4 opens
  2019-04-25 14:04 ` [PATCH 09/10] nfsd: expose some more information about NFSv4 opens J. Bruce Fields
@ 2019-05-02 15:28   ` Benjamin Coddington
  2019-05-02 15:58     ` Andrew W Elble
  0 siblings, 1 reply; 30+ messages in thread
From: Benjamin Coddington @ 2019-05-02 15:28 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-nfs, linux-fsdevel, abe, lsof-l, util-linux, jlayton

On 25 Apr 2019, at 10:04, J. Bruce Fields wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
>
> Add open modes, device numbers, inode numbers, and open owners to each
> line of nfsd/clients/#/opens.
>
> Open owners are totally opaque but seem to sometimes have some useful
> ascii strings included, so passing through printable ascii characters
> and escaping the rest seems useful while still being machine-readable.
> ---
>  fs/nfsd/nfs4state.c            | 40 
> ++++++++++++++++++++++++++++------
>  fs/nfsd/state.h                |  2 +-
>  fs/seq_file.c                  | 17 +++++++++++++++
>  include/linux/seq_file.h       |  2 ++
>  include/linux/string_helpers.h |  1 +
>  lib/string_helpers.c           |  5 +++--
>  6 files changed, 57 insertions(+), 10 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 829d1e5440d3..f53621a65e60 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -42,6 +42,7 @@
>  #include <linux/sunrpc/svcauth_gss.h>
>  #include <linux/sunrpc/addr.h>
>  #include <linux/jhash.h>
> +#include <linux/string_helpers.h>
>  #include "xdr4.h"
>  #include "xdr4cb.h"
>  #include "vfs.h"
> @@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s, 
> void *v)
>  static int opens_show(struct seq_file *s, void *v)
>  {
>  	struct nfs4_stid *st = v;
> -	struct nfs4_ol_stateid *os;
> -	u64 stateid;
> +	struct nfs4_ol_stateid *ols;
> +	struct nfs4_file *nf;
> +	struct file *file;
> +	struct inode *inode;
> +	struct nfs4_stateowner *oo;
> +	unsigned int access, deny;
>
>  	if (st->sc_type != NFS4_OPEN_STID)
>  		return 0; /* XXX: or SEQ_SKIP? */
> -	os = openlockstateid(st);
> -	/* XXX: get info about file, etc., here */
> +	ols = openlockstateid(st);
> +	oo = ols->st_stateowner;
> +	nf = st->sc_file;
> +	file = find_any_file(nf);
> +	inode = d_inode(file->f_path.dentry);
> +
> +	seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));

Should we match the byte order printed with what wireshark/tshark sees?

For example, this will print:

5ccb016e/6d028c97/00000002/00000002 -w  --  fd:00:9163439   'open 
id:\x00\x00\x00.\x00\x00\x00\x00\x00\x00\x021\x8dp\xbe\xc7'

But, tshark -V prints:

         Opcode: OPEN (18)
             Status: NFS4_OK (0)
             stateid
                 [StateID Hash: 0x8298]
                 seqid: 0x00000002
                 Data: 6e01cb5c978c026d02000000
                 [Data hash (CRC-32): 0x8391069f]

I think this is the first time we've exposed state ids to users from 
knfsd,
I wonder if we should make it easier for everyone that might want to try 
to
correlate this information with what they can see in a packet capture.

Ben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 09/10] nfsd: expose some more information about NFSv4 opens
  2019-05-02 15:28   ` Benjamin Coddington
@ 2019-05-02 15:58     ` Andrew W Elble
  2019-05-07  1:02       ` J. Bruce Fields
  0 siblings, 1 reply; 30+ messages in thread
From: Andrew W Elble @ 2019-05-02 15:58 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, abe, lsof-l,
	util-linux, jlayton

Benjamin Coddington <bcodding@redhat.com> writes:

> On 25 Apr 2019, at 10:04, J. Bruce Fields wrote:
>
>> From: "J. Bruce Fields" <bfields@redhat.com>
>>
>> Add open modes, device numbers, inode numbers, and open owners to each
>> line of nfsd/clients/#/opens.
>>
>> Open owners are totally opaque but seem to sometimes have some useful
>> ascii strings included, so passing through printable ascii characters
>> and escaping the rest seems useful while still being machine-readable.
>> ---
>>  fs/nfsd/nfs4state.c            | 40
>> ++++++++++++++++++++++++++++------
>>  fs/nfsd/state.h                |  2 +-
>>  fs/seq_file.c                  | 17 +++++++++++++++
>>  include/linux/seq_file.h       |  2 ++
>>  include/linux/string_helpers.h |  1 +
>>  lib/string_helpers.c           |  5 +++--
>>  6 files changed, 57 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index 829d1e5440d3..f53621a65e60 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -42,6 +42,7 @@
>>  #include <linux/sunrpc/svcauth_gss.h>
>>  #include <linux/sunrpc/addr.h>
>>  #include <linux/jhash.h>
>> +#include <linux/string_helpers.h>
>>  #include "xdr4.h"
>>  #include "xdr4cb.h"
>>  #include "vfs.h"
>> @@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s,
>> void *v)
>>  static int opens_show(struct seq_file *s, void *v)
>>  {
>>  	struct nfs4_stid *st = v;
>> -	struct nfs4_ol_stateid *os;
>> -	u64 stateid;
>> +	struct nfs4_ol_stateid *ols;
>> +	struct nfs4_file *nf;
>> +	struct file *file;
>> +	struct inode *inode;
>> +	struct nfs4_stateowner *oo;
>> +	unsigned int access, deny;
>>
>>  	if (st->sc_type != NFS4_OPEN_STID)
>>  		return 0; /* XXX: or SEQ_SKIP? */
>> -	os = openlockstateid(st);
>> -	/* XXX: get info about file, etc., here */
>> +	ols = openlockstateid(st);
>> +	oo = ols->st_stateowner;
>> +	nf = st->sc_file;
>> +	file = find_any_file(nf);

Is there a matching fput() missing somewhere, or did I just not see it...?

>> +	inode = d_inode(file->f_path.dentry);
>> +
>> +	seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
>
> Should we match the byte order printed with what wireshark/tshark sees?

^^ +1


Thanks,

Andy

-- 
Andrew W. Elble
Infrastructure Engineer
Information and Technology Services
Rochester Institute of Technology
tel: (585)-475-2411 | aweits@rit.edu
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912

CONFIDENTIALITY NOTE: The information transmitted, including
attachments, is intended only for the person(s) or entity to which it
is addressed and may contain confidential and/or privileged material.
Any review, retransmission, dissemination or other use of, or taking of
any action in reliance upon this information by persons or entities
other than the intended recipient is prohibited. If you received this
in error, please contact the sender and destroy any copies of this
information.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 09/10] nfsd: expose some more information about NFSv4 opens
  2019-05-02 15:58     ` Andrew W Elble
@ 2019-05-07  1:02       ` J. Bruce Fields
  0 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-05-07  1:02 UTC (permalink / raw)
  To: Andrew W Elble
  Cc: Benjamin Coddington, J. Bruce Fields, linux-nfs, linux-fsdevel,
	abe, lsof-l, util-linux, jlayton

On Thu, May 02, 2019 at 11:58:06AM -0400, Andrew W Elble wrote:
> Benjamin Coddington <bcodding@redhat.com> writes:
> 
> > On 25 Apr 2019, at 10:04, J. Bruce Fields wrote:
> >
> >> From: "J. Bruce Fields" <bfields@redhat.com>
> >>
> >> Add open modes, device numbers, inode numbers, and open owners to each
> >> line of nfsd/clients/#/opens.
> >>
> >> Open owners are totally opaque but seem to sometimes have some useful
> >> ascii strings included, so passing through printable ascii characters
> >> and escaping the rest seems useful while still being machine-readable.
> >> ---
> >>  fs/nfsd/nfs4state.c            | 40
> >> ++++++++++++++++++++++++++++------
> >>  fs/nfsd/state.h                |  2 +-
> >>  fs/seq_file.c                  | 17 +++++++++++++++
> >>  include/linux/seq_file.h       |  2 ++
> >>  include/linux/string_helpers.h |  1 +
> >>  lib/string_helpers.c           |  5 +++--
> >>  6 files changed, 57 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >> index 829d1e5440d3..f53621a65e60 100644
> >> --- a/fs/nfsd/nfs4state.c
> >> +++ b/fs/nfsd/nfs4state.c
> >> @@ -42,6 +42,7 @@
> >>  #include <linux/sunrpc/svcauth_gss.h>
> >>  #include <linux/sunrpc/addr.h>
> >>  #include <linux/jhash.h>
> >> +#include <linux/string_helpers.h>
> >>  #include "xdr4.h"
> >>  #include "xdr4cb.h"
> >>  #include "vfs.h"
> >> @@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s,
> >> void *v)
> >>  static int opens_show(struct seq_file *s, void *v)
> >>  {
> >>  	struct nfs4_stid *st = v;
> >> -	struct nfs4_ol_stateid *os;
> >> -	u64 stateid;
> >> +	struct nfs4_ol_stateid *ols;
> >> +	struct nfs4_file *nf;
> >> +	struct file *file;
> >> +	struct inode *inode;
> >> +	struct nfs4_stateowner *oo;
> >> +	unsigned int access, deny;
> >>
> >>  	if (st->sc_type != NFS4_OPEN_STID)
> >>  		return 0; /* XXX: or SEQ_SKIP? */
> >> -	os = openlockstateid(st);
> >> -	/* XXX: get info about file, etc., here */
> >> +	ols = openlockstateid(st);
> >> +	oo = ols->st_stateowner;
> >> +	nf = st->sc_file;
> >> +	file = find_any_file(nf);
> 
> Is there a matching fput() missing somewhere, or did I just not see it...?

Oops, fixed.

> >> +	inode = d_inode(file->f_path.dentry);
> >> +
> >> +	seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
> >
> > Should we match the byte order printed with what wireshark/tshark sees?
> 
> ^^ +1

Yeah, I agree, I'm changing that to just be a "%16phN", which should
give us what wireshark does.

Thanks for the review!

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/10] nfsd4: add file to display list of client's opens
  2019-04-26  1:18         ` J. Bruce Fields
@ 2019-05-16  0:40           ` J. Bruce Fields
  0 siblings, 0 replies; 30+ messages in thread
From: J. Bruce Fields @ 2019-05-16  0:40 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Jeff Layton, J. Bruce Fields, linux-nfs, linux-fsdevel, abe,
	lsof-l, util-linux

On Thu, Apr 25, 2019 at 09:18:04PM -0400, J. Bruce Fields wrote:
> On Thu, Apr 25, 2019 at 11:14:23PM +0200, Andreas Dilger wrote:
> > On Apr 25, 2019, at 10:14 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > > 
> > > On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
> > >> More bikeshedding: should we have a "states" file instead of an "opens"
> > >> file and print a different set of output for each stateid type?
> > > 
> > > Sure.  The format of the file could be something like
> > > 
> > > 	<stateid> open rw -- <openowner>...
> > > 	<stateid> lock r 0-EOF <lockowner>...
> > > 	<stateid> deleg r
> > > 
> > > I wonder if we could put owners on separate lines and do some
> > > heirarchical thing to show owner-stateid relationships?  Hm.  That's
> > > kind of appealing but more work.
> > 
> > My suggestion here would be to use YAML-formatted output rather than
> > space/tab separated positional fields.  That can still be made human
> > readable, but also machine parseable and extensible if formatted properly.
> 
> Well, anything we do will be machine-parseable.  But I can believe YAML
> would make future extension easier.  It doesn't look like it would be
> more complicated to generate.  It uses C-style escaping (like \x32) so
> there'd be no change to how we format binary blobs.
> 
> The field names make it a tad more verbose but I guess it's not too bad.

OK, I tried changing "opens" to "states" and using YAML.  Example output:

- 0x020000006a5fdc5c4ad09d9e01000000: { type: open, access: rw, deny: --, superblock: "fd:10:13649", owner: "open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x0046��QH " }
- 0x010000006a5fdc5c4ad09d9e03000000: { type: open, access: r-, deny: --, superblock: "fd:10:13650", owner: "open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x0046��QH" }
- 0x010000006a5fdc5c4ad09d9e04000000: { type: deleg, access: r, superblock: "fd:10:13650" }
- 0x010000006a5fdc5c4ad09d9e06000000: { type: lock, superblock: "fd:10:13649", owner: "lock id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00" }

The parser Andreas suggested (https://yaml-online-parser.appspot.com/)
accepts these.  It also thinks strings are always in a unicode encoding
of some kind, which they aren't.  The owners are arbitrary series of
bytes but I'd like at least any ascii parts to be human readable, and
I'm a little stuck on how to do that.

--b.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-05-16  1:48 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-25 14:04 [PATCH 00/10] exposing knfsd opens to userspace J. Bruce Fields
2019-04-25 14:04 ` [PATCH 01/10] nfsd: persist nfsd filesystem across mounts J. Bruce Fields
2019-04-25 14:04 ` [PATCH 02/10] nfsd: rename cl_refcount J. Bruce Fields
2019-04-25 14:04 ` [PATCH 03/10] nfsd4: use reference count to free client J. Bruce Fields
2019-04-25 14:04 ` [PATCH 04/10] nfsd: add nfsd/clients directory J. Bruce Fields
2019-04-25 14:04 ` [PATCH 05/10] nfsd: make client/ directory names small ints J. Bruce Fields
2019-04-25 14:04 ` [PATCH 06/10] rpc: replace rpc_filelist by tree_descr J. Bruce Fields
2019-04-25 14:04 ` [PATCH 07/10] nfsd4: add a client info file J. Bruce Fields
2019-04-25 14:04 ` [PATCH 08/10] nfsd4: add file to display list of client's opens J. Bruce Fields
2019-04-25 18:04   ` Jeff Layton
2019-04-25 20:14     ` J. Bruce Fields
2019-04-25 21:14       ` Andreas Dilger
2019-04-26  1:18         ` J. Bruce Fields
2019-05-16  0:40           ` J. Bruce Fields
2019-04-25 14:04 ` [PATCH 09/10] nfsd: expose some more information about NFSv4 opens J. Bruce Fields
2019-05-02 15:28   ` Benjamin Coddington
2019-05-02 15:58     ` Andrew W Elble
2019-05-07  1:02       ` J. Bruce Fields
2019-04-25 14:04 ` [PATCH 10/10] nfsd: add more information to client info file J. Bruce Fields
2019-04-25 17:02 ` [PATCH 00/10] exposing knfsd opens to userspace Jeff Layton
2019-04-25 20:01   ` J. Bruce Fields
2019-04-25 18:17 ` Jeff Layton
2019-04-25 21:08 ` Andreas Dilger
2019-04-25 23:20   ` NeilBrown
2019-04-26 11:00     ` Andreas Dilger
2019-04-26 12:56       ` J. Bruce Fields
2019-04-26 23:55         ` NeilBrown
2019-04-27 19:00           ` J. Bruce Fields
2019-04-28 22:57             ` NeilBrown
2019-04-27  0:03       ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).