Linux-NFS Archive on
 help / color / Atom feed
From: Andreas Dilger <>
To: NeilBrown <>
Cc: "J. Bruce Fields" <>,
	linux-nfs <>,
	linux-fsdevel <>,,,, Jeff Layton <>,
	James Simmons <>
Subject: Re: [PATCH 00/10] exposing knfsd opens to userspace
Date: Fri, 26 Apr 2019 13:00:19 +0200
Message-ID: <> (raw)
In-Reply-To: <>

[-- Attachment #1: Type: text/plain, Size: 4298 bytes --]

> On Apr 26, 2019, at 1:20 AM, NeilBrown <> wrote:
> On Thu, Apr 25 2019, Andreas Dilger wrote:
>> On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <> wrote:
>>> From: "J. Bruce Fields" <>
>>> The following patches expose information about NFSv4 opens held by knfsd
>>> on behalf of NFSv4 clients.  Those are currently invisible to userspace,
>>> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>>> The approach is to add a new directory /proc/fs/nfsd/clients/ with
>>> subdirectories for each active NFSv4 client.  Each subdirectory has an
>>> "info" file with some basic information to help identify the client and
>>> an "opens" directory that lists the opens held by that client.
>>> I got it working by cobbling together some poorly-understood code I
>>> found in libfs, rpc_pipefs and elsewhere.  If anyone wants to wade in
>>> and tell me what I've got wrong, they're more than welcome, but at this
>>> stage I'm more curious for feedback on the interface.
>> Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
>> My understanding is that "complex" files are verboten in procfs and sysfs?
>> We've been going through a lengthy process to move files out of procfs
>> into sysfs and debugfs as a result (while trying to maintain some kind of
>> compatibility in the user tools), but if it is possible to use a separate
>> filesystem to hold all of the stats/parameters I'd much rather do that
>> than use debugfs (which has become root-access-only in newer kernels).
> /proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
> filesystem, originally created to replace the nfsd-specific systemcall.
> So the nfsd developers have a fair degree of latitude as to what can go
> in there.
> But I *don't* think it is a good idea to follow this pattern.  Creating
> a separate control filesystem for every different module that thinks it
> has different needs doesn't scale well.  We could end up with dozens of
> tiny filesystems that all need to be mounted at just the right place.  I
> don't think that is healthy for Linus.
> Nor do I think we should be stuffing stuff into debugfs that isn't
> really for debugging.  That isn't healthy either.
> If sysfs doesn't meet our needs, then we need to raise that in
> appropriate fora and present a clear case and try to build consensus -
> because if we see a problem, then it is likely that others do to.

I definitely *do* see the restrictions sysfs as being a problem, and I'd
guess NFS developers thought the same, since the "one value per file"
paradigm means that any kind of complex data needs to be split over
hundreds or thousands of files, which is very inefficient for userspace to
use.  Consider if /proc/slabinfo had to follow the sysfs paradigm, this would
(on my system) need about 225 directories (one per slab) and 3589 separate
files in total (one per value) that would need to be read every second to
implement "slabtop".  Running strace on "top" shows it taking 0.25s wall time
to open and read the files for only 350 processes on my system, at 2 files
per process ("stat" and "statm"), and those have 44 and 7 values, respectively,
so if it had to follow the sysfs paradigm would make this far worse.

I think it would make a lot more sense to have one file per item of interest,
and make it e.g. a well-structured YAML format ("name: value", with indentation
denoting a hierarchy/grouping of related items) so that it can be both human
and machine readable, easily parsed by scripts using bash or awk, rather than
having an explicit directory+file hierarchy.  Files like /proc/meminfo and
/proc/<pid>/status are already YAML-formatted (or almost so), so it isn't ugly
like XML encoding.

> This is all presumably in the context of Lustre and while lustre is
> out-of-tree we don't have a lot of leverage.  So I wouldn't consider
> pursuing anything here until we get back upstream.

Sure, except that is a catch-22.  We can't discuss what is needed until
the code is in the kernel, but we can't get it into the kernel until the
files it puts in /proc have been moved into /sys?

Cheers, Andreas

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

  reply index

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-25 14:04 J. Bruce Fields
2019-04-25 14:04 ` [PATCH 01/10] nfsd: persist nfsd filesystem across mounts J. Bruce Fields
2019-04-25 14:04 ` [PATCH 02/10] nfsd: rename cl_refcount J. Bruce Fields
2019-04-25 14:04 ` [PATCH 03/10] nfsd4: use reference count to free client J. Bruce Fields
2019-04-25 14:04 ` [PATCH 04/10] nfsd: add nfsd/clients directory J. Bruce Fields
2019-04-25 14:04 ` [PATCH 05/10] nfsd: make client/ directory names small ints J. Bruce Fields
2019-04-25 14:04 ` [PATCH 06/10] rpc: replace rpc_filelist by tree_descr J. Bruce Fields
2019-04-25 14:04 ` [PATCH 07/10] nfsd4: add a client info file J. Bruce Fields
2019-04-25 14:04 ` [PATCH 08/10] nfsd4: add file to display list of client's opens J. Bruce Fields
2019-04-25 18:04   ` Jeff Layton
2019-04-25 20:14     ` J. Bruce Fields
2019-04-25 21:14       ` Andreas Dilger
2019-04-26  1:18         ` J. Bruce Fields
2019-05-16  0:40           ` J. Bruce Fields
2019-04-25 14:04 ` [PATCH 09/10] nfsd: expose some more information about NFSv4 opens J. Bruce Fields
2019-05-02 15:28   ` Benjamin Coddington
2019-05-02 15:58     ` Andrew W Elble
2019-05-07  1:02       ` J. Bruce Fields
2019-04-25 14:04 ` [PATCH 10/10] nfsd: add more information to client info file J. Bruce Fields
2019-04-25 17:02 ` [PATCH 00/10] exposing knfsd opens to userspace Jeff Layton
2019-04-25 20:01   ` J. Bruce Fields
2019-04-25 18:17 ` Jeff Layton
2019-04-25 21:08 ` Andreas Dilger
2019-04-25 23:20   ` NeilBrown
2019-04-26 11:00     ` Andreas Dilger [this message]
2019-04-26 12:56       ` J. Bruce Fields
2019-04-26 23:55         ` NeilBrown
2019-04-27 19:00           ` J. Bruce Fields
2019-04-28 22:57             ` NeilBrown
2019-04-27  0:03       ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on

Archives are clonable:
	git clone --mirror linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ \
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone