From: email@example.com (J. Bruce Fields) To: Jeff Layton <firstname.lastname@example.org> Cc: "J. Bruce Fields" <email@example.com>, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Re: [PATCH 00/10] exposing knfsd opens to userspace Date: Thu, 25 Apr 2019 16:01:00 -0400 Message-ID: <20190425200100.GA9889@fieldses.org> (raw) In-Reply-To: <email@example.com> On Thu, Apr 25, 2019 at 01:02:30PM -0400, Jeff Layton wrote: > On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote: > > From: "J. Bruce Fields" <firstname.lastname@example.org> > > > > The following patches expose information about NFSv4 opens held by knfsd > > on behalf of NFSv4 clients. Those are currently invisible to userspace, > > unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/). > > > > The approach is to add a new directory /proc/fs/nfsd/clients/ with > > subdirectories for each active NFSv4 client. Each subdirectory has an > > "info" file with some basic information to help identify the client and > > an "opens" directory that lists the opens held by that client. > > > > I got it working by cobbling together some poorly-understood code I > > found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in > > and tell me what I've got wrong, they're more than welcome, but at this > > stage I'm more curious for feedback on the interface. > > > > I'm also cc'ing people responsible for lsof and util-linux in case they > > have any opinions. > > > > Currently these pseudofiles look like: > > > > # find /proc/fs/nfsd/clients -type f|xargs tail > > ==> /proc/fs/nfsd/clients/3741/opens <== > > 5cc0cd36/6debfb50/00000001/00000001 rw -- fd:10:13649 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef' > > 5cc0cd36/6debfb50/00000003/00000001 r- -- fd:10:13650 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef' > > > > ==> /proc/fs/nfsd/clients/3741/info <== > > clientid: 6debfb505cc0cd36 > > address: 192.168.122.36:0 > > name: Linux NFSv4.2 test2.fieldses.org > > minor version: 2 > > > > Each line of the "opens" file is tab-delimited and describes one open, > > and the fields are stateid, open access bits, deny bits, > > major:minor:ino, and open owner. > > > > Nice work! We've needed this for a long time. > > One thing we need to consider here from the get-go though is what sort > of ABI guarantee you want for this format. People _will_ write scripts > that scrape this info, so we should take that into account up front. There is a man page for the nfsd filesystem, nfsd(7). I should write up something to add to that. If people write code without reading that then we may still end up boxed in, of course, but it's a start. What I'm hoping we can count on from readers: - they will ignore any unkown files in clients/#/. - readers will ignore any lines in clients/#/info starting with an unrecognized keyword. - they will ignore any unknown data at the end of clients/#/opens. That's in approximate decreasing order of my confidence in those rules being observed, though I don't think any of those are too much to ask. > > So, some random questions: > > > > - I just copied the major:minor:ino thing from /proc/locks, I > > suspect we would have picked something different to identify > > inodes if /proc/locks were done now. (Mount id and inode? > > Something else?) > > > > That does make it easy to correlate with the info in /proc/locks. > > We'd have a dentry here by virtue of the nfs4_file. Should we print a > path in addition to this? We could. It won't be 100% reliable, of course (unlinks, renames), but it could still be convenient for human readers, and an optimization for non-human readers trying to find an inode. The filehandle might be a good idea too. I wonder if there's any issue with line length, or with quantity of data emitted by a single seq_file show method. The open owner can be up to 4K (after escaping), paths and filehandles can be long too. > > - The open owner is just an opaque blob of binary data, but > > clients may choose to include some useful asci-encoded > > information, so I'm formatting them as strings with non-ascii > > stuff escaped. For example, pynfs usually uses the name of > > the test as the open owner. But as you see above, the ascii > > content of the Linux client's open owners is less useful. > > Also, there's no way I know of to map them back to a file > > description or process or anything else useful on the client, > > so perhaps they're of limited interest. > > > > - I'm not sure about the stateid either. I did think it might > > be useful just as a unique identifier for each line. > > (Actually for that it'd be enough to take just the third of > > those four numbers making up the stateid--maybe that would be > > better.) > > It'd be ideal to be able to easily correlate this info with what > wireshark displays. Does wireshark display hashes for openowners? I know > it does for stateids. If so, generating the same hash would be really > nice. > > That said, waybe it's best to just dump the raw info out here though and > rely on some postprocessing scripts for viewing it? In that case, I think so, as I don't know how committed wireshark is to the choice of hash. > > In the "info" file, the "name" line is the client identifier/client > > owner provided by the client, which (like the stateowner) is just opaque > > binary data, though as you can see here the Linux client is providing a > > readable ascii string. > > > > There's probably a lot more we could add to that info file eventually. > > > > Other stuff to add next: > > > > - nfsd/clients/#/kill that you can write to to revoke all a > > client's state if it's wedged somehow. > > That would also be neat. We have a bit of code to support today that in > the fault injection code, but it'll need some cleanup and wiring it into > a knob here would be better. OK, good, I'm working on that. Looks like fault injection gives up if there are rpc's in process for the given client, whereas here I'd rather force the expiry. Looks like that needs some straightforward waitqueue logic to wait for the in-progress rpc's. --b.
next prev parent reply index Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-25 14:04 J. Bruce Fields 2019-04-25 14:04 ` [PATCH 01/10] nfsd: persist nfsd filesystem across mounts J. Bruce Fields 2019-04-25 14:04 ` [PATCH 02/10] nfsd: rename cl_refcount J. Bruce Fields 2019-04-25 14:04 ` [PATCH 03/10] nfsd4: use reference count to free client J. Bruce Fields 2019-04-25 14:04 ` [PATCH 04/10] nfsd: add nfsd/clients directory J. Bruce Fields 2019-04-25 14:04 ` [PATCH 05/10] nfsd: make client/ directory names small ints J. Bruce Fields 2019-04-25 14:04 ` [PATCH 06/10] rpc: replace rpc_filelist by tree_descr J. Bruce Fields 2019-04-25 14:04 ` [PATCH 07/10] nfsd4: add a client info file J. Bruce Fields 2019-04-25 14:04 ` [PATCH 08/10] nfsd4: add file to display list of client's opens J. Bruce Fields 2019-04-25 18:04 ` Jeff Layton 2019-04-25 20:14 ` J. Bruce Fields 2019-04-25 21:14 ` Andreas Dilger 2019-04-26 1:18 ` J. Bruce Fields 2019-05-16 0:40 ` J. Bruce Fields 2019-04-25 14:04 ` [PATCH 09/10] nfsd: expose some more information about NFSv4 opens J. Bruce Fields 2019-05-02 15:28 ` Benjamin Coddington 2019-05-02 15:58 ` Andrew W Elble 2019-05-07 1:02 ` J. Bruce Fields 2019-04-25 14:04 ` [PATCH 10/10] nfsd: add more information to client info file J. Bruce Fields 2019-04-25 17:02 ` [PATCH 00/10] exposing knfsd opens to userspace Jeff Layton 2019-04-25 20:01 ` J. Bruce Fields [this message] 2019-04-25 18:17 ` Jeff Layton 2019-04-25 21:08 ` Andreas Dilger 2019-04-25 23:20 ` NeilBrown 2019-04-26 11:00 ` Andreas Dilger 2019-04-26 12:56 ` J. Bruce Fields 2019-04-26 23:55 ` NeilBrown 2019-04-27 19:00 ` J. Bruce Fields 2019-04-28 22:57 ` NeilBrown 2019-04-27 0:03 ` NeilBrown
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190425200100.GA9889@fieldses.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-NFS Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \ email@example.com public-inbox-index linux-nfs Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs AGPL code for this site: git clone https://public-inbox.org/public-inbox.git