linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Miklos Szeredi <miklos@szeredi.hu>, David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux NFS list <linux-nfs@vger.kernel.org>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Theodore Ts'o <tytso@mit.edu>,
	Linux API <linux-api@vger.kernel.org>,
	linux-ext4@vger.kernel.org,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Christian Brauner <christian@brauner.io>,
	Jann Horn <jannh@google.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Karel Zak <kzak@redhat.com>, Jeff Layton <jlayton@redhat.com>,
	linux-fsdevel@vger.kernel.org,
	LSM <linux-security-module@vger.kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19]
Date: Wed, 01 Apr 2020 13:22:38 +0800	[thread overview]
Message-ID: <50caf93782ba1d66bd6acf098fb8dcb0ecc98610.camel@themaw.net> (raw)
In-Reply-To: <CAJfpeguaiicjS2StY5m=8H7BCjq6PLxMsWE3Mx_jYR1foDWVTg@mail.gmail.com>

On Wed, 2020-03-18 at 17:05 +0100, Miklos Szeredi wrote:
> On Wed, Mar 18, 2020 at 4:08 PM David Howells <dhowells@redhat.com>
> wrote:
> 
> > ============================
> > WHY NOT USE PROCFS OR SYSFS?
> > ============================
> > 
> > Why is it better to go with a new system call rather than adding
> > more magic
> > stuff to /proc or /sysfs for each superblock object and each mount
> > object?
> > 
> >  (1) It can be targetted.  It makes it easy to query directly by
> > path.
> >      procfs and sysfs cannot do this easily.
> > 
> >  (2) It's more efficient as we can return specific binary data
> > rather than
> >      making huge text dumps.  Granted, sysfs and procfs could
> > present the
> >      same data, though as lots of little files which have to be
> >      individually opened, read, closed and parsed.
> 
> Asked this a number of times, but you haven't answered yet:  what
> application would require such a high efficiency?

Umm ... systemd and udisks2 and about 4 others.

A problem I've had with autofs for years is using autofs direct mount
maps of any appreciable size cause several key user space applications
to consume all available CPU while autofs is starting or stopping which
takes a fair while with a very large mount table. I saw a couple of
applications affected purely because of the large mount table but not
as badly as starting or stopping autofs.

Maps of 5,000 to 10,000 map entries can almost be handled, not uncommon
for heavy autofs users in spite of the problem, but much larger than
that and you've got a serious problem.

There are problems with expiration as well but that's more an autofs
problem that I need to fix.

To be clear it's not autofs that needs the improvement (I need to
deal with this in autofs itself) it's the affect that these large
mount tables have on the rest of the user space and that's quite
significant.

I can't even think about resolving my autofs problem until this
problem is resolved and handling very large numbers of mounts
as efficiently as possible must be part of that solution for me
and I think for the OS overall too.

Ian
> 
> Nobody's suggesting we move stat(2) to proc interfaces, and AFAIK
> nobody suggested we move /proc/PID/* to a binary syscall interface.
> Each one has its place, and I strongly feel that mount info belongs
> in
> the latter category.    Feel free to prove the opposite.
> 
> >  (3) We wouldn't have the overhead of open and close (even adding a
> >      self-contained readfile() syscall has to do that internally
> 
> Busted: add f_op->readfile() and be done with all that.   For example
> DEFINE_SHOW_ATTRIBUTE() could be trivially moved to that interface.
> 
> We could optimize existing proc, sys, etc. interfaces, but it's not
> been an issue, apparently.
> 
> >  (4) Opening a file in procfs or sysfs has a pathwalk overhead for
> > each
> >      file accessed.  We can use an integer attribute ID instead
> > (yes, this
> >      is similar to ioctl) - but could also use a string ID if that
> > is
> >      preferred.
> > 
> >  (5) Can easily query cross-namespace if, say, a container manager
> > process
> >      is given an fs_context that hasn't yet been mounted into a
> > namespace -
> >      or hasn't even been fully created yet.
> 
> Works with my patch.
> 
> >  (6) Don't have to create/delete a bunch of sysfs/procfs nodes each
> > time a
> >      mount happens or is removed - and since systemd makes much use
> > of
> >      mount namespaces and mount propagation, this will create a lot
> > of
> >      nodes.
> 
> Not true.
> 
> > The argument for doing this through procfs/sysfs/somemagicfs is
> > that
> > someone using a shell can just query the magic files using ordinary
> > text
> > tools, such as cat - and that has merit - but it doesn't solve the
> > query-by-pathname problem.
> > 
> > The suggested way around the query-by-pathname problem is to open
> > the
> > target file O_PATH and then look in a magic directory under procfs
> > corresponding to the fd number to see a set of attribute files[*]
> > laid out.
> > Bash, however, can't open by O_PATH or O_NOFOLLOW as things
> > stand...
> 
> Bash doesn't have fsinfo(2) either, so that's not really a good
> argument.
> 
> Implementing a utility to show mount attribute(s) by path is trivial
> for the file based interface, while it would need to be updated for
> each extension of fsinfo(2).   Same goes for libc, language bindings,
> etc.
> 
> Thanks,
> Miklos


  reply	other threads:[~2020-04-01  5:22 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 15:08 [PATCH 00/13] VFS: Filesystem information [ver #19] David Howells
2020-03-18 15:08 ` [PATCH 01/13] fsinfo: Add fsinfo() syscall to query filesystem " David Howells
2020-03-18 15:08 ` [PATCH 02/13] fsinfo: Provide a bitmap of supported features " David Howells
2020-03-18 15:08 ` [PATCH 03/13] fsinfo: Allow retrieval of superblock devname, options and stats " David Howells
2020-03-18 15:08 ` [PATCH 04/13] fsinfo: Allow fsinfo() to look up a mount object by ID " David Howells
2020-03-18 15:08 ` [PATCH 05/13] fsinfo: Add a uniquifier ID to struct mount " David Howells
2020-03-18 15:09 ` [PATCH 06/13] fsinfo: Allow mount information to be queried " David Howells
2020-03-18 15:09 ` [PATCH 07/13] fsinfo: Allow mount topology and propagation info to be retrieved " David Howells
2020-03-18 15:09 ` [PATCH 08/13] fsinfo: Provide notification overrun handling support " David Howells
2020-03-18 15:09 ` [PATCH 09/13] fsinfo: sample: Mount listing program " David Howells
2020-03-18 15:09 ` [PATCH 10/13] fsinfo: Add API documentation " David Howells
2020-03-18 15:09 ` [PATCH 11/13] fsinfo: Add support for AFS " David Howells
2020-03-18 15:09 ` [PATCH 12/13] fsinfo: Example support for Ext4 " David Howells
2020-03-18 15:10 ` [PATCH 13/13] fsinfo: Example support for NFS " David Howells
2020-03-18 16:05 ` [PATCH 00/13] VFS: Filesystem information " Miklos Szeredi
2020-04-01  5:22   ` Ian Kent [this message]
2020-04-01  8:18     ` Miklos Szeredi
2020-04-01  8:27     ` David Howells
2020-04-01  8:37       ` Miklos Szeredi
2020-04-01 12:35         ` Miklos Szeredi
2020-04-01 15:51         ` David Howells
2020-04-02  1:38         ` Ian Kent
2020-04-02 14:14           ` Karel Zak
2020-03-19 10:37 ` David Howells
2020-03-19 12:36   ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50caf93782ba1d66bd6acf098fb8dcb0ecc98610.camel@themaw.net \
    --to=raven@themaw.net \
    --cc=adilger.kernel@dilger.ca \
    --cc=anna.schumaker@netapp.com \
    --cc=christian@brauner.io \
    --cc=darrick.wong@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=jannh@google.com \
    --cc=jlayton@redhat.com \
    --cc=kzak@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mszeredi@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).