linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@mit.edu>
To: Nick Piggin <npiggin@suse.de>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Al Viro <viro@ZenIV.linux.org.uk>,
	Ulrich Drepper <drepper@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [rfc] new stat*fs-like syscall?
Date: Thu, 24 Jun 2010 10:08:55 -0400	[thread overview]
Message-ID: <4C2366F7.5010200@mit.edu> (raw)
In-Reply-To: <20100624131455.GA10441@laptop>

Nick Piggin wrote:
> This has come up a few times in the past, and I'd like to try to get
> an agreement on it. statvfs(2) importantly contains f_flag (mount
> flags), and is encouraged to use rather than statfs(2). The kernel
> provides a statfs syscall only.
> 
> This means glibc has to provide f_flag support by parsing /proc/mounts
> and stat(2)ing mount points. This is really slow, and /proc/mounts is
> hard for the kernel to provide. It's actually the last scalability
> bottleneck in the core vfs for dbench (samba) after my patches.
> 
> Not only that, but it's racy.
> 
> Other than types, other differences are:
> - statvfs(2) has is f_frsize, which seems fairly useless.
> - statvfs(2) has f_favail.
> - statfs(2) f_bsize is optimal transfer block, statvfs(2) f_bsize is fs
>   block size. The latter could be useful for disk space algorithms.
>   Both can be ill defned.
> - statvfs(2) lacks f_type.
> 
> Is there anything more we should add here? Samba wants a capabilities
> field, with things like sparse files, quotas, compression, encryption,
> case preserving/sensitive.
> 
> Any thoughts?

Something like fsid but actually specified to uniquely identify a 
superblock.  (Currently, fsid seems to be set by the filesystem, and 
nothing in particular ensures that two different filesystems couldn't 
have collisions.)  We could guarantee (or have a flag guaranteeing) that 
(fsid, st_inode) actually uniquely identifies an inode.

Similarly, something like fsid that uniquely identifies the vfsmount 
could be useful, although I don't know how easy that would be to provide 
for fstat?fs.

If we could expose the complete set of filesystem mount options so that 
mount(1) didn't have to look at /proc/self/mounts or /etc/mtab, then 
playing with chroots would be that much easier.

Should we expose superblock and vfsmount options separately?  We have 
read-only bind mounts now, but the way they work is rather inscrutable, 
and if stat?fs could say "superblock is read-write but vfsmount is 
readonly" then people might be able to make more sense of what's going on.

--Andy

  parent reply	other threads:[~2010-06-24 14:15 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-24 13:14 [rfc] new stat*fs-like syscall? Nick Piggin
2010-06-24 14:03 ` Miklos Szeredi
2010-06-24 14:36   ` Nick Piggin
2010-06-24 14:08 ` Andy Lutomirski [this message]
2010-06-24 14:18   ` Miklos Szeredi
2010-06-24 14:37     ` Andrew Lutomirski
2010-06-24 14:48       ` Miklos Szeredi
2010-06-25  3:50         ` Nick Piggin
2010-06-24 23:06   ` Andreas Dilger
2010-06-25  6:37     ` Christoph Hellwig
2010-06-24 23:13 ` Andreas Dilger
2010-06-25  4:01   ` Nick Piggin
2010-06-25  4:33     ` Jeff Garzik
2010-06-25 17:47     ` Andreas Dilger
2010-06-25 17:52       ` Ulrich Drepper
2010-06-25 18:16         ` Christoph Hellwig
2010-06-25 18:45           ` Christoph Hellwig
2010-06-25 19:40             ` Ulrich Drepper
2010-06-26  5:53 ` J. R. Okajima
2010-06-26  9:35   ` Christoph Hellwig
2010-06-26 12:54     ` J. R. Okajima
2010-07-05 20:58       ` Brad Boyer
2010-07-05 23:31         ` J. R. Okajima
2010-07-06  0:45           ` Brad Boyer
2010-07-06 16:45             ` Linus Torvalds
2010-07-07  1:44               ` Christoph Hellwig
2010-07-07  2:28                 ` Linus Torvalds
2010-06-26 14:49     ` Ulrich Drepper
2010-06-26 10:13 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C2366F7.5010200@mit.edu \
    --to=luto@mit.edu \
    --cc=drepper@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).