All of lore.kernel.org
 help / color / mirror / Atom feed
From: Casey Schaufler <casey@schaufler-ca.com>
To: Dave Chinner <david@fromorbit.com>, Miklos Szeredi <mszeredi@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org, linux-man@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	Karel Zak <kzak@redhat.com>, Ian Kent <raven@themaw.net>,
	David Howells <dhowells@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <christian@brauner.io>,
	Amir Goldstein <amir73il@gmail.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	Casey Schaufler <casey@schaufler-ca.com>
Subject: Re: [RFC PATCH] getvalues(2) prototype
Date: Wed, 23 Mar 2022 16:17:03 -0700	[thread overview]
Message-ID: <4080a088-8d4a-9631-3374-ded001d35c58@schaufler-ca.com> (raw)
In-Reply-To: <20220323225843.GI1609613@dread.disaster.area>

On 3/23/2022 3:58 PM, Dave Chinner wrote:
> On Tue, Mar 22, 2022 at 08:27:12PM +0100, Miklos Szeredi wrote:
>> Add a new userspace API that allows getting multiple short values in a
>> single syscall.
>>
>> This would be useful for the following reasons:
>>
>> - Calling open/read/close for many small files is inefficient.  E.g. on my
>>    desktop invoking lsof(1) results in ~60k open + read + close calls under
>>    /proc and 90% of those are 128 bytes or less.
> How does doing the open/read/close in a single syscall make this any
> more efficient? All it saves is the overhead of a couple of
> syscalls, it doesn't reduce any of the setup or teardown overhead
> needed to read the data itself....
>
>> - Interfaces for getting various attributes and statistics are fragmented.
>>    For files we have basic stat, statx, extended attributes, file attributes
>>    (for which there are two overlapping ioctl interfaces).  For mounts and
>>    superblocks we have stat*fs as well as /proc/$PID/{mountinfo,mountstats}.
>>    The latter also has the problem on not allowing queries on a specific
>>    mount.
> https://xkcd.com/927/
>
>> - Some attributes are cheap to generate, some are expensive.  Allowing
>>    userspace to select which ones it needs should allow optimizing queries.
>>
>> - Adding an ascii namespace should allow easy extension and self
>>    description.
>>
>> - The values can be text or binary, whichever is fits best.
>>
>> The interface definition is:
>>
>> struct name_val {
>> 	const char *name;	/* in */
>> 	struct iovec value_in;	/* in */
>> 	struct iovec value_out;	/* out */
>> 	uint32_t error;		/* out */
>> 	uint32_t reserved;
>> };
> Ahhh, XFS_IOC_ATTRMULTI_BY_HANDLE reborn. This is how xfsdump gets
> and sets attributes efficiently when dumping and restoring files -
> it's an interface that allows batches of xattr operations to be run
> on a file in a single syscall.
>
> I've said in the past when discussing things like statx() that maybe
> everything should be addressable via the xattr namespace and
> set/queried via xattr names regardless of how the filesystem stores
> the data. The VFS/filesystem simply translates the name to the
> storage location of the information. It might be held in xattrs, but
> it could just be a flag bit in an inode field.
>
> Then we just get named xattrs in batches from an open fd.
>
>> int getvalues(int dfd, const char *path, struct name_val *vec, size_t num,
>> 	      unsigned int flags);
>>
>> @dfd and @path are used to lookup object $ORIGIN.  @vec contains @num
>> name/value descriptors.  @flags contains lookup flags for @path.
>>
>> The syscall returns the number of values filled or an error.
>>
>> A single name/value descriptor has the following fields:
>>
>> @name describes the object whose value is to be returned.  E.g.
>>
>> mnt                    - list of mount parameters
>> mnt:mountpoint         - the mountpoint of the mount of $ORIGIN
>> mntns                  - list of mount ID's reachable from the current root
>> mntns:21:parentid      - parent ID of the mount with ID of 21
>> xattr:security.selinux - the security.selinux extended attribute
>> data:foo/bar           - the data contained in file $ORIGIN/foo/bar
> How are these different from just declaring new xattr namespaces for
> these things. e.g. open any file and list the xattrs in the
> xattr:mount.mnt namespace to get the list of mount parameters for
> that mount.

There is a significant and vocal set of people who dislike xattrs
passionately. I often hear them whinging whenever someone proposes
using them. I think that your suggestion has all the advantages of
the getvalues(2) interface while also addressing its shortcomings.
If we could get it past the anti-xattr crowd we might have something.
You could even provide getvalues() on top of it.

>
> Why do we need a new "xattr in everything but name" interface when
> we could just extend the one we've already got and formalise a new,
> cleaner version of xattr batch APIs that have been around for 20-odd
> years already?
>
> Cheers,
>
> Dave.
>

  reply	other threads:[~2022-03-23 23:17 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 19:27 [RFC PATCH] getvalues(2) prototype Miklos Szeredi
2022-03-22 19:30 ` Miklos Szeredi
2022-03-22 20:36 ` Casey Schaufler
2022-03-22 20:53   ` Casey Schaufler
2022-03-23  7:14   ` Greg KH
2022-03-22 23:32 ` kernel test robot
2022-03-23  7:16 ` Greg KH
2022-03-23 10:26   ` Bernd Schubert
2022-03-23 11:42     ` Greg KH
2022-03-23 12:06       ` Bernd Schubert
2022-03-23 12:13         ` Greg KH
2022-03-23 19:29     ` J. Bruce Fields
2022-03-23 11:42 ` Christian Brauner
2022-03-23 13:24   ` Miklos Szeredi
2022-03-23 13:38     ` Greg KH
2022-03-23 15:23       ` Miklos Szeredi
2022-03-24  6:56         ` Greg KH
2022-03-23 13:51     ` Casey Schaufler
2022-03-23 14:00       ` Miklos Szeredi
2022-03-23 22:39         ` Casey Schaufler
2022-03-23 22:19     ` Theodore Ts'o
2022-03-24  6:34       ` Christoph Hellwig
2022-03-24  8:44       ` Miklos Szeredi
2022-03-24 16:15         ` Eric W. Biederman
2022-03-25  8:46         ` Karel Zak
2022-03-25  8:54           ` Greg KH
2022-03-25  9:25             ` Karel Zak
2022-03-26  4:19               ` Theodore Ts'o
2022-03-25 18:40           ` Linus Torvalds
2022-03-25 11:02         ` Cyril Hrubis
2022-03-23 22:58 ` Dave Chinner
2022-03-23 23:17   ` Casey Schaufler [this message]
2022-03-24  8:57   ` Miklos Szeredi
2022-03-24 10:34     ` Amir Goldstein
2022-03-24 20:31     ` Dave Chinner
2022-03-25  9:10       ` Karel Zak
2022-03-25 16:42       ` Trond Myklebust
2022-03-27 21:03         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4080a088-8d4a-9631-3374-ded001d35c58@schaufler-ca.com \
    --to=casey@schaufler-ca.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=amir73il@gmail.com \
    --cc=christian@brauner.io \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=kzak@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=raven@themaw.net \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.