linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: David Howells <dhowells@redhat.com>
Cc: viro@zeniv.linux.org.uk, raven@themaw.net, mszeredi@redhat.com,
	linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/25] vfs: syscall: Add fsinfo() to query filesystem information [ver #14]
Date: Wed, 26 Jun 2019 11:58:49 +0200	[thread overview]
Message-ID: <20190626095848.fwcepme3cpwdluwz@brauner.io> (raw)
In-Reply-To: <7560.1561542552@warthog.procyon.org.uk>

On Wed, Jun 26, 2019 at 10:49:12AM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > > +	return sizeof(*p);
> > 
> > Hm, the discrepancy between the function signature returning int and
> > the sizeof operator most likely being size_t is bothering me. It
> > probably doesn't matter but maybe we can avoid that.
> 
> If sizeof(*p) exceeds 4096, the buffer is going to have been overrun by this
> point anyway.

Ok.

> 
> The function can't return size_t, though it could return ssize_t.  I could
> switch it to return long or even store the result in fsinfo_kparams::usage and
> return 0.
> 
> > > +	strlcpy(p->f_fs_name, path->dentry->d_sb->s_type->name,
> > > +		sizeof(p->f_fs_name));
> > 
> > Truncation is acceptable or impossible I assume?
> 
> I'm hoping that file_system_type::name isn't going to exceed 15 chars plus
> NUL.  If it does, it will be truncated.  I don't really want to add an
> individual attribute just for the filesystem driver name.
> 
> > > +#define _gen(X, Y) FSINFO_ATTR_##X: return fsinfo_generic_##Y(path, params->buffer)
> > 
> > I'm really not sure that this helps readability in the switch below... :)
> > 
> > > +
> > > +	switch (params->request) {
> > > +	case _gen(STATFS,		statfs);
> > > +	case _gen(IDS,			ids);
> > > +	case _gen(LIMITS,		limits);
> > > +	case _gen(SUPPORTS,		supports);
> > > +	case _gen(CAPABILITIES,		capabilities);
> > > +	case _gen(TIMESTAMP_INFO,	timestamp_info);
> > > ...
> 
> I'm trying to avoid having to spend multiple lines per case and tabulation
> makes things easier to read.  So
> 
> 	case FSINFO_ATTR_SUPPORTS:		return fsinfo_generic_supports(path, params->buffer);
> 	case FSINFO_ATTR_CAPABILITIES:		return fsinfo_generic_capabilities(path, params->buffer);
> 	case FSINFO_ATTR_TIMESTAMP_INFO:	return fsinfo_generic_timestamp_info(path, params->buffer);
> 
> is a bit on the long side per line, whereas:
> 
> 	case FSINFO_ATTR_SUPPORTS:
> 		return fsinfo_generic_supports(path, params->buffer);
> 	case FSINFO_ATTR_CAPABILITIES:
> 		return fsinfo_generic_capabilities(path, params->buffer);
> 	case FSINFO_ATTR_TIMESTAMP_INFO:
> 		return fsinfo_generic_timestamp_info(path, params->buffer);
> 
> is less readable by interleaving two of the three columns.  (Note that _gen is
> a actually third column as I introduce alternatives later).
> 
> > > +		if (ret <= (int)params->buf_size)
> > 
> > He, and this is where the return value discrepancy hits again. Just
> > doesn't look nice tbh. :)
> 
> No.  That's dealing with signed/unsigned comparison.  It might be better if I
> change this to:
> 
> 		if (IS_ERR_VALUE(ret))
> 			return ret; /* Error */
> 		if ((unsigned int)ret <= params->buf_size)
> 			return ret; /* It fitted */
> 
> In any case, buf_size isn't permitted to be larger than INT_MAX due to a check
> later in the loop.
> 
> > > +		kvfree(params->buffer);
> > 
> > That means callers should always memset fsinfo_kparams or this is an
> > invalid free...
> 
> vfs_info() isn't a public function.  And, in any case, the caller *must*
> provide a buffer here.
> 
> > > + * Return buffer information by requestable attribute.
> > > + *
> > > + * STRUCT indicates a fixed-size structure with only one instance.
> > > ...
> > I honestly have a hard time following the documentation here
> 
> How about:
> 
>  * STRUCT	- a fixed-size structure with only one instance.
>  * STRUCT_N	- a sequence of STRUCTs, indexed by Nth
>  * STRUCT_NM	- a sequence of sequences of STRUCTs, indexed by Nth, Mth
>  * STRING	- a string with only one instance.
>  * STRING_N	- a sequence of STRING, indexed by Nth
>  * STRING_NM	- a sequence of sequences of STRING, indexed by Nth, Mth
>  * OPAQUE	- a blob that can be larger than 4K.
>  * STRUCT_ARRAY - an array of structs that can be larger than 4K
> 
> > and that monster table/macro thing below.  For example, STRUCT_NM
> > corresponds to __FSINFO_NM or what?
> 
> STRUCT_NM -> .type = __FSINFO_STRUCT, .flags = __FSINFO_NM, .size = ...
> 
> If you think this is bad, you should try looking at the device ID tables used
> by the drivers and the attribute tables;-)
> 
> I could spell out the flag and type in the macro defs (such as the body of
> FSINFO_STRING(X,Y) for instance).  It would make it harder to compare macros
> as it wouldn't then tabulate, though.
> 
> > And is this uapi as you're using this in your samples/test below?
> 
> Not exactly.  Each attribute is defined as being a certain type in the
> documentation in the UAPI header, but this is not coded there.  The assumption
> being that if you're using a particular attribute, you'll know what the type
> of the attribute is and you'll structure your code appropriately.
> 
> The reason the sample code has this replicated is that it doesn't really
> attempt to interpret the type per se.  It has a dumper for an individual
> attribute value, but the table tells it whether there should be one of those,
> N of those or N of M(0), M(1), M(2), ... of those so that it can report an
> error if it doesn't see what it expects.
> 
> I could even cheaply provide a meta attribute that dumps the contents of the
> table (just the type info, not the names).
> 
> > > ...
> > > +	FSINFO_STRING		(NAME_ENCODING,		-),
> > > +	FSINFO_STRING		(NAME_CODEPAGE,		-),
> > > +};
> > 
> > Can I complain again that this is really annoying to parse.
> 
> Apparently you can;-)  What would you prefer?  This:
> 
> static const struct fsinfo_attr_info fsinfo_buffer_info[FSINFO_ATTR__NR] = {
> 	[FSINFO_STATFS] = {
> 		.type	= __FSINFO_STRUCT,
> 		.size	= sizeof(struct fsinfo_statfs),
> 	},
> 	[FSINFO_SERVERS] = {
> 		.type	= __FSINFO_STRUCT,
> 		.flags	= __FSINFO_NM,
> 		.size	= sizeof(struct fsinfo_server),
> 	},
> 	...
> };	
> 
> That has 3-5 lines for each 1 in the current code and isn't a great deal more
> readable.

Really, I find this more readable because parsing structs and arrays of
structs is probably still even more common for C programmers then
deciphering nested macros. :) But I won't enforce my own pov. :)

> 
> > if (copy_to_user()) and if (clear_user()) and not if (clear_user() != 0)
> 
> Better "if (copy_to_user() != 0)" since it's not a boolean return value in
> either case.
> 
> > Nit: There's a bunch of name inconsistency for the arguments between the
> > stub and the definition:
> > 
> > SYSCALL_DEFINE5(fsinfo,
> > 		int, dfd, const char __user *, filename,
> > 		struct fsinfo_params __user *, _params,
> > 		void __user *, user_buffer, size_t, user_buf_size)
> 
> Yeah.  C just doesn't care.
> 
> I'll change filename to pathname throughout.  That's at least consistent with
> various glibc manpages for other vfs syscalls.
> 
> _params I can change to params and params as-was to kparams.
> 
> But user_buffer and user_buf_size, I'll keep as I've named them such to avoid
> confusion with kparams->buffer and kparams->scratch_buffer.  However, I
> wouldn't want to call them that in the UAPI.

Yep, it's really just that I prefer the naming to be consistent. :)

> 
> > Do we do SPDX that way? Or isn't this just supposed to be:
> > // <spdxy stuff>
> 
> Look in, say, include/uapi/linux/stat.h or .../fs.h.
> 
> > > +	FSINFO_ATTR__NR
> > 
> > Nit/Bikeshed: FSINFO_ATTR_MAX? Seems more intuitive.
> 
> No.  That would imply a limit that it will never exceed.
> 
> > > +struct fsinfo_u128 {
> > > +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
> > > +	__u64	hi;
> > > +	__u64	lo;
> > > +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
> > > +	__u64	lo;
> > > +	__u64	hi;
> > > +#endif
> > > +};
> > 
> > Hm, I know why you do this custom fsinfo_u128 thingy but for userspace
> > that is going to be annoying to operate with, e.g. comparing the
> > size/space of two filesystems etc.
> 
> We don't have a __u128 in the UAPI, and I'm reluctant to use __uint128_t.
> 
> Do you have a better suggestion?
> 
> > > +struct fsinfo_ids {
> > > +	char	f_fs_name[15 + 1];	/* Filesystem name */
> > 
> > You should probably make this a macro so userspace can use it in fs-name
> > length checks too.
> 
> The name length, you mean?  Well, you can use sizeof...
> 
> > > +	FSINFO_CAP__NR
> > 
> > Hm, again, maybe better to use FSINFO_CAP_MAX?
> 
> It's not a limit.

Well, in both cases it's giving the limit of currently supported
attributes. Other places in the kernel do the same (netlink for
example). Anyway, it probably doesn't matter that much.

Christian

  reply	other threads:[~2019-06-26  9:59 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-24 14:08 [PATCH 00/25] VFS: Introduce filesystem information query syscall [ver #14] David Howells
2019-06-24 14:08 ` [PATCH 01/25] vfs: syscall: Add fsinfo() to query filesystem information " David Howells
2019-06-25  8:28   ` Christian Brauner
2019-06-26  9:49   ` David Howells
2019-06-26  9:58     ` Christian Brauner [this message]
2019-06-24 14:09 ` [PATCH 02/25] fsinfo: Add syscalls to other arches " David Howells
2019-06-25  9:31   ` Christian Brauner
2019-06-24 14:09 ` [PATCH 03/25] vfs: Allow fsinfo() to query what's in an fs_context " David Howells
2019-06-25  9:27   ` Christian Brauner
2019-06-26 10:02   ` David Howells
2019-06-26 10:06     ` Christian Brauner
2019-06-24 14:09 ` [PATCH 04/25] vfs: Allow fsinfo() to be used to query an fs parameter description " David Howells
2019-06-25  9:40   ` Christian Brauner
2019-06-24 14:09 ` [PATCH 05/25] vfs: Implement parameter value retrieval with fsinfo() " David Howells
2019-06-25  9:44   ` Christian Brauner
2019-06-24 14:09 ` [PATCH 06/25] fsinfo: Implement retrieval of LSM parameters " David Howells
2019-06-24 14:09 ` [PATCH 07/25] vfs: Introduce a non-repeating system-unique superblock ID " David Howells
2019-06-24 14:09 ` [PATCH 08/25] vfs: Allow fsinfo() to look up a mount object by " David Howells
2019-06-26  9:49   ` Christian Brauner
2019-06-24 14:10 ` [PATCH 09/25] vfs: Add mount notification count " David Howells
2019-06-26  9:52   ` Christian Brauner
2019-06-24 14:10 ` [PATCH 10/25] vfs: Allow mount information to be queried by fsinfo() " David Howells
2019-06-26  9:53   ` Christian Brauner
2019-06-24 14:10 ` [PATCH 11/25] vfs: fsinfo sample: Mount listing program " David Howells
2019-06-24 14:10 ` [PATCH 12/25] fsinfo: Add API documentation " David Howells
2019-06-24 14:10 ` [PATCH 13/25] hugetlbfs: Add support for fsinfo() " David Howells
2019-06-24 14:10 ` [PATCH 14/25] kernfs, cgroup: Add fsinfo support " David Howells
2019-06-24 14:11 ` [PATCH 15/25] fsinfo: Support SELinux superblock parameter retrieval " David Howells
2019-06-24 14:11 ` [PATCH 16/25] fsinfo: Support Smack " David Howells
2019-06-24 14:11 ` [PATCH 17/25] afs: Support fsinfo() " David Howells
2019-06-24 14:11 ` [PATCH 18/25] fsinfo: proc - add sb operation " David Howells
2019-06-24 14:11 ` [PATCH 19/25] fsinfo: autofs " David Howells
2019-06-24 14:11 ` [PATCH 20/25] fsinfo: shmem - add tmpfs " David Howells
2019-06-24 14:11 ` [PATCH 21/25] fsinfo: devpts - add " David Howells
2019-06-24 14:12 ` [PATCH 22/25] fsinfo: pstore " David Howells
2019-06-24 14:12 ` [PATCH 23/25] fsinfo: debugfs " David Howells
2019-06-24 14:12 ` [PATCH 24/25] fsinfo: bpf " David Howells
2019-06-24 14:12 ` [PATCH 25/25] fsinfo: ufs " David Howells
2019-06-26 10:05 ` [PATCH 00/25] VFS: Introduce filesystem information query syscall " Christian Brauner
2019-06-26 10:42   ` Ian Kent
2019-06-26 10:47     ` Christian Brauner
2019-06-27  0:38       ` Ian Kent
2019-06-26 13:19   ` Christian Brauner
2019-06-26 14:31   ` David Howells
2019-06-26 14:50     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190626095848.fwcepme3cpwdluwz@brauner.io \
    --to=christian@brauner.io \
    --cc=dhowells@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=raven@themaw.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).