All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: dhowells@redhat.com, viro@zeniv.linux.org.uk, raven@themaw.net,
	mszeredi@redhat.com, christian@brauner.io,
	linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/19] vfs: syscall: Add fsinfo() to query filesystem information [ver #16]
Date: Thu, 20 Feb 2020 10:34:35 +0000	[thread overview]
Message-ID: <542411.1582194875@warthog.procyon.org.uk> (raw)
In-Reply-To: <20200219163128.GB9496@magnolia>

Darrick J. Wong <darrick.wong@oracle.com> wrote:

> > +	p->f_blocks.hi	= 0;
> > +	p->f_blocks.lo	= buf.f_blocks;
> 
> Er... are there filesystems (besides that (xfs++)++ one) that require
> u128 counters?  I suspect that the Very Large Fields are for future
> expandability, but I also wonder about the whether it's worth the
> complexity of doing this, since the structures can always be
> version-revved later.

I'm making a relatively cheap allowance for future expansion.  Dave Chinner
has mentioned at one of the LSFs that 16EiB may be exceeded soon (though I
hate to think of fscking such a beastie).  I know that the YFS variant of AFS
supports 96-bit vnode numbers (which I translate to inode numbers).  What I'm
trying to avoid is the problem we have with stat/statfs where under some
circumstances we have to return an error (ERANGE?) because we can't represent
the number if someone asks for an older version of the struct.

Since the buffer is (meant to be) pre-cleared, the filesystem can just ignore
the high word if it's never going to set it.  In fact, fsinfo_generic_statfs
doesn't need to set them either.

> XFS inodes are u64 values...
> ...and the max symlink target length is 1k, not PAGE_SIZE...

Yeah, and AFS(YFS) has 96-bit inode numbers.  The filesystem's fsinfo table is
read first so that the filesystem can override this.

> ...so is the usage model here that XFS should call fsinfo_generic_limits
> to fill out the fsinfo_limits structure, modify the values in
> ctx->buffer as appropriate for XFS, and then return the structure size?

Actually, I should export some these so that you can do that.  I'll export
fsinfo_generic_{timestamp_info,supports,limits} for now.

> > +#define FSINFO_ATTR_VOLUME_ID		0x05	/* Volume ID (string) */
> > +#define FSINFO_ATTR_VOLUME_UUID		0x06	/* Volume UUID (LE uuid) */
> > +#define FSINFO_ATTR_VOLUME_NAME		0x07	/* Volume name (string) */
> 
> I think I've muttered about the distinction between volume id and
> volume name before, but I'm still wondering how confusing that will be
> for users?  Let me check my assumptions, though:
> 
> Volume ID is whatever's in super_block.s_id, which (at least for xfs and
> ext4) is the device name (e.g. "sda1").  I guess that's useful for
> correlating a thing you can call fsinfo() on against strings that were
> logged in dmesg.
>
> Volume name I think is the fs label (e.g. "home"), which I think will
> have to be implemented separately by each filesystem, and that's why
> there's no generic vfs implementation.

Yes.  For AFS, for example, this would be the name of the volume (which may be
changed), whereas the volume ID is the number in the protocol that actually
refers to the volume (which cannot be changed).

And, as you say, for blockdev mounts, the ID is the device name and the volume
name is filesystem specific.

> The 7 -> 0 -> 1 sequence here confused me until I figured out that
> QUERY_TYPE is the mask for QUERY_{PATH,FD}.

Changed to FSINFO_FLAGS_QUERY_MASK.

> > +struct fsinfo_limits {
> > +...
> > +	__u32	__reserved[1];
> 
> I wonder if these structures ought to reserve more space than a single u32...

No need.  Part of the way the interface is designed is that the version number
for a particular VSTRUCT-type attribute is also the length.  So a newer
version is also longer.  All the old fields must be retained and filled in.
New fields are tagged on the end.

If userspace asks for an older version than is supported, it gets a truncated
return.  If it asks for a newer version, the extra fields it is expecting are
all set to 0.  Either way, the length (and thus the version) the kernel
supports is returned - not the length copied.

The __reserved fields are there because they represent padding (the struct is
going to be aligned/padded according to __u64 in this case).  Ideally, I'd
mark the structs __packed, but this messes with the alignment and may make the
compiler do funny tricks to get out any field larger than a byte.

I've renamed them to __padding.

> > +struct fsinfo_supports {
> > +	__u64	stx_attributes;		/* What statx::stx_attributes are supported */
> > +	__u32	stx_mask;		/* What statx::stx_mask bits are supported */
> > +	__u32	ioc_flags;		/* What FS_IOC_* flags are supported */
> 
> "IOC"?  That just means 'ioctl'.  Is this field supposed to return the
> supported FS_IOC_GETFLAGS flags, or the supported FS_IOC_FSGETXATTR
> flags?

FS_IOC_[GS]ETFLAGS is what I meant.

> I suspect it would also be a big help to be able to tell userspace which
> of the flags can be set, and which can be cleared.

How about:

	__u32	fs_ioc_getflags;	/* What FS_IOC_GETFLAGS may return */
	__u32	fs_ioc_setflags_set;	/* What FS_IOC_SETFLAGS may set */
	__u32	fs_ioc_setflags_clear;	/* What FS_IOC_SETFLAGS may clear */

> > +struct fsinfo_timestamp_one {
> > +	__s64	minimum;	/* Minimum timestamp value in seconds */
> > +	__u64	maximum;	/* Maximum timestamp value in seconds */
> 
> Given that time64_t is s64, why is the maximum here u64?

Well, I assume it extremely unlikely that the maximum can be before 1970, so
there doesn't seem any need to allow the maximum to be negative.  Furthermore,
it would be feasible that you could encounter a filesystem with a u64
filesystem that doesn't support dates before 1970.

On the other hand, if Linux is still going when __s64 seconds from 1970 wraps,
it will be impressive, but I'm not sure we'll be around to see it...  Someone
will have to cast a resurrection spell on Arnd to fix that one.

However, since signed/unsigned comparisons may have issues, I can turn it into
an __s64 also if that is preferred.

David


  parent reply	other threads:[~2020-02-20 10:34 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 17:04 [PATCH 00/19] VFS: Filesystem information and notifications [ver #16] David Howells
2020-02-18 17:05 ` [PATCH 01/19] vfs: syscall: Add fsinfo() to query filesystem information " David Howells
2020-02-19 16:31   ` Darrick J. Wong
2020-02-19 20:07   ` Jann Horn
2020-02-20 10:34   ` David Howells [this message]
2020-02-20 15:48     ` Darrick J. Wong
2020-02-20 11:03   ` David Howells
2020-02-20 14:54     ` Jann Horn
2020-02-20 15:31       ` Darrick J. Wong
2020-02-18 17:05 ` [PATCH 02/19] fsinfo: Add syscalls to other arches " David Howells
2020-02-21 14:51   ` Christian Brauner
2020-02-21 18:10     ` Geert Uytterhoeven
2020-02-18 17:05 ` [PATCH 03/19] fsinfo: Provide a bitmap of supported features " David Howells
2020-02-19 16:37   ` Darrick J. Wong
2020-02-20 12:22   ` David Howells
2020-02-18 17:05 ` [PATCH 04/19] vfs: Add mount change counter " David Howells
2020-02-21 14:48   ` Christian Brauner
2020-02-18 17:05 ` [PATCH 05/19] vfs: Introduce a non-repeating system-unique superblock ID " David Howells
2020-02-19 16:53   ` Darrick J. Wong
2020-02-20 12:45   ` David Howells
2020-02-18 17:05 ` [PATCH 06/19] vfs: Allow fsinfo() to look up a mount object by " David Howells
2020-02-21 15:09   ` Christian Brauner
2020-02-18 17:05 ` [PATCH 07/19] vfs: Allow mount information to be queried by fsinfo() " David Howells
2020-02-18 17:05 ` [PATCH 08/19] vfs: fsinfo sample: Mount listing program " David Howells
2020-02-18 17:06 ` [PATCH 09/19] fsinfo: Allow the mount topology propogation flags to be retrieved " David Howells
2020-02-18 17:06 ` [PATCH 10/19] fsinfo: Add API documentation " David Howells
2020-02-18 17:06 ` [PATCH 11/19] afs: Support fsinfo() " David Howells
2020-02-19 21:01   ` Jann Horn
2020-02-20 12:58   ` David Howells
2020-02-20 14:58     ` Jann Horn
2020-02-21 13:26     ` David Howells
2020-02-18 17:06 ` [PATCH 12/19] security: Add hooks to rule on setting a superblock or mount watch " David Howells
2020-02-18 17:06 ` [PATCH 13/19] vfs: Add a mount-notification facility " David Howells
2020-02-19 22:40   ` Jann Horn
2020-02-19 22:55     ` Jann Horn
2020-02-21 12:24   ` David Howells
2020-02-21 15:49     ` Jann Horn
2020-02-21 17:06     ` David Howells
2020-02-21 17:36       ` seq_lock and lockdep_is_held() assertions Jann Horn
2020-02-21 18:02         ` John Stultz
2020-02-18 17:06 ` [PATCH 14/19] notifications: sample: Display mount tree change notifications [ver #16] David Howells
2020-02-18 17:06 ` [PATCH 15/19] vfs: Add superblock " David Howells
2020-02-19 23:08   ` Jann Horn
2020-02-21 14:23   ` David Howells
2020-02-21 15:44     ` Jann Horn
2020-02-21 16:33     ` David Howells
2020-02-21 16:41       ` Jann Horn
2020-02-21 17:11       ` David Howells
2020-02-18 17:06 ` [PATCH 16/19] fsinfo: Provide superblock notification counter " David Howells
2020-02-18 17:07 ` [PATCH 17/19] notifications: sample: Display superblock notifications " David Howells
2020-02-18 17:07 ` [PATCH 18/19] ext4: Add example fsinfo information " David Howells
2020-02-19 17:04   ` Darrick J. Wong
2020-02-20  0:53   ` kbuild test robot
2020-02-20  0:53     ` kbuild test robot
2020-02-21 14:43   ` David Howells
2020-02-21 16:26     ` Darrick J. Wong
2020-02-18 17:07 ` [PATCH 19/19] nfs: Add example filesystem " David Howells
2020-02-20  2:13   ` kbuild test robot
2020-02-20  2:13     ` kbuild test robot
2020-02-20  2:20   ` kbuild test robot
2020-02-20  2:20     ` kbuild test robot
2020-02-18 18:12 ` David Howells
2020-02-19 10:23 ` [PATCH 00/19] VFS: Filesystem information and notifications " Stefan Metzmacher
2020-02-19 14:46 ` Christian Brauner
2020-02-19 15:50   ` Darrick J. Wong
2020-02-20  4:42   ` Ian Kent
2020-02-20  9:09     ` Christian Brauner
2020-02-20 11:30       ` Ian Kent
2020-02-19 16:16 ` David Howells
2020-02-21 12:57 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=542411.1582194875@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=christian@brauner.io \
    --cc=darrick.wong@oracle.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=raven@themaw.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.