All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Josef Bacik <josef@toxicpanda.com>,
	Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
	Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/3] fanotify support for btrfs sub-volumes
Date: Mon, 6 Nov 2023 14:47:16 +0100	[thread overview]
Message-ID: <20231106-unser-fiskus-9d1eba9fc64c@brauner> (raw)
In-Reply-To: <ZUjcI1SE+a2t8n1v@infradead.org>

On Mon, Nov 06, 2023 at 04:29:23AM -0800, Christoph Hellwig wrote:
> On Mon, Nov 06, 2023 at 11:03:37AM +0100, Christian Brauner wrote:
> > But why do we care?
> > Current code already does need to know it is on a btrfs subvolume. They
> > all know that btrfs subvolumes are special.
> 
> "they all know" is a bit vague.  How do you know "all" code knows?

Granted, an over-generalization but non in any way different from
claiming that currently on one needs to know about btrfs subvolumes or
that the proposed vfsmount solution will make it magically so that no
one needs to care anymore.

Tools will have to change either way is my point. And a lot of tools do
already handle subvolumes specially exactly because of the non-unique
inode situation. And if they don't they still can get confused by seing
st_dev numbers they can't associate with a filesystem.

> > They will need to know that
> > btrfs subvolumes are special in the future even if they were vfsmounts.
> > They would likely end up with another kind of confusion because suddenly
> > vfsmounts have device numbers that aren't associated with the superblock
> > that vfsmount belongs to.
> 
> Let's take a step back.  Posix says st_ino is uniqueue for a given
> st_dev, and per posix a mount mount is defined as any file that
> has a different st_dev from the parent.  So by the Posix definition
> btrfs subvolume roots are mount points, which is am obvios clash
> with the Linux definition based on vfsmounts.

3.229 Mount Point
Either the system root directory or a directory for which the st_dev
field of structure stat differs from that of its parent directory.

I think that's just an argument against mapping subvolumes to vfsmounts.
Because bind-mounts don't change the device number - and they very much
shouldn't.

> 
> > > > If userspace requests STATX_SUBVOLUME in the request mask, the two
> > > > filesystems raise STATX_SUBVOLUME in the statx result mask and then also
> > > > return the _real_ device number of the superblock and stop exposing that
> > > > made up device number.
> > > 
> > > What is a "real" device number?
> > 
> > The device number of the superblock of the btrfs filesystem and not some
> > made-up device number.
> 
> The block device st_dev is just as made up.
> 
> > I care about not making a btrfs specific problem the vfs's problem by
> > hoisting that whole problem space a level up by mapping subvolumes to
> > vfsmounts.
> 
> While I'd love to fix it, and evern more not have more of this
> crap sneak in (*cough* bcachefs, *cough*). І'm ok with that stance.
> But that also means we can't let this creep into the vfs by other
> means, which is what started the thread.

The thing is I'm not even sure there's anything to fix.

This discussion started with btrfs maybe getting an alternative way to
uniquify an inode independent of st_dev.

I'm not sure that is such a massive problem.

If we give both btrfs and bcachefs a single flag in statx() that allows
_interested_ userspace to query whether a file is located on a subvolume
that shouldn't be a problem (We have STATX_ATTR_* which identifies
additional properties that are restricted to few filesytems).

And all the specific gobbledigook can be implemented as an ioctl() -
ideally both btrfs and bcachefs agree on something - that the vfs
doesn't have to care about at all.

I genuinely don't care if they report a fake st_dev from stat(). I
genuinely _do_ care that we don't make vfsmounts privy to this.

Let alone that automounts are a giant paint. Not just do they iirc allow
to create shadow mounts, they also interact with namespace and container
creation.

If you spawn thousands of containers each with a private mount namespace
- which is the default - you now trigger automounts in thousands of
containers when triggering a lookup on btrfs. If you have mount
propagation turned on each automount may also propagate into god knows
how many other mount namespaces. That's just nasty.

IOW, making subvolumes vfsmounts will also have wider semantic
implications for using btrfs as a filesystem.

  reply	other threads:[~2023-11-06 13:47 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 13:50 [PATCH 0/3] fanotify support for btrfs sub-volumes Amir Goldstein
2023-10-25 13:50 ` [PATCH 1/3] fs: define a new super operation to get fsid Amir Goldstein
2023-10-25 13:50 ` [PATCH 2/3] btrfs: implement " Amir Goldstein
2023-10-25 13:50 ` [PATCH 3/3] fanotify: support reporting events with fid on btrfs sub-volumes Amir Goldstein
2023-10-25 15:34 ` [PATCH 0/3] fanotify support for " Christoph Hellwig
2023-10-25 17:04   ` Jan Kara
2023-10-27  5:44     ` Christoph Hellwig
2023-10-27 10:58       ` Jan Kara
2023-10-25 21:06   ` Josef Bacik
2023-10-25 23:02     ` Qu Wenruo
2023-10-26  5:49       ` Amir Goldstein
2023-10-27  5:46     ` Christoph Hellwig
2023-10-27 13:17       ` Josef Bacik
2023-10-27 13:47         ` Miklos Szeredi
2023-10-28  5:57         ` Amir Goldstein
2023-10-30 13:25         ` Christoph Hellwig
2023-10-31 12:14           ` Christian Brauner
2023-10-31 12:22             ` Christoph Hellwig
2023-10-31 12:50               ` Christian Brauner
2023-10-31 17:06                 ` Christoph Hellwig
2023-11-01  0:03                   ` Qu Wenruo
2023-11-03 14:21                     ` Christoph Hellwig
2023-11-01  8:16                   ` Christian Brauner
2023-11-01  8:41                     ` Qu Wenruo
2023-11-01  9:52                       ` Christian Brauner
2023-11-02  5:13                         ` Josef Bacik
2023-11-02  8:53                           ` Amir Goldstein
2023-11-02  9:48                           ` Christian Brauner
2023-11-02 12:34                             ` Josef Bacik
2023-11-02 17:07                               ` David Sterba
2023-11-02 20:32                                 ` Josef Bacik
2023-11-03  6:56                                 ` Christian Brauner
2023-11-03 13:52                                   ` Josef Bacik
2023-11-02 11:07                           ` Christian Brauner
2023-11-03 14:28                             ` Christoph Hellwig
2023-11-03 15:47                               ` Christian Brauner
2023-11-06  7:53                                 ` Christoph Hellwig
2023-11-06  8:18                                   ` Qu Wenruo
2023-11-06  9:56                                     ` Christian Brauner
2023-11-06 12:25                                     ` Christoph Hellwig
2023-11-06 10:03                                   ` Christian Brauner
2023-11-06 10:41                                     ` Qu Wenruo
2023-11-06 10:59                                       ` Christian Brauner
2023-11-06 12:30                                         ` Christoph Hellwig
2023-11-06 13:05                                           ` Christian Brauner
2023-11-06 17:10                                             ` Christoph Hellwig
2023-11-07  8:58                                               ` Christian Brauner
2023-11-08  7:56                                                 ` Christoph Hellwig
2023-11-08  8:09                                                   ` Christian Brauner
2023-11-08  8:12                                                     ` Christoph Hellwig
2023-11-08  8:22                                                       ` Christian Brauner
2023-11-08 14:07                                                         ` Christoph Hellwig
2023-11-08 15:57                                                           ` Christian Brauner
2023-11-06 12:29                                     ` Christoph Hellwig
2023-11-06 13:47                                       ` Christian Brauner [this message]
2023-11-06 17:13                                         ` Christoph Hellwig
2023-11-06 22:42                                           ` Josef Bacik
2023-11-07  9:06                                             ` Christian Brauner
2023-11-08  7:52                                               ` Christoph Hellwig
2023-11-08  8:27                                                 ` Christian Brauner
2023-11-08 14:08                                                   ` Christoph Hellwig
2023-11-08 16:16                                                     ` Christian Brauner
2023-11-08 16:20                                                       ` Christian Brauner
2023-11-09  6:55                                                         ` Christoph Hellwig
2023-11-09  9:07                                                           ` Christian Brauner
2023-11-09 14:41                                                             ` Christoph Hellwig
2023-11-10  9:33                                                               ` Christian Brauner
2023-11-10 10:31                                                                 ` Amir Goldstein
2023-11-09  6:53                                                       ` Christoph Hellwig
2023-11-08  7:51                                             ` Christoph Hellwig
2023-11-08 11:08                                               ` Jan Kara
2023-11-08 14:11                                                 ` Christoph Hellwig
2023-11-06  9:03                                 ` Jan Kara
2023-11-06  9:52                                   ` Christian Brauner
2023-11-06 12:22                                     ` Jan Kara
2023-11-03 14:23                       ` Christoph Hellwig
2023-11-03 14:22                     ` Christoph Hellwig
2023-10-25 17:17 ` Amir Goldstein
2023-10-25 18:02   ` Amir Goldstein
2023-10-26 12:17     ` Jan Kara
2023-10-26 12:36       ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231106-unser-fiskus-9d1eba9fc64c@brauner \
    --to=brauner@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.