All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Josef Bacik <josef@toxicpanda.com>,
	Christoph Hellwig <hch@infradead.org>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
	Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/3] fanotify support for btrfs sub-volumes
Date: Thu, 2 Nov 2023 12:07:47 +0100	[thread overview]
Message-ID: <20231102-schafsfell-denkzettel-08da41113e24@brauner> (raw)
In-Reply-To: <20231102051349.GA3292886@perftesting>

> Btw I'm working on this, mostly to show Christoph it doesn't do what he thinks
> it does.
> 
> However I ran into some weirdness where I need to support the new mount API, so
> that's what I've been doing since I wandered away from this thread.  I should
> have that done tomorrow, and then I was going to do the S_AUTOMOUNT thing ontop
> of that.
> 
> But I have the same questions as you Christian, I'm not entirely sure how this
> is supposed to be better.  Even if they show up in /proc/mounts, it's not going
> to do anything useful for the applications that don't check /proc/mounts to see
> if they've wandered into a new mount.  I also don't quite understand how NFS
> suddenly knows it's wandered into a new mount with a vfsmount.

So the subvolumes-as-vfsmount solution was implemented already a few
years ago. I looked at that patchset and the crucial point in the
solution was patch [1].

show_mountinfo() is called under namespace_sem (see fs/namespace.c).
That thing is crucial for all mount namespaces, mount propagation and so
on. We can't cause IO under that unless we want to allow to trivially
deadlock the whole system by tricking us into talking to an unresponsive
NFS server or similar. And all vfs_getattr*() flavours can legitimately
cause IO even with AT_STATX_DONT_SYNC.

So exposing this via /proc/<pid>/mountinfo doesn't work. But that means
even if you make it a separate vfsmount you need to epose the device
information through another interface.

But at that point we really need to ask if it makes sense to use
vfsmounts per subvolume in the first place:

(1) We pollute /proc/<pid>/mountinfo with a lot of mounts.
(2) By calling ->getattr() from show_mountinfo() we open the whole
    system up to deadlocks.
(3) We change btrfs semantics drastically to the point where they need a
    new mount, module, or Kconfig option.
(4) We make (initial) lookup on btrfs subvolumes more heavyweight
    because you need to create a mount for the subvolume.

So right now, I don't see how we can make this work even if the concept
doesn't seem necessarily wrong.

Even if we were to go through with this and make each subvolume a
vfsmount but then don't expose the ->getattr() device numbers in
/proc/<pid>/mountinfo but instead add a separate retrieval method via
statx() we'd be creating even more confusion for userspace by showing
different device numbers in /proc/<pid>/mountinfo than in statx().

[1]:

Subject:        [PATCH 01/11] VFS: show correct dev num in mountinfo
Date:	 	Wed, 28 Jul 2021 08:37:45 +1000
Message-ID:	<162742546548.32498.10889023150565429936.stgit@noble.brown>

/proc/$PID/mountinfo contains a field for the device number of the
filesystem at each mount.

This is taken from the superblock ->s_dev field, which is correct for
every filesystem except btrfs.  A btrfs filesystem can contain multiple
subvols which each have a different device number.  If (a directory
within) one of these subvols is mounted, the device number reported in
mountinfo will be different from the device number reported by stat().

This confuses some libraries and tools such as, historically, findmnt.
Current findmnt seems to cope with the strangeness.

So instead of using ->s_dev, call vfs_getattr_nosec() and use the ->dev
provided.  As there is no STATX flag to ask for the device number, we
pass a request mask for zero, and also ask the filesystem to avoid
syncing with any remote service.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/proc_namespace.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 392ef5162655..f342a0231e9e 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -138,10 +138,16 @@ static int show_mountinfo(struct seq_file *m, struct vfsmount *mnt)
 	struct mount *r = real_mount(mnt);
 	struct super_block *sb = mnt->mnt_sb;
 	struct path mnt_path = { .dentry = mnt->mnt_root, .mnt = mnt };
+	struct kstat stat;
 	int err;
 
+	/* We only want ->dev, and there is no STATX flag for that,
+	 * so ask for nothing and assume we get ->dev
+	 */
+	vfs_getattr_nosec(&mnt_path, &stat, 0, AT_STATX_DONT_SYNC);
+
 	seq_printf(m, "%i %i %u:%u ", r->mnt_id, r->mnt_parent->mnt_id,
-		   MAJOR(sb->s_dev), MINOR(sb->s_dev));
+		   MAJOR(stat.dev), MINOR(stat.dev));
 	if (sb->s_op->show_path) {
 		err = sb->s_op->show_path(m, mnt->mnt_root);
 		if (err)


  parent reply	other threads:[~2023-11-02 11:07 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 13:50 [PATCH 0/3] fanotify support for btrfs sub-volumes Amir Goldstein
2023-10-25 13:50 ` [PATCH 1/3] fs: define a new super operation to get fsid Amir Goldstein
2023-10-25 13:50 ` [PATCH 2/3] btrfs: implement " Amir Goldstein
2023-10-25 13:50 ` [PATCH 3/3] fanotify: support reporting events with fid on btrfs sub-volumes Amir Goldstein
2023-10-25 15:34 ` [PATCH 0/3] fanotify support for " Christoph Hellwig
2023-10-25 17:04   ` Jan Kara
2023-10-27  5:44     ` Christoph Hellwig
2023-10-27 10:58       ` Jan Kara
2023-10-25 21:06   ` Josef Bacik
2023-10-25 23:02     ` Qu Wenruo
2023-10-26  5:49       ` Amir Goldstein
2023-10-27  5:46     ` Christoph Hellwig
2023-10-27 13:17       ` Josef Bacik
2023-10-27 13:47         ` Miklos Szeredi
2023-10-28  5:57         ` Amir Goldstein
2023-10-30 13:25         ` Christoph Hellwig
2023-10-31 12:14           ` Christian Brauner
2023-10-31 12:22             ` Christoph Hellwig
2023-10-31 12:50               ` Christian Brauner
2023-10-31 17:06                 ` Christoph Hellwig
2023-11-01  0:03                   ` Qu Wenruo
2023-11-03 14:21                     ` Christoph Hellwig
2023-11-01  8:16                   ` Christian Brauner
2023-11-01  8:41                     ` Qu Wenruo
2023-11-01  9:52                       ` Christian Brauner
2023-11-02  5:13                         ` Josef Bacik
2023-11-02  8:53                           ` Amir Goldstein
2023-11-02  9:48                           ` Christian Brauner
2023-11-02 12:34                             ` Josef Bacik
2023-11-02 17:07                               ` David Sterba
2023-11-02 20:32                                 ` Josef Bacik
2023-11-03  6:56                                 ` Christian Brauner
2023-11-03 13:52                                   ` Josef Bacik
2023-11-02 11:07                           ` Christian Brauner [this message]
2023-11-03 14:28                             ` Christoph Hellwig
2023-11-03 15:47                               ` Christian Brauner
2023-11-06  7:53                                 ` Christoph Hellwig
2023-11-06  8:18                                   ` Qu Wenruo
2023-11-06  9:56                                     ` Christian Brauner
2023-11-06 12:25                                     ` Christoph Hellwig
2023-11-06 10:03                                   ` Christian Brauner
2023-11-06 10:41                                     ` Qu Wenruo
2023-11-06 10:59                                       ` Christian Brauner
2023-11-06 12:30                                         ` Christoph Hellwig
2023-11-06 13:05                                           ` Christian Brauner
2023-11-06 17:10                                             ` Christoph Hellwig
2023-11-07  8:58                                               ` Christian Brauner
2023-11-08  7:56                                                 ` Christoph Hellwig
2023-11-08  8:09                                                   ` Christian Brauner
2023-11-08  8:12                                                     ` Christoph Hellwig
2023-11-08  8:22                                                       ` Christian Brauner
2023-11-08 14:07                                                         ` Christoph Hellwig
2023-11-08 15:57                                                           ` Christian Brauner
2023-11-06 12:29                                     ` Christoph Hellwig
2023-11-06 13:47                                       ` Christian Brauner
2023-11-06 17:13                                         ` Christoph Hellwig
2023-11-06 22:42                                           ` Josef Bacik
2023-11-07  9:06                                             ` Christian Brauner
2023-11-08  7:52                                               ` Christoph Hellwig
2023-11-08  8:27                                                 ` Christian Brauner
2023-11-08 14:08                                                   ` Christoph Hellwig
2023-11-08 16:16                                                     ` Christian Brauner
2023-11-08 16:20                                                       ` Christian Brauner
2023-11-09  6:55                                                         ` Christoph Hellwig
2023-11-09  9:07                                                           ` Christian Brauner
2023-11-09 14:41                                                             ` Christoph Hellwig
2023-11-10  9:33                                                               ` Christian Brauner
2023-11-10 10:31                                                                 ` Amir Goldstein
2023-11-09  6:53                                                       ` Christoph Hellwig
2023-11-08  7:51                                             ` Christoph Hellwig
2023-11-08 11:08                                               ` Jan Kara
2023-11-08 14:11                                                 ` Christoph Hellwig
2023-11-06  9:03                                 ` Jan Kara
2023-11-06  9:52                                   ` Christian Brauner
2023-11-06 12:22                                     ` Jan Kara
2023-11-03 14:23                       ` Christoph Hellwig
2023-11-03 14:22                     ` Christoph Hellwig
2023-10-25 17:17 ` Amir Goldstein
2023-10-25 18:02   ` Amir Goldstein
2023-10-26 12:17     ` Jan Kara
2023-10-26 12:36       ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231102-schafsfell-denkzettel-08da41113e24@brauner \
    --to=brauner@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.