All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jeff Mahoney <jeffm@suse.com>
Cc: Dave Chinner <david@fromorbit.com>, Mark Fasheh <mfasheh@suse.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	David Howells <dhowells@redhat.com>, Jan Kara <jack@suse.cz>
Subject: Re: [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root
Date: Wed, 6 Jun 2018 12:49:48 +0300	[thread overview]
Message-ID: <CAOQ4uxjUwi7dhTbNk1mK6_=rb_+L0JB1GG_mkGSDrudtocXScg@mail.gmail.com> (raw)
In-Reply-To: <5b2ae799-1595-c262-7b65-41b10c11906d@suse.com>

On Tue, Jun 5, 2018 at 11:17 PM, Jeff Mahoney <jeffm@suse.com> wrote:
> Sorry, just getting back to this.
>
> On 5/9/18 2:41 AM, Dave Chinner wrote:
>> On Tue, May 08, 2018 at 10:06:44PM -0400, Jeff Mahoney wrote:
>>> On 5/8/18 7:38 PM, Dave Chinner wrote:
>>>> On Tue, May 08, 2018 at 11:03:20AM -0700, Mark Fasheh wrote:
>>>>> Hi,
>>>>>
>>>>> The VFS's super_block covers a variety of filesystem functionality. In
>>>>> particular we have a single structure representing both I/O and
>>>>> namespace domains.
>>>>>
>>>>> There are requirements to de-couple this functionality. For example,
>>>>> filesystems with more than one root (such as btrfs subvolumes) can
>>>>> have multiple inode namespaces. This starts to confuse userspace when
>>>>> it notices multiple inodes with the same inode/device tuple on a
>>>>> filesystem.
>>>>

Speaking as someone who joined this discussion late, maybe years after
it started, it would help to get an overview of existing problems and how
fs_view aims to solve them.

I do believe that both Overlayfs and Btrfs can benefit from a layer of
abstraction
in the VFS, but I think it is best if we start with laying all the
common problems
and then see how a solution would look like.

Even the name of the abstraction (fs_view) doesn't make it clear to me what
it is we are abstracting (security context? st_dev? what else?). probably best
to try to describe the abstraction from user POV rather then give sporadic
examples of what MAY go into fs_view.

While at it, need to see if this discussion has any intersections with David
Howell's fs_context work, because if we consider adding sub volume support
to VFS, we may want to leave some reserved bits in the API for it.

[...]

> One thing is clear: If we want to solve the btrfs and overlayfs problems
> in the same way, the view approach with a simple static mapping doesn't
> work.  Sticking something between the inode and superblock doesn't get
> the job done when the belongs to a different file system.  Overlayfs
> needs a per-object remapper, which means a callback that takes a path.
> Suddenly the way we do things in the SUSE kernel doesn't seem so hacky
> anymore.
>

And what is the SUSE way?

> I'm not sure we need the same solution for btrfs and overlayfs.  It's
> not the same problem.  Every object in overlayfs as a unique mapping
> already.  If we report s_dev and i_ino from the inode, it still maps to
> a unique user-visible object.  It may not map back to the overlayfs
> name, but that's a separate issue that's more difficult to solve.  The
> btrfs issue isn't one of presenting an alternative namespace to the
> user.  Btrfs has multiple internal namespaces and no way to publish them
> to the rest of the kernel.
>

FYI, the Overlayfs file/inode mapping is about to change with many
VFS hacks queued for removal, so stay tuned.

[...]

>> My point is that if we are talking about infrastructure to remap
>> what userspace sees from different mountpoint views into a
>> filesystem, then it should be done above the filesystem layers in
>> the VFS so all filesystems behave the same way. And in this case,
>> the vfsmount maps exactly to the "fs_view" that Mark has proposed we
>> add to the superblock.....
>
> It's proposed to be above the superblock with a default view in the
> superblock.  It would sit between the inode and the superblock so we
> have access to it anywhere we already have an inode.  That's the main
> difference.  We already have the inode everywhere it's needed.  Plumbing
> a vfsmount everywhere needed means changing code that only requires an
> inode and doesn't need a vfsmount.
>
> The two biggest problem areas:
> - Writeback tracepoints print a dev/inode pair.  Do we want to plumb a
> vfsmount into __mark_inode_dirty, super_operations->write_inode,
> __writeback_single_inode, writeback_sb_inodes, etc?
> - Audit.  As it happens, most of audit has a path or file that can be
> used.  We do run into problems with fsnotify.  fsnotify_move is called
> from vfs_rename which turns into a can of worms pretty quickly.
>

Can you please elaborate on that problem.
Do you mean when watching a directory for changes, you need to
be able to tell in which fs_view the directory inode that is being watched?

>>> It makes sense for that to be above the
>>> superblock because the file system doesn't care about them.  We're
>>> interested in the inode namespace, which for every other file system can
>>> be described using an inode and a superblock pair, but btrfs has another
>>> layer in the middle: inode -> btrfs_root -> superblock.
>>
>> Which seems to me to be irrelevant if there's a vfsmount per
>> subvolume that can hold per-subvolume information.
>
> I disagree.  There are a ton of places where we only have access to an
> inode and only need access to an inode.  It also doesn't solve the
> overlayfs issue.
>

I have an interest of solving another problem.
In VFS operations where only inode is available, I would like to be able to
report fsnotify events (e.g. fsnotify_move()) only in directories under a
certain subtree root. That could be achieved either by bind mount the subtree
root and passing vfsmount into vfs_rename() or by defining an fs_view on the
subtree and mounting that fs_view.

Thanks,
Amir.

  reply	other threads:[~2018-06-06  9:49 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-08 18:03 [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root Mark Fasheh
2018-05-08 18:03 ` [PATCH 01/76] vfs: Introduce struct fs_view Mark Fasheh
2018-05-08 18:03 ` [PATCH 02/76] arch: Use inode_sb() helper instead of inode->i_sb Mark Fasheh
2018-05-08 18:03 ` [PATCH 03/76] drivers: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 04/76] fs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 05/76] include: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 06/76] ipc: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 07/76] kernel: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 08/76] mm: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 09/76] net: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 10/76] security: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 11/76] fs/9p: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 12/76] fs/adfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 13/76] fs/affs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 14/76] fs/afs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 15/76] fs/autofs4: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 16/76] fs/befs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 17/76] fs/bfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 18/76] fs/btrfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 19/76] fs/ceph: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 20/76] fs/cifs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 21/76] fs/coda: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 22/76] fs/configfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 23/76] fs/cramfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 24/76] fs/crypto: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 25/76] fs/ecryptfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 26/76] fs/efivarfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 27/76] fs/efs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 28/76] fs/exofs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 29/76] fs/exportfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 30/76] fs/ext2: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 31/76] fs/ext4: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 32/76] fs/f2fs: " Mark Fasheh
2018-05-10 10:10   ` Chao Yu
2018-05-08 18:03 ` [PATCH 33/76] fs/fat: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 34/76] fs/freevxfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 35/76] fs/fuse: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 36/76] fs/gfs2: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 37/76] fs/hfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 38/76] fs/hfsplus: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 39/76] fs/hostfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 40/76] fs/hpfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 41/76] fs/hugetlbfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 42/76] fs/isofs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 43/76] fs/jbd2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 44/76] fs/jffs2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 45/76] fs/jfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 46/76] fs/kernfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 47/76] fs/lockd: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 48/76] fs/minix: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 49/76] fs/nfsd: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 50/76] fs/nfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 51/76] fs/nilfs2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 52/76] fs/notify: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 53/76] fs/ntfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 54/76] fs/ocfs2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 55/76] fs/omfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 56/76] fs/openpromfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 57/76] fs/orangefs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 58/76] fs/overlayfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 59/76] fs/proc: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 60/76] fs/qnx4: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 61/76] fs/qnx6: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 62/76] fs/quota: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 63/76] fs/ramfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 64/76] fs/read: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 65/76] fs/reiserfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 66/76] fs/romfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 67/76] fs/squashfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 68/76] fs/sysv: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 69/76] fs/ubifs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 70/76] fs/udf: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 71/76] fs/ufs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 72/76] fs/xfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 73/76] vfs: Move s_dev to to struct fs_view Mark Fasheh
2018-05-08 18:04 ` [PATCH 74/76] fs: Use fs_view device from struct inode Mark Fasheh
2018-05-08 18:04 ` [PATCH 75/76] fs: Use fs view device from struct super_block Mark Fasheh
2018-05-08 18:04 ` [PATCH 76/76] btrfs: Use fs_view in roots, point inodes to it Mark Fasheh
2018-05-08 23:38 ` [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root Dave Chinner
2018-05-09  2:06   ` Jeff Mahoney
2018-05-09  6:41     ` Dave Chinner
2018-06-05 20:17       ` Jeff Mahoney
2018-06-06  9:49         ` Amir Goldstein [this message]
2018-06-06 20:42           ` Mark Fasheh
2018-06-07  6:06             ` Amir Goldstein
2018-06-07 20:44               ` Mark Fasheh
2018-06-06 21:19           ` Jeff Mahoney
2018-06-07  6:17             ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxjUwi7dhTbNk1mK6_=rb_+L0JB1GG_mkGSDrudtocXScg@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=jack@suse.cz \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfasheh@suse.de \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.