linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jeff Mahoney <jeffm@suse.com>
Cc: Dave Chinner <david@fromorbit.com>, Mark Fasheh <mfasheh@suse.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	David Howells <dhowells@redhat.com>, Jan Kara <jack@suse.cz>
Subject: Re: [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root
Date: Wed, 6 Jun 2018 12:49:48 +0300	[thread overview]
Message-ID: <CAOQ4uxjUwi7dhTbNk1mK6_=rb_+L0JB1GG_mkGSDrudtocXScg@mail.gmail.com> (raw)
In-Reply-To: <5b2ae799-1595-c262-7b65-41b10c11906d@suse.com>

On Tue, Jun 5, 2018 at 11:17 PM, Jeff Mahoney <jeffm@suse.com> wrote:
> Sorry, just getting back to this.
>
> On 5/9/18 2:41 AM, Dave Chinner wrote:
>> On Tue, May 08, 2018 at 10:06:44PM -0400, Jeff Mahoney wrote:
>>> On 5/8/18 7:38 PM, Dave Chinner wrote:
>>>> On Tue, May 08, 2018 at 11:03:20AM -0700, Mark Fasheh wrote:
>>>>> Hi,
>>>>>
>>>>> The VFS's super_block covers a variety of filesystem functionality. In
>>>>> particular we have a single structure representing both I/O and
>>>>> namespace domains.
>>>>>
>>>>> There are requirements to de-couple this functionality. For example,
>>>>> filesystems with more than one root (such as btrfs subvolumes) can
>>>>> have multiple inode namespaces. This starts to confuse userspace when
>>>>> it notices multiple inodes with the same inode/device tuple on a
>>>>> filesystem.
>>>>

Speaking as someone who joined this discussion late, maybe years after
it started, it would help to get an overview of existing problems and how
fs_view aims to solve them.

I do believe that both Overlayfs and Btrfs can benefit from a layer of
abstraction
in the VFS, but I think it is best if we start with laying all the
common problems
and then see how a solution would look like.

Even the name of the abstraction (fs_view) doesn't make it clear to me what
it is we are abstracting (security context? st_dev? what else?). probably best
to try to describe the abstraction from user POV rather then give sporadic
examples of what MAY go into fs_view.

While at it, need to see if this discussion has any intersections with David
Howell's fs_context work, because if we consider adding sub volume support
to VFS, we may want to leave some reserved bits in the API for it.

[...]

> One thing is clear: If we want to solve the btrfs and overlayfs problems
> in the same way, the view approach with a simple static mapping doesn't
> work.  Sticking something between the inode and superblock doesn't get
> the job done when the belongs to a different file system.  Overlayfs
> needs a per-object remapper, which means a callback that takes a path.
> Suddenly the way we do things in the SUSE kernel doesn't seem so hacky
> anymore.
>

And what is the SUSE way?

> I'm not sure we need the same solution for btrfs and overlayfs.  It's
> not the same problem.  Every object in overlayfs as a unique mapping
> already.  If we report s_dev and i_ino from the inode, it still maps to
> a unique user-visible object.  It may not map back to the overlayfs
> name, but that's a separate issue that's more difficult to solve.  The
> btrfs issue isn't one of presenting an alternative namespace to the
> user.  Btrfs has multiple internal namespaces and no way to publish them
> to the rest of the kernel.
>

FYI, the Overlayfs file/inode mapping is about to change with many
VFS hacks queued for removal, so stay tuned.

[...]

>> My point is that if we are talking about infrastructure to remap
>> what userspace sees from different mountpoint views into a
>> filesystem, then it should be done above the filesystem layers in
>> the VFS so all filesystems behave the same way. And in this case,
>> the vfsmount maps exactly to the "fs_view" that Mark has proposed we
>> add to the superblock.....
>
> It's proposed to be above the superblock with a default view in the
> superblock.  It would sit between the inode and the superblock so we
> have access to it anywhere we already have an inode.  That's the main
> difference.  We already have the inode everywhere it's needed.  Plumbing
> a vfsmount everywhere needed means changing code that only requires an
> inode and doesn't need a vfsmount.
>
> The two biggest problem areas:
> - Writeback tracepoints print a dev/inode pair.  Do we want to plumb a
> vfsmount into __mark_inode_dirty, super_operations->write_inode,
> __writeback_single_inode, writeback_sb_inodes, etc?
> - Audit.  As it happens, most of audit has a path or file that can be
> used.  We do run into problems with fsnotify.  fsnotify_move is called
> from vfs_rename which turns into a can of worms pretty quickly.
>

Can you please elaborate on that problem.
Do you mean when watching a directory for changes, you need to
be able to tell in which fs_view the directory inode that is being watched?

>>> It makes sense for that to be above the
>>> superblock because the file system doesn't care about them.  We're
>>> interested in the inode namespace, which for every other file system can
>>> be described using an inode and a superblock pair, but btrfs has another
>>> layer in the middle: inode -> btrfs_root -> superblock.
>>
>> Which seems to me to be irrelevant if there's a vfsmount per
>> subvolume that can hold per-subvolume information.
>
> I disagree.  There are a ton of places where we only have access to an
> inode and only need access to an inode.  It also doesn't solve the
> overlayfs issue.
>

I have an interest of solving another problem.
In VFS operations where only inode is available, I would like to be able to
report fsnotify events (e.g. fsnotify_move()) only in directories under a
certain subtree root. That could be achieved either by bind mount the subtree
root and passing vfsmount into vfs_rename() or by defining an fs_view on the
subtree and mounting that fs_view.

Thanks,
Amir.

  reply	other threads:[~2018-06-06  9:49 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-08 18:03 Mark Fasheh
2018-05-08 18:03 ` [PATCH 01/76] vfs: Introduce struct fs_view Mark Fasheh
2018-05-08 18:03 ` [PATCH 02/76] arch: Use inode_sb() helper instead of inode->i_sb Mark Fasheh
2018-05-08 18:03 ` [PATCH 03/76] drivers: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 04/76] fs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 05/76] include: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 06/76] ipc: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 07/76] kernel: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 08/76] mm: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 09/76] net: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 10/76] security: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 11/76] fs/9p: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 12/76] fs/adfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 13/76] fs/affs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 14/76] fs/afs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 15/76] fs/autofs4: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 16/76] fs/befs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 17/76] fs/bfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 18/76] fs/btrfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 19/76] fs/ceph: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 20/76] fs/cifs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 21/76] fs/coda: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 22/76] fs/configfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 23/76] fs/cramfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 24/76] fs/crypto: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 25/76] fs/ecryptfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 26/76] fs/efivarfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 27/76] fs/efs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 28/76] fs/exofs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 29/76] fs/exportfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 30/76] fs/ext2: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 31/76] fs/ext4: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 32/76] fs/f2fs: " Mark Fasheh
2018-05-10 10:10   ` Chao Yu
2018-05-08 18:03 ` [PATCH 33/76] fs/fat: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 34/76] fs/freevxfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 35/76] fs/fuse: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 36/76] fs/gfs2: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 37/76] fs/hfs: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 38/76] fs/hfsplus: " Mark Fasheh
2018-05-08 18:03 ` [PATCH 39/76] fs/hostfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 40/76] fs/hpfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 41/76] fs/hugetlbfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 42/76] fs/isofs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 43/76] fs/jbd2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 44/76] fs/jffs2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 45/76] fs/jfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 46/76] fs/kernfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 47/76] fs/lockd: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 48/76] fs/minix: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 49/76] fs/nfsd: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 50/76] fs/nfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 51/76] fs/nilfs2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 52/76] fs/notify: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 53/76] fs/ntfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 54/76] fs/ocfs2: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 55/76] fs/omfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 56/76] fs/openpromfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 57/76] fs/orangefs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 58/76] fs/overlayfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 59/76] fs/proc: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 60/76] fs/qnx4: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 61/76] fs/qnx6: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 62/76] fs/quota: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 63/76] fs/ramfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 64/76] fs/read: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 65/76] fs/reiserfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 66/76] fs/romfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 67/76] fs/squashfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 68/76] fs/sysv: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 69/76] fs/ubifs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 70/76] fs/udf: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 71/76] fs/ufs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 72/76] fs/xfs: " Mark Fasheh
2018-05-08 18:04 ` [PATCH 73/76] vfs: Move s_dev to to struct fs_view Mark Fasheh
2018-05-08 18:04 ` [PATCH 74/76] fs: Use fs_view device from struct inode Mark Fasheh
2018-05-08 18:04 ` [PATCH 75/76] fs: Use fs view device from struct super_block Mark Fasheh
2018-05-08 18:04 ` [PATCH 76/76] btrfs: Use fs_view in roots, point inodes to it Mark Fasheh
2018-05-08 23:38 ` [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root Dave Chinner
2018-05-09  2:06   ` Jeff Mahoney
2018-05-09  6:41     ` Dave Chinner
2018-06-05 20:17       ` Jeff Mahoney
2018-06-06  9:49         ` Amir Goldstein [this message]
2018-06-06 20:42           ` Mark Fasheh
2018-06-07  6:06             ` Amir Goldstein
2018-06-07 20:44               ` Mark Fasheh
2018-06-06 21:19           ` Jeff Mahoney
2018-06-07  6:17             ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxjUwi7dhTbNk1mK6_=rb_+L0JB1GG_mkGSDrudtocXScg@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=jack@suse.cz \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfasheh@suse.de \
    --cc=miklos@szeredi.hu \
    --subject='Re: [RFC][PATCH 0/76] vfs: '\''views'\'' for filesystems with more than one root' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).