All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Djalal Harouni <tixxdz@gmail.com>, Chris Mason <clm@fb.com>,
	Theodore Tso <tytso@mit.edu>,
	Josh Triplett <josh@joshtriplett.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Seth Forshee <seth.forshee@canonical.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>,
	Dongsu Park <dongsu@endocode.com>,
	David Herrmann <dh.herrmann@googlemail.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Alban Crequy <alban.crequy@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Phil Estes <estesp@gmail.com>
Subject: Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount
Date: Mon, 06 Feb 2017 06:41:22 -0800	[thread overview]
Message-ID: <1486392082.2474.27.camel@HansenPartnership.com> (raw)
In-Reply-To: <CAOQ4uxhQE6y5pRont7ejobU+fzQSiTQaQub8gZT=-K7UAZEbkA@mail.gmail.com>

On Mon, 2017-02-06 at 08:59 +0200, Amir Goldstein wrote:
> On Mon, Feb 6, 2017 at 3:18 AM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Sun, 2017-02-05 at 09:51 +0200, Amir Goldstein wrote:
> > > On Sat, Feb 4, 2017 at 9:19 PM, James Bottomley
> > > <James.Bottomley@hansenpartnership.com> wrote:
> > > > This allows any subtree to be uid/gid shifted and bound 
> > > > elsewhere.  It does this by operating simlarly to overlayfs. 
> > > >  Its primary use is for shifting the underlying uids of 
> > > > filesystems used to support unpriviliged (uid shifted) 
> > > > containers.  The usual use case here is that the container is 
> > > > operating with an uid shifted unprivileged root but sometimes 
> > > > needs to make use of or work with a filesystem image that has
> > > > root at real uid 0.
> > > > 
> > > > The mechanism is to allow any subordinate mount namespace to 
> > > > mount a shiftfs filesystem (by marking it FS_USERNS_MOUNT) but 
> > > > only allowing it to mount marked subtrees (using the -o mark 
> > > > option as root).   Once mounted, the subtree is mapped via the 
> > > > super block user namespace so that the interior ids of the 
> > > > mounting user namespace are the ids written to the filesystem.
> > > > 
> > > > Signed-off-by: James Bottomley <
> > > > James.Bottomley@HansenPartnership.com>
> > > > 
> > > 
> > > James,
> > > 
> > > Allow me to point out some problems in this patch and offer a
> > > slightly different approach.
> > > 
> > > First of all, the subject says "uid/gid shifting bind mount", but
> > > it's not really a bind mount. What it is is a stackable mount and 
> > > 2 levels of stack no less.
> > 
> > The reason for the description is to have it behave exactly like a 
> > bind mount.  You can assert that a bind mount is, in fact, a 
> > stacked mount, but we don't currently.  I'm also not sure where you 
> > get your 2 levels from?
> > 
> 
> A bind mount does not incur recursion into VFS code, a stacked fs 
> does. And there is a programmable limit of stack depth of 2, which 
> stacked fs need to comply with. Your proposed setup has 2 stacked fs, 
> the mark shitfs by admin and the uid shitfs by container user. Or
> maybe I misunderstood.

Oh, right, actually, it wouldn't be 2 because once the unprivileged
mount uses the marked filesystem, what it uses is the mnt and dentry
from the underlying filesystem (what you would have got from a path
lookup on it).

That said, it does perform recursive calls to the underlying filesystem
unlike a true bind mount, so I can add the depth easily enough.

> > >  So one thing that is missing is increasing of sb->s_stack_depth 
> > > and that also means that shiftfs cannot be used to recursively 
> > > shift uids in child userns if that was ever the intention.
> > 
> > I can't think of a use case that would ever need that, but perhaps
> > other container people can.
> > 
> > > The other problem is that by forking overlayfs functionality,
> > 
> > So this wouldn't really be the right way to look at it: shiftfs 
> > shares no code with overlayfs at all, so is definitely not a fork. 
> >  The only piece of functionality it has which is similar to 
> > overlayfs is the way it does lookups via a new dentry cache. 
> >  However, that functionality is not unique to overlayfs and if you 
> > look, you'll see that shiftfs_lookup() actually has far more in 
> > common with ecryptfs_lookup().
> 
> That's a good point. All stackable file systems may share similar 
> problems and solutions (e.g. consistent st_ino/st_dev). Perhaps it 
> calls for shared library code or more generic VFS code. At the moment 
> ecryptfs is not seeing much development, so everything happens in 
> overlayfs. If there is going to be more than 1 actively developed
> stackable fs, we need to see about that.

I believe we already do ... if you look at the lookup functions of each
of them, you see the only common thing is encapsulated in a variant of
the lookup_one_len() functions.  After that, even simple things like
our negative dentry handling differs.

> > >  shiftfs is going to miss out on overlayfs bug fixes related to 
> > > user credentials differ from mounter credentials, like fd3220d 
> > > ("ovl: update S_ISGID when setting posix ACLs"). I am not sure 
> > > that this specific case is relevant to shiftfs, but there could
> > > be other.
> > 
> > OK, so shiftfs doesn't have this bug and the reason why is
> > illustrative: basically shiftfs does three things
> > 
> >    1. lookups via a uid/gid shifted dentry cache
> >    2. shifted credential inode operations permission checks on the
> >       underlying filesystem
> >    3. location marking for unprivileged mount
> > 
> > I think we've already seen that 1. isn't from overlayfs but the
> > functionality could be added to overlayfs, I suppose.  The big 
> > problem is 2.  The overlayfs code emulates the permission checks, 
> > which makes it rather complex (this is where you get your bugs like 
> > the above from).  I did actually look at adding 2. to overlayfs on 
> > the theory that a single layer overlay might be closest to what 
> > this is, but eventually concluded I'd have to take the special 
> > cases and add a whole lot more to them ... it really would increase 
> > the maintenance burden substantially and make the code an
> > unreadable rats nest.
> > 
> 
> The use cases for uid shifting are still overwelming for me.
> I take your word for it that its going to be a maintanace burdon
> to add this functionality to overlayfs.
> 
> > When you think about it this way, it becomes obvious that the clean
> > separation is if shiftfs functionality is layered on top of 
> > overlayfs and when you do that, doing it as its own filesystem is 
> > more logical.
> > 
> 
> Yes, I agree with that statement. This is inline with the solution I 
> outlined at the end of my previous email, where single layer 
> overlayfs is used for the host "mark" mount, although I wonder if the 
> same cannot be achieved with a bind mount?

I understand, but once I can't consume overlayfs to construct it, the
idea of trying to use it becomes a negative not a positive.

We could achieve the same thing using bind mounts, if the vfsmount
structure carried a private field, but it doesn't.  I think given the
prevalence of this structure throughout the mount tree, that's a
deliberate decision to keep it thin.

> in host:
> mount -t overlay -o noexec,upper=<origin> container_visible <mark
> location>
> 
> in container:
> mount -t shiftfs -o <mark location> <somewhere in my local mount ns>

So I'm not sure it's a more widespread problem: mount --bind is usable
inside an unprivileged container, which means you can bridge filesystem
subtrees even only being local container admin.  The problem is
mounting other filesystems types.  Marking a type safe for mounting is
done by the FS_USERNS_MOUNT flag but it means for things like shiftfs
that you do have to restrict the source location, but for most
filesystem types, that source will be a device, so they will need other
checking than a mount mark.

James

  reply	other threads:[~2017-02-06 14:41 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-04 19:18 [RFC 0/1] shiftfs: uid/gid shifting filesystem (s_user_ns version) James Bottomley
2017-02-04 19:19 ` [RFC 1/1] shiftfs: uid/gid shifting bind mount James Bottomley
2017-02-05  7:51   ` Amir Goldstein
2017-02-06  1:18     ` James Bottomley
2017-02-06  6:59       ` Amir Goldstein
2017-02-06 14:41         ` James Bottomley [this message]
2017-02-14 23:03       ` Vivek Goyal
2017-02-14 23:45         ` James Bottomley
2017-02-15 14:17           ` Vivek Goyal
2017-02-16 15:51             ` James Bottomley
2017-02-16 16:42               ` Vivek Goyal
2017-02-16 16:58                 ` James Bottomley
2017-02-17  1:57                   ` Eric W. Biederman
2017-02-17  8:39                     ` Djalal Harouni
2017-02-17 17:19                     ` James Bottomley
2017-02-20  4:24                       ` Eric W. Biederman
2017-02-22 12:01                         ` James Bottomley
2017-02-06  3:25   ` J. R. Okajima
2017-02-06  6:38     ` Amir Goldstein
2017-02-06 16:29       ` James Bottomley
2017-02-06  6:46     ` James Bottomley
2017-02-06 14:50       ` Theodore Ts'o
2017-02-06 15:18         ` James Bottomley
2017-02-06 15:38           ` lkml
2017-02-06 17:32             ` James Bottomley
2017-02-06 21:52           ` J. Bruce Fields
2017-02-07  0:10             ` James Bottomley
2017-02-07  1:35               ` J. Bruce Fields
2017-02-07 19:01                 ` James Bottomley
2017-02-07 19:47                   ` Christoph Hellwig
2017-02-06 16:24       ` J. R. Okajima
2017-02-21  0:48         ` James Bottomley
2017-02-21  2:57           ` J. R. Okajima
2017-02-21  4:07             ` James Bottomley
2017-02-21  4:34               ` J. R. Okajima
2017-02-07  9:19   ` Christoph Hellwig
2017-02-07  9:39     ` Djalal Harouni
2017-02-07  9:53       ` Christoph Hellwig
2017-02-07 16:37     ` James Bottomley
2017-02-07 17:59       ` Amir Goldstein
2017-02-07 18:10         ` Christoph Hellwig
2017-02-07 19:02           ` James Bottomley
2017-02-07 19:49             ` Christoph Hellwig
2017-02-07 20:05               ` James Bottomley
2017-02-07 21:01                 ` Amir Goldstein
2017-02-07 22:25                   ` Christoph Hellwig
2017-02-07 23:42                     ` James Bottomley
2017-02-08  6:44                       ` Amir Goldstein
2017-02-08 11:45                         ` Konstantin Khlebnikov
2017-02-08 14:57                         ` James Bottomley
2017-02-08 15:15                         ` James Bottomley
2017-02-08  1:54               ` Josh Triplett
2017-02-08 15:22                 ` James Bottomley
2017-02-09 10:36                   ` Josh Triplett
2017-02-09 15:34                     ` James Bottomley
2017-02-13 10:15                       ` Eric W. Biederman
2017-02-15  9:33                         ` Djalal Harouni
2017-02-15  9:37                           ` Eric W. Biederman
2017-02-15 10:04                             ` Djalal Harouni
2017-02-07 18:20         ` James Bottomley
2017-02-07 19:48           ` Djalal Harouni
2017-02-15 20:34   ` Vivek Goyal
2017-02-16 15:56     ` James Bottomley
2017-02-17  2:55       ` Al Viro
2017-02-17 17:34         ` James Bottomley
2017-02-17 20:35           ` Vivek Goyal
2017-02-19  3:24             ` James Bottomley
2017-02-20 19:26               ` Vivek Goyal
2017-02-21  0:38                 ` James Bottomley
2017-02-17  2:29   ` Al Viro
2017-02-17 17:24     ` James Bottomley
2017-02-17 17:51       ` Al Viro
2017-02-17 20:27         ` Vivek Goyal
2017-02-17 20:50         ` James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2016-05-12 19:06 [RFC 0/1] shiftfs: uid/gid shifting filesystem James Bottomley
2016-05-12 19:07 ` [RFC 1/1] shiftfs: uid/gid shifting bind mount James Bottomley
2016-05-16 19:41   ` Serge Hallyn
2016-05-17  2:28     ` James Bottomley
2016-05-17  3:47       ` Serge E. Hallyn
2016-05-17 10:23         ` James Bottomley
2016-05-17 20:59           ` James Bottomley
2016-05-19  2:28             ` Serge E. Hallyn
2016-05-19 10:53               ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1486392082.2474.27.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=alban.crequy@gmail.com \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=dh.herrmann@googlemail.com \
    --cc=dongsu@endocode.com \
    --cc=ebiederm@xmission.com \
    --cc=estesp@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=serge@hallyn.com \
    --cc=seth.forshee@canonical.com \
    --cc=tixxdz@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.