linux-integrity.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: "Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Christoph Hellwig" <hch@lst.de>,
	linux-fsdevel@vger.kernel.org,
	"John Johansen" <john.johansen@canonical.com>,
	"James Morris" <jmorris@namei.org>,
	"Mimi Zohar" <zohar@linux.ibm.com>,
	"Dmitry Kasatkin" <dmitry.kasatkin@gmail.com>,
	"Stephen Smalley" <stephen.smalley.work@gmail.com>,
	"Casey Schaufler" <casey@schaufler-ca.com>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Andreas Dilger" <adilger.kernel@dilger.ca>,
	"OGAWA Hirofumi" <hirofumi@mail.parknet.co.jp>,
	"Geoffrey Thomas" <geofft@ldpreload.com>,
	"Mrunal Patel" <mpatel@redhat.com>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Theodore Tso" <tytso@mit.edu>, "Alban Crequy" <alban@kinvolk.io>,
	"Tycho Andersen" <tycho@tycho.ws>,
	"David Howells" <dhowells@redhat.com>,
	"James Bottomley" <James.Bottomley@hansenpartnership.com>,
	"Seth Forshee" <seth.forshee@canonical.com>,
	"Stéphane Graber" <stgraber@ubuntu.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Aleksa Sarai" <cyphar@cyphar.com>,
	"Lennart Poettering" <lennart@poettering.net>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	smbarber@chromium.org, "Phil Estes" <estesp@gmail.com>,
	"Serge Hallyn" <serge@hallyn.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Todd Kjos" <tkjos@google.com>,
	"Paul Moore" <paul@paul-moore.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	containers@lists.linux-foundation.org,
	linux-security-module@vger.kernel.org, linux-api@vger.kernel.org,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-integrity@vger.kernel.org, selinux@vger.kernel.org
Subject: Re: [PATCH v6 00/40] idmapped mounts
Date: Tue, 26 Jan 2021 23:40:00 -0600	[thread overview]
Message-ID: <20210127054000.GA30832@mail.hallyn.com> (raw)
In-Reply-To: <20210121131959.646623-1-christian.brauner@ubuntu.com>

On Thu, Jan 21, 2021 at 02:19:19PM +0100, Christian Brauner wrote:
> Hey everyone,
> 
> The only major change is the updated version of hch's pach to port xfs
> to support idmapped mounts. Thanks again to Christoph for doing that
> work.
> (Otherwise Acked-bys and Reviewed-bys were added and the tree reordered
>  to decouple filesystem specific conversion from the vfs work so they
>  can proceed independent.
>  For a full list of major changes between versions see the end of this
>  cover letter. Please also note the large xfstests testsuite in patch 42
>  that has been kept as part of this series. It verifies correct vfs
>  behavior with and without idmapped mounts including covering newer vfs
>  features such as io_uring.
>  I currently still plan to target the v5.12 merge window.)
> 
> With this patchset we make it possible to attach idmappings to mounts,
> i.e. simply put different bind mounts can expose the same file or
> directory with different ownership.
> Shifting of ownership on a per-mount basis handles a wide range of
> long standing use-cases. Here are just a few:
> - Shifting of a subset of ownership-less filesystems (vfat) for use by
>   multiple users, effectively allowing for DAC on such devices
>   (systemd, Android, ...)
> - Allow remapping uid/gid on external filesystems or paths (USB sticks,
>   network filesystem, ...) to match the local system's user and groups.
>   (David Howells intends to port AFS as a first candidate.)
> - Shifting of a container rootfs or base image without having to mangle
>   every file (runc, Docker, containerd, k8s, LXD, systemd ...)
> - Sharing of data between host or privileged containers with
>   unprivileged containers (runC, Docker, containerd, k8s, LXD, ...)
> - Data sharing between multiple user namespaces with incompatible maps
>   (LXD, k8s, ...)
> 
> There has been significant interest in this patchset as evidenced by
> user commenting on previous version of this patchset. They include
> containerd, ChromeOS, systemd, LXD and a range of others. There is
> already a patchset up for containerd, the default Kubernetes container
> runtime https://github.com/containerd/containerd/pull/4734
> to make use of this. systemd intends to use it in their systemd-homed
> implementation for portable home directories. ChromeOS wants to make use
> of it to share data between the host and the Linux containers they run
> on Chrome- and Pixelbooks. There's also a few talks that of people who
> are going to make use of this. The most recent one was a CNCF webinar
> https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
> and upcoming talk during FOSDEM.
> (Fwiw, for fun and since I wanted to do this for a long time I've ported
>  my home directory to be completely portable with a simple service file
>  that now mounts my home directory on an ext4 formatted usb stick with
>  an id mapping mapping all files to the random uid I'm assigned at
>  login.)
> 
> Making it possible to share directories and mounts between users with
> different uids and gids is itself quite an important use-case in
> distributed systems environments. It's of course especially useful in
> general for portable usb sticks, sharing data between multiple users in,
> and sharing home directories between multiple users. The last example is
> now elegantly expressed in systemd's homed concept for portable home
> directories. As mentioned above, idmapped mounts also allow data from
> the host to be shared with unprivileged containers, between privileged
> and unprivileged containers simultaneously and in addition also between
> unprivileged containers with different idmappings whenever they are used
> to isolate one container completely from another container.
> 
> We have implemented and proposed multiple solutions to this before. This
> included the introduction of fsid mappings, a tiny filesystem I've
> authored with Seth Forshee that is currently carried in Ubuntu that has
> shown to be the wrong approach, and the conceptual hack of calling
> override creds directly in the vfs. In addition, to some of these
> solutions being hacky none of these solutions have covered all of the
> above use-cases.
> 
> Idmappings become a property of struct vfsmount instead of tying it to a
> process being inside of a user namespace which has been the case for all
> other proposed approaches. It also allows to pass down the user
> namespace into the filesystems which is a clean way instead of violating
> calling conventions by strapping the user namespace information that is
> a property of the mount to the caller's credentials or similar hacks.
> Each mount can have a separate idmapping and idmapped mounts can even be
> created in the initial user namespace unblocking a range of use-cases.
> 
> To this end the vfsmount struct gains a new struct user_namespace
> member. The idmapping of the user namespace becomes the idmapping of the
> mount. A caller that is privileged with respect to the user namespace of
> the superblock of the underlying filesystem can create an idmapped
> mount. In the future, we can enable unprivileged use-cases by checking
> whether the caller is privileged wrt to the user namespace that an
> already idmapped mount has been marked with, allowing them to change the
> idmapping. For now, keep things simple until the need arises.
> Note, that with syscall interception it is already possible to intercept
> idmapped mount requests from unprivileged containers and handle them in
> a sufficiently privileged container manager. Support for this is already
> available in LXD and will be available in runC where syscall
> interception is currently in the process of becoming part of the runtime
> spec: https://github.com/opencontainers/runtime-spec/pull/1074.
> 
> The user namespace the mount will be marked with can be specified by
> passing a file descriptor refering to the user namespace as an argument
> to the new mount_setattr() syscall together with the new
> MOUNT_ATTR_IDMAP flag. By default vfsmounts are marked with the initial
> user namespace and no behavioral or performance changes are observed.
> All mapping operations are nops for the initial user namespace. When a
> file/inode is accessed through an idmapped mount the i_uid and i_gid of
> the inode will be remapped according to the user namespace the mount has
> been marked with.
> 
> In order to support idmapped mounts, filesystems need to be changed and
> mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The initial
> version contains fat, ext4, and xfs including a list of examples.
> But patches for other filesystems are actively worked on and will be
> sent out separately. We are here to see this through and there are
> multiple people involved in converting filesystems. So filesystem
> developers are not left alone with this and are provided with a large
> testsuite to verify that their port is correct.
> 
> There is a simple tool available at
> https://github.com/brauner/mount-idmapped that allows to create idmapped
> mounts so people can play with this patch series. Here are a few
> illustrations:
> 
> 1. Create a simple idmapped mount of another user's home directory
> 
> u1001@f2-vm:/$ sudo ./mount-idmapped --map-mount b:1000:1001:1 /home/ubuntu/ /mnt
> u1001@f2-vm:/$ ls -al /home/ubuntu/
> total 28
> drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
> drwxr-xr-x 4 root   root   4096 Oct 28 04:00 ..
> -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
> -rw-r--r-- 1 ubuntu ubuntu  220 Feb 25  2020 .bash_logout
> -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25  2020 .bashrc
> -rw-r--r-- 1 ubuntu ubuntu  807 Feb 25  2020 .profile
> -rw-r--r-- 1 ubuntu ubuntu    0 Oct 16 16:11 .sudo_as_admin_successful
> -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo

So I assume this falls under the buyer beware warning, but it's
probably important to warn people loudly of the fact that, at this
point, the user with uid 1001 can chmod u+s any binary under /mnt
and then run it from /home/ubuntu with euid=1000.  In other words,
that while this has excellent uses, if you *can* use shared group
membership, you should :)

Very cool though.

      parent reply	other threads:[~2021-01-27  6:10 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 13:19 [PATCH v6 00/40] idmapped mounts Christian Brauner
2021-01-21 13:19 ` [PATCH v6 01/40] mount: attach mappings to mounts Christian Brauner
2021-01-21 13:19 ` [PATCH v6 02/40] fs: add id translation helpers Christian Brauner
2021-03-13  0:05   ` Vivek Goyal
2021-01-21 13:19 ` [PATCH v6 03/40] fs: add file and path permissions helpers Christian Brauner
2021-01-22  2:55   ` James Morris
2021-01-21 13:19 ` [PATCH v6 04/40] capability: handle idmapped mounts Christian Brauner
2021-01-22  2:57   ` James Morris
2021-01-21 13:19 ` [PATCH v6 05/39] namei: make permission helpers idmapped mount aware Christian Brauner
2021-01-22  3:02   ` James Morris
2021-01-22 22:26   ` J. Bruce Fields
2021-01-23 13:09     ` Christian Brauner
2021-01-24 22:18       ` J. Bruce Fields
2021-01-24 22:44         ` Christian Brauner
2021-01-21 13:19 ` [PATCH v6 06/40] inode: make init and " Christian Brauner
2021-01-22  3:10   ` James Morris
2021-01-21 13:19 ` [PATCH v6 07/40] attr: handle idmapped mounts Christian Brauner
2021-01-21 13:19 ` [PATCH v6 08/40] acl: " Christian Brauner
2021-01-21 13:19 ` [PATCH v6 09/40] xattr: " Christian Brauner
2021-01-22  3:21   ` James Morris
2021-01-21 13:19 ` [PATCH v6 10/40] commoncap: " Christian Brauner
2021-01-22  3:27   ` James Morris
2021-01-21 13:19 ` [PATCH v6 11/40] stat: " Christian Brauner
2021-01-22  3:28   ` James Morris
2021-01-21 13:19 ` [PATCH v6 12/40] namei: handle idmapped mounts in may_*() helpers Christian Brauner
2021-01-22  3:47   ` James Morris
2021-01-21 13:19 ` [PATCH v6 13/40] namei: introduce struct renamedata Christian Brauner
2021-01-21 13:19 ` [PATCH v6 14/40] namei: prepare for idmapped mounts Christian Brauner
2021-01-21 13:19 ` [PATCH v6 15/40] open: handle idmapped mounts in do_truncate() Christian Brauner
2021-01-22 17:20   ` Christoph Hellwig
2021-01-21 13:19 ` [PATCH v6 16/40] open: handle idmapped mounts Christian Brauner
2021-01-22  4:14   ` James Morris
2021-01-22 17:21   ` Christoph Hellwig
2021-01-21 13:19 ` [PATCH v6 17/40] af_unix: " Christian Brauner
2021-01-22  4:14   ` James Morris
2021-01-21 13:19 ` [PATCH v6 18/40] utimes: " Christian Brauner
2021-01-22  4:15   ` James Morris
2021-01-21 13:19 ` [PATCH v6 19/40] fcntl: " Christian Brauner
2021-01-22  4:17   ` James Morris
2021-01-21 13:19 ` [PATCH v6 20/40] init: " Christian Brauner
2021-01-22 17:23   ` Christoph Hellwig
2021-01-21 13:19 ` [PATCH v6 21/40] ioctl: " Christian Brauner
2021-01-22  4:33   ` James Morris
2021-01-21 13:19 ` [PATCH v6 22/40] would_dump: " Christian Brauner
2021-01-21 13:19 ` [PATCH v6 23/40] exec: " Christian Brauner
2021-01-22  4:35   ` James Morris
2021-01-25 16:39   ` Eric W. Biederman
2021-01-25 16:44     ` Christian Brauner
2021-01-25 17:03       ` Serge E. Hallyn
2021-01-25 17:06         ` Christian Brauner
2021-01-27  5:50       ` Serge E. Hallyn
2021-01-21 13:19 ` [PATCH v6 25/40] apparmor: " Christian Brauner
2021-01-21 13:19 ` [PATCH v6 26/39] ima: " Christian Brauner
2021-01-21 13:19 ` [PATCH v6 27/40] ecryptfs: do not mount on top of " Christian Brauner
2021-01-22  4:37   ` James Morris
2021-01-21 13:19 ` [PATCH v6 28/40] overlayfs: " Christian Brauner
2021-01-22  4:38   ` James Morris
2021-01-21 13:19 ` [PATCH v6 29/40] namespace: take lock_mount_hash() directly when changing flags Christian Brauner
2021-01-21 13:19 ` [PATCH v6 30/40] mount: make {lock,unlock}_mount_hash() static Christian Brauner
2021-01-21 13:19 ` [PATCH v6 31/40] namespace: only take read lock in do_reconfigure_mnt() Christian Brauner
2021-01-21 13:19 ` [PATCH v6 32/40] fs: split out functions to hold writers Christian Brauner
2021-01-21 13:19 ` [PATCH v6 33/40] fs: add attr_flags_to_mnt_flags helper Christian Brauner
2021-01-21 13:19 ` [PATCH v6 34/40] fs: add mount_setattr() Christian Brauner
2021-01-21 13:19 ` [PATCH v6 35/40] fs: introduce MOUNT_ATTR_IDMAP Christian Brauner
2021-01-22 17:33   ` Christoph Hellwig
2021-01-22 17:34     ` Christoph Hellwig
2021-01-21 13:19 ` [PATCH v6 36/40] tests: add mount_setattr() selftests Christian Brauner
2021-01-21 13:19 ` [PATCH v6 37/40] fat: handle idmapped mounts Christian Brauner
2021-01-21 13:19 ` [PATCH v6 38/40] ext4: support " Christian Brauner
2021-01-21 13:19 ` [PATCH v6 39/40] xfs: " Christian Brauner
2021-03-01 20:05   ` Darrick J. Wong
2021-03-01 20:46     ` Christian Brauner
2021-03-03  7:01     ` Christoph Hellwig
2021-01-27  5:40 ` Serge E. Hallyn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210127054000.GA30832@mail.hallyn.com \
    --to=serge@hallyn.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=alban@kinvolk.io \
    --cc=arnd@arndb.de \
    --cc=casey@schaufler-ca.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=cyphar@cyphar.com \
    --cc=dhowells@redhat.com \
    --cc=dmitry.kasatkin@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=estesp@gmail.com \
    --cc=geofft@ldpreload.com \
    --cc=hch@lst.de \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jmorris@namei.org \
    --cc=john.johansen@canonical.com \
    --cc=josh@joshtriplett.org \
    --cc=keescook@chromium.org \
    --cc=lennart@poettering.net \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mpatel@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=selinux@vger.kernel.org \
    --cc=seth.forshee@canonical.com \
    --cc=smbarber@chromium.org \
    --cc=stephen.smalley.work@gmail.com \
    --cc=stgraber@ubuntu.com \
    --cc=tkjos@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tycho@tycho.ws \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).