linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Djalal Harouni <tixxdz@gmail.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>,
	Chris Mason <clm@fb.com>, <tytso@mit.edu>,
	Serge Hallyn <serge.hallyn@canonical.com>,
	Josh Triplett <josh@joshtriplett.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Seth Forshee <seth.forshee@canonical.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	Dongsu Park <dongsu@endocode.com>,
	David Herrmann <dh.herrmann@googlemail.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Alban Crequy <alban.crequy@gmail.com>
Cc: Djalal Harouni <tixxdz@gmail.com>, Djalal Harouni <tixxdz@opendz.org>
Subject: [RFC v2 PATCH 1/8] VFS: add CLONE_MNTNS_SHIFT_UIDGID flag to allow mounts to shift their UIDs/GIDs
Date: Wed,  4 May 2016 16:26:47 +0200	[thread overview]
Message-ID: <1462372014-3786-2-git-send-email-tixxdz@gmail.com> (raw)
In-Reply-To: <1462372014-3786-1-git-send-email-tixxdz@gmail.com>

Add CLONE_MNTNS_SHIFT_UIDGID flag which is a mount namespace flag when
set mount points on filesystems that support UID/GID shifts will have
their UIDs and GIDs shifted by the VFS. The UID and GID mapping rules are per
mount namespace, they follow the rules of the user namespace of the containing
mount namespace. The UID/GID of inodes are supposed to always contain
the on-disk values, hence, the shift will be done inside VFS and it's a read
shift when we access the inodes.

This is a preparation patch.

Goal:

	/* (1) */
	clone4(CLONE_NEWNS|CLONE_MNTNS_SHIFT_UIDGID, ...)
	/*
		Setup container base mount namespace, rootfs and mount all
		necessary mount points and filesystems that can't be mounted
		in user namespaces. Filesystems that support uid/gid shifts
		should set the mount parameters.
		mount(..., mount_options=[vfs_shift_uids, vfs_shift_gids])
	*/

	/* (2) */
	/*
		Setup new mount and user namespaces and inherit the
		CLONE_MNTNS_SHIFT_UIDGID flag from (1) into the new mount
		namespace (2).
	*/
	clone4(CLONE_NEWUSER|CLONE_NEWNS|CLONE_MNTNS_SHIFT_UIDGID, ...)
	/*
	   inodes of mount points here that support UID/GID shifts will have
	   automatically their UID/GID shifted according to the user
	   namespace rules of the current mount namespace (2).
	*/

We create the new user and mount namespaces where:
1) The mount namespace allows mounts inside it that support UID and GID
   shifting to perform the shifts if the CLONE_MNTNS_SHIFT_UIDGID is set
   in the current mount namespace.

2) The UID and GID mapping is done according to the rules of the user
   namespace of the containing mount namespace. The CLONE_MNTNS_SHIFT_UIDGID
   follows the CLONE_NEWUSER|CLONE_NEWNS combination. This ensures that
   only the creator of the mount namespace is able to adjust the user
   namespace mapping rules.

The flag CLONE_MNTNS_SHIFT_UIDGID can be set on the mount namespace
only if:

1) The parent namespace has already CLONE_MNTNS_SHIFT_UIDGID set on
   its mount namespace.

2) The caller has CAP_SYS_ADMIN in the init_user_ns namespace, since we
   start from that namespace and we inherit some mount points we have to
   protect files from privileged userns doing:
   clone(CLONE_NEWUSER|CLONE_NEWNS|CLONE_MNTNS_SHIFT_UIDGID...)
   This is blocked.

If a filesystem was mounted with "vfs_shift_uids" and "vfs_shift_gids"
and shows up in a mount namespace that does not include the
CLONE_MNTNS_SHIFT_UIDGID, then no shift is done. UIDs and GIDs will
not be changed at all, and things will continue to work as they are now.

Signed-off-by: Dongsu Park <dongsu@endocode.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
---
 fs/mount.h                 |  1 +
 fs/namespace.c             | 20 ++++++++++++++++++++
 include/uapi/linux/sched.h |  1 +
 kernel/fork.c              |  4 ++++
 4 files changed, 26 insertions(+)

diff --git a/fs/mount.h b/fs/mount.h
index 14db05d..1e317eb 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -6,6 +6,7 @@
 
 struct mnt_namespace {
 	atomic_t		count;
+	int			flags;
 	struct ns_common	ns;
 	struct mount *	root;
 	struct list_head	list;
diff --git a/fs/namespace.c b/fs/namespace.c
index 4fb1691..940ecfc 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2774,6 +2774,7 @@ static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns)
 	INIT_LIST_HEAD(&new_ns->list);
 	init_waitqueue_head(&new_ns->poll);
 	new_ns->event = 0;
+	new_ns->flags = 0;
 	new_ns->user_ns = get_user_ns(user_ns);
 	return new_ns;
 }
@@ -2801,6 +2802,25 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns,
 	if (IS_ERR(new_ns))
 		return new_ns;
 
+	if (flags & CLONE_MNTNS_SHIFT_UIDGID) {
+		/*
+		 * If parent has the CLONE_MNTNS_SHIFT_UIDGID flag set
+		 * or current is capable in init_user_ns, then we set the
+		 * CLONE_MNTNS_SHIFT_UIDGID flag and allow mounts inside
+		 * this namespace to shift their UID and GID.
+		 *
+		 * We check the init_user_ns here since we always start from
+		 * that user namespace and mounts are by default available to all
+		 * users. In this regard, only CAP_SYS_ADMIN in init_user_ns is
+		 * allowed to start and propagate the CLONE_MNTNS_SHIFT_UIDGID
+		 * flag to new mount namespaces.
+		 */
+		if ((ns->flags & CLONE_MNTNS_SHIFT_UIDGID) || capable(CAP_SYS_ADMIN))
+			new_ns->flags |= CLONE_MNTNS_SHIFT_UIDGID;
+		else
+			return ERR_PTR(-EPERM);
+	}
+
 	namespace_lock();
 	/* First pass: copy the tree topology */
 	copy_flags = CL_COPY_UNBINDABLE | CL_EXPIRE;
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 5f0fe01..9ba2124 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -19,6 +19,7 @@
 #define CLONE_PARENT_SETTID	0x00100000	/* set the TID in the parent */
 #define CLONE_CHILD_CLEARTID	0x00200000	/* clear the TID in the child */
 #define CLONE_DETACHED		0x00400000	/* Unused, ignored */
+#define CLONE_MNTNS_SHIFT_UIDGID     0x00400000      /* If set allows to shift UID and GID for mounts that support it */
 #define CLONE_UNTRACED		0x00800000	/* set if the tracing process can't force CLONE_PTRACE on this clone */
 #define CLONE_CHILD_SETTID	0x01000000	/* set the TID in the child */
 #define CLONE_NEWCGROUP		0x02000000	/* New cgroup namespace */
diff --git a/kernel/fork.c b/kernel/fork.c
index d277e83..41223cd 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1264,6 +1264,10 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS))
 		return ERR_PTR(-EINVAL);
 
+	if ((clone_flags & CLONE_MNTNS_SHIFT_UIDGID) &&
+	    !(clone_flags & CLONE_NEWNS))
+		return ERR_PTR(-EINVAL);
+
 	if ((clone_flags & (CLONE_NEWUSER|CLONE_FS)) == (CLONE_NEWUSER|CLONE_FS))
 		return ERR_PTR(-EINVAL);
 
-- 
2.5.5


  reply	other threads:[~2016-05-04 14:29 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-04 14:26 [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems Djalal Harouni
2016-05-04 14:26 ` Djalal Harouni [this message]
2016-05-04 14:26 ` [RFC v2 PATCH 2/8] VFS:uidshift: add flags and helpers to shift UIDs and GIDs to virtual view Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 3/8] fs: Treat foreign mounts as nosuid Djalal Harouni
2016-05-04 23:19   ` Serge Hallyn
2016-05-05 13:05     ` Seth Forshee
2016-05-05 22:40       ` Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 4/8] VFS:userns: shift UID/GID to virtual view during permission access Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 5/8] VFS:userns: add helpers to shift UIDs and GIDs into on-disk view Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 6/8] VFS:userns: shift UID/GID to on-disk view before any write to disk Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 7/8] ext4: add support for vfs_shift_uids and vfs_shift_gids mount options Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 8/8] btrfs: " Djalal Harouni
2016-05-04 16:34 ` [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems Josh Triplett
2016-05-04 21:06 ` James Bottomley
2016-05-05  7:36   ` Djalal Harouni
2016-05-05 11:56     ` James Bottomley
2016-05-05 21:49       ` Djalal Harouni
2016-05-05 22:08         ` James Bottomley
2016-05-10 23:36           ` James Bottomley
2016-05-11  0:38             ` Al Viro
2016-05-11  0:53             ` Al Viro
2016-05-11  3:47               ` James Bottomley
2016-05-11 16:42             ` Djalal Harouni
2016-05-11 18:33               ` James Bottomley
2016-05-12 19:55                 ` Djalal Harouni
2016-05-12 22:24                   ` James Bottomley
2016-05-14  9:53                     ` Djalal Harouni
2016-05-14 13:46                       ` James Bottomley
2016-05-15  2:21                         ` Eric W. Biederman
2016-05-15 15:04                           ` James Bottomley
2016-05-16 14:12                           ` Seth Forshee
2016-05-16 16:42                             ` Eric W. Biederman
2016-05-16 18:25                               ` Seth Forshee
2016-05-16 19:13                           ` James Bottomley
2016-05-17 22:40                             ` Eric W. Biederman
2016-05-17 11:42                           ` Djalal Harouni
2016-05-17 15:42                         ` Djalal Harouni
2016-05-04 23:30 ` Serge Hallyn
2016-05-06 14:38   ` Djalal Harouni
2016-05-09 16:26     ` Serge Hallyn
2016-05-10 10:33       ` Djalal Harouni
2016-05-05  0:23 ` Dave Chinner
2016-05-05  1:44   ` Andy Lutomirski
2016-05-05  2:25     ` Dave Chinner
2016-05-05  3:29       ` Andy Lutomirski
2016-05-05 22:34     ` Djalal Harouni
2016-05-05 22:24   ` Djalal Harouni
2016-05-06  2:50     ` Dave Chinner
2016-05-12 19:47       ` Djalal Harouni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1462372014-3786-2-git-send-email-tixxdz@gmail.com \
    --to=tixxdz@gmail.com \
    --cc=alban.crequy@gmail.com \
    --cc=clm@fb.com \
    --cc=dh.herrmann@googlemail.com \
    --cc=dongsu@endocode.com \
    --cc=ebiederm@xmission.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=serge.hallyn@canonical.com \
    --cc=seth.forshee@canonical.com \
    --cc=tixxdz@opendz.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).