linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: serge.hallyn@ubuntu.com
To: linux-kernel@vger.kernel.org
Cc: adityakali@google.com, tj@kernel.org, linux-api@vger.kernel.org,
	containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
	lxc-devel@lists.linuxcontainers.org, akpm@linux-foundation.org,
	ebiederm@xmission.com, gregkh@linuxfoundation.org,
	lizefan@huawei.com, hannes@cmpxchg.org,
	Serge Hallyn <serge.hallyn@ubuntu.com>,
	Serge Hallyn <serge.hallyn@canonical.com>
Subject: [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns
Date: Tue, 22 Dec 2015 22:23:27 -0600	[thread overview]
Message-ID: <1450844609-9194-7-git-send-email-serge.hallyn@ubuntu.com> (raw)
In-Reply-To: <1450844609-9194-1-git-send-email-serge.hallyn@ubuntu.com>

From: Serge Hallyn <serge.hallyn@ubuntu.com>

This patch enables cgroup mounting inside userns when a process
as appropriate privileges. The cgroup filesystem mounted is
rooted at the cgroupns-root. Thus, in a container-setup, only
the hierarchy under the cgroupns-root is exposed inside the container.
This allows container management tools to run inside the containers
without depending on any global state.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
---
Changelog:
	20151116 - Don't allow user namespaces to bind new subsystems
	20151118 - postpone the FS_USERNS_MOUNT flag until the
	           last patch, until we can convince ourselves it
		   is safe.
	20151207 - Switch to walking up the kernfs path from kn root.
		 - Group initialized variables
		 - Explain the capable(CAP_SYS_ADMIN) check
		 - Style fixes
---
 kernel/cgroup.c |   40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index e85fbf9..99c4443 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1983,6 +1983,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 {
 	bool is_v2 = fs_type == &cgroup2_fs_type;
 	struct super_block *pinned_sb = NULL;
+	struct cgroup_namespace *ns = current->nsproxy->cgroup_ns;
 	struct cgroup_subsys *ss;
 	struct cgroup_root *root;
 	struct cgroup_sb_opts opts;
@@ -1991,6 +1992,14 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 	int i;
 	bool new_sb;
 
+	get_cgroup_ns(ns);
+
+	/* Check if the caller has permission to mount. */
+	if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) {
+		put_cgroup_ns(ns);
+		return ERR_PTR(-EPERM);
+	}
+
 	/*
 	 * The first time anyone tries to mount a cgroup, enable the list
 	 * linking each css_set to its tasks and fix up all existing tasks.
@@ -2106,6 +2115,16 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 		goto out_unlock;
 	}
 
+	/*
+	 * We know this subsystem has not yet been bound.  Users in a non-init
+	 * user namespace may only mount hierarchies with no bound subsystems,
+	 * i.e. 'none,name=user1'
+	 */
+	if (!opts.none && !capable(CAP_SYS_ADMIN)) {
+		ret = -EPERM;
+		goto out_unlock;
+	}
+
 	root = kzalloc(sizeof(*root), GFP_KERNEL);
 	if (!root) {
 		ret = -ENOMEM;
@@ -2124,12 +2143,30 @@ out_free:
 	kfree(opts.release_agent);
 	kfree(opts.name);
 
-	if (ret)
+	if (ret) {
+		put_cgroup_ns(ns);
 		return ERR_PTR(ret);
+	}
 out_mount:
 	dentry = kernfs_mount(fs_type, flags, root->kf_root,
 			      is_v2 ? CGROUP2_SUPER_MAGIC : CGROUP_SUPER_MAGIC,
 			      &new_sb);
+
+	/*
+	 * In non-init cgroup namespace, instead of root cgroup's
+	 * dentry, we return the dentry corresponding to the
+	 * cgroupns->root_cgrp.
+	 */
+	if (!IS_ERR(dentry) && ns != &init_cgroup_ns) {
+		struct dentry *nsdentry;
+		struct cgroup *cgrp;
+
+		cgrp = cset_cgroup_from_root(ns->root_cset, root);
+		nsdentry = kernfs_node_dentry(cgrp->kn, dentry->d_sb);
+		dput(dentry);
+		dentry = nsdentry;
+	}
+
 	if (IS_ERR(dentry) || !new_sb)
 		cgroup_put(&root->cgrp);
 
@@ -2142,6 +2179,7 @@ out_mount:
 		deactivate_super(pinned_sb);
 	}
 
+	put_cgroup_ns(ns);
 	return dentry;
 }
 
-- 
1.7.9.5


  parent reply	other threads:[~2015-12-23  4:25 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-23  4:23 CGroup Namespaces (v8) serge.hallyn
2015-12-23  4:23 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn
2015-12-23 16:08   ` Tejun Heo
2015-12-23 16:36     ` Serge E. Hallyn
2015-12-23 19:33     ` [PATCH 1/8 v8.2] " Serge E. Hallyn
2015-12-23 16:24   ` [PATCH 1/8] " Tejun Heo
2015-12-23 16:51     ` Greg KH
2015-12-23  4:23 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn
2015-12-23  4:23 ` [PATCH 3/8] cgroup: introduce cgroup namespaces serge.hallyn
2015-12-23 16:15   ` Tejun Heo
2015-12-23 19:34     ` [PATCH 3/8 v8.2] " Serge E. Hallyn
2015-12-23  4:23 ` [PATCH 4/8] cgroup: cgroup namespace setns support serge.hallyn
2015-12-23  4:23 ` [PATCH 5/8] kernfs: define kernfs_node_dentry serge.hallyn
2015-12-23 16:25   ` Tejun Heo
2015-12-23 16:51     ` Greg KH
2015-12-23  4:23 ` serge.hallyn [this message]
2015-12-31 13:38   ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns Sergey Senozhatsky
2016-01-01  0:58     ` Serge E. Hallyn
2016-01-01  1:17       ` Sergey Senozhatsky
2016-01-01  1:56       ` Tejun Heo
2015-12-23  4:23 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn
2015-12-28 17:47   ` Tejun Heo
2015-12-28 21:13     ` Serge Hallyn
2015-12-28 21:48       ` [PATCH] " Tejun Heo
2015-12-23  4:23 ` [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs serge.hallyn
2015-12-28 17:46 ` CGroup Namespaces (v8) Tejun Heo
2016-01-01  8:19 ` Dan Williams
2016-01-01  8:59   ` Serge E. Hallyn
2016-01-01  9:42     ` Dan Williams
2016-01-01 18:06       ` Serge E. Hallyn
2016-01-01 19:14         ` Dan Williams
2016-01-02 11:52           ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2016-01-29  8:54 CGroup Namespaces (v10) serge.hallyn
2016-01-29  8:54 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn
2016-01-04 19:54 CGroup Namespaces (v9) serge.hallyn
2016-01-04 19:54 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn
2015-12-09 19:28 CGroup Namespaces (v7) serge.hallyn
2015-12-09 19:28 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1450844609-9194-7-git-send-email-serge.hallyn@ubuntu.com \
    --to=serge.hallyn@ubuntu.com \
    --cc=adityakali@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=lxc-devel@lists.linuxcontainers.org \
    --cc=serge.hallyn@canonical.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).