From: Tejun Heo <tj@kernel.org>
To: serge.hallyn@ubuntu.com
Cc: linux-kernel@vger.kernel.org, adityakali@google.com,
linux-api@vger.kernel.org, containers@lists.linux-foundation.org,
cgroups@vger.kernel.org, lxc-devel@lists.linuxcontainers.org,
akpm@linux-foundation.org, ebiederm@xmission.com,
gregkh@linuxfoundation.org, lizefan@huawei.com,
hannes@cmpxchg.org, Serge Hallyn <serge.hallyn@canonical.com>
Subject: Re: [PATCH 7/8] cgroup: Add documentation for cgroup namespaces
Date: Mon, 28 Dec 2015 12:47:35 -0500 [thread overview]
Message-ID: <20151228174735.GB30165@mtj.duckdns.org> (raw)
In-Reply-To: <1450844609-9194-8-git-send-email-serge.hallyn@ubuntu.com>
Hello,
I did some heavy editing of the documentation. How does this look?
Did I miss anything?
Thanks.
---
Documentation/cgroup.txt | 146 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 146 insertions(+)
--- a/Documentation/cgroup.txt
+++ b/Documentation/cgroup.txt
@@ -47,6 +47,11 @@ CONTENTS
5-3. IO
5-3-1. IO Interface Files
5-3-2. Writeback
+6. Namespace
+ 6-1. Basics
+ 6-2. The Root and Views
+ 6-3. Migration and setns(2)
+ 6-4. Interaction with Other Namespaces
P. Information on Kernel Programming
P-1. Filesystem Support for Writeback
D. Deprecated v1 Core Features
@@ -1013,6 +1018,147 @@ writeback as follows.
vm.dirty[_background]_ratio.
+6. Namespace
+
+6-1. Basics
+
+cgroup namespace provides a mechanism to virtualize the view of the
+"/proc/$PID/cgroup" file. The CLONE_NEWCGROUP clone flag can be used
+with clone(2) and unshare(2) to create a new cgroup namespace. The
+process running inside the cgroup namespace will have its
+"/proc/$PID/cgroup" output restricted to cgroupns root. The cgroupns
+root is the cgroup of the process at the time of creation of the
+cgroup namespace.
+
+Without cgroup namespace, the "/proc/$PID/cgroup" file shows the
+complete path of the cgroup of a process. In a container setup where
+a set of cgroups and namespaces are intended to isolate processes the
+"/proc/$PID/cgroup" file may leak potential system level information
+to the isolated processes. For Example:
+
+ # cat /proc/self/cgroup
+ 0::/batchjobs/container_id1
+
+The path '/batchjobs/container_id1' can be considered as system-data
+and undesirable to expose to the isolated processes. cgroup namespace
+can be used to restrict visibility of this path. For example, before
+creating a cgroup namespace, one would see:
+
+ # ls -l /proc/self/ns/cgroup
+ lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835]
+ # cat /proc/self/cgroup
+ 0::/batchjobs/container_id1
+
+After unsharing a new namespace, the view changes.
+
+ # ls -l /proc/self/ns/cgroup
+ lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183]
+ # cat /proc/self/cgroup
+ 0::/
+
+When some thread from a multi-threaded process unshares its cgroup
+namespace, the new cgroupns gets applied to the entire process (all
+the threads). This is natural for the v2 hierarchy; however, for the
+legacy hierarchies, this may be unexpected.
+
+A cgroup namespace is alive as long as there are processes inside it.
+When the last process exits, the cgroup namespace is destroyed. The
+cgroupns root and the actual cgroups remain.
+
+
+6-2. The Root and Views
+
+The 'cgroupns root' for a cgroup namespace is the cgroup in which the
+process calling unshare(2) is running. For example, if a process in
+/batchjobs/container_id1 cgroup calls unshare, cgroup
+/batchjobs/container_id1 becomes the cgroupns root. For the
+init_cgroup_ns, this is the real root ('/') cgroup.
+
+The cgroupns root cgroup does not change even if the namespace creator
+process later moves to a different cgroup.
+
+ # ~/unshare -c # unshare cgroupns in some cgroup
+ # cat /proc/self/cgroup
+ 0::/
+ # mkdir sub_cgrp_1
+ # echo 0 > sub_cgrp_1/cgroup.procs
+ # cat /proc/self/cgroup
+ 0::/sub_cgrp_1
+
+Each process gets its namespace-specific view of "/proc/$PID/cgroup"
+
+Processes running inside the cgroup namespace will be able to see
+cgroup paths (in /proc/self/cgroup) only inside their root cgroup.
+From within an unshared cgroupns:
+
+ # sleep 100000 &
+ [1] 7353
+ # echo 7353 > sub_cgrp_1/cgroup.procs
+ # cat /proc/7353/cgroup
+ 0::/sub_cgrp_1
+
+From the initial cgroup namespace, the real cgroup path will be
+visible:
+
+ $ cat /proc/7353/cgroup
+ 0::/batchjobs/container_id1/sub_cgrp_1
+
+From a sibling cgroup namespace (that is, a namespace rooted at a
+different cgroup), the cgroup path relative to its own cgroup
+namespace root will be shown. For instance, if PID 7353's cgroup
+namespace root is at '/batchjobs/container_id2', then it will see
+
+ # cat /proc/7353/cgroup
+ 0::/../container_id2/sub_cgrp_1
+
+Note that the relative path always starts with '/' to indicate that
+its relative to the cgroup namespace root of the caller.
+
+
+6-3. Migration and setns(2)
+
+Processes inside a cgroup namespace can move into and out of the
+namespace root if they have proper access to external cgroups. For
+example, from inside a namespace with cgroupns root at
+/batchjobs/container_id1, and assuming that the global hierarchy is
+still accessible inside cgroupns:
+
+ # cat /proc/7353/cgroup
+ 0::/sub_cgrp_1
+ # echo 7353 > batchjobs/container_id2/cgroup.procs
+ # cat /proc/7353/cgroup
+ 0::/../container_id2
+
+Note that this kind of setup is not encouraged. A task inside cgroup
+namespace should only be exposed to its own cgroupns hierarchy.
+
+setns(2) to another cgroup namespace is allowed when:
+
+(a) the process has CAP_SYS_ADMIN against its current user namespace
+(b) the process has CAP_SYS_ADMIN against the target cgroup
+ namespace's userns
+
+No implicit cgroup changes happen with attaching to another cgroup
+namespace. It is expected that the someone moves the attaching
+process under the target cgroup namespace root.
+
+
+6-4. Interaction with Other Namespaces
+
+Namespace specific cgroup hierarchy can be mounted by a process
+running inside a non-init cgroup namespace.
+
+ # mount -t cgroup2 none $MOUNT_POINT
+
+This will mount the unified cgroup hierarchy with cgroupns root as the
+filesystem root. The process needs CAP_SYS_ADMIN against its user and
+mount namespaces.
+
+The virtualization of /proc/self/cgroup file combined with restricting
+the view of cgroup hierarchy by namespace-private cgroupfs mount
+provides a properly isolated cgroup view inside the container.
+
+
P. Information on Kernel Programming
This section contains kernel programming information in the areas
next prev parent reply other threads:[~2015-12-28 17:47 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-23 4:23 CGroup Namespaces (v8) serge.hallyn
2015-12-23 4:23 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn
2015-12-23 16:08 ` Tejun Heo
2015-12-23 16:36 ` Serge E. Hallyn
2015-12-23 19:33 ` [PATCH 1/8 v8.2] " Serge E. Hallyn
2015-12-23 16:24 ` [PATCH 1/8] " Tejun Heo
2015-12-23 16:51 ` Greg KH
2015-12-23 4:23 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn
2015-12-23 4:23 ` [PATCH 3/8] cgroup: introduce cgroup namespaces serge.hallyn
2015-12-23 16:15 ` Tejun Heo
2015-12-23 19:34 ` [PATCH 3/8 v8.2] " Serge E. Hallyn
2015-12-23 4:23 ` [PATCH 4/8] cgroup: cgroup namespace setns support serge.hallyn
2015-12-23 4:23 ` [PATCH 5/8] kernfs: define kernfs_node_dentry serge.hallyn
2015-12-23 16:25 ` Tejun Heo
2015-12-23 16:51 ` Greg KH
2015-12-23 4:23 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn
2015-12-31 13:38 ` Sergey Senozhatsky
2016-01-01 0:58 ` Serge E. Hallyn
2016-01-01 1:17 ` Sergey Senozhatsky
2016-01-01 1:56 ` Tejun Heo
2015-12-23 4:23 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn
2015-12-28 17:47 ` Tejun Heo [this message]
2015-12-28 21:13 ` Serge Hallyn
2015-12-28 21:48 ` [PATCH] " Tejun Heo
2015-12-23 4:23 ` [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs serge.hallyn
2015-12-28 17:46 ` CGroup Namespaces (v8) Tejun Heo
2016-01-01 8:19 ` Dan Williams
2016-01-01 8:59 ` Serge E. Hallyn
2016-01-01 9:42 ` Dan Williams
2016-01-01 18:06 ` Serge E. Hallyn
2016-01-01 19:14 ` Dan Williams
2016-01-02 11:52 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2016-01-29 8:54 CGroup Namespaces (v10) serge.hallyn
2016-01-29 8:54 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn
2016-01-04 19:54 CGroup Namespaces (v9) serge.hallyn
2016-01-04 19:54 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn
2015-12-09 19:28 CGroup Namespaces (v7) serge.hallyn
2015-12-09 19:29 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151228174735.GB30165@mtj.duckdns.org \
--to=tj@kernel.org \
--cc=adityakali@google.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=lxc-devel@lists.linuxcontainers.org \
--cc=serge.hallyn@canonical.com \
--cc=serge.hallyn@ubuntu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).