From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754607AbbAEWtI (ORCPT ); Mon, 5 Jan 2015 17:49:08 -0500 Received: from mail-ob0-f169.google.com ([209.85.214.169]:65322 "EHLO mail-ob0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753197AbbAEWtF (ORCPT ); Mon, 5 Jan 2015 17:49:05 -0500 MIME-Version: 1.0 In-Reply-To: <548E17CE.8010704@nod.at> References: <1417744550-6461-1-git-send-email-adityakali@google.com> <1417744550-6461-9-git-send-email-adityakali@google.com> <548E17CE.8010704@nod.at> From: Aditya Kali Date: Mon, 5 Jan 2015 14:48:44 -0800 Message-ID: Subject: Re: [PATCHv3 8/8] cgroup: Add documentation for cgroup namespaces To: Richard Weinberger Cc: Tejun Heo , Li Zefan , Serge Hallyn , Andy Lutomirski , "Eric W. Biederman" , cgroups mailinglist , "linux-kernel@vger.kernel.org" , Linux API , Ingo Molnar , Linux Containers , Rohit Jnagal , Vivek Goyal Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Dec 14, 2014 at 3:05 PM, Richard Weinberger wrote: > Aditya, > > I gave your patch set a try but it does not work for me. > Maybe you can bring some light into the issues I'm facing. > Sadly I still had no time to dig into your code. > > Am 05.12.2014 um 02:55 schrieb Aditya Kali: >> Signed-off-by: Aditya Kali >> --- >> Documentation/cgroups/namespace.txt | 147 ++++++++++++++++++++++++++++++++++++ >> 1 file changed, 147 insertions(+) >> create mode 100644 Documentation/cgroups/namespace.txt >> >> diff --git a/Documentation/cgroups/namespace.txt b/Documentation/cgroups/namespace.txt >> new file mode 100644 >> index 0000000..6480379 >> --- /dev/null >> +++ b/Documentation/cgroups/namespace.txt >> @@ -0,0 +1,147 @@ >> + CGroup Namespaces >> + >> +CGroup Namespace provides a mechanism to virtualize the view of the >> +/proc//cgroup file. The CLONE_NEWCGROUP clone-flag can be used with >> +clone() and unshare() syscalls to create a new cgroup namespace. >> +The process running inside the cgroup namespace will have its /proc//cgroup >> +output restricted to cgroupns-root. cgroupns-root is the cgroup of the process >> +at the time of creation of the cgroup namespace. >> + >> +Prior to CGroup Namespace, the /proc//cgroup file used to show complete >> +path of the cgroup of a process. In a container setup (where a set of cgroups >> +and namespaces are intended to isolate processes), the /proc//cgroup file >> +may leak potential system level information to the isolated processes. >> + >> +For Example: >> + $ cat /proc/self/cgroup >> + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1 >> + >> +The path '/batchjobs/container_id1' can generally be considered as system-data >> +and its desirable to not expose it to the isolated process. >> + >> +CGroup Namespaces can be used to restrict visibility of this path. >> +For Example: >> + # Before creating cgroup namespace >> + $ ls -l /proc/self/ns/cgroup >> + lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835] >> + $ cat /proc/self/cgroup >> + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1 >> + >> + # unshare(CLONE_NEWCGROUP) and exec /bin/bash >> + $ ~/unshare -c >> + [ns]$ ls -l /proc/self/ns/cgroup >> + lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183] >> + # From within new cgroupns, process sees that its in the root cgroup >> + [ns]$ cat /proc/self/cgroup >> + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/ >> + >> + # From global cgroupns: >> + $ cat /proc//cgroup >> + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1 >> + >> + # Unshare cgroupns along with userns and mountns >> + # Following calls unshare(CLONE_NEWCGROUP|CLONE_NEWUSER|CLONE_NEWNS), then >> + # sets up uid/gid map and execs /bin/bash >> + $ ~/unshare -c -u -m > > This command does not issue CLONE_NEWUSER, -U does. > I was using a custom unshare binary. But I will update the command line to be similar to the one in util-linux. >> + # Originally, we were in /batchjobs/container_id1 cgroup. Mount our own cgroup >> + # hierarchy. >> + [ns]$ mount -t cgroup cgroup /tmp/cgroup >> + [ns]$ ls -l /tmp/cgroup >> + total 0 >> + -r--r--r-- 1 root root 0 2014-10-13 09:32 cgroup.controllers >> + -r--r--r-- 1 root root 0 2014-10-13 09:32 cgroup.populated >> + -rw-r--r-- 1 root root 0 2014-10-13 09:25 cgroup.procs >> + -rw-r--r-- 1 root root 0 2014-10-13 09:32 cgroup.subtree_control > > I've patched libvirt-lxc to issue CLONE_NEWCGROUP and not bind mount cgroupfs into a container. > But I'm unable to mount cgroupfs within the container, mount(2) is failing with EINVAL. > And /proc/self/cgroup still shows the cgroup from outside. > > ---cut--- > container:/ # ls /sys/fs/cgroup/ > container:/ # mount -t cgroup none /sys/fs/cgroup/ You need to provide "-o __DEVEL_sane_behavior" flag. Inside the container, only unified hierarchy can be mounted. So, for now, that flag is needed. I will fix the documentation too. > mount: wrong fs type, bad option, bad superblock on none, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. > container:/ # cat /proc/self/cgroup > 8:memory:/machine/test00.libvirt-lxc > 7:devices:/machine/test00.libvirt-lxc > 6:hugetlb:/ > 5:cpuset:/machine/test00.libvirt-lxc > 4:blkio:/machine/test00.libvirt-lxc > 3:cpu,cpuacct:/machine/test00.libvirt-lxc > 2:freezer:/machine/test00.libvirt-lxc > 1:name=systemd:/user.slice/user-0.slice/session-c2.scope > container:/ # ls -la /proc/self/ns > total 0 > dr-x--x--x 2 root root 0 Dec 14 23:02 . > dr-xr-xr-x 8 root root 0 Dec 14 23:02 .. > lrwxrwxrwx 1 root root 0 Dec 14 23:02 cgroup -> cgroup:[4026532240] > lrwxrwxrwx 1 root root 0 Dec 14 23:02 ipc -> ipc:[4026532238] > lrwxrwxrwx 1 root root 0 Dec 14 23:02 mnt -> mnt:[4026532235] > lrwxrwxrwx 1 root root 0 Dec 14 23:02 net -> net:[4026532242] > lrwxrwxrwx 1 root root 0 Dec 14 23:02 pid -> pid:[4026532239] > lrwxrwxrwx 1 root root 0 Dec 14 23:02 user -> user:[4026532234] > lrwxrwxrwx 1 root root 0 Dec 14 23:02 uts -> uts:[4026532236] > container:/ # > > #host side > lxc-os132:~ # ls -la /proc/self/ns > total 0 > dr-x--x--x 2 root root 0 Dec 14 23:56 . > dr-xr-xr-x 8 root root 0 Dec 14 23:56 .. > lrwxrwxrwx 1 root root 0 Dec 14 23:56 cgroup -> cgroup:[4026531835] > lrwxrwxrwx 1 root root 0 Dec 14 23:56 ipc -> ipc:[4026531839] > lrwxrwxrwx 1 root root 0 Dec 14 23:56 mnt -> mnt:[4026531840] > lrwxrwxrwx 1 root root 0 Dec 14 23:56 net -> net:[4026531957] > lrwxrwxrwx 1 root root 0 Dec 14 23:56 pid -> pid:[4026531836] > lrwxrwxrwx 1 root root 0 Dec 14 23:56 user -> user:[4026531837] > lrwxrwxrwx 1 root root 0 Dec 14 23:56 uts -> uts:[4026531838] > ---cut--- > > Any ideas? > Please try with "-o __DEVEL_sane_behavior" flag to the mount command. > Thanks, > //richard Thanks, -- Aditya