From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753166AbbAGOrw (ORCPT ); Wed, 7 Jan 2015 09:47:52 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:38923 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751943AbbAGOru (ORCPT ); Wed, 7 Jan 2015 09:47:50 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Richard Weinberger Cc: Aditya Kali , Tejun Heo , Li Zefan , Serge Hallyn , Andy Lutomirski , cgroups mailinglist , "linux-kernel\@vger.kernel.org" , Linux API , Ingo Molnar , Linux Containers , Rohit Jnagal , Vivek Goyal References: <1417744550-6461-1-git-send-email-adityakali@google.com> <1417744550-6461-9-git-send-email-adityakali@google.com> <548E17CE.8010704@nod.at> <54AB15BD.8020007@nod.at> <87lhlgpyxk.fsf@x220.int.ebiederm.org> <54AB2992.6060707@nod.at> <54ACFC38.5070007@nod.at> Date: Wed, 07 Jan 2015 08:45:05 -0600 In-Reply-To: <54ACFC38.5070007@nod.at> (Richard Weinberger's message of "Wed, 07 Jan 2015 10:28:24 +0100") Message-ID: <87fvbmir9q.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX1/JAyO4e1UlqGafV0LEiQ8uJ43lSzuSLNI= X-SA-Exim-Connect-IP: 97.121.85.189 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.5 XM_Body_Dirty_Words Contains a dirty word X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Richard Weinberger X-Spam-Relay-Country: X-Spam-Timing: total 538 ms - load_scoreonly_sql: 0.07 (0.0%), signal_user_changed: 3.8 (0.7%), b_tie_ro: 2.7 (0.5%), parse: 1.19 (0.2%), extract_message_metadata: 20 (3.7%), get_uri_detail_list: 3.3 (0.6%), tests_pri_-1000: 9 (1.7%), tests_pri_-950: 1.80 (0.3%), tests_pri_-900: 1.61 (0.3%), tests_pri_-400: 34 (6.3%), check_bayes: 32 (6.0%), b_tokenize: 13 (2.4%), b_tok_get_all: 10 (1.8%), b_comp_prob: 4.4 (0.8%), b_tok_touch_all: 2.5 (0.5%), b_finish: 0.75 (0.1%), tests_pri_0: 456 (84.8%), tests_pri_500: 6 (1.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCHv3 8/8] cgroup: Add documentation for cgroup namespaces X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Richard Weinberger writes: > Am 07.01.2015 um 00:20 schrieb Aditya Kali: >> I understand your point. But it will add some complexity to the code. >> >> Before trying to make it work for non-unified hierarchy cases, I would >> like to get a clearer idea. >> What do you expect to be mounted when you run: >> container:/ # mount -t cgroup none /sys/fs/cgroup/ >> from inside the container? >> >> Note that cgroup-namespace wont be able to change the way cgroups are >> mounted .. i.e., if say cpu and cpuacct subsystems are mounted >> together at a single mount-point, then we cannot mount them any other >> way (inside a container or outside). This restriction exists today and >> cgroup-namespaces won't change that. > > I wondered why cgroup namespaces won't change that and looked at your patches > in more detail. > What you propose as cgroup namespace is much more a cgroup chroot() than > a namespace. > As you pass relative paths into the namespace you depend on the mount structure > of the host side. > Hence, the abstraction between namespaces happens on the mount paths of the initial > cgroupfs. But we really want a new cgroupfs instance within a container and not just > a cut out of the initial cgroupfs mount. > > I fear you approach is over simplified and won't work for all cases. It may work > for your specific use case at Google but we really want something generic. > Eric, what do you think? I think I probably need to go back upthread and read the patches. I think it is a reasonable practical requirement that a widely used long term supported distribution like RHEL 7 needs to be able to run in a linux container bizarre init system and all. And that we the abstractions should be that that we should be able to migrate such a beast. There are a couple of issues in play and I think we need actual testing rather than reports that something shouldn't work before we reject a set of patches. Aditya in one of his replies to me has reported a configuration that he expects will work. So I think that configuration needs to be tested. cgroups is a weird beast and the problems tend not to lie where a person would first expect. I suspect no one strongly cares if the cgroup hierarchy is unified or not. By unified hierarchy I mean that every mount of cgroupfs has the same directories with the same processes in each directory. I do think people will care which controllers will show up in differ mounts of cgroupfs, and I think that is relevant to process migration. I am going to segway into scope of what is achievable with a cgroup namespace. - If there are files in cgroupfs that are not safe to delegate we can not support those files in a container. Last I looked there were such files and systemd used them. - Which controllers share hierarchies of processes to track resources is a core cgroup issue and not a cgroup namespace issue. If we find problems with using a unified hierarchy support we need to go fix cgroups in general not cgroupfs. Eric