From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757573AbbE3Gun (ORCPT ); Sat, 30 May 2015 02:50:43 -0400 Received: from mail-ie0-f180.google.com ([209.85.223.180]:36075 "EHLO mail-ie0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750799AbbE3Guf (ORCPT ); Sat, 30 May 2015 02:50:35 -0400 MIME-Version: 1.0 X-Originating-IP: [122.106.150.15] In-Reply-To: <20150528204051.GB27479@htj.duckdns.org> References: <1431960667-26593-1-git-send-email-cyphar@cyphar.com> <1431960667-26593-9-git-send-email-cyphar@cyphar.com> <20150519080055.GA3644@twins.programming.kicks-ass.net> <20150528204051.GB27479@htj.duckdns.org> Date: Sat, 30 May 2015 16:50:34 +1000 Message-ID: Subject: Re: [PATCH v12 8/8] cgroup: implement the PIDs subsystem From: Aleksa Sarai To: Tejun Heo Cc: Thomas Gleixner , Peter Zijlstra , lizefan@huawei.com, mingo@redhat.com, richard@nod.at, =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, >> That's complete and utter nonsense. What has the parent limit to do >> with the overflow of the child limit? >> >> parent: limit 100 usecnt 80 >> child: limit 10 usecnt 10 >> >> So moving anything into child is violating the constraints and has to >> be refused. Anything else is just dirty hackery. > > And the one who's moving the process there might as well raise the > limit in the child all the same. It doesn't make any difference > without delegation and with delegation we need to restrict migration > at the exactly same junctions. We can't delegate otherwise. And the > resource limit for the delegated subtree is enforced from its parent > which delegatee can't escape how it changes the configuration or moves > processes around. Here's a case where we've delegated a subtree, for an example of how a delegated subtree can't overcome `subtree_parent`'s limit -- and by extension `parent`'s limit: parent: limit=128 usage=64 -- subtree_parent: limit=64 usage=32 ---- subtree_child: limit=2 usage=1 If you delegate a subtree (such that a process cannot attach processes to `parent`), then it is not possible for the subtree to violate `subtree_parent`'s limit. This is because the ability to migrate a process mid-fork relies on the ability to *actually* fork in the _original_ cgroup (`subtree_parent` or `subtree_child` [which requires the ability to fork in `subtree_parent`]). Once you've hit subtree_parent's limit, there's no way for you to violate that limit. The only other method I can think of is if you do the mid-fork thing to migrate into `subtree_child`, then you migrate the two processes into `subtree_parent`. This won't help you either, because if you then continue and try to fork in `subtree_child` and then migrate, you'll be blocked if the fork would violate `subtree_parent`'s limit. If you try to attach to `subtree_child` a process that is mid-fork, you'll bump the usage count to 3 (while this is bad, I can't really think of any way we can tell can_attach() that the process is mid-fork). If you do it again (because we don't stop can_attach()), you aren't blocked by the fact that you're attaching to a cgroup that has already exceeded its usage count, so you'll bump the count to 5 -- this I can understand would _seem_ to indicate a broken controller. And you /can/ continue this ad infinitum -- up _until_ you run out of the ability to make new processes inside `subtree_parent` (which *will* happen). At that point, can_fork() will fail the fork on `subtree_parent`, before you can attempt to migrate mid-fork. And I just want to point out that if you have the ability to attach processes to `subtree_child`, then you *already* have the right to violate its set limit through attach anyway (or just changing the limit) -- so the fact you can do this mid-fork isn't untoward at all. If a user has the ability to just disable the cgroup's limit, then why should that same user be hampered when attempting to attach processes that said cgroup (which is an administrative operation -- so you'd assume that they're clever enough to know that migration into a cgroup may bump usage so it's greater than the limit [or that they just RTFM'd])? -- Aleksa Sarai (cyphar) www.cyphar.com