From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758556AbYHZNl6 (ORCPT ); Tue, 26 Aug 2008 09:41:58 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756532AbYHZNlu (ORCPT ); Tue, 26 Aug 2008 09:41:50 -0400 Received: from mx1.redhat.com ([66.187.233.31]:52716 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756443AbYHZNlt (ORCPT ); Tue, 26 Aug 2008 09:41:49 -0400 Date: Tue, 26 Aug 2008 09:41:27 -0400 From: Vivek Goyal To: Paul Menage Cc: righi.andrea@gmail.com, KAMEZAWA Hiroyuki , Balbir Singh , linux kernel mailing list , Dhaval Giani , Kazunaga Ikeno , Morton Andrew Morton , Thomas Graf , Ulrich Drepper , Steve Olivieri Subject: Re: [RFC] [PATCH -mm] cgroup: uid-based rules to add processes efficiently in the right cgroup Message-ID: <20080826134127.GA30312@redhat.com> References: <20080710104852.797fe79c@cuia.bos.redhat.com> <20080710154035.GA12043@redhat.com> <20080711095501.cefff6df.kamezawa.hiroyu@jp.fujitsu.com> <20080714135719.GE16673@redhat.com> <487B665B.9080205@sun.com> <20080714152142.GJ16673@redhat.com> <48A7FE7B.3060309@gmail.com> <6599ad830808181405i3ec1f9fdp4d8ca7ab675b2c5f@mail.gmail.com> <20080819125710.GA18972@redhat.com> <6599ad830808251754l146588dax65aeff2cc22ac0c1@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6599ad830808251754l146588dax65aeff2cc22ac0c1@mail.gmail.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 25, 2008 at 05:54:39PM -0700, Paul Menage wrote: > On Tue, Aug 19, 2008 at 5:57 AM, Vivek Goyal wrote: > > > > Same thing will happen if we implement the daemon in user space. A task > > who does seteuid(), can be swept away to a different cgroup based on > > rules specified in /etc/cgrules.conf. > > Yes, I'm not so keen on a daemon magically pulling things into a > cgroup based on uid either, for the same reasons. > > But a user-space based solution can be much more flexible (e.g. easier > to configure it to only move tasks from certain source cgroups). > > > > > What do you mean by risk? This is the policy set up by system admin and > > behaviour would seem consistent as per the policy. If an admin decides > > that tasks of user "apache" should run into /container/cpu/apache cgroup and > > if a "root" tasks does seteuid(apache), then it manes sense to move task > > to /container/cpu/apache. > > The kind of unexpected behaviour I was imagining was when some other > daemon (e.g. ftpd?) unexpectedly does a setuid to one of the > magically-controlled users, and results in that daemon being pulled > into the specified cgroup. For something like cpu maybe that's mostly > benign (but what moves it back into its original group after it > switches back to root?) Once ftpd does seteuid() or setreuid() again to switch effective user to "root", it will be moved back to original group (root's group). So basic question is if a program changes its effective user id temporarily to user B than all the resource consumption should take place from the resources of user B or should continue to take place from original cgroup. I would think that we should move the task temporarily to B's cgroup and bring back again upon identity change. At the same time I can also understand that this behavior can probably be considered over-intrusive and some people might want to avoid that. Two things come to my mind. - Users who find it too intrusive, can just shut down the rules based daemon. - Or, we can implement selective movement of tasks by daemon as suggested by you. This will make system more complex but provides more flexibility in the sense users can keep daemon running at the same time control movement of certain tasks. > but for other subsystems it could be more > painful (memory, device access, etc). > > > > > Exactly what kind of scenario do you have in mind when you want the policy > > to be enforced selectively based on task (tid)? > > I was thinking of something like possibly a per-cgroup file (that also > affected child cgroups) rather than a global file. That would also > automatically handle multiple hierarchies. > So there can be two kind of controls. - Create a per cgroup file say "group_pinned", where if 1 is written to "group_pinned" that means daemon will not move tasks from this cgroup upon effective uid/gid changes. - Provide more fine grained control where task movement is not controlled per cgroup, rather per thread id. In that case every cgroup will contain another file "tasks_pinned" which will contain all the tids which cannot be moved from this cgroup by daemon. By default this file will be empty and all the tids are movable. I think initially we can keep things simple and implement "group_pinned" which provides coarse control on the whole group and pins all the tasks in that cgroup. Thoughts? Thanks Vivek