From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422966AbXCOQ5v (ORCPT ); Thu, 15 Mar 2007 12:57:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1422967AbXCOQ5v (ORCPT ); Thu, 15 Mar 2007 12:57:51 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:54198 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422966AbXCOQ5u (ORCPT ); Thu, 15 Mar 2007 12:57:50 -0400 Date: Thu, 15 Mar 2007 22:34:35 +0530 From: Srivatsa Vaddagiri To: "Paul Menage" Cc: xemul@sw.ru, dev@sw.ru, pj@sgi.com, sam@vilain.net, ebiederm@xmission.com, winget@google.com, serue@us.ibm.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, ckrm-tech@lists.sourceforge.net, containers@lists.osdl.org Subject: Re: Summary of resource management discussion Message-ID: <20070315170435.GA28692@in.ibm.com> Reply-To: vatsa@in.ibm.com References: <20070312124226.GD17151@in.ibm.com> <6599ad830703150424t3478cd55mf9d2699f3669c9f0@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6599ad830703150424t3478cd55mf9d2699f3669c9f0@mail.gmail.com> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 15, 2007 at 04:24:37AM -0700, Paul Menage wrote: > If there really was a grouping that was always guaranteed to match the > way you wanted to group tasks for e.g. resource control, then yes, it > would be great to use it. But I don't see an obvious candidate. The > pid namespace is not it, IMO. In vserver context, what is the "normal" case then? Atleast for Linux Vserver pid namespace seems to be normal unit of resource control (as per Herbert). Even if one wanted to manage a arbitrary group of tasks in vserver context, IMHO its still possible to construct that arbitrary group using the existing pointer, ns[/task]proxy, and not break existing namespace semantics/functionality. So the normal case I see is: pid_ns1 uts_ns1 cpu_ctl_space1 pid_ns2 uts_ns2 cpu_ctl_space2 ^ ^ (50%) ^ ^ (50%) | | ^ | | ^ | | | | | | --------------------------- ------------------------------- | task_proxy1 | | task_proxy2 | | (Vserver1) | | (Vserver2) | --------------------------- ------------------------------- But, if someone wanted to manage cpu resource differently, and say that postgres tasks from both vservers should be in same cpu resource class, the above becomes: pid_ns1 uts_ns1 cpu_ctl_space1 pid_ns1 uts_ns1 cpu_ctl_space2 ^ ^ (25%) ^ ^ (50%) | | ^ | | ^ | | | | | | --------------------------- ------------------------------- | task_proxy1 | | task_proxy2 | | (Vserver1) | | (postgres tasks in VServer1) | --------------------------- ------------------------------- pid_ns2 uts_ns2 cpu_ctl_space3 pid_ns2 uts_ns2 cpu_ctl_space2 ^ ^ (25%) ^ ^ (50%) | | ^ | | ^ | | | | | | --------------------------- ------------------------------ | task_proxy3 | | task_proxy4 | | (Vserver2) | | (postgres tasks in VServer2 | --------------------------- ------------------------------ (the best I could draw using ASCII art!) The benefit I see of this approach is it will avoid introduction of additional pointers in struct task_struct and also additional structures (struct container etc) in the kernel, but we will still be able to retain same user interfaces you had in your patches. Do you see any drawbacks of doing like this? What will break if we do this? > Resource control (and other kinds of task grouping behaviour) shouldn't > require virtualization. Certainly. AFAICS, nsproxy[.c] is unconditionally available in the kernel (even if virtualization support is not enabled). When reused for pure resource control purpose, I see that as a special case of virtualization where only resources are virtualized and namespaces are not. I think an interesting question would be : what more task-grouping behavior do you want to implement using an additional pointer that you can't reusing ->task_proxy? > >a. Paul Menage's patches: > > > > (tsk->containers->container[cpu_ctlr.subsys_id] - X)->cpu_limit > > So what's the '-X' that you're referring to Oh ..that's to seek pointer to begining of the cpulimit structure (subsys pointer in 'struct container' points to a structure embedded in a larger structure. -X gets you to point to the larger structure). > >6. As tasks move around namespaces/resource-classes, their > > tsk->nsproxy/containers object will change. Do we simple create > > a new nsproxy/containers object or optimize storage by searching > > for one which matches the task's new requirements? > > I think the latter. Yes me too. But maybe to keep in simple in initial versions, we should avoid that optimisation and at the same time get statistics on duplicates?. -- Regards, vatsa