From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030567AbXCFN2m (ORCPT ); Tue, 6 Mar 2007 08:28:42 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1030611AbXCFN2l (ORCPT ); Tue, 6 Mar 2007 08:28:41 -0500 Received: from MAIL.13thfloor.at ([213.145.232.33]:39215 "EHLO MAIL.13thfloor.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030567AbXCFN2k (ORCPT ); Tue, 6 Mar 2007 08:28:40 -0500 Date: Tue, 6 Mar 2007 14:28:39 +0100 From: Herbert Poetzl To: Srivatsa Vaddagiri Cc: Paul Jackson , ckrm-tech@lists.sourceforge.net, linux-kernel@vger.kernel.org, xemul@sw.ru, ebiederm@xmission.com, winget@google.com, containers@lists.osdl.org, menage@google.com, akpm@linux-foundation.org Subject: Re: [ckrm-tech] [PATCH 0/2] resource control file system - aka containers on top of nsproxy! Message-ID: <20070306132837.GA15495@MAIL.13thfloor.at> Mail-Followup-To: Srivatsa Vaddagiri , Paul Jackson , ckrm-tech@lists.sourceforge.net, linux-kernel@vger.kernel.org, xemul@sw.ru, ebiederm@xmission.com, winget@google.com, containers@lists.osdl.org, menage@google.com, akpm@linux-foundation.org References: <20070301133543.GK15509@in.ibm.com> <20070301113900.a7dace47.pj@sgi.com> <20070303093655.GA1028@in.ibm.com> <20070303173244.GA16051@MAIL.13thfloor.at> <20070305173401.GA17044@in.ibm.com> <20070305183937.GC22445@MAIL.13thfloor.at> <20070306103940.GA2336@in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070306103940.GA2336@in.ibm.com> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 06, 2007 at 04:09:40PM +0530, Srivatsa Vaddagiri wrote: > On Mon, Mar 05, 2007 at 07:39:37PM +0100, Herbert Poetzl wrote: > > > Thats why nsproxy has pointers to resource control objects, rather > > > than embedding resource control information in nsproxy itself. > > > > which makes it a (name)space, no? > > I tend to agree, yes! > > > > This will let different nsproxy structures share the same resource > > > control objects (ctlr_data) and thus be governed by the same > > > parameters. > > > > as it is currently done for vfs, uts, ipc and soon > > pid and network l2/l3, yes? > > yes (by vfs do you mean mnt_ns?) yep > > > Where else do you think the resource control information for a > > > container should be stored? > > > > an alternative for that is to keep the resource > > stuff as part of a 'context' structure, and keep > > a reference from the task to that (one less > > indirection, as we had for vfs before) > > something like: > > struct resource_context { > int cpu_limit; > int rss_limit; > /* all other limits here */ > } > > struct task_struct { > ... > struct resource_context *rc; > > } > > ? > > With this approach, it makes it hard to have task-grouping that are > unique to each resource. that is correct ... > For ex: lets say that CPU and Memory needs to be divided as follows: > > CPU : C1 (70%), C2 (30%) > Mem : M1 (60%), M2 (40%) > > Tasks T1, T2, T3, T4 are assigned to these resource classes as follows: > > C1 : T1, T3 > C2 : T2, T4 > M1 : T1, T4 > M2 : T2, T3 > > We had a lengthy discussion on this requirement here: > > http://lkml.org/lkml/2006/11/6/95 > http://lkml.org/lkml/2006/11/1/239 > > Linus also has expressed a similar view here: > > http://lwn.net/Articles/94573/ you probably could get that flexibility by grouping certain limits into a separate struct, but IMHO the real world use of this is limited, because the resource limitations usually only fulfill one purpose, being protection from malicious users and DoS prevention groups like Memory, Disk Space, Sockets might make sense though, although we never had a single request for any overlapping in the resource management (while we have quite a few users of overlapping Network spaces) > Paul Menage's (and its clone rcfs) patches allows this flexibility by > simply mounting different hierarchies: > > mount -t container -o cpu none /dev/cpu > mount -t container -o mem none /dev/mem > > The task-groups created under /dev/cpu can be completely independent of > task-groups created under /dev/mem. > > Lumping together all resource parameters in one struct (like > resource_context above) makes it difficult to provide this feature. > > Now can we live w/o this flexibility? Maybe, I don't know for sure. > Since (stability of) user-interface is in question, we need to take a > carefull decision here. I don't like the dev/filesystem interface at all but I can probably live with it :) > > > then other derefences (->ctlr_data[] and ->limit) should be fast, > > > theas y should be in the cache? > > > > please provide real world numbers from testing ... > > What kind of testing did you have in mind? for example, implement RSS/VM limits and run memory intensive tests like kernel building or so, see that the accounting and limit checks do not add measureable overhead ... similar could be done for socket/ipc accounting and multithreaded network tests (apache comes to my mind) HTH, Herbert > -- > Regards, > vatsa