All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Herbert Poetzl <herbert@13thfloor.at>
Cc: dev@openvz.org, ckrm-tech@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, xemul@sw.ru, ebiederm@xmission.com,
	winget@google.com, containers@lists.osdl.org, menage@google.com,
	akpm@linux-foundation.org
Subject: Re: [PATCH 0/2] resource control file system - aka containers on top of nsproxy!
Date: Sat, 3 Mar 2007 13:22:44 -0800	[thread overview]
Message-ID: <20070303132244.80b5cb40.pj@sgi.com> (raw)
In-Reply-To: <20070303174553.GB16051@MAIL.13thfloor.at>

Herbert wrote:
> I agree here, there is not much difference for the
> following aspects:

Whether two somewhat similar needs should be met by one shared
mechanism, or two distinct mechanisms, cannot really be decided by
listing the similarities.

One has to determine if there are any significant differences in
needs that are too difficult for a shared mechanism to provide.

A couple of things you wrote in your second message might
touch on such possible significant differences:
 - resources must be hierarchically suballocated, and
 - key resource management code hooks can't cause hot cache lines.

In a later message, Herbert wrote:
> well, the thing is, as nsproxy is working now, you
> will get a new one (with a changed subset of entries)
> every time a task does a clone() with one of the 
> space flags set, which means, that you will end up
> with quite a lot of them, but resource limits have
> to address a group of them, not a single nsproxy
> (or act in a deeply hierarchical way which is not
> there atm, and probably will never be, as it simply
> adds too much overhead)

I still can't claim to have my head around this, but what you write
here, Herbert, writes here touches on what I suspect is a key
difference between namespaces and resources that would make it
impractical to accomplish both with a shared mechanism for aggregating
tasks.

It is a natural and desirable capability when managing resources, which
are relatively scarce (that's why they're worth all this trouble)
commodities of which we have some limited amount, to subdivide
allowances of them.  Some group of tasks gets the right to use certain
memory pages or cpu time slices, and in turn suballocates that allotment
to some subgroup of itself.  This naturally leads to a hierarchy of
allocated resources.

There is no such necessary, hierarchy for name spaces. One name space
might be derived from another at setup, by some arbitrary conventions,
but once initialized, this way or that, they are separate name spaces,
or at least naturally can (must?) be separate.

The cpuset hierarchy is an important part of the API that cpusets
presents to user space, where that hierarchy reflects the suballocation
of resources.  If B is a child of A in the cpuset hierarchy, then
the CPUs and Memory Nodes allowed to B -must- be a subset of those allowed
to A.  That is the key semantic of the cpuset hierarchy.  This includes
forcing the removal of a resource from B if for some reason it must
be removed from A, in order to preserve the hierarchical suballocation,
which requirement is causing a fair bit of hard work for the cpu hot
unplug folks.

I am quite willing to believe that name spaces has no need for such a
hierarchy, and further that it probably never will have such ... "too
much overhead" as you say.


> > It should have the same perf overhead as the original
> > container patches (basically a double dereference -
> > task->containers/nsproxy->cpuset - required to get to the 
> > cpuset from a task).
> 
> on every limit accounting or check? I think that
> is quite a lot of overhead ...

Do either of these dereferences require locks?

The two critical resources that cpusets manages, memory pages and
time slices on a cpu, cannot afford such dereferences or locking
in the key code paths (allocating a page or scheduling a cpu.)  The
existing cpuset code is down to one RCU guarded dereference of
current->cpuset in the page allocation code path (and then only on
systems actively using cpusets), and no such dereferences at all in the
scheduler path.

It took a fair bit of hard work (for someone of my modest abilities)
to get that far; I doubt we can accept much regression on this point.

Most likely the other folks doing resource management will have similar
concerns in many cases - memory pages and cpu slices are not the only
resources we're trying to manage on critical code paths.

In short - the issues seem to be:
 - resources need to be hierarchical, name spaces don't (can't?), and
 - no hot cache lines allowed by the resource hooks in key code paths.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

  reply	other threads:[~2007-03-03 21:22 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-01 13:35 [PATCH 0/2] resource control file system - aka containers on top of nsproxy! Srivatsa Vaddagiri
2007-03-01 13:45 ` [PATCH 1/2] rcfs core patch Srivatsa Vaddagiri
2007-03-01 16:31   ` Serge E. Hallyn
2007-03-01 16:46     ` Srivatsa Vaddagiri
2007-03-02  5:06   ` [ckrm-tech] " Balbir Singh
2007-03-03  9:38     ` Srivatsa Vaddagiri
2007-03-08  3:12   ` Eric W. Biederman
2007-03-08  9:10     ` Paul Menage
2007-03-09  0:38       ` Herbert Poetzl
2007-03-09  9:07         ` Kirill Korotaev
2007-03-09 13:29           ` Herbert Poetzl
2007-03-09 17:57         ` Srivatsa Vaddagiri
2007-03-10  1:19           ` Herbert Poetzl
2007-03-11 16:36             ` Serge E. Hallyn
2007-03-12 23:16               ` Herbert Poetzl
2007-03-08 10:13     ` Srivatsa Vaddagiri
2007-03-09  0:48       ` Herbert Poetzl
2007-03-09  2:35         ` Paul Jackson
2007-03-09  9:23         ` Kirill Korotaev
2007-03-09  9:38           ` Paul Jackson
2007-03-09 13:21           ` Herbert Poetzl
2007-03-11 17:09             ` Kirill Korotaev
2007-03-12 23:00               ` Herbert Poetzl
2007-03-13  8:28                 ` Kirill Korotaev
2007-03-13 13:55                   ` Herbert Poetzl
2007-03-13 14:11                     ` [ckrm-tech] " Srivatsa Vaddagiri
2007-03-13 15:52                       ` Herbert Poetzl
2007-03-09 18:14         ` Srivatsa Vaddagiri
2007-03-09 19:25           ` Paul Jackson
2007-03-10  1:00             ` Herbert Poetzl
2007-03-10  1:31               ` Paul Jackson
2007-03-10  0:56           ` Herbert Poetzl
2007-03-09 16:16       ` Serge E. Hallyn
2007-03-01 13:50 ` [PATCH 2/2] cpu_accounting controller Srivatsa Vaddagiri
2007-03-01 19:39 ` [PATCH 0/2] resource control file system - aka containers on top of nsproxy! Paul Jackson
2007-03-02 15:45   ` Kirill Korotaev
2007-03-02 16:52     ` Andrew Morton
2007-03-02 17:25       ` Kirill Korotaev
2007-03-03 17:45     ` Herbert Poetzl
2007-03-03 21:22       ` Paul Jackson [this message]
2007-03-05 17:47         ` Srivatsa Vaddagiri
2007-03-03  9:36   ` Srivatsa Vaddagiri
2007-03-03 10:21     ` Paul Jackson
2007-03-05 17:02       ` Srivatsa Vaddagiri
2007-03-03 17:32     ` Herbert Poetzl
2007-03-05 17:34       ` [ckrm-tech] " Srivatsa Vaddagiri
2007-03-05 18:39         ` Herbert Poetzl
2007-03-06 10:39           ` Srivatsa Vaddagiri
2007-03-06 13:28             ` Herbert Poetzl
2007-03-06 16:21               ` Srivatsa Vaddagiri
2007-03-07  2:32 ` Paul Menage
2007-03-07 17:30   ` Srivatsa Vaddagiri
2007-03-07 17:29     ` Paul Menage
2007-03-07 17:52       ` [ckrm-tech] " Srivatsa Vaddagiri
2007-03-07 17:32     ` Srivatsa Vaddagiri
2007-03-07 17:43     ` Serge E. Hallyn
2007-03-07 17:46       ` Paul Menage
2007-03-07 23:16         ` Eric W. Biederman
2007-03-08 11:39           ` Srivatsa Vaddagiri
2007-03-07 18:00       ` Srivatsa Vaddagiri
2007-03-07 20:58         ` Serge E. Hallyn
2007-03-07 21:20           ` Paul Menage
2007-03-07 21:59             ` Serge E. Hallyn
2007-03-07 22:13               ` Dave Hansen
2007-03-07 23:13                 ` Eric W. Biederman
2007-03-12 14:11               ` [ckrm-tech] " Srivatsa Vaddagiri
2007-03-07 22:32             ` Eric W. Biederman
2007-03-07 23:18               ` Paul Menage
2007-03-08  0:35                 ` Sam Vilain
2007-03-08  0:42                   ` Paul Menage
2007-03-08  0:53                     ` Sam Vilain
2007-03-08  0:58                       ` [ckrm-tech] " Paul Menage
2007-03-08  1:32                         ` Eric W. Biederman
2007-03-08  1:35                           ` Paul Menage
2007-03-08  2:25                             ` Eric W. Biederman
2007-03-09  0:56                             ` Herbert Poetzl
2007-03-09  0:53                           ` Herbert Poetzl
2007-03-09 18:19                             ` Srivatsa Vaddagiri
2007-03-09 19:36                               ` Paul Jackson
2007-03-09 21:52                               ` Herbert Poetzl
2007-03-09 22:06                                 ` Paul Jackson
2007-03-12 14:01                                   ` Srivatsa Vaddagiri
2007-03-12 15:15                                     ` Srivatsa Vaddagiri
2007-03-12 20:26                                     ` Paul Jackson
2007-03-09  4:30                           ` Paul Jackson
2007-03-08  2:47                         ` Sam Vilain
2007-03-08  2:57                           ` Paul Menage
2007-03-08  3:32                             ` Sam Vilain
2007-03-08  6:10                               ` Matt Helsley
2007-03-08  6:44                                 ` Eric W. Biederman
2007-03-09  1:06                                   ` Herbert Poetzl
2007-03-10  9:06                                     ` Sam Vilain
2007-03-11 21:15                                       ` Paul Jackson
2007-03-12  9:35                                         ` Sam Vilain
2007-03-12 10:00                                         ` Paul Menage
2007-03-12 23:21                                           ` Herbert Poetzl
2007-03-13  2:25                                             ` Paul Menage
2007-03-13 15:57                                               ` Herbert Poetzl
2007-03-09  4:37                                 ` Paul Jackson
2007-03-08  6:32                               ` Eric W. Biederman
2007-03-08  9:10                               ` Paul Menage
2007-03-09 16:50                                 ` Serge E. Hallyn
2007-03-22 14:08                                   ` Srivatsa Vaddagiri
2007-03-22 14:39                                     ` Serge E. Hallyn
2007-03-22 14:56                                       ` Srivatsa Vaddagiri
2007-03-09  4:27                       ` Paul Jackson
2007-03-10  8:52                         ` Sam Vilain
2007-03-10  9:11                           ` Paul Jackson
2007-03-09 16:34             ` Srivatsa Vaddagiri
2007-03-09 16:41               ` [ckrm-tech] " Srivatsa Vaddagiri
2007-03-09 22:09               ` Paul Menage
2007-03-10  2:02                 ` Srivatsa Vaddagiri
2007-03-10  3:19                   ` [ckrm-tech] " Srivatsa Vaddagiri
2007-03-12 15:07                 ` Srivatsa Vaddagiri
2007-03-12 15:56                   ` Serge E. Hallyn
2007-03-12 16:20                     ` Srivatsa Vaddagiri
2007-03-12 17:25                       ` Serge E. Hallyn
2007-03-12 21:15                       ` Sam Vilain
2007-03-12 23:31                       ` Herbert Poetzl
2007-03-13  2:22                         ` Srivatsa Vaddagiri
2007-03-08  0:50     ` Sam Vilain
2007-03-08 11:30       ` Srivatsa Vaddagiri
2007-03-09  1:16         ` Herbert Poetzl
2007-03-09 18:41           ` Srivatsa Vaddagiri
2007-03-10  2:03             ` Herbert Poetzl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070303132244.80b5cb40.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=containers@lists.osdl.org \
    --cc=dev@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=herbert@13thfloor.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=winget@google.com \
    --cc=xemul@sw.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.