linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>,
	containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Kay Sievers <kay.sievers@vrfy.org>,
	Lennart Poettering <lennart@poettering.net>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [RFD] cgroup: about multiple hierarchies
Date: Wed, 22 Feb 2012 11:38:58 -0500	[thread overview]
Message-ID: <20120222163858.GB4128@redhat.com> (raw)
In-Reply-To: <20120221211938.GE12236@google.com>

On Tue, Feb 21, 2012 at 01:19:38PM -0800, Tejun Heo wrote:

[..]
> 3. Head towards single hierarchy with the pie-in-the-sky goal of
>    merging things into process hierarchy in some distant future.
> 
>    The first step would be herding people to use a unified hierarchy
>    (ie. all subsystems mounted on a single cgroup tree) which is
>    controlled by single entity in userland (be it systemd or cgroupd,
>    cgroup-kit or whatever); however, even if we exclude supporting
>    orthogonal categorizations, there are good number of non-trivial
>    hurdles to clear before this can be realized.

Apart from orthogonal categorizations, one advantage of of multiple 
hierarchies is that you don't have to use a controller if you don't
want to. (Just don't create cgroup in controller's respective hierarchy).

This is not ideal but practically it might he helpful. In the sense
cgroups might not come cheap and different controllers might have different
overheads associated with it. For example, in blkio controller we can end
up idling a lot with increasing number of cgroups. In that case a better
way might be that use blkio controller cgroups selectively and that is
any workload which is destroying the performance of others, move it out
in a separate blkio group.

This is not ideal situation but that's how things currently are.

systemd by default creates in cgroups only cpu hierarchy (apart from named
systemd hiearchy to keep track of groups/processes). By default it does
not make use of other controllers and put any restrictions on
processes/services apart from cpu. Having a separate hiearchy for every
controller atleast easily allows that.

> 
>    Most importantly, we would need to clean up how nesting is handled
>    across different subsystems.  Handling internal and leaf nodes as
>    equals simply can't work.  Membership should be recursive, and for
>    subsystems which can't support proper nesting, the right thing to
>    do would be somehow ensuring that only single node in the path from
>    root to leaf is active for the controller.  We may even have to
>    introduce an alternative of operation to support this (yuck).
> 
>    This path would require the most amount of work and we would be
>    excluding a feature - support for multiple orthogonal
>    categorizations - which has been available till now, probably
>    through deprecation process spanning years; however, this at least
>    gives us hope that we may reach sanity in the end, how distant that
>    end may be.  Oh, hope. :)

Yes this is something needs to be cleaned up. Everybody seems to have
dealt with hiearchy in its own way.

For blkio controller, initially we provided fully nested hiearchies like
cpu controller but then implementation became too complex (CFQ is already
complicated and implementing fully nested hiearchies made it much more
complicated without any significant gain). So, I converted it into
flat model where internally we treat the whole hierarchy flat. (It
might have been a bad decision though).

So for blkio controller we can convert it into fully nested hierarchy
at the expense of more complex code in CFQ. I think memory cgroup
controller provides both flat and hierarchical mode. Keeping it fully
hierarchical also increases the cost as we need to traverse lot more
pointers for simple things like nested stats. On a system having
both systemd and libvirt, every virtual machine is already 3-4 level
deep in cgroup hierarchy.

Trying to make all the controllers uniform in terms of their treatment
of cgroup hiearchy sounds like a good thing to do. Once that is done,
one can probably see if it is worth to put all the controllers in a
single hierarchy.

Thanks
Vivek

  parent reply	other threads:[~2012-02-22 16:39 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-21 21:19 [RFD] cgroup: about multiple hierarchies Tejun Heo
2012-02-21 21:21 ` Tejun Heo
2012-02-22 13:34   ` Glauber Costa
2012-02-23  7:45     ` Serge E. Hallyn
2012-02-23 17:29       ` Tejun Heo
2012-02-23 18:47         ` Serge Hallyn
2012-02-26  4:59   ` Konstantin Khlebnikov
2012-02-22 13:30 ` Peter Zijlstra
2012-02-22 13:37   ` Glauber Costa
2012-02-22 18:01   ` Tejun Heo
2012-02-23  7:39   ` Li Zefan
2012-02-22 15:45 ` Frederic Weisbecker
2012-02-22 18:22   ` Tejun Heo
2012-02-27 17:46     ` Frederic Weisbecker
2012-02-22 16:38 ` Vivek Goyal [this message]
2012-02-22 16:57   ` Vivek Goyal
2012-02-22 18:43     ` Tejun Heo
2012-02-23  9:41     ` Peter Zijlstra
2012-02-23 14:13       ` Peter Zijlstra
2012-03-01 17:19         ` Michal Schmidt
2012-03-01 18:03           ` Peter Zijlstra
2012-03-02 11:08             ` Michal Schmidt
2012-03-02 11:23               ` Peter Zijlstra
2012-03-02 11:28                 ` Michal Schmidt
2012-03-02 11:34                   ` Peter Zijlstra
2012-03-01 20:26           ` Mike Galbraith
2012-03-01 21:02             ` Vivek Goyal
2012-03-01 22:04               ` Mike Galbraith
2012-03-01 22:38                 ` C Anthony Risinger
2012-03-02 10:51                 ` Michal Schmidt
2012-03-02 11:52                   ` Mike Galbraith
2012-03-05 12:43                 ` Lennart Poettering
2012-03-05 15:47                   ` Mike Galbraith
2012-03-05 19:58                     ` Mike Galbraith
2012-03-02  2:43             ` Kay Sievers
2012-03-02 10:15               ` Peter Zijlstra
2012-03-02 11:16             ` Michal Schmidt
2012-03-02 11:24               ` Peter Zijlstra
2012-02-23 21:38       ` Vivek Goyal
2012-02-23 22:34         ` Tejun Heo
2012-02-28 21:16           ` Vivek Goyal
2012-02-28 21:21             ` Peter Zijlstra
2012-02-28 21:35               ` Vivek Goyal
2012-02-28 21:43                 ` Peter Zijlstra
2012-02-28 21:54                   ` Vivek Goyal
2012-02-28 22:00                     ` Peter Zijlstra
2012-02-28 22:31                       ` Vivek Goyal
2012-02-28 21:53                 ` Peter Zijlstra
2012-02-28 22:09                   ` Vivek Goyal
2012-02-24 11:33         ` Peter Zijlstra
2012-02-22 18:33   ` Tejun Heo
2012-02-23 19:41     ` Vivek Goyal
2012-02-23 22:38       ` Tejun Heo
2012-02-23  7:59   ` Li Zefan
2012-02-23 20:32     ` Vivek Goyal
2012-02-23  8:22 ` Li Zefan
2012-02-23 17:33   ` Tejun Heo
     [not found] ` <m162em2efy.fsf@fess.ebiederm.org>
2012-03-03 14:26   ` Serge Hallyn
2012-03-05 11:37 ` Lennart Poettering
2012-03-12 22:10 ` Tejun Heo
2012-03-12 22:22   ` Peter Zijlstra
2012-03-12 22:28     ` Tejun Heo
2012-03-12 22:31       ` Lennart Poettering
2012-03-12 23:00         ` Tejun Heo
2012-03-12 23:02           ` Peter Zijlstra
2012-03-12 23:09             ` Tejun Heo
2012-03-12 23:43             ` Lennart Poettering
2012-03-12 22:32       ` Peter Zijlstra
2012-03-12 22:39         ` Tejun Heo
2012-03-12 22:44           ` Peter Zijlstra
2012-03-12 23:04             ` Tejun Heo
2012-03-13 14:10               ` Vivek Goyal
2012-03-13 16:11                 ` C Anthony Risinger
2012-03-13 16:30                   ` C Anthony Risinger
2012-03-13 17:25                 ` Peter Zijlstra
2012-03-13 17:31                   ` Peter Zijlstra
2012-03-13 10:11             ` Glauber Costa
2012-03-13 14:03       ` Vivek Goyal
2012-03-13 15:59         ` Tejun Heo
2012-03-16 23:14           ` James Bottomley
2012-03-12 22:37   ` Serge Hallyn
2012-03-12 22:55     ` Tejun Heo
2012-03-13 13:49   ` Vivek Goyal
2012-03-13 16:02     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120222163858.GB4128@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=hch@infradead.org \
    --cc=kay.sievers@vrfy.org \
    --cc=lennart@poettering.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).