All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Glauber Costa <glommer@parallels.com>
Cc: Colin Cross <ccross@google.com>,
	cgroups@vger.kernel.org, lkml <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Paul Turner <pjt@google.com>
Subject: Re: [PATCH v5 00/11] per-cgroup cpu-stat
Date: Tue, 22 Jan 2013 17:02:26 -0800	[thread overview]
Message-ID: <20130123010226.GF5359@htj.dyndns.org> (raw)
In-Reply-To: <50FD3123.6090003@parallels.com>

Hello,

On Mon, Jan 21, 2013 at 04:14:27PM +0400, Glauber Costa wrote:
> > Android userspace is currently using both cpu and cpuacct, and not
> > co-mounting them.  They are used for fundamentally different uses such
> > that creating a single hierarchy for both of them while maintaining
> > the existing behavior is not possible.
> > 
> > We use the cpu cgroup primarily as a priority container.  A simple
> > view is that each thread is assigned to a foreground cgroup when it is
> > user-visible, and a background cgroup when it is not.  The foreground
> > cgroup is assigned a significantly higher cpu.shares value such that
> > when each group is fully loaded the background group will get 5% and
> > the foreground group will get 95%.
> > 
> > We use the cpuacct cgroup to measure cpu usage per uid, primarily to
> > estimate one cause of battery usage.  Each uid gets a cgroup, and when
> > spawning a task for a new uid we put it in the appropriate cgroup.
> 
> As we are all in a way sons of Linus the Great, the fact that you have
> this usecase should be by itself a reason for us not to deprecate it.
> 
> I still view this, however, as a not common use case. And from the
> scheduler PoV, we still have all the duplicate hierarchy walks. So
> assuming we would carry on all the changes in this patchset, except the
> deprecation, would it be okay for you?
> 
> This way we could take steps to make sure the scheduler codepaths for
> cpuacct are not taking during normal comounted operation, and you could
> still have your setup unchanged.
> 
> Tejun, any words here?

I think the only thing we can do is keeping cpuacct around.  We can
still optimize comounted cpu and cpuacct as the usual case.  That
said, I'd really like to avoid growing new use cases for separate
hierarchies for cpu and cpuacct (well, any controller actually).
Having multiple hierarchies is fundamentally broken in that we can't
say whether a given resource belongs to certain cgroup independently
from the current task, and we're definitnely moving towards unified
hierarchy.

We are not gonna break multiple hierarchies but won't go extra miles
to optimize or enable new features on it, so it would be best to move
away from it.

Maybe we can generate a warning message on separate mounts?

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: Colin Cross <ccross-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	lkml <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Peter Zijlstra
	<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>,
	Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v5 00/11] per-cgroup cpu-stat
Date: Tue, 22 Jan 2013 17:02:26 -0800	[thread overview]
Message-ID: <20130123010226.GF5359@htj.dyndns.org> (raw)
In-Reply-To: <50FD3123.6090003-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

Hello,

On Mon, Jan 21, 2013 at 04:14:27PM +0400, Glauber Costa wrote:
> > Android userspace is currently using both cpu and cpuacct, and not
> > co-mounting them.  They are used for fundamentally different uses such
> > that creating a single hierarchy for both of them while maintaining
> > the existing behavior is not possible.
> > 
> > We use the cpu cgroup primarily as a priority container.  A simple
> > view is that each thread is assigned to a foreground cgroup when it is
> > user-visible, and a background cgroup when it is not.  The foreground
> > cgroup is assigned a significantly higher cpu.shares value such that
> > when each group is fully loaded the background group will get 5% and
> > the foreground group will get 95%.
> > 
> > We use the cpuacct cgroup to measure cpu usage per uid, primarily to
> > estimate one cause of battery usage.  Each uid gets a cgroup, and when
> > spawning a task for a new uid we put it in the appropriate cgroup.
> 
> As we are all in a way sons of Linus the Great, the fact that you have
> this usecase should be by itself a reason for us not to deprecate it.
> 
> I still view this, however, as a not common use case. And from the
> scheduler PoV, we still have all the duplicate hierarchy walks. So
> assuming we would carry on all the changes in this patchset, except the
> deprecation, would it be okay for you?
> 
> This way we could take steps to make sure the scheduler codepaths for
> cpuacct are not taking during normal comounted operation, and you could
> still have your setup unchanged.
> 
> Tejun, any words here?

I think the only thing we can do is keeping cpuacct around.  We can
still optimize comounted cpu and cpuacct as the usual case.  That
said, I'd really like to avoid growing new use cases for separate
hierarchies for cpu and cpuacct (well, any controller actually).
Having multiple hierarchies is fundamentally broken in that we can't
say whether a given resource belongs to certain cgroup independently
from the current task, and we're definitnely moving towards unified
hierarchy.

We are not gonna break multiple hierarchies but won't go extra miles
to optimize or enable new features on it, so it would be best to move
away from it.

Maybe we can generate a warning message on separate mounts?

Thanks.

-- 
tejun

  reply	other threads:[~2013-01-23  1:02 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-09 11:45 [PATCH v5 00/11] per-cgroup cpu-stat Glauber Costa
2013-01-09 11:45 ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 01/11] don't call cpuacct_charge in stop_task.c Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 02/11] cgroup: implement CFTYPE_NO_PREFIX Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 03/11] cgroup, sched: let cpu serve the same files as cpuacct Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-14  8:34   ` Sha Zhengju
2013-01-14  8:34     ` Sha Zhengju
2013-01-14 14:55     ` Glauber Costa
2013-01-14 14:55       ` Glauber Costa
2013-01-15 10:19       ` Sha Zhengju
2013-01-15 10:19         ` Sha Zhengju
2013-01-15 17:52         ` Glauber Costa
2013-01-15 17:52           ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 04/11] cgroup, sched: deprecate cpuacct Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 05/11] sched: adjust exec_clock to use it as cpu usage metric Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 06/11] cpuacct: don't actually do anything Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 07/11] account guest time per-cgroup as well Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 08/11] sched: Push put_prev_task() into pick_next_task() Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 09/11] record per-cgroup number of context switches Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 10/11] sched: change nr_context_switches calculation Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 11:45 ` [PATCH v5 11/11] sched: introduce cgroup file stat_percpu Glauber Costa
2013-01-09 11:45   ` Glauber Costa
2013-01-09 20:42   ` Andrew Morton
2013-01-09 20:42     ` Andrew Morton
2013-01-09 21:10     ` Glauber Costa
2013-01-09 21:10       ` Glauber Costa
2013-01-09 21:17       ` Andrew Morton
2013-01-09 21:17         ` Andrew Morton
2013-01-09 21:27         ` Glauber Costa
2013-01-09 21:27           ` Glauber Costa
2013-01-23 14:26           ` Glauber Costa
2013-01-23 14:26             ` Glauber Costa
2013-01-23 14:20     ` Glauber Costa
2013-01-23 14:20       ` Glauber Costa
2013-01-09 14:41 ` [PATCH v5 00/11] per-cgroup cpu-stat Tejun Heo
2013-01-09 14:41   ` Tejun Heo
2013-01-16  0:33 ` Colin Cross
2013-01-21 12:14   ` Glauber Costa
2013-01-21 12:14     ` Glauber Costa
2013-01-23  1:02     ` Tejun Heo [this message]
2013-01-23  1:02       ` Tejun Heo
2013-01-23  1:53       ` Colin Cross
2013-01-23  1:53         ` Colin Cross
2013-01-23  8:12         ` Glauber Costa
2013-01-23  8:12           ` Glauber Costa
2013-01-23 16:56         ` Tejun Heo
2013-01-23 16:56           ` Tejun Heo
2013-01-23 22:41           ` Colin Cross
2013-01-23 23:06             ` Tejun Heo
2013-01-23 23:06               ` Tejun Heo
2013-01-23 23:53               ` Colin Cross
2013-01-23 23:53                 ` Colin Cross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130123010226.GF5359@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=ccross@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=glommer@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pjt@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.