archive mirror
 help / color / mirror / Atom feed
From: Patrick Bellasi <>
To: Tejun Heo <>
	Ingo Molnar <>,
	Peter Zijlstra <>,
	"Rafael J . Wysocki" <>,
	Viresh Kumar <>,
	Vincent Guittot <>,
	Paul Turner <>,
	Dietmar Eggemann <>,
	Morten Rasmussen <>,
	Juri Lelli <>,
	Joel Fernandes <>,
	Steve Muckle <>
Subject: Re: [PATCH 4/7] sched/core: uclamp: add utilization clamping to the CPU controller
Date: Tue, 10 Apr 2018 18:16:12 +0100	[thread overview]
Message-ID: <20180410171612.GJ14248@e110439-lin> (raw)
In-Reply-To: <>

Hi Tejun,

On 09-Apr 15:24, Tejun Heo wrote:
> On Mon, Apr 09, 2018 at 05:56:12PM +0100, Patrick Bellasi wrote:
> > This patch extends the CPU controller by adding a couple of new attributes,
> > util_min and util_max, which can be used to enforce frequency boosting and
> > capping. Specifically:
> > 
> > - util_min: defines the minimum CPU utilization which should be considered,
> > 	    e.g. when  schedutil selects the frequency for a CPU while a
> > 	    task in this group is RUNNABLE.
> > 	    i.e. the task will run at least at a minimum frequency which
> > 	         corresponds to the min_util utilization
> > 
> > - util_max: defines the maximum CPU utilization which should be considered,
> > 	    e.g. when schedutil selects the frequency for a CPU while a
> > 	    task in this group is RUNNABLE.
> > 	    i.e. the task will run up to a maximum frequency which
> > 	         corresponds to the max_util utilization
> I'm not too enthusiastic about util_min/max given that it can easily
> be read as actual utilization based bandwidth control when what's
> actually implemented, IIUC, is affecting CPU frequency selection.

Right now we are basically affecting the frequency selection.
However, the next step is to use this same interface to possibly bias
task placement.

The idea is that:

- the util_min value can be used to possibly avoid CPUs which have
  a (maybe temporarily) limited capacity, for example, due to thermal

- a util_max value can use used to possibly identify tasks which can
  be co-scheduled together in a (maybe) limited capacity CPU since
  they are more likely "less important" tasks.

Thus, since this is a new user-space API, we would like to find a
concept which is generic enough to express the current requirement but
also easily accommodate future extensions.

> Maybe something like cpu.freq.min/max are better names?

IMO this is something too much platform specific.

I agree that utilization is maybe too much an implementation detail,
but perhaps this can be solved by using a more generic range.

What about using values in the [0..100] range which define:

   a percentage of the maximum available capacity
         for the CPUs in the target system

Do you think this can work?

> > These attributes:
> > a) are tunable at all hierarchy levels, i.e. at root group level too, thus
> >    allowing to define the minimum and maximum frequency constraints for all
> >    otherwise non-classified tasks (e.g. autogroups) and to be a sort-of
> >    replacement for cpufreq's powersave, ondemand and performance
> >    governors.
> This is a problem which exists for all other interfaces.  For
> historical and other reasons, at least till now, we've opted to put
> everything at system level outside of cgroup interface.  We might
> change this in the future and duplicate system-level information and
> interfaces in the root cgroup but we wanna do that in a more systemtic
> fashion than adding an one-off knob in the cgroup root.

I see, I think we can easily come up with a procfs/sysfs interface
usable to define system-wide values.

Any suggestion for something already existing which I can use as a

> Besides, if a feature makes sense at the system level which is the
> cgroup root, it makes sense without cgroup mounted or enabled, so it
> needs a place outside cgroup one way or the other.

Indeed, and it makes perfectly sense now that we have also a non
cgroup-based primary APU.

> > b) allow to create subgroups of tasks which are not violating the
> >    utilization constraints defined by the parent group.
> Tying creation / config operations to the config propagation doesn't
> work well with delegation and is inconsistent with what other
> controllers are doing.  For cases where the propagated config being
> visible in a sub cgroup is necessary, please add .effective files.

I'm not sure to understand this point: you mean that we should not
enforce "consistency rules" among parent-child groups?

I have to look better into this "effective" concept.
Meanwhile, can you make a simple example?

> > Tasks on a subgroup can only be more boosted and/or capped, which is
> Less boosted.  .low at a parent level must set the upper bound of .low
> that all its descendants can have.

Is that a mandatory requirement? Or based on a proper justification
you can also accept what I'm proposing?

I've always been more of the idea that what I'm proposing could make
more sense for a general case but perhaps I just need to go back and
better check the use-cases we have on hand to see if it's really
required or not.

Thanks for the prompt feedbacks!

#include <best/regards.h>

Patrick Bellasi

  reply	other threads:[~2018-04-10 17:16 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-09 16:56 [PATCH 0/7] Add utilization clamping support Patrick Bellasi
2018-04-09 16:56 ` [PATCH 1/7] sched/core: uclamp: add CPU clamp groups accounting Patrick Bellasi
2018-04-13  8:26   ` Peter Zijlstra
2018-04-13 10:22     ` Peter Zijlstra
2018-04-13 11:04       ` Patrick Bellasi
2018-04-13 11:15         ` Peter Zijlstra
2018-04-13  8:40   ` Peter Zijlstra
2018-04-13 11:17     ` Patrick Bellasi
2018-04-13 11:29       ` Peter Zijlstra
2018-04-13 11:33         ` Patrick Bellasi
2018-04-13  8:43   ` Peter Zijlstra
2018-04-13 11:15     ` Patrick Bellasi
2018-04-13 11:36       ` Peter Zijlstra
2018-04-13 11:47         ` Patrick Bellasi
2018-04-13 11:52           ` Patrick Bellasi
2018-04-13 12:44           ` Peter Zijlstra
2018-04-13  9:30   ` Peter Zijlstra
2018-04-13  9:38     ` Peter Zijlstra
2018-04-13  9:46   ` Peter Zijlstra
2018-04-13 11:08     ` Patrick Bellasi
2018-04-13 11:19       ` Peter Zijlstra
2018-04-09 16:56 ` [PATCH 2/7] sched/core: uclamp: map TASK clamp values into CPU clamp groups Patrick Bellasi
2018-04-09 16:56 ` [PATCH 3/7] sched/core: uclamp: extend sched_setattr to support utilization clamping Patrick Bellasi
2018-04-09 16:56 ` [PATCH 4/7] sched/core: uclamp: add utilization clamping to the CPU controller Patrick Bellasi
2018-04-09 22:24   ` Tejun Heo
2018-04-10 17:16     ` Patrick Bellasi [this message]
2018-04-10 20:05       ` Tejun Heo
2018-04-21 21:08         ` Joel Fernandes
2018-04-26 18:58           ` Tejun Heo
2018-04-09 16:56 ` [PATCH 5/7] sched/core: uclamp: use TG clamps to restrict TASK clamps Patrick Bellasi
2018-04-09 16:56 ` [PATCH 6/7] sched/cpufreq: uclamp: add utilization clamping for FAIR tasks Patrick Bellasi
2018-04-09 16:56 ` [PATCH 7/7] sched/cpufreq: uclamp: add utilization clamping for RT tasks Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180410171612.GJ14248@e110439-lin \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).