LKML Archive on lore.kernel.org
 help / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Paul Turner <pjt@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Joel Fernandes <joelaf@google.com>,
	Steve Muckle <smuckle@google.com>
Subject: Re: [PATCH 4/7] sched/core: uclamp: add utilization clamping to the CPU controller
Date: Tue, 10 Apr 2018 13:05:14 -0700
Message-ID: <20180410200514.GA793541@devbig577.frc2.facebook.com> (raw)
In-Reply-To: <20180410171612.GJ14248@e110439-lin>

Hello,

On Tue, Apr 10, 2018 at 06:16:12PM +0100, Patrick Bellasi wrote:
> > I'm not too enthusiastic about util_min/max given that it can easily
> > be read as actual utilization based bandwidth control when what's
> > actually implemented, IIUC, is affecting CPU frequency selection.
> 
> Right now we are basically affecting the frequency selection.
> However, the next step is to use this same interface to possibly bias
> task placement.
> 
> The idea is that:
> 
> - the util_min value can be used to possibly avoid CPUs which have
>   a (maybe temporarily) limited capacity, for example, due to thermal
>   pressure.
> 
> - a util_max value can use used to possibly identify tasks which can
>   be co-scheduled together in a (maybe) limited capacity CPU since
>   they are more likely "less important" tasks.
> 
> Thus, since this is a new user-space API, we would like to find a
> concept which is generic enough to express the current requirement but
> also easily accommodate future extensions.

I'm not sure we can overload the meanings like that on the same
interface.  Right now, it doesn't say anything about bandwidth (or
utilization) allocation.  It just limits the frequency range the
particular cpu that the task ended up on can be in and what you're
describing above is the third different thing.  It doesn't seem clear
that they're something which can be overloaded onto the same
interface.

> > Maybe something like cpu.freq.min/max are better names?
> 
> IMO this is something too much platform specific.
> 
> I agree that utilization is maybe too much an implementation detail,
> but perhaps this can be solved by using a more generic range.
> 
> What about using values in the [0..100] range which define:
> 
>    a percentage of the maximum available capacity
>          for the CPUs in the target system
> 
> Do you think this can work?

Yeah, sure, it's more that right now the intention isn't clear.  A
cgroup control knob which limits cpu frequency range while the cgroup
is on a cpu is a very different thing from a cgroup knob which
restricts what tasks can be scheduled on the same cpu.  They're
actually incompatible.  Doing the latter actively breaks the former.

> > This is a problem which exists for all other interfaces.  For
> > historical and other reasons, at least till now, we've opted to put
> > everything at system level outside of cgroup interface.  We might
> > change this in the future and duplicate system-level information and
> > interfaces in the root cgroup but we wanna do that in a more systemtic
> > fashion than adding an one-off knob in the cgroup root.
> 
> I see, I think we can easily come up with a procfs/sysfs interface
> usable to define system-wide values.
> 
> Any suggestion for something already existing which I can use as a
> reference?

Most system level interfaces are there with a long history and things
aren't that consistent.  One route could be finding an interface
implementing a nearby feature and staying consistent with that.

> > Tying creation / config operations to the config propagation doesn't
> > work well with delegation and is inconsistent with what other
> > controllers are doing.  For cases where the propagated config being
> > visible in a sub cgroup is necessary, please add .effective files.
> 
> I'm not sure to understand this point: you mean that we should not
> enforce "consistency rules" among parent-child groups?

You should.  It just shouldn't make configurations fail cuz that ends
up breaking delegations.

> I have to look better into this "effective" concept.
> Meanwhile, can you make a simple example?

There's a recent cpuset patchset posted by Waiman Long.  Googling for
lkml cpuset and Waiman Long should find it easily.

> > > Tasks on a subgroup can only be more boosted and/or capped, which is
> > 
> > Less boosted.  .low at a parent level must set the upper bound of .low
> > that all its descendants can have.
> 
> Is that a mandatory requirement? Or based on a proper justification
> you can also accept what I'm proposing?
>
> I've always been more of the idea that what I'm proposing could make
> more sense for a general case but perhaps I just need to go back and
> better check the use-cases we have on hand to see if it's really
> required or not.

Yeah, I think we want to stick to that semantics.  That's what memory
controller does and it'd be really confusing to flip the directions on
different controllers.

Thanks.

-- 
tejun

  reply index

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-09 16:56 [PATCH 0/7] Add utilization clamping support Patrick Bellasi
2018-04-09 16:56 ` [PATCH 1/7] sched/core: uclamp: add CPU clamp groups accounting Patrick Bellasi
2018-04-13  8:26   ` Peter Zijlstra
2018-04-13 10:22     ` Peter Zijlstra
2018-04-13 11:04       ` Patrick Bellasi
2018-04-13 11:15         ` Peter Zijlstra
2018-04-13  8:40   ` Peter Zijlstra
2018-04-13 11:17     ` Patrick Bellasi
2018-04-13 11:29       ` Peter Zijlstra
2018-04-13 11:33         ` Patrick Bellasi
2018-04-13  8:43   ` Peter Zijlstra
2018-04-13 11:15     ` Patrick Bellasi
2018-04-13 11:36       ` Peter Zijlstra
2018-04-13 11:47         ` Patrick Bellasi
2018-04-13 11:52           ` Patrick Bellasi
2018-04-13 12:44           ` Peter Zijlstra
2018-04-13  9:30   ` Peter Zijlstra
2018-04-13  9:38     ` Peter Zijlstra
2018-04-13  9:46   ` Peter Zijlstra
2018-04-13 11:08     ` Patrick Bellasi
2018-04-13 11:19       ` Peter Zijlstra
2018-04-09 16:56 ` [PATCH 2/7] sched/core: uclamp: map TASK clamp values into CPU clamp groups Patrick Bellasi
2018-04-09 16:56 ` [PATCH 3/7] sched/core: uclamp: extend sched_setattr to support utilization clamping Patrick Bellasi
2018-04-09 16:56 ` [PATCH 4/7] sched/core: uclamp: add utilization clamping to the CPU controller Patrick Bellasi
2018-04-09 22:24   ` Tejun Heo
2018-04-10 17:16     ` Patrick Bellasi
2018-04-10 20:05       ` Tejun Heo [this message]
2018-04-21 21:08         ` Joel Fernandes
2018-04-26 18:58           ` Tejun Heo
2018-04-09 16:56 ` [PATCH 5/7] sched/core: uclamp: use TG clamps to restrict TASK clamps Patrick Bellasi
2018-04-09 16:56 ` [PATCH 6/7] sched/cpufreq: uclamp: add utilization clamping for FAIR tasks Patrick Bellasi
2018-04-09 16:56 ` [PATCH 7/7] sched/cpufreq: uclamp: add utilization clamping for RT tasks Patrick Bellasi

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180410200514.GA793541@devbig577.frc2.facebook.com \
    --to=tj@kernel.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=smuckle@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox