linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Paul Turner <pjt@google.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	John Stultz <john.stultz@linaro.org>,
	Todd Kjos <tkjos@android.com>, Tim Murray <timmurray@google.com>,
	Andres Oportus <andresoportus@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Juri Lelli <juri.lelli@arm.com>,
	Chris Redpath <chris.redpath@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>
Subject: Re: [RFC v3 0/5] Add capacity capping support to the CPU controller
Date: Thu, 13 Apr 2017 11:34:27 +0100	[thread overview]
Message-ID: <20170413103427.GA18854@e110439-lin> (raw)
In-Reply-To: <20170412161423.jktdg6tacp7wwpno@hirez.programming.kicks-ass.net>

On 12-Apr 18:14, Peter Zijlstra wrote:
> On Wed, Apr 12, 2017 at 03:43:10PM +0100, Patrick Bellasi wrote:
> > On 12-Apr 16:34, Peter Zijlstra wrote:
> > > On Wed, Apr 12, 2017 at 02:27:41PM +0100, Patrick Bellasi wrote:
> > > > On 12-Apr 14:48, Peter Zijlstra wrote:
> > > > > On Tue, Apr 11, 2017 at 06:58:33PM +0100, Patrick Bellasi wrote:
> > > > > > >     illustrated per your above points in that it affects both, while in
> > > > > > >     fact it actually modifies another metric, namely util_avg.
> > > > > > 
> > > > > > I don't see it modifying in any direct way util_avg.
> > > > > 
> > > > > The point is that clamps called 'capacity' are applied to util. So while
> > > > > you don't modify util directly, you do modify the util signal (for one
> > > > > consumer).
> > > > 
> > > > Right, but this consumer (i.e. schedutil) it's already translating
> > > > the util_avg into a next_freq (which ultimately it's a capacity).

                                                               ^^^^^^^^
                                                                [REF1]

> > > > 
> > > > Thus, I don't see a big misfit in that code path to "filter" this
> > > > translation with a capacity clamp.
> > > 
> > > Still strikes me as odd though.
> > 
> > Can you better elaborate on they why?
> 
> Because capacity is, as you pointed out earlier, a relative measure of
> inter CPU performance (which isn't otherwise exposed to userspace
> afaik).

Perhaps, since I'm biased by EAS concepts which are still not
mainline, I was not clear on specifying what I meant by "capacity" in
[REF1].

My fault, sorry, perhaps it's worth if I start by reviewing some
concepts and see if we can establish a common language.


.:: Mainline

If we look at mainline, "capacity" is actually a concept used to
represent the computational bandwidth available in a CPU, when running
at the highest OPP (let's consider SMP systems to keep it simple).

But things are already a bit more complicated. Specifically, looking
at update_cpu_capacity(), we distinguish between:

- cpu_rq(cpu)->cpu_capacity_orig
  which is the bandwidth available at the max OPP.

- cpu_rq(cpu)->cpu_capacity
  which discounts from the previous metrics the "average" bandwidth used
  by RT tasks, but not (yet) DEADLINE tasks afaics.

Thus, "capacity" is already a polymorphic concept:
  we use cpu_capacity_orig to cap the cpu utilization of CFS tasks
  in cpu_util()
but
  this cpu utilization is a signal which converge to "current capacity"
  in  ___update_load_avg()

The "current capacity" (capacity_curr, but just in some comments) is actually
the computational bandwidth available at a certain OPP.

Thus, we already have in mainline a concepts of capacity which refers to the
bandwidth available in a certain OPP. The "current capacity" is what we
ultimately use to scale PELT depending on the current OPP.


.:: EAS

Looking at EAS, and specifically the energy model, we describe each
OPP using a:

  struct capacity_state {
          unsigned long cap;      /* compute capacity */
          unsigned long power;    /* power consumption at this compute capacity */
  };

Where again we find a usage of the "current capacity", i.e. the
computational bandwidth available at each OPP.


.:: Current Capacity

In [REF1] I was referring to the concept of "current capacity", which is what
schedutil is after. There we need translate cfs.avg.util_avg into an OPP, which
ultimately is a suitable level of "current capacity" to satisfy the
CPU bandwidth requested by CFS tasks.

> While the utilization thing is a per task running signal.

Which still is converging to the "current capacity", at least before
Vincent's patches.

> There is no direct relation between the two.

Give the previous definitions, can we say that there is a relation between task
utilization and "current capacity"?

     Sum(task_utilization) = cpu_utilization
            <= "current capacity" (cpufreq_schedutil::get_next_freq())    [1]
            <= cpu_capacity_orig

> The two main uses for the util signal are:
> 
>   OPP selection: the aggregate util of all runnable tasks for a
>     particular CPU is used to select an OPP for said CPU [*], against
>     whatever max-freq that CPU has. Capacity doesn't really come into play
>     here.

The OPP selected has to provide a suitable amount of "current capacity" to
accommodate the required utilization.

>   Task placement: capacity comes into play in so far that we want to
>     make sure our task fits.

This two usages are not completely independent, at least when EAS is
in use. In EAS we can evaluate/compare scenarios like:

  "should I increase the capacity of CPUx or wakeup CPUy"

Thus, we use capacity indexes to estimate energy deltas by
moving a task and, by consequence, changing a CPU's OPP.

Which means: expected "capacity" variations are affecting OPP selections.

> And I'm not at all sure we want to have both uses of our utilization
> controlled by the one knob. They're quite distinct.

The proposed knobs, for example capacity_min, are used to clamp the
scheduler/schedutil view on what is the required "current capacity" by
modifying the previous relation [1] to be:

                 Sum(task_utilization) = cpu_utilization
            clamp(cpu_utilization, capacity_min, capacity_max)
                        <= "current capacity"
                        <= cpu_capacity_orig

In [1] we already have a transformation from the cpu_utilization
domain to the "current capacity" domain. Here we are just adding a
clamping filter around that transformation.

I hope this is useful to find some common ground, perhaps the naming
capacity_{min,max} is unfortunate and we can find a better one.
However, we should first agree on the utility of the proposed
clamping concept... ;-)

-- 
#include <best/regards.h>

Patrick Bellasi

      reply	other threads:[~2017-04-13 10:34 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28 14:38 [RFC v3 0/5] Add capacity capping support to the CPU controller Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 1/5] sched/core: add capacity constraints to " Patrick Bellasi
2017-03-13 10:46   ` Joel Fernandes (Google)
2017-03-15 11:20     ` Patrick Bellasi
2017-03-15 13:20       ` Joel Fernandes
2017-03-15 16:10         ` Paul E. McKenney
2017-03-15 16:44           ` Patrick Bellasi
2017-03-15 17:24             ` Paul E. McKenney
2017-03-15 17:57               ` Patrick Bellasi
2017-03-20 17:15   ` Tejun Heo
2017-03-20 17:36     ` Tejun Heo
2017-03-20 18:08     ` Patrick Bellasi
2017-03-23  0:28       ` Joel Fernandes (Google)
2017-03-23 10:32         ` Patrick Bellasi
2017-03-23 16:01           ` Tejun Heo
2017-03-23 18:15             ` Patrick Bellasi
2017-03-23 18:39               ` Tejun Heo
2017-03-24  6:37                 ` Joel Fernandes (Google)
2017-03-24 15:00                   ` Tejun Heo
2017-03-30 21:13                 ` Paul Turner
2017-03-24  7:02           ` Joel Fernandes (Google)
2017-03-30 21:15       ` Paul Turner
2017-04-01 16:25         ` Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 2/5] sched/core: track CPU's capacity_{min,max} Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 3/5] sched/core: sync capacity_{min,max} between slow and fast paths Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 4/5] sched/{core,cpufreq_schedutil}: add capacity clamping for FAIR tasks Patrick Bellasi
2017-02-28 14:38 ` [RFC v3 5/5] sched/{core,cpufreq_schedutil}: add capacity clamping for RT/DL tasks Patrick Bellasi
2017-03-13 10:08   ` Joel Fernandes (Google)
2017-03-15 11:40     ` Patrick Bellasi
2017-03-15 12:59       ` Joel Fernandes
2017-03-15 14:44         ` Juri Lelli
2017-03-15 16:13           ` Joel Fernandes
2017-03-15 16:24             ` Juri Lelli
2017-03-15 23:40               ` Joel Fernandes
2017-03-16 11:16                 ` Juri Lelli
2017-03-16 12:27                   ` Patrick Bellasi
2017-03-16 12:44                     ` Juri Lelli
2017-03-16 16:58                       ` Joel Fernandes
2017-03-16 17:17                         ` Juri Lelli
2017-03-15 11:41 ` [RFC v3 0/5] Add capacity capping support to the CPU controller Rafael J. Wysocki
2017-03-15 12:59   ` Patrick Bellasi
2017-03-16  1:04     ` Rafael J. Wysocki
2017-03-16  3:15       ` Joel Fernandes
2017-03-20 22:51         ` Rafael J. Wysocki
2017-03-21 11:01           ` Patrick Bellasi
2017-03-24 23:52             ` Rafael J. Wysocki
2017-03-16 12:23       ` Patrick Bellasi
2017-03-20 14:51 ` Tejun Heo
2017-03-20 17:22   ` Patrick Bellasi
2017-04-10  7:36     ` Peter Zijlstra
2017-04-11 17:58       ` Patrick Bellasi
2017-04-12 12:10         ` Peter Zijlstra
2017-04-12 13:55           ` Patrick Bellasi
2017-04-12 15:37             ` Peter Zijlstra
2017-04-13 11:33               ` Patrick Bellasi
2017-04-12 12:15         ` Peter Zijlstra
2017-04-12 13:34           ` Patrick Bellasi
2017-04-12 14:41             ` Peter Zijlstra
2017-04-12 12:22         ` Peter Zijlstra
2017-04-12 13:24           ` Patrick Bellasi
2017-04-12 12:48         ` Peter Zijlstra
2017-04-12 13:27           ` Patrick Bellasi
2017-04-12 14:34             ` Peter Zijlstra
2017-04-12 14:43               ` Patrick Bellasi
2017-04-12 16:14                 ` Peter Zijlstra
2017-04-13 10:34                   ` Patrick Bellasi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170413103427.GA18854@e110439-lin \
    --to=patrick.bellasi@arm.com \
    --cc=andresoportus@google.com \
    --cc=chris.redpath@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=john.stultz@linaro.org \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=timmurray@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@android.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).