linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Linux PM list <linux-pm@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@arm.com>,
	Steve Muckle <steve.muckle@linaro.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Michael Turquette <mturquette@baylibre.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH v6 7/7][Update] cpufreq: schedutil: New governor based on scheduler utilization data
Date: Fri, 18 Mar 2016 12:34:09 +0000	[thread overview]
Message-ID: <20160318123409.GA900@e105326-lin> (raw)
In-Reply-To: <1614814.usHvZ58O6A@vostro.rjw.lan>

Hi Rafael, all,
I have (yet another) consideration regarding the definition of the
margin for the frequency selection.

On 17-Mar 17:01, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] cpufreq: schedutil: New governor based on scheduler utilization data
> 
> Add a new cpufreq scaling governor, called "schedutil", that uses
> scheduler-provided CPU utilization information as input for making
> its decisions.
> 
> Doing that is possible after commit 34e2c555f3e1 (cpufreq: Add
> mechanism for registering utilization update callbacks) that
> introduced cpufreq_update_util() called by the scheduler on
> utilization changes (from CFS) and RT/DL task status updates.
> In particular, CPU frequency scaling decisions may be based on
> the the utilization data passed to cpufreq_update_util() by CFS.
> 
> The new governor is relatively simple.
> 
> The frequency selection formula used by it depends on whether or not
> the utilization is frequency-invariant.  In the frequency-invariant
> case the new CPU frequency is given by
> 
> 	next_freq = 1.25 * max_freq * util / max
> 
> where util and max are the last two arguments of cpufreq_update_util().
> In turn, if util is not frequency-invariant, the maximum frequency in
> the above formula is replaced with the current frequency of the CPU:
> 
> 	next_freq = 1.25 * curr_freq * util / max
> 
> The coefficient 1.25 corresponds to the frequency tipping point at
> (util / max) = 0.8.


In both this formulas the OPP jump is driven by a margin which is
effectively proportional to the capacity of the current OPP.
For example, if we consider a simple system with this set of OPPs:

  [200,400,600,800,1000) MHz

and we apply the formula for the frequency-invariant case, we get:

   util/max    min_opp    min_util   margin
        1.0       1000        0.80      20%
        0.8        800        0.64      16%
        0.6        600        0.48      12%
        0.4        400        0.32       8%
        0.2        200        0.16       4%

Where:
- min_opp: is the minimum OPP which can satisfy (util/max) capacity
  request
- min_util: is the minimum utilization value which effectively trigger
  a switch to the upper OPP
- margin: is the effective capacity margin to remain at min_opp

This means that when running at the lower OPP we can build up to 16%
utilization (i.e. 4% less than the capacity of the min_opp) before
jumping to the next OPP. But, for example, switching at the 800MHz
OPP we need to build up just 4% utilization (i.e. 16% less than the
capacity of that OPP) to jump up.

This is a really simple example, with OPPs that are equally distributed.
However, the question is: does is really make sense to have different
effective margins for different starting OPPs?

AFAIU, this solution is biasing the frequency selection to higher
OPPs.  The bigger the utilization of a CPU the more we are likely to
run at an higher the minimum OPP.
The advantage is a reduce time to reach the highest OPP, which can be
beneficial for performance oriented workload. The disadvantage is
instead a quite likely reduction of residencies on mid-range OPPs.

We should consider also that, at least in its current implementation,
PELT "builds up" slower when running at lower OPPs, which further
amplify this unbalance on OPP residencies.

IMO, biasing the selection of an OPP over another is something which
sound more like a "policy" than a "mechanism". Since here the goal
should be to provide just a mechanism, perhaps a different approach
can be evaluated.

Have we ever considered to use a "constant margin" for each OPP?

The value of such a margin can still be defined as a (configurable)
percentage of the max (or min) OPP. But once defined, the same
margin can be used to decide whenever to switch to the next OPP.

In the previous example, considering a 5% margin wrt the max capacity,
these are the new margins:

   util/max    min_opp    min_util   margin
        1.0       1000        0.95       5%
        0.8        800        0.75       5%
        0.6        600        0.55       5%
        0.4        400        0.35       5%
        0.2        200        0.15       5%

That means that when running both at the lowest OPP or in a mid-range
one, we always need to build up the same amount of utilization before
switching to the next one.

What is the translation in residencies time? This is still affected by
the PELT behaviors when running at different OPPs but IMO it should
improve a bit the fairness on OPP selections.

Moreover, from an implementation standpoint, what is now a couple of
multiplications and comparison, can potentially be reduced to a single
comparison, e.g.

   next_freq = util > (curr_cap - margin)
               ?  curr_freq + 1
               :  curr_freq

where margin is pre-computed to be for example 51 (i.e. 5% of 1024) as
well as (curr_cap - margin), which can be cached at each OPP change.

-- 
#include <best/regards.h>

Patrick Bellasi

      reply	other threads:[~2016-03-18 12:34 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-02  1:56 [PATCH 0/6] cpufreq: schedutil governor Rafael J. Wysocki
2016-03-02  2:04 ` [PATCH 1/6] cpufreq: Reduce cpufreq_update_util() overhead a bit Rafael J. Wysocki
2016-03-03  5:48   ` Viresh Kumar
2016-03-03 11:47   ` Juri Lelli
2016-03-03 13:04     ` Peter Zijlstra
2016-03-02  2:05 ` [PATCH 2/6][Resend] cpufreq: acpi-cpufreq: Make read and write operations more efficient Rafael J. Wysocki
2016-03-02  2:08 ` [PATCH 3/6] cpufreq: governor: New data type for management part of dbs_data Rafael J. Wysocki
2016-03-03  5:53   ` Viresh Kumar
2016-03-03 19:26     ` Rafael J. Wysocki
2016-03-04  5:49       ` Viresh Kumar
2016-03-02  2:10 ` [PATCH 4/6] cpufreq: governor: Move abstract gov_tunables code to a seperate file Rafael J. Wysocki
2016-03-03  6:03   ` Viresh Kumar
2016-03-02  2:12 ` [PATCH 5/6] cpufreq: Support for fast frequency switching Rafael J. Wysocki
2016-03-03  6:00   ` Viresh Kumar
2016-03-04  2:15     ` Rafael J. Wysocki
2016-03-03 11:16   ` Peter Zijlstra
2016-03-03 20:56     ` Rafael J. Wysocki
2016-03-03 21:12       ` Peter Zijlstra
2016-03-03 11:18   ` Peter Zijlstra
2016-03-03 19:39     ` Rafael J. Wysocki
2016-03-02  2:27 ` [PATCH 6/6] cpufreq: schedutil: New governor based on scheduler utilization data Rafael J. Wysocki
2016-03-02 17:10   ` Vincent Guittot
2016-03-02 17:58     ` Rafael J. Wysocki
2016-03-02 22:49       ` Rafael J. Wysocki
2016-03-03 12:20         ` Peter Zijlstra
2016-03-03 12:32           ` Juri Lelli
2016-03-03 16:24           ` Rafael J. Wysocki
2016-03-03 16:37             ` Peter Zijlstra
2016-03-03 16:47               ` Peter Zijlstra
2016-03-04  1:14                 ` Rafael J. Wysocki
2016-03-03 16:55               ` Juri Lelli
2016-03-03 16:56                 ` Peter Zijlstra
2016-03-03 17:14                   ` Juri Lelli
2016-03-03 14:01         ` Vincent Guittot
2016-03-03 15:38           ` Peter Zijlstra
2016-03-03 16:28             ` Peter Zijlstra
2016-03-03 16:42               ` Peter Zijlstra
2016-03-03 17:28               ` Dietmar Eggemann
2016-03-03 18:26                 ` Peter Zijlstra
2016-03-03 19:14                   ` Dietmar Eggemann
2016-03-08 13:09                   ` Peter Zijlstra
2016-03-03 18:58               ` Rafael J. Wysocki
2016-03-03 13:07       ` Vincent Guittot
2016-03-03 20:06         ` Steve Muckle
2016-03-03 20:20           ` Rafael J. Wysocki
2016-03-03 21:37             ` Steve Muckle
2016-03-07  2:41               ` Rafael J. Wysocki
2016-03-08 11:27                 ` Peter Zijlstra
2016-03-08 18:00                   ` Rafael J. Wysocki
2016-03-08 19:26                     ` Peter Zijlstra
2016-03-08 20:05                       ` Rafael J. Wysocki
2016-03-09 10:15                         ` Juri Lelli
2016-03-09 23:41                           ` Rafael J. Wysocki
2016-03-10  4:30                             ` Juri Lelli
2016-03-10 21:01                               ` Rafael J. Wysocki
2016-03-10 23:19                             ` Michael Turquette
2016-03-09 16:39                         ` Peter Zijlstra
2016-03-09 23:28                           ` Rafael J. Wysocki
2016-03-10  3:44                             ` Vincent Guittot
2016-03-10 10:07                               ` Peter Zijlstra
2016-03-10 10:26                                 ` Vincent Guittot
     [not found]                                 ` <CAKfTPtCbjgbJn+68NJPCnmPFtcHD0wGmZRYaw37zSqPXNpo_Uw@mail.gmail.com>
2016-03-10 10:30                                   ` Peter Zijlstra
2016-03-10 10:56                                     ` Peter Zijlstra
2016-03-10 22:28                                       ` Rafael J. Wysocki
2016-03-10  8:43                             ` Peter Zijlstra
2016-03-04  2:56 ` [PATCH v2 0/10] cpufreq: schedutil governor Rafael J. Wysocki
2016-03-04  2:58   ` [PATCH v2 1/10] cpufreq: Reduce cpufreq_update_util() overhead a bit Rafael J. Wysocki
2016-03-09 12:39     ` Peter Zijlstra
2016-03-09 14:17       ` Rafael J. Wysocki
2016-03-09 15:29         ` Peter Zijlstra
2016-03-09 21:35           ` Rafael J. Wysocki
2016-03-10  9:19             ` Peter Zijlstra
2016-03-04  2:59   ` [PATCH v2 2/10][Resend] cpufreq: acpi-cpufreq: Make read and write operations more efficient Rafael J. Wysocki
2016-03-04  3:01   ` [PATCH v2 3/10] cpufreq: governor: New data type for management part of dbs_data Rafael J. Wysocki
2016-03-04  5:52     ` Viresh Kumar
2016-03-04  3:03   ` [PATCH v2 4/10] cpufreq: governor: Move abstract gov_attr_set code to seperate file Rafael J. Wysocki
2016-03-04  5:52     ` Viresh Kumar
2016-03-04  3:05   ` [PATCH v2 5/10] cpufreq: Move governor attribute set headers to cpufreq.h Rafael J. Wysocki
2016-03-04  5:53     ` Viresh Kumar
2016-03-04  3:07   ` [PATCH v2 6/10] cpufreq: Support for fast frequency switching Rafael J. Wysocki
2016-03-04 22:18     ` Steve Muckle
2016-03-04 22:32       ` Rafael J. Wysocki
2016-03-04 22:40         ` Rafael J. Wysocki
2016-03-04 23:18           ` Rafael J. Wysocki
2016-03-04 23:56             ` Steve Muckle
2016-03-05  0:18               ` Rafael J. Wysocki
2016-03-05 11:58                 ` Ingo Molnar
2016-03-05 16:49             ` Peter Zijlstra
2016-03-06  2:17               ` Rafael J. Wysocki
2016-03-07  8:00                 ` Peter Zijlstra
2016-03-07 13:15                   ` Rafael J. Wysocki
2016-03-07 13:32                     ` Peter Zijlstra
2016-03-07 13:42                       ` Rafael J. Wysocki
2016-03-04 22:58         ` Rafael J. Wysocki
2016-03-04 23:59           ` Steve Muckle
2016-03-04  3:12   ` [PATCH v2 7/10] cpufreq: Rework the scheduler hooks for triggering updates Rafael J. Wysocki
2016-03-04  3:14   ` [PATCH v2 8/10] cpufreq: Move scheduler-related code to the sched directory Rafael J. Wysocki
2016-03-04  3:18   ` [PATCH v2 9/10] cpufreq: sched: Re-introduce cpufreq_update_util() Rafael J. Wysocki
2016-03-04 10:50     ` Juri Lelli
2016-03-04 12:58       ` Rafael J. Wysocki
2016-03-04 13:30     ` [PATCH v3 " Rafael J. Wysocki
2016-03-04 21:21       ` Steve Muckle
2016-03-04 21:27         ` Rafael J. Wysocki
2016-03-04 21:36           ` Rafael J. Wysocki
2016-03-04  3:35   ` [PATCH v2 10/10] cpufreq: schedutil: New governor based on scheduler utilization data Rafael J. Wysocki
2016-03-04 11:26     ` Juri Lelli
2016-03-04 13:19       ` Rafael J. Wysocki
2016-03-04 15:56       ` Srinivas Pandruvada
2016-03-08  2:23   ` [PATCH v3 0/7] cpufreq: schedutil governor Rafael J. Wysocki
2016-03-08  2:25     ` [PATCH v3 1/7][Resend] cpufreq: Rework the scheduler hooks for triggering updates Rafael J. Wysocki
2016-03-09 13:41       ` Peter Zijlstra
2016-03-09 14:02         ` Rafael J. Wysocki
2016-03-08  2:26     ` [PATCH v3 2/7][Resend] cpufreq: Move scheduler-related code to the sched directory Rafael J. Wysocki
2016-03-08  2:28     ` [PATCH v3 3/7][Resend] cpufreq: governor: New data type for management part of dbs_data Rafael J. Wysocki
2016-03-08  2:29     ` [PATCH v3 4/7][Resend] cpufreq: governor: Move abstract gov_attr_set code to seperate file Rafael J. Wysocki
2016-03-08  2:38     ` [PATCH v3 5/7] cpufreq: Support for fast frequency switching Rafael J. Wysocki
2016-03-08  2:41     ` [PATCH v3 6/7] cpufreq: sched: Re-introduce cpufreq_update_util() Rafael J. Wysocki
2016-03-08  2:50     ` [PATCH v3 7/7] cpufreq: schedutil: New governor based on scheduler utilization data Rafael J. Wysocki
2016-03-16 14:41     ` [PATCH v4 0/7] cpufreq: schedutil governor Rafael J. Wysocki
2016-03-16 14:43       ` [PATCH v4 1/7] cpufreq: sched: Helpers to add and remove update_util hooks Rafael J. Wysocki
2016-03-16 14:44       ` [PATCH v4 2/7] cpufreq: governor: New data type for management part of dbs_data Rafael J. Wysocki
2016-03-16 14:45       ` [PATCH v4 3/7] cpufreq: governor: Move abstract gov_attr_set code to seperate file Rafael J. Wysocki
2016-03-16 14:46       ` [PATCH v4 4/7] cpufreq: Move governor attribute set headers to cpufreq.h Rafael J. Wysocki
2016-03-16 14:47       ` [PATCH v4 5/7] cpufreq: Move governor symbols " Rafael J. Wysocki
2016-03-16 14:52       ` [PATCH v4 6/7] cpufreq: Support for fast frequency switching Rafael J. Wysocki
2016-03-16 15:35         ` Peter Zijlstra
2016-03-16 16:58           ` Rafael J. Wysocki
2016-03-16 15:43         ` Peter Zijlstra
2016-03-16 16:58           ` Rafael J. Wysocki
2016-03-16 14:59       ` [PATCH v4 7/7] cpufreq: schedutil: New governor based on scheduler utilization data Rafael J. Wysocki
2016-03-16 17:35         ` Peter Zijlstra
2016-03-16 21:42           ` Rafael J. Wysocki
2016-03-16 17:36         ` Peter Zijlstra
2016-03-16 21:34           ` Rafael J. Wysocki
2016-03-16 17:52         ` Peter Zijlstra
2016-03-16 21:38           ` Rafael J. Wysocki
2016-03-16 22:39             ` Peter Zijlstra
2016-03-16 17:53         ` Peter Zijlstra
2016-03-16 21:48           ` Rafael J. Wysocki
2016-03-16 18:14         ` Peter Zijlstra
2016-03-16 21:38           ` Rafael J. Wysocki
2016-03-16 22:40             ` Peter Zijlstra
2016-03-16 22:53               ` Rafael J. Wysocki
2016-03-16 15:27       ` [PATCH v4 0/7] cpufreq: schedutil governor Peter Zijlstra
2016-03-16 16:20         ` Rafael J. Wysocki
2016-03-16 23:51       ` [PATCH v5 6/7][Update] cpufreq: Support for fast frequency switching Rafael J. Wysocki
2016-03-17 11:35         ` Juri Lelli
2016-03-17 11:40           ` Peter Zijlstra
2016-03-17 11:48             ` Juri Lelli
2016-03-17 12:53               ` Rafael J. Wysocki
2016-03-17  0:01       ` [PATCH v5 7/7][Update] cpufreq: schedutil: New governor based on scheduler utilization data Rafael J. Wysocki
2016-03-17 11:30         ` Juri Lelli
2016-03-17 12:54           ` Rafael J. Wysocki
2016-03-17 11:36         ` Peter Zijlstra
2016-03-17 12:54           ` Rafael J. Wysocki
2016-03-17 15:54       ` [PATCH v6 6/7][Update] cpufreq: Support for fast frequency switching Rafael J. Wysocki
2016-03-17 16:01       ` [PATCH v6 7/7][Update] cpufreq: schedutil: New governor based on scheduler utilization data Rafael J. Wysocki
2016-03-18 12:34         ` Patrick Bellasi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160318123409.GA900@e105326-lin \
    --to=patrick.bellasi@arm.com \
    --cc=juri.lelli@arm.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mturquette@baylibre.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=steve.muckle@linaro.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).