Linux-PM Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Douglas Raillard <douglas.raillard@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	mingo@redhat.com, rjw@rjwysocki.net, viresh.kumar@linaro.org,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, qperret@qperret.net,
	patrick.bellasi@matbug.net, dh.han@samsung.com
Subject: Re: [RFC PATCH v3 0/6] sched/cpufreq: Make schedutil energy aware
Date: Thu, 17 Oct 2019 11:50:15 +0200
Message-ID: <20191017095015.GI2311@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <a1ce67d7-62c3-b78b-1d87-23ef4dbc2274@arm.com>

On Mon, Oct 14, 2019 at 04:50:24PM +0100, Douglas Raillard wrote:

> I posted some numbers based on a similar experiment on the v2 of that series that
> are still applicable:
> 
> TL;DR the rt-app negative slack is divided by 1.75 by this series, with an
> increase of 3% in total energy consumption. There is a burst every 0.6s, and
> the average power consumption increase is proportional to the average number
> of bursts.
> 
> 
> The workload is an rt-app task ramping up from 5% to 75% util in one big step,
> pinned on a big core. The whole cycle is 0.6s long (0.3s at 5% followed by 0.3s at 75%).
> This cycle is repeated 20 times and the average of boosting is taken.
> 
> The test device is a Google Pixel 3 (Qcom Snapdragon 845) phone.
> It has a lot more OPPs than a hikey 960, so gradations in boosting
> are better reflected on frequency selection.
> 
> avg slack (higher=better):
>     Average time between task sleep and its next periodic activation.
>     See rt-app doc: https://github.com/scheduler-tools/rt-app/blob/9a50d76f726d7c325c82ac8c7ed9ed70e1c97937/doc/tutorial.txt#L631
> 
> avg negative slack (lower in absolute value=better):
>     Same as avg slack, but only taking into account negative values.
>     Negative slack means a task activation did not have enough time to complete before the next
>     periodic activation fired, which is what we want to avoid.
> 
> boost energy overhead (lower=better):
>     Extra power consumption induced by ramp boost, assuming continuous OPP space (infinite number of OPP)
>     and single-CPU policies. In practice, fixed number of OPP decrease this value, and more CPU per policy increases it,
>     since boost(policy) = max(boost(cpu) foreach cpu of policy)).
> 
> Without ramp boost:
> +--------------------+--------------------+
> |avg slack (us)      |avg negative slack  |
> |                    |(us)                |
> +--------------------+--------------------+
> |6598.72             |-10217.13           |
> |6595.49             |-10200.13           |
> |6613.72             |-10401.06           |
> |6600.29             |-9860.872           |
> |6605.53             |-10057.64           |
> |6612.05             |-10267.50           |
> |6599.01             |-9939.60            |
> |6593.79             |-9445.633           |
> |6613.56             |-10276.75           |
> |6595.44             |-9751.770           |
> +--------------------+--------------------+
> |average                                  |
> +--------------------+--------------------+
> |6602.76             |-10041.81           |
> +--------------------+--------------------+
> 
> 
> With ramp boost enabled:
> +--------------------+--------------------+--------------------+
> |boost energy        |avg slack (us)      |avg negative slack  |
> |overhead (%)        |                    |(us)                |
> +--------------------+--------------------+--------------------+
> |3.05                |7148.93             |-5664.26            |
> |3.04                |7144.69             |-5667.77            |
> |3.05                |7149.05             |-5698.31            |
> |2.97                |7126.71             |-6040.23            |
> |3.02                |7140.28             |-5826.78            |
> |3.03                |7135.11             |-5749.62            |
> |3.05                |7140.24             |-5750.0             |
> |3.05                |7144.84             |-5667.04            |
> |3.07                |7157.30             |-5656.65            |
> |3.06                |7154.65             |-5653.76            |
> +--------------------+--------------------+--------------------+
> |average                                                       |
> +--------------------+--------------------+--------------------+
> |3.039000            |7144.18             |-5737.44            |
> +--------------------+--------------------+--------------------+
> 
> 
> The negative slack is due to missed activations while the utilization signals
> increase during the big utilization step. Ramp boost is designed to boost frequency during
> that phase, which materializes in 1.75 less negative slack, for an extra power
> consumption under 3%.

OK, so I think I see what it is doing, and why.

Normally we use (map_util_freq):

	freq = C * max_freq * util / max ; C=1.25

But here, when util is increasing, we effectively increase our C to
allow picking a higher OPP. Because of that higher OPP we finish our
work sooner (avg slack increases) and miss our activation less often
(avg neg slack decreases).

Now, the thing is, we use map_util_freq() in more places, and should we
not reflect this increase in C for all of them? That is, why is this
patch changing get_next_freq() and not map_util_freq().

I don't think that question is answered in the Changelogs.

Exactly because it does change the energy consumption (it must) should
that not also be reflected in the EAS logic?

I'm still thinking about the exact means you're using to raise C; that
is, the 'util - util_est' as cost_margin. It hurts my brain still.

  reply index

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-11 13:44 Douglas RAILLARD
2019-10-11 13:44 ` [RFC PATCH v3 1/6] PM: Introduce em_pd_get_higher_freq() Douglas RAILLARD
2019-10-17  8:57   ` Dietmar Eggemann
2019-10-17  9:58   ` Dietmar Eggemann
2019-10-17 11:09     ` Douglas Raillard
2019-10-11 13:44 ` [RFC PATCH v3 2/6] sched/cpufreq: Attach perf domain to sugov policy Douglas RAILLARD
2019-10-17  8:57   ` Dietmar Eggemann
2019-10-17 10:22     ` Douglas Raillard
2019-10-11 13:44 ` [RFC PATCH v3 3/6] sched/cpufreq: Hook em_pd_get_higher_power() into get_next_freq() Douglas RAILLARD
2019-10-11 13:44 ` [RFC PATCH v3 4/6] sched/cpufreq: Introduce sugov_cpu_ramp_boost Douglas RAILLARD
2019-10-14 14:33   ` Peter Zijlstra
2019-10-14 15:32     ` Douglas Raillard
2019-10-17  8:57   ` Dietmar Eggemann
2019-10-17 11:19     ` Douglas Raillard
2019-10-11 13:44 ` [RFC PATCH v3 5/6] sched/cpufreq: Boost schedutil frequency ramp up Douglas RAILLARD
2019-10-17  9:21   ` Dietmar Eggemann
2019-10-11 13:45 ` [RFC PATCH v3 6/6] sched/cpufreq: Add schedutil_em_tp tracepoint Douglas RAILLARD
2019-10-14 14:53 ` [RFC PATCH v3 0/6] sched/cpufreq: Make schedutil energy aware Peter Zijlstra
2019-10-14 15:50   ` Douglas Raillard
2019-10-17  9:50     ` Peter Zijlstra [this message]
2019-10-17 11:11       ` Quentin Perret
2019-10-17 14:11         ` Peter Zijlstra
2019-10-18  7:44           ` Dietmar Eggemann
2019-10-18  7:59             ` Peter Zijlstra
2019-10-18 17:24               ` Douglas Raillard
2019-10-18  8:11             ` Peter Zijlstra
2019-10-17 14:23       ` Douglas Raillard
2019-10-17 14:53         ` Peter Zijlstra
2019-10-17 19:07         ` Peter Zijlstra
2019-10-18 11:46           ` Douglas Raillard
2019-10-18 12:07             ` Peter Zijlstra
2019-10-18 14:44               ` Douglas Raillard
2019-10-18 15:15                 ` Vincent Guittot
2019-10-18 16:03                   ` Douglas Raillard
2019-10-18 15:20                 ` Vincent Guittot

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191017095015.GI2311@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dh.han@samsung.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=douglas.raillard@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=patrick.bellasi@matbug.net \
    --cc=qperret@qperret.net \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pm/0 linux-pm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pm linux-pm/ https://lore.kernel.org/linux-pm \
		linux-pm@vger.kernel.org
	public-inbox-index linux-pm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git