linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Donnefort <vincent.donnefort@arm.com>
To: peterz@infradead.org, mingo@redhat.com, vincent.guittot@linaro.org
Cc: linux-kernel@vger.kernel.org, dietmar.eggemann@arm.com,
	valentin.schneider@arm.com, morten.rasmussen@arm.com,
	chris.redpath@arm.com, qperret@google.com, lukasz.luba@arm.com,
	Vincent Donnefort <vincent.donnefort@arm.com>
Subject: [PATCH 4/4] sched/fair: Remove the energy margin in feec()
Date: Thu,  9 Dec 2021 16:11:59 +0000	[thread overview]
Message-ID: <20211209161159.1596018-5-vincent.donnefort@arm.com> (raw)
In-Reply-To: <20211209161159.1596018-1-vincent.donnefort@arm.com>

find_energy_efficient_cpu() integrates a margin to protect tasks from
bouncing back and forth from a CPU to another. This margin is set as being
6% of the total current energy estimated on the system. This however does
not work for two reasons:

1. The energy estimation is not a good absolute value:

The function, compute_energy() used in feec() is a good estimation for
task placement as it allows to compare the energy with and without a task.
The computed delta will give a good overview of the cost for a certain
task placement. It, however, doesn't work as an absolute estimation for
the total energy of the system. First it adds the contribution to idle
CPUs into the energy, second it mixes util_avg with util_est values.
util_avg represents integrates the near history for a CPU usage,
it doesn't tell at all what the current utilization is. A system that has
been quite busy in the near past will hold a very high energy and then a
high margin preventing any task migration to a lower capacity CPU, wasting
energy. It even creates a negative feedback loop: by holding the tasks on
a less efficient CPU, the margin contributes in keeping the energy high.

2. The margin handicaps small tasks:

On a system where the workload is composed mostly of small tasks (which is
often the case on Android), the overall energy will be high enough to
create a margin none of those tasks can cross. e.g. On a Pixel4, a small
utilization of 5% on all the CPUs creates a global estimated energy of 140
joules, as per the Energy Model declaration of that same device. This
means, after applying the 6% margin that any migration must save more than
8 joules to happen. No task with a utilization lower than 40 would then be
able to migrate away from the biggest CPU of the system.

The 6% of the overall system energy was brought by the following patch:

 (eb92692b2544 sched/fair: Speed-up energy-aware wake-ups)

It was previously 6% of the prev_cpu energy. Also, the following one
made this margin value conditional on the clusters where the task fits:

 (8d4c97c105ca sched/fair: Only compute base_energy_pd if necessary)

We could simply revert that margin change to what it was, but the original
version didn't have strong grounds neither and as demonstrated in (1.) the
estimated energy isn't a good absolute value. Instead, removing it
completely. It is indeed, made possible by recent changes that improved
energy estimation comparison fairness (sched/fair: Remove task_util from
effective utilization in feec()) and task utilization stabilization
(sched/fair: Decay task util_avg during migration)

Without a margin, we could have feared bouncing between CPUs. But running
LISA's eas_behaviour test coverage on three different platforms (Hikey960,
RB-5 and DB-845) showed no issue and even fixed previously known failures.

Removing the energy margin enables more energy-optimized placements for a
more energy efficient system.

Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c40252a96671..a36d37f0b227 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6755,7 +6755,6 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 	struct root_domain *rd = cpu_rq(smp_processor_id())->rd;
 	int cpu, best_energy_cpu = prev_cpu, target = -1;
 	unsigned long cpu_cap, cpu_thermal_cap, util;
-	unsigned long base_energy = 0;
 	struct sched_domain *sd;
 	struct perf_domain *pd;
 
@@ -6838,7 +6837,6 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		/* Compute the 'base' energy of the pd, without @p */
 		base_energy_pd = compute_energy(p, -1, pd, cpu_thermal_cap,
 						pd_busy_time);
-		base_energy += base_energy_pd;
 
 		/* Evaluate the energy impact of using prev_cpu. */
 		if (compute_prev_delta) {
@@ -6867,12 +6865,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 	}
 	rcu_read_unlock();
 
-	/*
-	 * Pick the best CPU if prev_cpu cannot be used, or if it saves at
-	 * least 6% of the energy used by prev_cpu.
-	 */
-	if ((prev_delta == ULONG_MAX) ||
-	    (prev_delta - best_delta) > ((prev_delta + base_energy) >> 4))
+	if (best_delta < prev_delta)
 		target = best_energy_cpu;
 
 	return target;
-- 
2.25.1


      parent reply	other threads:[~2021-12-09 16:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 16:11 [PATCH 0/4] feec() energy margin removal Vincent Donnefort
2021-12-09 16:11 ` [PATCH 1/4] sched/fair: Provide u64 read for 32-bits arch helper Vincent Donnefort
2021-12-09 16:11 ` [PATCH 2/4] sched/fair: Decay task PELT values during migration Vincent Donnefort
2021-12-20 11:26   ` Dietmar Eggemann
2021-12-20 16:09     ` Vincent Donnefort
2021-12-21 12:46       ` Dietmar Eggemann
2021-12-09 16:11 ` [PATCH 3/4] sched/fair: Remove task_util from effective utilization in feec() Vincent Donnefort
2022-01-04 17:27   ` Dietmar Eggemann
2022-01-12 10:20     ` Vincent Donnefort
2021-12-09 16:11 ` Vincent Donnefort [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211209161159.1596018-5-vincent.donnefort@arm.com \
    --to=vincent.donnefort@arm.com \
    --cc=chris.redpath@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).