linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair: Speed-up energy-aware wake-ups
@ 2019-09-12  9:44 Quentin Perret
  2019-09-13 22:43 ` [tip: sched/core] " tip-bot2 for Quentin Perret
  2019-09-20  3:02 ` [PATCH] " Pavan Kondeti
  0 siblings, 2 replies; 8+ messages in thread
From: Quentin Perret @ 2019-09-12  9:44 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot
  Cc: dietmar.eggemann, juri.lelli, rjw, morten.rasmussen,
	valentin.schneider, qais.yousef, tkjos, linux-kernel, qperret

From: Quentin Perret <quentin.perret@arm.com>

EAS computes the energy impact of migrating a waking task when deciding
on which CPU it should run. However, the current approach is known to
have a high algorithmic complexity, which can result in prohibitively
high wake-up latencies on systems with complex energy models, such as
systems with per-CPU DVFS. On such systems, the algorithm complexity is
in O(n^2) (ignoring the cost of searching for performance states in the
EM) with 'n' the number of CPUs.

To address this, re-factor the EAS wake-up path to compute the energy
'delta' (with and without the task) on a per-performance domain basis,
rather than system-wide, which brings the complexity down to O(n).

No functional changes intended.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---

Test results
~~~~~~~~~~~~

* Setup: Tested on a Google Pixel 3, with a Snapdragon 845 (4+4 CPUs,
  A55/A75). Base kernel is 5.3-rc5 + Pixel3 specific patches. Android
  userspace, no graphics.

* Test case:  Run a periodic rt-app task, with 16ms period, ramping down
  from 70% to 10%, in 5% steps of 500 ms each (json avail. at [1]).
  Frequencies of all CPUs are pinned to max (using scaling_min_freq
  CPUFreq sysfs entries) to reduce variability. The time to run
  select_task_rq_fair() is measured using the function profiler
  (/sys/kernel/debug/tracing/trace_stat/function*). See the test script
  for more details [2].

Test 1:
I hacked the DT to 'fake' per-CPU DVFS. That is, we end up with one
CPUFreq policy per CPU (8 policies in total). Since all frequencies are
pinned to max for the test, this should have no impact on the actual
frequency selection, but it does in the EAS calculation.

      +---------------------------+----------------------------------+
      | Without patch             | With patch                       |
+-----+-----+----------+----------+-----+-----------------+----------+
| CPU | Hit | Avg (us) | s^2 (us) | Hit | Avg (us)        | s^2 (us) |
|-----+-----+----------+----------+-----+-----------------+----------+
|  0  | 274 | 38.303   | 1750.239 | 401 | 14.126 (-63.1%) | 146.625  |
|  1  | 197 | 49.529   | 1695.852 | 314 | 16.135 (-67.4%) | 167.525  |
|  2  | 142 | 34.296   | 1758.665 | 302 | 14.133 (-58.8%) | 130.071  |
|  3  | 172 | 31.734   | 1490.975 | 641 | 14.637 (-53.9%) | 139.189  |
|  4  | 316 | 7.834    | 178.217  | 425 | 5.413  (-30.9%) | 20.803   |
|  5  | 447 | 8.424    | 144.638  | 556 | 5.929  (-29.6%) | 27.301   |
|  6  | 581 | 14.886   | 346.793  | 456 | 5.711  (-61.6%) | 23.124   |
|  7  | 456 | 10.005   | 211.187  | 997 | 4.708  (-52.9%) | 21.144   |
+-----+-----+----------+----------+-----+-----------------+----------+
             * Hit, Avg and s^2 are as reported by the function profiler

Test 2:
I also ran the same test with a normal DT, with 2 CPUFreq policies, to
see if this causes regressions in the most common case.

      +---------------------------+----------------------------------+
      | Without patch             | With patch                       |
+-----+-----+----------+----------+-----+-----------------+----------+
| CPU | Hit | Avg (us) | s^2 (us) | Hit | Avg (us)        | s^2 (us) |
|-----+-----+----------+----------+-----+-----------------+----------+
|  0  | 345 | 22.184   | 215.321  | 580 | 18.635 (-16.0%) | 146.892  |
|  1  | 358 | 18.597   | 200.596  | 438 | 12.934 (-30.5%) | 104.604  |
|  2  | 359 | 25.566   | 200.217  | 397 | 10.826 (-57.7%) | 74.021   |
|  3  | 362 | 16.881   | 200.291  | 718 | 11.455 (-32.1%) | 102.280  |
|  4  | 457 | 3.822    | 9.895    | 757 | 4.616  (+20.8%) | 13.369   |
|  5  | 344 | 4.301    | 7.121    | 594 | 5.320  (+23.7%) | 18.798   |
|  6  | 472 | 4.326    | 7.849    | 464 | 5.648  (+30.6%) | 22.022   |
|  7  | 331 | 4.630    | 13.937   | 408 | 5.299  (+14.4%) | 18.273   |
+-----+-----+----------+----------+-----+-----------------+----------+
             * Hit, Avg and s^2 are as reported by the function profiler

In addition to these two tests, I also ran 50 iterations of the Lisa
EAS functional test suite [3] with this patch applied on Arm Juno r0,
Arm Juno r2, Arm TC2 and Hikey960, and could not see any regressions
(all EAS functional tests are passing).

[1] https://paste.debian.net/1100055/
[2] https://paste.debian.net/1100057/
[3] https://github.com/ARM-software/lisa/blob/master/lisa/tests/scheduler/eas_behaviour.py
---
 kernel/sched/fair.c | 110 ++++++++++++++++++++------------------------
 1 file changed, 50 insertions(+), 60 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bc9cfeaac8bd..8a521087e11d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6277,69 +6277,55 @@ static unsigned long cpu_util_next(int cpu, struct task_struct *p, int dst_cpu)
 }
 
 /*
- * compute_energy(): Estimates the energy that would be consumed if @p was
+ * compute_energy(): Estimates the energy that @pd would consume if @p was
  * migrated to @dst_cpu. compute_energy() predicts what will be the utilization
- * landscape of the * CPUs after the task migration, and uses the Energy Model
+ * landscape of @pd's CPUs after the task migration, and uses the Energy Model
  * to compute what would be the energy if we decided to actually migrate that
  * task.
  */
 static long
 compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
 {
-	unsigned int max_util, util_cfs, cpu_util, cpu_cap;
-	unsigned long sum_util, energy = 0;
-	struct task_struct *tsk;
+	struct cpumask *pd_mask = perf_domain_span(pd);
+	unsigned long cpu_cap = arch_scale_cpu_capacity(cpumask_first(pd_mask));
+	unsigned long max_util = 0, sum_util = 0;
 	int cpu;
 
-	for (; pd; pd = pd->next) {
-		struct cpumask *pd_mask = perf_domain_span(pd);
+	/*
+	 * The capacity state of CPUs of the current rd can be driven by CPUs
+	 * of another rd if they belong to the same pd. So, account for the
+	 * utilization of these CPUs too by masking pd with cpu_online_mask
+	 * instead of the rd span.
+	 *
+	 * If an entire pd is outside of the current rd, it will not appear in
+	 * its pd list and will not be accounted by compute_energy().
+	 */
+	for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
+		unsigned long cpu_util, util_cfs = cpu_util_next(cpu, p, dst_cpu);
+		struct task_struct *tsk = cpu == dst_cpu ? p : NULL;
 
 		/*
-		 * The energy model mandates all the CPUs of a performance
-		 * domain have the same capacity.
+		 * Busy time computation: utilization clamping is not
+		 * required since the ratio (sum_util / cpu_capacity)
+		 * is already enough to scale the EM reported power
+		 * consumption at the (eventually clamped) cpu_capacity.
 		 */
-		cpu_cap = arch_scale_cpu_capacity(cpumask_first(pd_mask));
-		max_util = sum_util = 0;
+		sum_util += schedutil_cpu_util(cpu, util_cfs, cpu_cap,
+					       ENERGY_UTIL, NULL);
 
 		/*
-		 * The capacity state of CPUs of the current rd can be driven by
-		 * CPUs of another rd if they belong to the same performance
-		 * domain. So, account for the utilization of these CPUs too
-		 * by masking pd with cpu_online_mask instead of the rd span.
-		 *
-		 * If an entire performance domain is outside of the current rd,
-		 * it will not appear in its pd list and will not be accounted
-		 * by compute_energy().
+		 * Performance domain frequency: utilization clamping
+		 * must be considered since it affects the selection
+		 * of the performance domain frequency.
+		 * NOTE: in case RT tasks are running, by default the
+		 * FREQUENCY_UTIL's utilization can be max OPP.
 		 */
-		for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
-			util_cfs = cpu_util_next(cpu, p, dst_cpu);
-
-			/*
-			 * Busy time computation: utilization clamping is not
-			 * required since the ratio (sum_util / cpu_capacity)
-			 * is already enough to scale the EM reported power
-			 * consumption at the (eventually clamped) cpu_capacity.
-			 */
-			sum_util += schedutil_cpu_util(cpu, util_cfs, cpu_cap,
-						       ENERGY_UTIL, NULL);
-
-			/*
-			 * Performance domain frequency: utilization clamping
-			 * must be considered since it affects the selection
-			 * of the performance domain frequency.
-			 * NOTE: in case RT tasks are running, by default the
-			 * FREQUENCY_UTIL's utilization can be max OPP.
-			 */
-			tsk = cpu == dst_cpu ? p : NULL;
-			cpu_util = schedutil_cpu_util(cpu, util_cfs, cpu_cap,
-						      FREQUENCY_UTIL, tsk);
-			max_util = max(max_util, cpu_util);
-		}
-
-		energy += em_pd_energy(pd->em_pd, max_util, sum_util);
+		cpu_util = schedutil_cpu_util(cpu, util_cfs, cpu_cap,
+					      FREQUENCY_UTIL, tsk);
+		max_util = max(max_util, cpu_util);
 	}
 
-	return energy;
+	return em_pd_energy(pd->em_pd, max_util, sum_util);
 }
 
 /*
@@ -6381,21 +6367,19 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
  * other use-cases too. So, until someone finds a better way to solve this,
  * let's keep things simple by re-using the existing slow path.
  */
-
 static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 {
-	unsigned long prev_energy = ULONG_MAX, best_energy = ULONG_MAX;
+	unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX;
 	struct root_domain *rd = cpu_rq(smp_processor_id())->rd;
+	unsigned long cpu_cap, util, base_energy = 0;
 	int cpu, best_energy_cpu = prev_cpu;
-	struct perf_domain *head, *pd;
-	unsigned long cpu_cap, util;
 	struct sched_domain *sd;
+	struct perf_domain *pd;
 
 	rcu_read_lock();
 	pd = rcu_dereference(rd->pd);
 	if (!pd || READ_ONCE(rd->overutilized))
 		goto fail;
-	head = pd;
 
 	/*
 	 * Energy-aware wake-up happens on the lowest sched_domain starting
@@ -6412,9 +6396,14 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		goto unlock;
 
 	for (; pd; pd = pd->next) {
-		unsigned long cur_energy, spare_cap, max_spare_cap = 0;
+		unsigned long cur_delta, spare_cap, max_spare_cap = 0;
+		unsigned long base_energy_pd;
 		int max_spare_cap_cpu = -1;
 
+		/* Compute the 'base' energy of the pd, without @p */
+		base_energy_pd = compute_energy(p, -1, pd);
+		base_energy += base_energy_pd;
+
 		for_each_cpu_and(cpu, perf_domain_span(pd), sched_domain_span(sd)) {
 			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
 				continue;
@@ -6427,9 +6416,9 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 
 			/* Always use prev_cpu as a candidate. */
 			if (cpu == prev_cpu) {
-				prev_energy = compute_energy(p, prev_cpu, head);
-				best_energy = min(best_energy, prev_energy);
-				continue;
+				prev_delta = compute_energy(p, prev_cpu, pd);
+				prev_delta -= base_energy_pd;
+				best_delta = min(best_delta, prev_delta);
 			}
 
 			/*
@@ -6445,9 +6434,10 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 
 		/* Evaluate the energy impact of using this CPU. */
 		if (max_spare_cap_cpu >= 0) {
-			cur_energy = compute_energy(p, max_spare_cap_cpu, head);
-			if (cur_energy < best_energy) {
-				best_energy = cur_energy;
+			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
+			cur_delta -= base_energy_pd;
+			if (cur_delta < best_delta) {
+				best_delta = cur_delta;
 				best_energy_cpu = max_spare_cap_cpu;
 			}
 		}
@@ -6459,10 +6449,10 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 	 * Pick the best CPU if prev_cpu cannot be used, or if it saves at
 	 * least 6% of the energy used by prev_cpu.
 	 */
-	if (prev_energy == ULONG_MAX)
+	if (prev_delta == ULONG_MAX)
 		return best_energy_cpu;
 
-	if ((prev_energy - best_energy) > (prev_energy >> 4))
+	if ((prev_delta - best_delta) > ((prev_delta + base_energy) >> 4))
 		return best_energy_cpu;
 
 	return prev_cpu;
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [tip: sched/core] sched/fair: Speed-up energy-aware wake-ups
  2019-09-12  9:44 [PATCH] sched/fair: Speed-up energy-aware wake-ups Quentin Perret
@ 2019-09-13 22:43 ` tip-bot2 for Quentin Perret
  2019-09-20  3:02 ` [PATCH] " Pavan Kondeti
  1 sibling, 0 replies; 8+ messages in thread
From: tip-bot2 for Quentin Perret @ 2019-09-13 22:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Quentin Perret, Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
	dietmar.eggemann, juri.lelli, morten.rasmussen, qais.yousef,
	qperret, rjw, tkjos, valentin.schneider, vincent.guittot,
	Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     eb92692b2544d3f415887dbbc98499843dfe568b
Gitweb:        https://git.kernel.org/tip/eb92692b2544d3f415887dbbc98499843dfe568b
Author:        Quentin Perret <quentin.perret@arm.com>
AuthorDate:    Thu, 12 Sep 2019 11:44:04 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 13 Sep 2019 07:45:17 +02:00

sched/fair: Speed-up energy-aware wake-ups

EAS computes the energy impact of migrating a waking task when deciding
on which CPU it should run. However, the current approach is known to
have a high algorithmic complexity, which can result in prohibitively
high wake-up latencies on systems with complex energy models, such as
systems with per-CPU DVFS. On such systems, the algorithm complexity is
in O(n^2) (ignoring the cost of searching for performance states in the
EM) with 'n' the number of CPUs.

To address this, re-factor the EAS wake-up path to compute the energy
'delta' (with and without the task) on a per-performance domain basis,
rather than system-wide, which brings the complexity down to O(n).

No functional changes intended.

Test results
~~~~~~~~~~~~

* Setup: Tested on a Google Pixel 3, with a Snapdragon 845 (4+4 CPUs,
  A55/A75). Base kernel is 5.3-rc5 + Pixel3 specific patches. Android
  userspace, no graphics.

* Test case:  Run a periodic rt-app task, with 16ms period, ramping down
  from 70% to 10%, in 5% steps of 500 ms each (json avail. at [1]).
  Frequencies of all CPUs are pinned to max (using scaling_min_freq
  CPUFreq sysfs entries) to reduce variability. The time to run
  select_task_rq_fair() is measured using the function profiler
  (/sys/kernel/debug/tracing/trace_stat/function*). See the test script
  for more details [2].

Test 1:

I hacked the DT to 'fake' per-CPU DVFS. That is, we end up with one
CPUFreq policy per CPU (8 policies in total). Since all frequencies are
pinned to max for the test, this should have no impact on the actual
frequency selection, but it does in the EAS calculation.

      +---------------------------+----------------------------------+
      | Without patch             | With patch                       |
+-----+-----+----------+----------+-----+-----------------+----------+
| CPU | Hit | Avg (us) | s^2 (us) | Hit | Avg (us)        | s^2 (us) |
|-----+-----+----------+----------+-----+-----------------+----------+
|  0  | 274 | 38.303   | 1750.239 | 401 | 14.126 (-63.1%) | 146.625  |
|  1  | 197 | 49.529   | 1695.852 | 314 | 16.135 (-67.4%) | 167.525  |
|  2  | 142 | 34.296   | 1758.665 | 302 | 14.133 (-58.8%) | 130.071  |
|  3  | 172 | 31.734   | 1490.975 | 641 | 14.637 (-53.9%) | 139.189  |
|  4  | 316 | 7.834    | 178.217  | 425 | 5.413  (-30.9%) | 20.803   |
|  5  | 447 | 8.424    | 144.638  | 556 | 5.929  (-29.6%) | 27.301   |
|  6  | 581 | 14.886   | 346.793  | 456 | 5.711  (-61.6%) | 23.124   |
|  7  | 456 | 10.005   | 211.187  | 997 | 4.708  (-52.9%) | 21.144   |
+-----+-----+----------+----------+-----+-----------------+----------+
             * Hit, Avg and s^2 are as reported by the function profiler

Test 2:
I also ran the same test with a normal DT, with 2 CPUFreq policies, to
see if this causes regressions in the most common case.

      +---------------------------+----------------------------------+
      | Without patch             | With patch                       |
+-----+-----+----------+----------+-----+-----------------+----------+
| CPU | Hit | Avg (us) | s^2 (us) | Hit | Avg (us)        | s^2 (us) |
|-----+-----+----------+----------+-----+-----------------+----------+
|  0  | 345 | 22.184   | 215.321  | 580 | 18.635 (-16.0%) | 146.892  |
|  1  | 358 | 18.597   | 200.596  | 438 | 12.934 (-30.5%) | 104.604  |
|  2  | 359 | 25.566   | 200.217  | 397 | 10.826 (-57.7%) | 74.021   |
|  3  | 362 | 16.881   | 200.291  | 718 | 11.455 (-32.1%) | 102.280  |
|  4  | 457 | 3.822    | 9.895    | 757 | 4.616  (+20.8%) | 13.369   |
|  5  | 344 | 4.301    | 7.121    | 594 | 5.320  (+23.7%) | 18.798   |
|  6  | 472 | 4.326    | 7.849    | 464 | 5.648  (+30.6%) | 22.022   |
|  7  | 331 | 4.630    | 13.937   | 408 | 5.299  (+14.4%) | 18.273   |
+-----+-----+----------+----------+-----+-----------------+----------+
             * Hit, Avg and s^2 are as reported by the function profiler

In addition to these two tests, I also ran 50 iterations of the Lisa
EAS functional test suite [3] with this patch applied on Arm Juno r0,
Arm Juno r2, Arm TC2 and Hikey960, and could not see any regressions
(all EAS functional tests are passing).

 [1] https://paste.debian.net/1100055/
 [2] https://paste.debian.net/1100057/
 [3] https://github.com/ARM-software/lisa/blob/master/lisa/tests/scheduler/eas_behaviour.py

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: qais.yousef@arm.com
Cc: qperret@qperret.net
Cc: rjw@rjwysocki.net
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190912094404.13802-1-qperret@qperret.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 110 +++++++++++++++++++------------------------
 1 file changed, 50 insertions(+), 60 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1054d2c..8b66511 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6251,69 +6251,55 @@ static unsigned long cpu_util_next(int cpu, struct task_struct *p, int dst_cpu)
 }
 
 /*
- * compute_energy(): Estimates the energy that would be consumed if @p was
+ * compute_energy(): Estimates the energy that @pd would consume if @p was
  * migrated to @dst_cpu. compute_energy() predicts what will be the utilization
- * landscape of the * CPUs after the task migration, and uses the Energy Model
+ * landscape of @pd's CPUs after the task migration, and uses the Energy Model
  * to compute what would be the energy if we decided to actually migrate that
  * task.
  */
 static long
 compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
 {
-	unsigned int max_util, util_cfs, cpu_util, cpu_cap;
-	unsigned long sum_util, energy = 0;
-	struct task_struct *tsk;
+	struct cpumask *pd_mask = perf_domain_span(pd);
+	unsigned long cpu_cap = arch_scale_cpu_capacity(cpumask_first(pd_mask));
+	unsigned long max_util = 0, sum_util = 0;
 	int cpu;
 
-	for (; pd; pd = pd->next) {
-		struct cpumask *pd_mask = perf_domain_span(pd);
+	/*
+	 * The capacity state of CPUs of the current rd can be driven by CPUs
+	 * of another rd if they belong to the same pd. So, account for the
+	 * utilization of these CPUs too by masking pd with cpu_online_mask
+	 * instead of the rd span.
+	 *
+	 * If an entire pd is outside of the current rd, it will not appear in
+	 * its pd list and will not be accounted by compute_energy().
+	 */
+	for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
+		unsigned long cpu_util, util_cfs = cpu_util_next(cpu, p, dst_cpu);
+		struct task_struct *tsk = cpu == dst_cpu ? p : NULL;
 
 		/*
-		 * The energy model mandates all the CPUs of a performance
-		 * domain have the same capacity.
+		 * Busy time computation: utilization clamping is not
+		 * required since the ratio (sum_util / cpu_capacity)
+		 * is already enough to scale the EM reported power
+		 * consumption at the (eventually clamped) cpu_capacity.
 		 */
-		cpu_cap = arch_scale_cpu_capacity(cpumask_first(pd_mask));
-		max_util = sum_util = 0;
+		sum_util += schedutil_cpu_util(cpu, util_cfs, cpu_cap,
+					       ENERGY_UTIL, NULL);
 
 		/*
-		 * The capacity state of CPUs of the current rd can be driven by
-		 * CPUs of another rd if they belong to the same performance
-		 * domain. So, account for the utilization of these CPUs too
-		 * by masking pd with cpu_online_mask instead of the rd span.
-		 *
-		 * If an entire performance domain is outside of the current rd,
-		 * it will not appear in its pd list and will not be accounted
-		 * by compute_energy().
+		 * Performance domain frequency: utilization clamping
+		 * must be considered since it affects the selection
+		 * of the performance domain frequency.
+		 * NOTE: in case RT tasks are running, by default the
+		 * FREQUENCY_UTIL's utilization can be max OPP.
 		 */
-		for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
-			util_cfs = cpu_util_next(cpu, p, dst_cpu);
-
-			/*
-			 * Busy time computation: utilization clamping is not
-			 * required since the ratio (sum_util / cpu_capacity)
-			 * is already enough to scale the EM reported power
-			 * consumption at the (eventually clamped) cpu_capacity.
-			 */
-			sum_util += schedutil_cpu_util(cpu, util_cfs, cpu_cap,
-						       ENERGY_UTIL, NULL);
-
-			/*
-			 * Performance domain frequency: utilization clamping
-			 * must be considered since it affects the selection
-			 * of the performance domain frequency.
-			 * NOTE: in case RT tasks are running, by default the
-			 * FREQUENCY_UTIL's utilization can be max OPP.
-			 */
-			tsk = cpu == dst_cpu ? p : NULL;
-			cpu_util = schedutil_cpu_util(cpu, util_cfs, cpu_cap,
-						      FREQUENCY_UTIL, tsk);
-			max_util = max(max_util, cpu_util);
-		}
-
-		energy += em_pd_energy(pd->em_pd, max_util, sum_util);
+		cpu_util = schedutil_cpu_util(cpu, util_cfs, cpu_cap,
+					      FREQUENCY_UTIL, tsk);
+		max_util = max(max_util, cpu_util);
 	}
 
-	return energy;
+	return em_pd_energy(pd->em_pd, max_util, sum_util);
 }
 
 /*
@@ -6355,21 +6341,19 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
  * other use-cases too. So, until someone finds a better way to solve this,
  * let's keep things simple by re-using the existing slow path.
  */
-
 static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 {
-	unsigned long prev_energy = ULONG_MAX, best_energy = ULONG_MAX;
+	unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX;
 	struct root_domain *rd = cpu_rq(smp_processor_id())->rd;
+	unsigned long cpu_cap, util, base_energy = 0;
 	int cpu, best_energy_cpu = prev_cpu;
-	struct perf_domain *head, *pd;
-	unsigned long cpu_cap, util;
 	struct sched_domain *sd;
+	struct perf_domain *pd;
 
 	rcu_read_lock();
 	pd = rcu_dereference(rd->pd);
 	if (!pd || READ_ONCE(rd->overutilized))
 		goto fail;
-	head = pd;
 
 	/*
 	 * Energy-aware wake-up happens on the lowest sched_domain starting
@@ -6386,9 +6370,14 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		goto unlock;
 
 	for (; pd; pd = pd->next) {
-		unsigned long cur_energy, spare_cap, max_spare_cap = 0;
+		unsigned long cur_delta, spare_cap, max_spare_cap = 0;
+		unsigned long base_energy_pd;
 		int max_spare_cap_cpu = -1;
 
+		/* Compute the 'base' energy of the pd, without @p */
+		base_energy_pd = compute_energy(p, -1, pd);
+		base_energy += base_energy_pd;
+
 		for_each_cpu_and(cpu, perf_domain_span(pd), sched_domain_span(sd)) {
 			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
 				continue;
@@ -6401,9 +6390,9 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 
 			/* Always use prev_cpu as a candidate. */
 			if (cpu == prev_cpu) {
-				prev_energy = compute_energy(p, prev_cpu, head);
-				best_energy = min(best_energy, prev_energy);
-				continue;
+				prev_delta = compute_energy(p, prev_cpu, pd);
+				prev_delta -= base_energy_pd;
+				best_delta = min(best_delta, prev_delta);
 			}
 
 			/*
@@ -6419,9 +6408,10 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 
 		/* Evaluate the energy impact of using this CPU. */
 		if (max_spare_cap_cpu >= 0) {
-			cur_energy = compute_energy(p, max_spare_cap_cpu, head);
-			if (cur_energy < best_energy) {
-				best_energy = cur_energy;
+			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
+			cur_delta -= base_energy_pd;
+			if (cur_delta < best_delta) {
+				best_delta = cur_delta;
 				best_energy_cpu = max_spare_cap_cpu;
 			}
 		}
@@ -6433,10 +6423,10 @@ unlock:
 	 * Pick the best CPU if prev_cpu cannot be used, or if it saves at
 	 * least 6% of the energy used by prev_cpu.
 	 */
-	if (prev_energy == ULONG_MAX)
+	if (prev_delta == ULONG_MAX)
 		return best_energy_cpu;
 
-	if ((prev_energy - best_energy) > (prev_energy >> 4))
+	if ((prev_delta - best_delta) > ((prev_delta + base_energy) >> 4))
 		return best_energy_cpu;
 
 	return prev_cpu;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/fair: Speed-up energy-aware wake-ups
  2019-09-12  9:44 [PATCH] sched/fair: Speed-up energy-aware wake-ups Quentin Perret
  2019-09-13 22:43 ` [tip: sched/core] " tip-bot2 for Quentin Perret
@ 2019-09-20  3:02 ` Pavan Kondeti
  2019-09-20  9:41   ` Quentin Perret
  1 sibling, 1 reply; 8+ messages in thread
From: Pavan Kondeti @ 2019-09-20  3:02 UTC (permalink / raw)
  To: Quentin Perret
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, juri.lelli,
	rjw, morten.rasmussen, valentin.schneider, qais.yousef, tkjos,
	linux-kernel

Hi Quentin,

On Thu, Sep 12, 2019 at 11:44:04AM +0200, Quentin Perret wrote:
> From: Quentin Perret <quentin.perret@arm.com>
> 
> EAS computes the energy impact of migrating a waking task when deciding
> on which CPU it should run. However, the current approach is known to
> have a high algorithmic complexity, which can result in prohibitively
> high wake-up latencies on systems with complex energy models, such as
> systems with per-CPU DVFS. On such systems, the algorithm complexity is
> in O(n^2) (ignoring the cost of searching for performance states in the
> EM) with 'n' the number of CPUs.
> 
> To address this, re-factor the EAS wake-up path to compute the energy
> 'delta' (with and without the task) on a per-performance domain basis,
> rather than system-wide, which brings the complexity down to O(n).
> 
> No functional changes intended.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>  

[snip]

>  /*
> @@ -6381,21 +6367,19 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
>   * other use-cases too. So, until someone finds a better way to solve this,
>   * let's keep things simple by re-using the existing slow path.
>   */
> -
>  static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  {
> -	unsigned long prev_energy = ULONG_MAX, best_energy = ULONG_MAX;
> +	unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX;
>  	struct root_domain *rd = cpu_rq(smp_processor_id())->rd;
> +	unsigned long cpu_cap, util, base_energy = 0;
>  	int cpu, best_energy_cpu = prev_cpu;
> -	struct perf_domain *head, *pd;
> -	unsigned long cpu_cap, util;
>  	struct sched_domain *sd;
> +	struct perf_domain *pd;
>  
>  	rcu_read_lock();
>  	pd = rcu_dereference(rd->pd);
>  	if (!pd || READ_ONCE(rd->overutilized))
>  		goto fail;
> -	head = pd;
>  
>  	/*
>  	 * Energy-aware wake-up happens on the lowest sched_domain starting
> @@ -6412,9 +6396,14 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  		goto unlock;
>  
>  	for (; pd; pd = pd->next) {
> -		unsigned long cur_energy, spare_cap, max_spare_cap = 0;
> +		unsigned long cur_delta, spare_cap, max_spare_cap = 0;
> +		unsigned long base_energy_pd;
>  		int max_spare_cap_cpu = -1;
>  
> +		/* Compute the 'base' energy of the pd, without @p */
> +		base_energy_pd = compute_energy(p, -1, pd);
> +		base_energy += base_energy_pd;
> +
>  		for_each_cpu_and(cpu, perf_domain_span(pd), sched_domain_span(sd)) {
>  			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
>  				continue;
> @@ -6427,9 +6416,9 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  
>  			/* Always use prev_cpu as a candidate. */
>  			if (cpu == prev_cpu) {
> -				prev_energy = compute_energy(p, prev_cpu, head);
> -				best_energy = min(best_energy, prev_energy);
> -				continue;
> +				prev_delta = compute_energy(p, prev_cpu, pd);
> +				prev_delta -= base_energy_pd;
> +				best_delta = min(best_delta, prev_delta);
>  			}

Earlier, we are not checking the spare capacity for the prev_cpu. Now that the
continue statement is removed, prev_cpu could also be the max_spare_cap_cpu.
Actually that makes sense. Because there is no reason why we want to select
another CPU which has less spare capacity than previous CPU.

Is this behavior intentional?

When prev_cpu == max_spare_cap_cpu, we are evaluating the energy again for the
same CPU below. That could have been skipped by returning prev_cpu when
prev_cpu == max_spare_cap_cpu.

>  
>  			/*
> @@ -6445,9 +6434,10 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  
>  		/* Evaluate the energy impact of using this CPU. */
>  		if (max_spare_cap_cpu >= 0) {
> -			cur_energy = compute_energy(p, max_spare_cap_cpu, head);
> -			if (cur_energy < best_energy) {
> -				best_energy = cur_energy;
> +			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
> +			cur_delta -= base_energy_pd;
> +			if (cur_delta < best_delta) {
> +				best_delta = cur_delta;
>  				best_energy_cpu = max_spare_cap_cpu;
>  			}
>  		}
> @@ -6459,10 +6449,10 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  	 * Pick the best CPU if prev_cpu cannot be used, or if it saves at
>  	 * least 6% of the energy used by prev_cpu.
>  	 */
> -	if (prev_energy == ULONG_MAX)
> +	if (prev_delta == ULONG_MAX)
>  		return best_energy_cpu;
>  
> -	if ((prev_energy - best_energy) > (prev_energy >> 4))
> +	if ((prev_delta - best_delta) > ((prev_delta + base_energy) >> 4))
>  		return best_energy_cpu;
>  
>  	return prev_cpu;
> -- 
> 2.22.1
> 

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/fair: Speed-up energy-aware wake-ups
  2019-09-20  3:02 ` [PATCH] " Pavan Kondeti
@ 2019-09-20  9:41   ` Quentin Perret
  2019-09-20 10:33     ` Pavan Kondeti
  2019-09-27  8:10     ` [tip: sched/urgent] sched/fair: Avoid redundant EAS calculation tip-bot2 for Quentin Perret
  0 siblings, 2 replies; 8+ messages in thread
From: Quentin Perret @ 2019-09-20  9:41 UTC (permalink / raw)
  To: Pavan Kondeti
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, juri.lelli,
	rjw, morten.rasmussen, valentin.schneider, qais.yousef, tkjos,
	linux-kernel

Hi Pavan,

On Friday 20 Sep 2019 at 08:32:15 (+0530), Pavan Kondeti wrote:
> Earlier, we are not checking the spare capacity for the prev_cpu. Now that the
> continue statement is removed, prev_cpu could also be the max_spare_cap_cpu.
> Actually that makes sense. Because there is no reason why we want to select
> another CPU which has less spare capacity than previous CPU.
> 
> Is this behavior intentional?

The intent was indeed to not compute the energy for another CPU in
prev_cpu's perf domain if prev_cpu is the one with max spare cap -- it
is useless to do so since this other CPU cannot 'beat' prev_cpu and
will never be chosen in the end.

But I did miss that we'd end up computing the energy for prev_cpu
twice ... Harmless but useless. So yeah, let's optimize that case too :)

> When prev_cpu == max_spare_cap_cpu, we are evaluating the energy again for the
> same CPU below. That could have been skipped by returning prev_cpu when
> prev_cpu == max_spare_cap_cpu.

Right, something like the patch below ? My test results are still
looking good with it applied.

Thanks for the careful review,
Quentin
---
From 7b8258287f180a2c383ebe397e8129f5f898ffbe Mon Sep 17 00:00:00 2001
From: Quentin Perret <qperret@qperret.net>
Date: Fri, 20 Sep 2019 09:07:20 +0100
Subject: [PATCH] sched/fair: Avoid redundant EAS calculation

The EAS wake-up path computes the system energy for several CPU
candidates: the CPU with maximum spare capacity in each performance
domain, and the prev_cpu. However, if prev_cpu also happens to be the
CPU with maximum spare capacity in its performance domain, the energy
calculation is still done twice, unnecessarily.

Add a condition to filter out this corner case before doing the energy
calculation.

Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Quentin Perret <qperret@qperret.net>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4bbf68c3161..7399382bc291 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6412,7 +6412,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		}
 
 		/* Evaluate the energy impact of using this CPU. */
-		if (max_spare_cap_cpu >= 0) {
+		if (max_spare_cap_cpu >= 0 && max_spare_cap_cpu != prev_cpu) {
 			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
 			cur_delta -= base_energy_pd;
 			if (cur_delta < best_delta) {
-- 
2.22.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/fair: Speed-up energy-aware wake-ups
  2019-09-20  9:41   ` Quentin Perret
@ 2019-09-20 10:33     ` Pavan Kondeti
  2019-09-20 11:23       ` Quentin Perret
  2019-09-27  8:10     ` [tip: sched/urgent] sched/fair: Avoid redundant EAS calculation tip-bot2 for Quentin Perret
  1 sibling, 1 reply; 8+ messages in thread
From: Pavan Kondeti @ 2019-09-20 10:33 UTC (permalink / raw)
  To: Quentin Perret
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, juri.lelli,
	rjw, morten.rasmussen, valentin.schneider, qais.yousef, tkjos,
	linux-kernel

Hi Quentin,

On Fri, Sep 20, 2019 at 11:41:15AM +0200, Quentin Perret wrote:
> Hi Pavan,
> 
> On Friday 20 Sep 2019 at 08:32:15 (+0530), Pavan Kondeti wrote:
> > Earlier, we are not checking the spare capacity for the prev_cpu. Now that the
> > continue statement is removed, prev_cpu could also be the max_spare_cap_cpu.
> > Actually that makes sense. Because there is no reason why we want to select
> > another CPU which has less spare capacity than previous CPU.
> > 
> > Is this behavior intentional?
> 
> The intent was indeed to not compute the energy for another CPU in
> prev_cpu's perf domain if prev_cpu is the one with max spare cap -- it
> is useless to do so since this other CPU cannot 'beat' prev_cpu and
> will never be chosen in the end.

Yes. Selecting the prev_cpu is the correct decision.

> 
> But I did miss that we'd end up computing the energy for prev_cpu
> twice ... Harmless but useless. So yeah, let's optimize that case too :)
> 
> > When prev_cpu == max_spare_cap_cpu, we are evaluating the energy again for the
> > same CPU below. That could have been skipped by returning prev_cpu when
> > prev_cpu == max_spare_cap_cpu.
> 
> Right, something like the patch below ? My test results are still
> looking good with it applied.
> 
> Thanks for the careful review,
> Quentin
> ---
> From 7b8258287f180a2c383ebe397e8129f5f898ffbe Mon Sep 17 00:00:00 2001
> From: Quentin Perret <qperret@qperret.net>
> Date: Fri, 20 Sep 2019 09:07:20 +0100
> Subject: [PATCH] sched/fair: Avoid redundant EAS calculation
> 
> The EAS wake-up path computes the system energy for several CPU
> candidates: the CPU with maximum spare capacity in each performance
> domain, and the prev_cpu. However, if prev_cpu also happens to be the
> CPU with maximum spare capacity in its performance domain, the energy
> calculation is still done twice, unnecessarily.
> 
> Add a condition to filter out this corner case before doing the energy
> calculation.
> 
> Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
> Signed-off-by: Quentin Perret <qperret@qperret.net>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d4bbf68c3161..7399382bc291 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6412,7 +6412,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  		}
>  
>  		/* Evaluate the energy impact of using this CPU. */
> -		if (max_spare_cap_cpu >= 0) {
> +		if (max_spare_cap_cpu >= 0 && max_spare_cap_cpu != prev_cpu) {
>  			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
>  			cur_delta -= base_energy_pd;
>  			if (cur_delta < best_delta) {
> -- 
> 2.22.1
> 

+1. Looks good to me.

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/fair: Speed-up energy-aware wake-ups
  2019-09-20 10:33     ` Pavan Kondeti
@ 2019-09-20 11:23       ` Quentin Perret
  2019-09-25  8:41         ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Quentin Perret @ 2019-09-20 11:23 UTC (permalink / raw)
  To: Pavan Kondeti
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, juri.lelli,
	rjw, morten.rasmussen, valentin.schneider, qais.yousef, tkjos,
	linux-kernel

On Friday 20 Sep 2019 at 16:03:38 (+0530), Pavan Kondeti wrote:
> +1. Looks good to me.

Cool, thanks.

Peter/Ingo, is there anything else I should do ? Should I resend the
patch 'properly' (that is, not inline) ?

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sched/fair: Speed-up energy-aware wake-ups
  2019-09-20 11:23       ` Quentin Perret
@ 2019-09-25  8:41         ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2019-09-25  8:41 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Pavan Kondeti, mingo, vincent.guittot, dietmar.eggemann,
	juri.lelli, rjw, morten.rasmussen, valentin.schneider,
	qais.yousef, tkjos, linux-kernel

On Fri, Sep 20, 2019 at 01:23:00PM +0200, Quentin Perret wrote:
> On Friday 20 Sep 2019 at 16:03:38 (+0530), Pavan Kondeti wrote:
> > +1. Looks good to me.
> 
> Cool, thanks.
> 
> Peter/Ingo, is there anything else I should do ? Should I resend the
> patch 'properly' (that is, not inline) ?

Got it, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip: sched/urgent] sched/fair: Avoid redundant EAS calculation
  2019-09-20  9:41   ` Quentin Perret
  2019-09-20 10:33     ` Pavan Kondeti
@ 2019-09-27  8:10     ` tip-bot2 for Quentin Perret
  1 sibling, 0 replies; 8+ messages in thread
From: tip-bot2 for Quentin Perret @ 2019-09-27  8:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Pavan Kondeti, Quentin Perret, Peter Zijlstra (Intel),
	Linus Torvalds, Thomas Gleixner, dietmar.eggemann, juri.lelli,
	morten.rasmussen, qais.yousef, rjw, tkjos, valentin.schneider,
	vincent.guittot, Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     4892f51ad54ddff2883a60b6ad4323c1f632a9d6
Gitweb:        https://git.kernel.org/tip/4892f51ad54ddff2883a60b6ad4323c1f632a9d6
Author:        Quentin Perret <qperret@qperret.net>
AuthorDate:    Fri, 20 Sep 2019 11:41:15 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 25 Sep 2019 17:42:32 +02:00

sched/fair: Avoid redundant EAS calculation

The EAS wake-up path computes the system energy for several CPU
candidates: the CPU with maximum spare capacity in each performance
domain, and the prev_cpu. However, if prev_cpu also happens to be the
CPU with maximum spare capacity in its performance domain, the energy
calculation is still done twice, unnecessarily.

Add a condition to filter out this corner case before doing the energy
calculation.

Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Quentin Perret <qperret@qperret.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: qais.yousef@arm.com
Cc: rjw@rjwysocki.net
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Fixes: eb92692b2544 ("sched/fair: Speed-up energy-aware wake-ups")
Link: https://lkml.kernel.org/r/20190920094115.GA11503@qperret.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index dfdac90..83ab35e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6389,7 +6389,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		}
 
 		/* Evaluate the energy impact of using this CPU. */
-		if (max_spare_cap_cpu >= 0) {
+		if (max_spare_cap_cpu >= 0 && max_spare_cap_cpu != prev_cpu) {
 			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
 			cur_delta -= base_energy_pd;
 			if (cur_delta < best_delta) {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-27  8:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-12  9:44 [PATCH] sched/fair: Speed-up energy-aware wake-ups Quentin Perret
2019-09-13 22:43 ` [tip: sched/core] " tip-bot2 for Quentin Perret
2019-09-20  3:02 ` [PATCH] " Pavan Kondeti
2019-09-20  9:41   ` Quentin Perret
2019-09-20 10:33     ` Pavan Kondeti
2019-09-20 11:23       ` Quentin Perret
2019-09-25  8:41         ` Peter Zijlstra
2019-09-27  8:10     ` [tip: sched/urgent] sched/fair: Avoid redundant EAS calculation tip-bot2 for Quentin Perret

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).