[PATCH v5 0/6] sched: Clean-ups and asymmetric cpu capacity support

* [PATCH v5 0/6] sched: Clean-ups and asymmetric cpu capacity support
@ 2016-10-14 13:41 Morten Rasmussen
  2016-10-14 13:41 ` [PATCH v5 1/6] sched/fair: Compute task/cpu utilization at wake-up correctly Morten Rasmussen
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Morten Rasmussen @ 2016-10-14 13:41 UTC (permalink / raw)
  To: peterz, mingo
  Cc: dietmar.eggemann, yuyang.du, vincent.guittot, mgalbraith,
	sgurrappadi, freedom.tan, keita.kobayashi.ym, linux-kernel,
	Morten Rasmussen

Hi,

The scheduler is currently not doing much to help performance on systems with
asymmetric compute capacities (read ARM big.LITTLE). This series improves the
situation with a few tweaks mainly to the task wake-up path that considers
compute capacity at wake-up and not just whether a cpu is idle for these
systems. This gives us consistent, and potentially higher, throughput in
partially utilized scenarios. SMP behaviour and performance should be
unaffected.

Test 0:
	for i in `seq 1 10`; \
	       do sysbench --test=cpu --max-time=3 --num-threads=1 run; \
	       done \
	| awk '{if ($4=="events:") {print $5; sum +=$5; runs +=1}} \
	       END {print "Average events: " sum/runs}'

Target: ARM TC2 (2xA15+3xA7)

	(Higher is better)
tip:	Average events: 116.3
patch:	Average events: 217.9

Target: ARM Juno (2xA57+4xA53)

	(Higher is better)
tip:	Average events: 2063.2
patch:	Average events: 2684.1

Test 1:
	perf stat --null --repeat 10 -- \
	perf bench sched messaging -g 50 -l 5000

Target: Intel IVB-EP (2*10*2)

tip:    4.815292358 seconds time elapsed  ( +-  0.77% ) 
patch:  4.855237141 seconds time elapsed  ( +-  1.00% )

Target: ARM TC2 A7-only (3xA7) (-l 1000)

tip:    63.888583172 seconds time elapsed ( +-  0.08% ) 
patch:  63.841030289 seconds time elapsed ( +-  0.23% ) 

Target: ARM Juno A53-only (4xA53) (-l 1000)

tip:    37.252267738 seconds time elapsed ( +-  0.24% ) 
patch:  37.480712902 seconds time elapsed ( +-  0.26% ) 

Notes:

Active migration of tasks away from small capacity cpus isn't addressed
in this set although it is necessary for consistent throughput in other
scenarios on asymmetric cpu capacity systems.

The infrastructure to enable capacity awareness for arm64 and arm is not
provided here but will be based on Juri's DT bindings patch set [1]. A
combined preview branch is available [2]. Test results above a based on
[2].

[1] https://lkml.org/lkml/2016/7/19/419
[2] git://linux-arm.org/linux-power.git capacity_awareness_v5_arm64_v1

Patch    1: Fix task utilization for wake-up decisions.
Patch  2-5: Improve capacity awareness.
Patch    6: Comment fix.

Tested-by: Koan-Sin Tan <freedom.tan@mediatek.com>
Tested-by: Keita Kobayashi <keita.kobayashi.ym@renesas.com>

v5:

- Changed peak utilization tracking to only update when tasks are
  dequeued to sleep as suggested by Patrick Bellasi.

- Fixed wrong use of task_util_peak() in cpu_util_wake().

- Added comment fix for previously merged patch.

v4: https://lkml.org/lkml/2016/8/31/292

- Removed patches already in tip/sched/core.

- Fixed wrong use of capacity_of() instead of capacity_orig_of() as
  reported by Wanpeng Li.

- Re-implement fix for task wake-up utilization. Instead of estimating
  the utilization it is now computed and updated correctly.

- Introduced peak utilization tracking to compensate for decay in
  wake-up placement decisions.

- Removed pointless spare capacity selection criteria in
  find_idlest_group() as pointed out by Vincent and added a comment
  describing when we use spare capacity instead of least load.

v3: https://lkml.org/lkml/2016/7/25/245

- Changed SD_ASYM_CPUCAPACITY sched_domain flag semantics as suggested
  by PeterZ.

- Dropped arm specific patches for setting cpu capacity as these are
  superseded by Juri's patches [2].

- Changed capacity-aware pulling during load-balance to use sched_group
  min capacity instead of max as suggested by Sai.

v2: https://lkml.org/lkml/2016/6/22/614

- Dropped patch ignoring wakee_flips for pid=0 for now as we can not
  distinguish cpu time processing irqs from idle time.

- Dropped disabling WAKE_AFFINE as suggested by Vincent to allow more
  scenarios to use fast-path (select_idle_sibling()). Asymmetric wake
  conditions adjusted accordingly.

- Changed use of new SD_ASYM_CPUCAPACITY slightly. Now enables
  SD_BALANCE_WAKE.

- Minor clean-ups and rebased to more recent tip/sched/core.

v1: https://lkml.org/lkml/2014/5/23/621

Morten Rasmussen (6):
  sched/fair: Compute task/cpu utilization at wake-up correctly
  sched/fair: Consider spare capacity in find_idlest_group()
  sched: Add per-cpu min capacity to sched_group_capacity
  sched/fair: Avoid pulling tasks from non-overloaded higher capacity
    groups
  sched/fair: Track peak per-entity utilization
  sched/fair: Fix wrong comment for capacity_margin

 include/linux/sched.h |   2 +-
 kernel/sched/core.c   |   3 +-
 kernel/sched/fair.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++------
 kernel/sched/sched.h  |   3 +-
 4 files changed, 135 insertions(+), 19 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread