All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/13] sched: Clean-ups and asymmetric cpu capacity support
@ 2016-07-25 13:34 Morten Rasmussen
  2016-07-25 13:34 ` [PATCH v3 01/13] sched: Fix power to capacity renaming in comment Morten Rasmussen
                   ` (12 more replies)
  0 siblings, 13 replies; 45+ messages in thread
From: Morten Rasmussen @ 2016-07-25 13:34 UTC (permalink / raw)
  To: peterz, mingo
  Cc: dietmar.eggemann, yuyang.du, vincent.guittot, mgalbraith,
	sgurrappadi, freedom.tan, keita.kobayashi.ym, linux-kernel,
	Morten Rasmussen

Hi,

The scheduler is currently not doing much to help performance on systems with
asymmetric compute capacities (read ARM big.LITTLE). This series improves the
situation with a few tweaks mainly to the task wake-up path that considers
compute capacity at wake-up and not just whether a cpu is idle for these
systems. This gives us consistent, and potentially higher, throughput in
partially utilized scenarios. SMP behaviour and performance should be
unaffected.

Test 0:
	for i in `seq 1 10`; \
	       do sysbench --test=cpu --max-time=3 --num-threads=1 run; \
	       done \
	| awk '{if ($4=="events:") {print $5; sum +=$5; runs +=1}} \
	       END {print "Average events: " sum/runs}'

Target: ARM TC2 (2xA15+3xA7)

	(Higher is better)
tip:	Average events: 151.8 
patch:	Average events: 217.9

Target: ARM Juno (2xA57+4xA53)

	(Higher is better)
tip:	Average events: 1737.7
patch:	Average events: 1952.5

Test 1:
	perf stat --null --repeat 10 -- \
	perf bench sched messaging -g 50 -l 5000

Target: Intel IVB-EP (2*10*2)

tip:    4.632455615 seconds time elapsed ( +-  0.84% )
patch:  4.532470694 seconds time elapsed ( +-  1.28% )

Target: ARM TC2 A7-only (3xA7) (-l 1000)

tip:    61.554834175 seconds time elapsed ( +-  0.11% )
patch:  62.633350367 seconds time elapsed ( +-  0.12% )

Notes:

Active migration of tasks away from small capacity cpus isn't addressed
in this set although it is necessary for consistent throughput in other
scenarios on asymmetric cpu capacity systems.

The infrastructure to enable capacity awareness for arm64 and arm is not
provided here but will be based on Juri's DT bindings patch set [1]. A
combined preview branch is available [2]. Test results above a based on
[2].

[1] https://lkml.org/lkml/2016/7/19/419
[2] git://linux-arm.org/linux-power.git capacity_awareness_v3_arm64_v1

Patch   1-4: Generic fixes and clean-ups.
Patch  5-13: Improve capacity awareness.

Tested-by: Koan-Sin Tan <freedom.tan@mediatek.com>
Tested-by: Keita Kobayashi <keita.kobayashi.ym@renesas.com>

v3:

- Changed SD_ASYM_CPUCAPACITY sched_domain flag semantics as suggested
  by PeterZ.

- Dropped arm specific patches for setting cpu capacity as these are
  superseded by Juri's patches [2].

- Changed capacity-aware pulling during load-balance to use sched_group
  min capacity instead of max as suggested by Sai.

v2: https://lkml.org/lkml/2016/6/22/614

- Dropped patch ignoring wakee_flips for pid=0 for now as we can not
  distinguish cpu time processing irqs from idle time.

- Dropped disabling WAKE_AFFINE as suggested by Vincent to allow more
  scenarios to use fast-path (select_idle_sibling()). Asymmetric wake
  conditions adjusted accordingly.

- Changed use of new SD_ASYM_CPUCAPACITY slightly. Now enables
  SD_BALANCE_WAKE.

- Minor clean-ups and rebased to more recent tip/sched/core.

v1: https://lkml.org/lkml/2014/5/23/621

Dietmar Eggemann (1):
  sched: Store maximum per-cpu capacity in root domain

Morten Rasmussen (12):
  sched: Fix power to capacity renaming in comment
  sched/fair: Consistent use of prev_cpu in wakeup path
  sched/fair: Optimize find_idlest_cpu() when there is no choice
  sched/core: Remove unnecessary null-pointer check
  sched: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag
  sched/core: Pass child domain into sd_init
  sched: Enable SD_BALANCE_WAKE for asymmetric capacity systems
  sched/fair: Let asymmetric cpu configurations balance at wake-up
  sched/fair: Compute task/cpu utilization at wake-up more correctly
  sched/fair: Consider spare capacity in find_idlest_group()
  sched: Add per-cpu min capacity to sched_group_capacity
  sched/fair: Avoid pulling tasks from non-overloaded higher capacity
    groups

 include/linux/sched.h |   3 +-
 kernel/sched/core.c   |  37 +++++++--
 kernel/sched/fair.c   | 213 ++++++++++++++++++++++++++++++++++++++++++++------
 kernel/sched/sched.h  |   5 +-
 4 files changed, 227 insertions(+), 31 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2016-08-22 11:29 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-25 13:34 [PATCH v3 00/13] sched: Clean-ups and asymmetric cpu capacity support Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 01/13] sched: Fix power to capacity renaming in comment Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 02/13] sched/fair: Consistent use of prev_cpu in wakeup path Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 03/13] sched/fair: Optimize find_idlest_cpu() when there is no choice Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 04/13] sched/core: Remove unnecessary null-pointer check Morten Rasmussen
2016-08-18 10:56   ` [tip:sched/core] sched/core: Remove unnecessary NULL-pointer check tip-bot for Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 05/13] sched: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag Morten Rasmussen
2016-08-15 10:54   ` Peter Zijlstra
2016-08-15 11:43     ` Morten Rasmussen
2016-08-18 10:56     ` [tip:sched/core] sched/core: Clarify SD_flags comment tip-bot for Peter Zijlstra
2016-08-17  8:42   ` [PATCH v3 05/13] sched: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag Wanpeng Li
2016-08-17  9:23     ` Morten Rasmussen
2016-08-17  9:26       ` Wanpeng Li
2016-08-18 10:56   ` [tip:sched/core] sched/core: " tip-bot for Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 06/13] sched/core: Pass child domain into sd_init Morten Rasmussen
2016-08-18 10:57   ` [tip:sched/core] sched/core: Pass child domain into sd_init() tip-bot for Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 07/13] sched: Enable SD_BALANCE_WAKE for asymmetric capacity systems Morten Rasmussen
2016-08-18 10:57   ` [tip:sched/core] sched/core: " tip-bot for Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 08/13] sched: Store maximum per-cpu capacity in root domain Morten Rasmussen
2016-08-01 18:53   ` Dietmar Eggemann
2016-08-16 12:24     ` Vincent Guittot
2016-08-18 10:58     ` [tip:sched/core] sched/core: Store maximum per-CPU " tip-bot for Dietmar Eggemann
2016-07-25 13:34 ` [PATCH v3 09/13] sched/fair: Let asymmetric cpu configurations balance at wake-up Morten Rasmussen
2016-08-15 13:39   ` Peter Zijlstra
2016-08-15 15:01     ` Morten Rasmussen
2016-08-15 15:10       ` Peter Zijlstra
2016-08-15 15:30         ` Morten Rasmussen
2016-08-18 10:58   ` [tip:sched/core] sched/fair: Let asymmetric CPU " tip-bot for Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 10/13] sched/fair: Compute task/cpu utilization at wake-up more correctly Morten Rasmussen
2016-08-15 14:23   ` Peter Zijlstra
2016-08-15 15:42     ` Morten Rasmussen
2016-08-18  8:40       ` Morten Rasmussen
2016-08-18 10:24         ` Morten Rasmussen
2016-08-18 11:46           ` Wanpeng Li
2016-08-18 13:45             ` Morten Rasmussen
2016-08-19  1:43               ` Wanpeng Li
2016-08-19 14:03                 ` Morten Rasmussen
2016-08-22  1:48                   ` Wanpeng Li
2016-08-22 11:29                     ` Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 11/13] sched/fair: Consider spare capacity in find_idlest_group() Morten Rasmussen
2016-08-16 13:57   ` Vincent Guittot
2016-08-18 11:16     ` Morten Rasmussen
2016-08-18 12:28       ` Peter Zijlstra
2016-07-25 13:34 ` [PATCH v3 12/13] sched: Add per-cpu min capacity to sched_group_capacity Morten Rasmussen
2016-07-25 13:34 ` [PATCH v3 13/13] sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups Morten Rasmussen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.