From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751683AbcGMMGz (ORCPT ); Wed, 13 Jul 2016 08:06:55 -0400 Received: from mail-lf0-f41.google.com ([209.85.215.41]:34952 "EHLO mail-lf0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751224AbcGMMGr (ORCPT ); Wed, 13 Jul 2016 08:06:47 -0400 MIME-Version: 1.0 In-Reply-To: <1466615004-3503-1-git-send-email-morten.rasmussen@arm.com> References: <1466615004-3503-1-git-send-email-morten.rasmussen@arm.com> From: Vincent Guittot Date: Wed, 13 Jul 2016 14:06:17 +0200 Message-ID: Subject: Re: [PATCH v2 00/13] sched: Clean-ups and asymmetric cpu capacity support To: Morten Rasmussen Cc: Peter Zijlstra , "mingo@redhat.com" , Dietmar Eggemann , Yuyang Du , mgalbraith@suse.de, linux-kernel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Morten, On 22 June 2016 at 19:03, Morten Rasmussen wrote: > Hi, > > The scheduler is currently not doing much to help performance on systems with > asymmetric compute capacities (read ARM big.LITTLE). This series improves the > situation with a few tweaks mainly to the task wake-up path that considers > compute capacity at wake-up and not just whether a cpu is idle for these > systems. This gives us consistent, and potentially higher, throughput in > partially utilized scenarios. SMP behaviour and performance should be > unaffected. > > Test 0: > for i in `seq 1 10`; \ > do sysbench --test=cpu --max-time=3 --num-threads=1 run; \ > done \ > | awk '{if ($4=="events:") {print $5; sum +=$5; runs +=1}} \ > END {print "Average events: " sum/runs}' > > Target: ARM TC2 (2xA15+3xA7) > > (Higher is better) > tip: Average events: 146.9 > patch: Average events: 217.9 > > Test 1: > perf stat --null --repeat 10 -- \ > perf bench sched messaging -g 50 -l 5000 > > Target: Intel IVB-EP (2*10*2) > > tip: 4.861970420 seconds time elapsed ( +- 1.39% ) > patch: 4.886204224 seconds time elapsed ( +- 0.75% ) > > Target: ARM TC2 A7-only (3xA7) (-l 1000) > > tip: 61.485682596 seconds time elapsed ( +- 0.07% ) > patch: 62.667950130 seconds time elapsed ( +- 0.36% ) > > More analysis: > > Statistics from mixed periodic task workload (rt-app) containing both > big and little task, single run on ARM TC2: > > tu = Task utilization big/little > pcpu = Previous cpu big/little > tcpu = This (waker) cpu big/little > dl = New cpu is little > db = New cpu is big > sis = New cpu chosen by select_idle_sibling() > figc = New cpu chosen by find_idlest_*() > ww = wake_wide(task) count for figc wakeups > bw = sd_flag & SD_BALANCE_WAKE (non-fork/exec wake) > for figc wakeups > > case tu pcpu tcpu dl db sis figc ww bw > 1 l l l 122 68 28 162 161 161 > 2 l l b 11 4 0 15 15 15 > 3 l b l 0 252 8 244 244 244 > 4 l b b 36 1928 711 1253 1016 1016 > 5 b l l 5 19 0 24 22 24 > 6 b l b 5 1 0 6 0 6 > 7 b b l 0 31 0 31 31 31 > 8 b b b 1 194 109 86 59 59 > -------------------------------------------------- > 180 2497 856 1821 I'm not sure to know how to interpret all these statistics > > Cases 1-4 + 8 are fine to be served by select_idle_sibling() as both > this_cpu and prev_cpu are suitable cpus for the task. However, as the > figc column reveals, those cases are often served by find_idlest_*() > anyway due to wake_wide() sending the wakeup that way when > SD_BALANCE_WAKE is set on the sched_domains. > > Pulling in the wakee_flip patch (dropped in v2) from v1 shifts a > significant share of the wakeups to sis from figc: > > case tu pcpu tcpu dl db sis figc ww bw > 1 l l l 537 8 537 8 6 6 > 2 l l b 49 11 32 28 28 28 > 3 l b l 4 323 322 5 5 5 > 4 l b b 1 1910 1209 702 458 456 > 5 b l l 0 5 0 5 1 5 > 6 b l b 0 0 0 0 0 0 > 7 b b l 0 32 0 32 2 32 > 8 b b b 0 198 168 30 13 13 > -------------------------------------------------- > 591 2487 2268 810 > > Notes: > > Active migration of tasks away from small capacity cpus isn't addressed > in this set although it is necessary for consistent throughput in other > scenarios on asymmetric cpu capacity systems. > > The infrastructure to enable capacity awareness for arm64 is not provided here > but will be based on Juri's DT bindings patch set [1]. A combined preview > branch is available [2]. > > [1] https://lkml.org/lkml/2016/6/15/291 > [2] git://linux-arm.org/linux-power.git capacity_awareness_v2_arm64_v1 > > Patch 1-3: Generic fixes and clean-ups. > Patch 4-11: Improve capacity awareness. > Patch 11-12: Arch features for arm to enable asymmetric capacity support. > > v2: > > - Dropped patch ignoring wakee_flips for pid=0 for now as we can not > distinguish cpu time processing irqs from idle time. > > - Dropped disabling WAKE_AFFINE as suggested by Vincent to allow more > scenarios to use fast-path (select_idle_sibling()). Asymmetric wake > conditions adjusted accordingly. > > - Changed use of new SD_ASYM_CPUCAPACITY slightly. Now enables > SD_BALANCE_WAKE. > > - Minor clean-ups and rebased to more recent tip/sched/core. > > v1: https://lkml.org/lkml/2014/5/23/621 > > Dietmar Eggemann (1): > sched: Store maximum per-cpu capacity in root domain > > Morten Rasmussen (12): > sched: Fix power to capacity renaming in comment > sched/fair: Consistent use of prev_cpu in wakeup path > sched/fair: Optimize find_idlest_cpu() when there is no choice > sched: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag > sched: Enable SD_BALANCE_WAKE for asymmetric capacity systems > sched/fair: Let asymmetric cpu configurations balance at wake-up > sched/fair: Compute task/cpu utilization at wake-up more correctly > sched/fair: Consider spare capacity in find_idlest_group() > sched: Add per-cpu max capacity to sched_group_capacity > sched/fair: Avoid pulling tasks from non-overloaded higher capacity > groups > arm: Set SD_ASYM_CPUCAPACITY for big.LITTLE platforms > arm: Update arch_scale_cpu_capacity() to reflect change to define > > arch/arm/include/asm/topology.h | 5 + > arch/arm/kernel/topology.c | 25 ++++- > include/linux/sched.h | 3 +- > kernel/sched/core.c | 21 +++- > kernel/sched/fair.c | 212 +++++++++++++++++++++++++++++++++++----- > kernel/sched/sched.h | 5 +- > 6 files changed, 241 insertions(+), 30 deletions(-) > > -- > 1.9.1 >