From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759860AbcHaKwI (ORCPT ); Wed, 31 Aug 2016 06:52:08 -0400 Received: from foss.arm.com ([217.140.101.70]:57560 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758116AbcHaKwG (ORCPT ); Wed, 31 Aug 2016 06:52:06 -0400 From: Morten Rasmussen To: peterz@infradead.org, mingo@redhat.com Cc: dietmar.eggemann@arm.com, yuyang.du@intel.com, vincent.guittot@linaro.org, mgalbraith@suse.de, sgurrappadi@nvidia.com, freedom.tan@mediatek.com, keita.kobayashi.ym@renesas.com, linux-kernel@vger.kernel.org, Morten Rasmussen Subject: [PATCH v4 0/5] sched: Clean-ups and asymmetric cpu capacity support Date: Wed, 31 Aug 2016 11:52:14 +0100 Message-Id: <1472640739-8778-1-git-send-email-morten.rasmussen@arm.com> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, The scheduler is currently not doing much to help performance on systems with asymmetric compute capacities (read ARM big.LITTLE). This series improves the situation with a few tweaks mainly to the task wake-up path that considers compute capacity at wake-up and not just whether a cpu is idle for these systems. This gives us consistent, and potentially higher, throughput in partially utilized scenarios. SMP behaviour and performance should be unaffected. Test 0: for i in `seq 1 10`; \ do sysbench --test=cpu --max-time=3 --num-threads=1 run; \ done \ | awk '{if ($4=="events:") {print $5; sum +=$5; runs +=1}} \ END {print "Average events: " sum/runs}' Target: ARM TC2 (2xA15+3xA7) (Higher is better) tip: Average events: 126.9 patch: Average events: 217.9 Target: ARM Juno (2xA57+4xA53) (Higher is better) tip: Average events: 2082.6 patch: Average events: 2687.5 Test 1: perf stat --null --repeat 10 -- \ perf bench sched messaging -g 50 -l 5000 Target: Intel IVB-EP (2*10*2) tip: 4.652802973 seconds time elapsed ( +- 0.99% ) patch: 4.643020680 seconds time elapsed ( +- 1.26% ) Target: ARM TC2 A7-only (3xA7) (-l 1000) tip: 61.902516175 seconds time elapsed ( +- 0.22% ) patch: 63.178903751 seconds time elapsed ( +- 0.30% ) Target: ARM Juno A53-only (4xA53) (-l 1000) tip: 37.919193364 seconds time elapsed ( +- 0.29% ) patch: 37.568717760 seconds time elapsed ( +- 0.13% ) Notes: Active migration of tasks away from small capacity cpus isn't addressed in this set although it is necessary for consistent throughput in other scenarios on asymmetric cpu capacity systems. The infrastructure to enable capacity awareness for arm64 and arm is not provided here but will be based on Juri's DT bindings patch set [1]. A combined preview branch is available [2]. Test results above a based on [2]. [1] https://lkml.org/lkml/2016/7/19/419 [2] git://linux-arm.org/linux-power.git capacity_awareness_v4_arm64_v1 Patch 1: Fix task utilization for wake-up decisions. Patch 2-5: Improve capacity awareness. Tested-by: Koan-Sin Tan Tested-by: Keita Kobayashi v4: - Removed patches already in tip/sched/core. - Fixed wrong use of capacity_of() instead of capacity_orig_of() as reported by Wanpeng Li. - Re-implement fix for task wake-up utilization. Instead of estimating the utilization it is now computed and updated correctly. - Introduced peak utilization tracking to compensate for decay in wake-up placement decisions. - Removed pointless spare capacity selection criteria in find_idlest_group() as pointed out by Vincent and added a comment describing when we use spare capacity instead of least load. v3: https://lkml.org/lkml/2016/7/25/245 - Changed SD_ASYM_CPUCAPACITY sched_domain flag semantics as suggested by PeterZ. - Dropped arm specific patches for setting cpu capacity as these are superseded by Juri's patches [2]. - Changed capacity-aware pulling during load-balance to use sched_group min capacity instead of max as suggested by Sai. v2: https://lkml.org/lkml/2016/6/22/614 - Dropped patch ignoring wakee_flips for pid=0 for now as we can not distinguish cpu time processing irqs from idle time. - Dropped disabling WAKE_AFFINE as suggested by Vincent to allow more scenarios to use fast-path (select_idle_sibling()). Asymmetric wake conditions adjusted accordingly. - Changed use of new SD_ASYM_CPUCAPACITY slightly. Now enables SD_BALANCE_WAKE. - Minor clean-ups and rebased to more recent tip/sched/core. v1: https://lkml.org/lkml/2014/5/23/621 Morten Rasmussen (5): sched/fair: Compute task/cpu utilization at wake-up correctly sched/fair: Consider spare capacity in find_idlest_group() sched: Add per-cpu min capacity to sched_group_capacity sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups sched/fair: Track peak per-entity utilization include/linux/sched.h | 2 +- kernel/sched/core.c | 3 +- kernel/sched/fair.c | 142 ++++++++++++++++++++++++++++++++++++++++++++------ kernel/sched/sched.h | 3 +- 4 files changed, 130 insertions(+), 20 deletions(-) -- 1.9.1