From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752013AbdKHJuM (ORCPT ); Wed, 8 Nov 2017 04:50:12 -0500 Received: from terminus.zytor.com ([65.50.211.136]:53477 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751886AbdKHJuI (ORCPT ); Wed, 8 Nov 2017 04:50:08 -0500 Date: Wed, 8 Nov 2017 01:46:28 -0800 From: tip-bot for Patrick Bellasi Message-ID: Cc: juri.lelli@redhat.com, linux-kernel@vger.kernel.org, hpa@zytor.com, vincent.guittot@linaro.org, mingo@kernel.org, torvalds@linux-foundation.org, brendan.jackman@arm.com, patrick.bellasi@arm.com, chris.redpath@arm.com, dietmar.eggemann@arm.com, morten.rasmussen@arm.com, tglx@linutronix.de, peterz@infradead.org Reply-To: hpa@zytor.com, vincent.guittot@linaro.org, mingo@kernel.org, juri.lelli@redhat.com, linux-kernel@vger.kernel.org, tglx@linutronix.de, morten.rasmussen@arm.com, dietmar.eggemann@arm.com, peterz@infradead.org, brendan.jackman@arm.com, torvalds@linux-foundation.org, chris.redpath@arm.com, patrick.bellasi@arm.com In-Reply-To: <20171107160658.4839-1-patrick.bellasi@arm.com> References: <20171107160658.4839-1-patrick.bellasi@arm.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:sched/core] sched/core: Optimize sched_feat() for !SCHED_DEBUG builds Git-Commit-ID: 692ee9a79c14c9f707eeb03754a26b9427c0e005 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 692ee9a79c14c9f707eeb03754a26b9427c0e005 Gitweb: https://git.kernel.org/tip/692ee9a79c14c9f707eeb03754a26b9427c0e005 Author: Patrick Bellasi AuthorDate: Tue, 7 Nov 2017 16:06:58 +0000 Committer: Ingo Molnar CommitDate: Wed, 8 Nov 2017 10:17:29 +0100 sched/core: Optimize sched_feat() for !SCHED_DEBUG builds When the kernel is compiled with !SCHED_DEBUG support, we expect that all SCHED_FEAT are turned into compile time constants being propagated to support compiler optimizations. Specifically, we expect that code blocks like this: if (sched_feat(FEATURE_NAME) [&& ]) { /* FEATURE CODE */ } are turned into dead-code in case FEATURE_NAME defaults to FALSE, and thus being removed by the compiler from the finale image. For this mechanism to properly work it's required for the compiler to have full access, from each translation unit, to whatever is the value defined by the sched_feat macro. This macro is defined as: #define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x)) and thus, the compiler can optimize that code only if the value of sysctl_sched_features is visible within each translation unit. Since: 029632fbb sched: Make separate sched*.c translation units the scheduler code has been split into separate translation units however the definition of sysctl_sched_features is part of kernel/sched/core.c while, for all the other scheduler modules, it is visible only via kernel/sched/sched.h as an: extern const_debug unsigned int sysctl_sched_features Unfortunately, an extern reference does not allow the compiler to apply constants propagation. Thus, on !SCHED_DEBUG kernel we still end up with code to load a memory reference and (eventually) doing an unconditional jump of a chunk of code. This mechanism is unavoidable when sched_features can be turned on and off at run-time. However, this is not the case for "production" kernels compiled with !SCHED_DEBUG. In this case, sysctl_sched_features is just a constant value which cannot be changed at run-time and thus memory loads and jumps can be avoided altogether. This patch fixes the case of !SCHED_DEBUG kernel by declaring a local version of the sysctl_sched_features constant for each translation unit. This will ultimately allow the compiler to perform constants propagation and dead-code pruning. Tests have been done, with !SCHED_DEBUG on a v4.14-rc8 with and without the patch, by running 30 iterations of: perf bench sched messaging --pipe --thread --group 4 --loop 50000 on a 40 cores Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz using the powersave governor to rule out variations due to frequency scaling. Statistics on the reported completion time: count mean std min 99% max v4.14-rc8 30.0 15.7831 0.176032 15.442 16.01226 16.014 v4.14-rc8+patch 30.0 15.5033 0.189681 15.232 15.93938 15.962 show a 1.8% speedup on average completion time and 0.5% speedup in the 99 percentile. Signed-off-by: Patrick Bellasi Signed-off-by: Chris Redpath Reviewed-by: Dietmar Eggemenn Reviewed-by: Brendan Jackman Acked-by: Peter Zijlstra Cc: Juri Lelli Cc: Linus Torvalds Cc: Morten Rasmussen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Vincent Guittot Link: http://lkml.kernel.org/r/20171107160658.4839-1-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar --- kernel/sched/core.c | 9 ++++++--- kernel/sched/sched.h | 25 ++++++++++++++++++++++--- 2 files changed, 28 insertions(+), 6 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1a55c84..a39a081 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -43,18 +43,21 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); +#if defined(CONFIG_SCHED_DEBUG) && defined(HAVE_JUMP_LABEL) /* * Debugging: various feature bits + * + * If SCHED_DEBUG is disabled, each compilation unit has its own copy of + * sysctl_sched_features, defined in sched.h, to allow constants propagation + * at compile time and compiler optimization based on features default. */ - #define SCHED_FEAT(name, enabled) \ (1UL << __SCHED_FEAT_##name) * enabled | - const_debug unsigned int sysctl_sched_features = #include "features.h" 0; - #undef SCHED_FEAT +#endif /* * Number of tasks to iterate in a single balance run. diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 58787e3..e84fa6f 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1233,8 +1233,6 @@ static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu) # define const_debug const #endif -extern const_debug unsigned int sysctl_sched_features; - #define SCHED_FEAT(name, enabled) \ __SCHED_FEAT_##name , @@ -1246,6 +1244,13 @@ enum { #undef SCHED_FEAT #if defined(CONFIG_SCHED_DEBUG) && defined(HAVE_JUMP_LABEL) + +/* + * To support run-time toggling of sched features, all the translation units + * (but core.c) reference the sysctl_sched_features defined in core.c. + */ +extern const_debug unsigned int sysctl_sched_features; + #define SCHED_FEAT(name, enabled) \ static __always_inline bool static_branch_##name(struct static_key *key) \ { \ @@ -1253,13 +1258,27 @@ static __always_inline bool static_branch_##name(struct static_key *key) \ } #include "features.h" - #undef SCHED_FEAT extern struct static_key sched_feat_keys[__SCHED_FEAT_NR]; #define sched_feat(x) (static_branch_##x(&sched_feat_keys[__SCHED_FEAT_##x])) + #else /* !(SCHED_DEBUG && HAVE_JUMP_LABEL) */ + +/* + * Each translation unit has its own copy of sysctl_sched_features to allow + * constants propagation at compile time and compiler optimization based on + * features default. + */ +#define SCHED_FEAT(name, enabled) \ + (1UL << __SCHED_FEAT_##name) * enabled | +static const_debug unsigned int sysctl_sched_features = +#include "features.h" + 0; +#undef SCHED_FEAT + #define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x)) + #endif /* SCHED_DEBUG && HAVE_JUMP_LABEL */ extern struct static_key_false sched_numa_balancing;