From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755933AbbDIRAD (ORCPT ); Thu, 9 Apr 2015 13:00:03 -0400 Received: from mail-am1on0080.outbound.protection.outlook.com ([157.56.112.80]:14015 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753680AbbDIQ7x (ORCPT ); Thu, 9 Apr 2015 12:59:53 -0400 Authentication-Results: spf=fail (sender IP is 12.216.194.146) smtp.mailfrom=ezchip.com; ezchip.com; dkim=none (message not signed) header.d=none; From: Chris Metcalf To: "Peter Zijlstra (Intel)" , Frederic Weisbecker , "Paul E. McKenney" , "Rafael J. Wysocki" , Martin Schwidefsky , Ingo Molnar , CC: Chris Metcalf Subject: [PATCH v5] nohz: set isolcpus when nohz_full is set Date: Thu, 9 Apr 2015 12:59:39 -0400 Message-ID: <1428598779-24244-1-git-send-email-cmetcalf@ezchip.com> X-Mailer: git-send-email 2.1.2 In-Reply-To: <20150409124524.GA17709@lerouge> References: <20150409124524.GA17709@lerouge> X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:12.216.194.146;CTRY:US;IPV:NLI;EFV:NLI;BMV:1;SFV:NSPM;SFS:(10009020)(6009001)(339900001)(199003)(24454002)(189002)(47776003)(229853001)(106466001)(48376002)(36756003)(46102003)(62966003)(50466002)(50986999)(105606002)(77156002)(2950100001)(87936001)(575784001)(104016003)(85426001)(6806004)(42186005)(19580395003)(86362001)(50226001)(33646002)(76176999)(92566002);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR02MB0782;H:ld-1.internal.tilera.com;FPR:;SPF:Fail;MLV:sfv;A:1;MX:1;LANG:en; MIME-Version: 1.0 Content-Type: text/plain X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:VI1PR02MB0782; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5002010)(5005006);SRVR:VI1PR02MB0782;BCL:0;PCL:0;RULEID:;SRVR:VI1PR02MB0782; X-Forefront-PRVS: 0541031FF6 X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Apr 2015 16:59:47.9081 (UTC) X-MS-Exchange-CrossTenant-Id: 0fc16e0a-3cd3-4092-8b2f-0a42cff122c3 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=0fc16e0a-3cd3-4092-8b2f-0a42cff122c3;Ip=[12.216.194.146];Helo=[ld-1.internal.tilera.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR02MB0782 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org nohz_full is only useful with isolcpus also set, since otherwise the scheduler has to run periodically to try to determine whether to steal work from other cores. Accordingly, when booting with nohz_full=xxx on the command line, we should act as if isolcpus=xxx was also set, and set (or extend) the isolcpus set to include the nohz_full cpus. Signed-off-by: Chris Metcalf --- Frederic wrote: > cpu_isolated_map is allocated and filled early (__setup or sched_init()) > before tick_init() and tick_init() is before sched_init_smp() which first uses > cpu_isolated_map(). So we can call some sched_isolated_map_add(struct cpumask *cpumask) > from tick_nohz_init(). I'll re-send a v4 of the patch without your suggestion, just renaming the methods to tick_nohz_full_cpumask_andnot() etc, since I still think that that model is easier to understand - we tweak isolcpus in exactly the spot where we first put it to use. And, we do need those tick_nohz_full_cpumask_xxx() accessors in other places anyway -- see my earlier patch for the tilegx network driver to remove the nohz_full cores from the set of cores that get interrupted by the driver, for example. That said, I'm not opposed to your idea, and we could certainly do it that way if that's the consensus. For reference, here's what it looks like when fleshed out; I'm calling it v5 to be sort of clear about this, but either v4 or v5 would be fine. I left the sched_isolated_map_add() function enabled in all kernel configurations, not just NO_HZ_FULL, since it's pretty trivial and it felt like the #ifdefs to disable it conditionally would be noisier than the benefit to kernel size. include/linux/sched.h | 1 + kernel/sched/core.c | 5 +++++ kernel/time/tick-sched.c | 3 +++ 3 files changed, 9 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6d77432e14ff..18a961b9beba 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -323,6 +323,7 @@ struct task_struct; extern int lockdep_tasklist_lock_is_held(void); #endif /* #ifdef CONFIG_PROVE_RCU */ +extern void sched_isolated_map_add(const struct cpumask *); extern void sched_init(void); extern void sched_init_smp(void); extern asmlinkage void schedule_tail(struct task_struct *prev); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f0f831e8a345..b055c5e0e65c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5824,6 +5824,11 @@ static int __init isolated_cpu_setup(char *str) __setup("isolcpus=", isolated_cpu_setup); +void sched_isolated_map_add(const struct cpumask *cpumask) +{ + cpumask_or(cpu_isolated_map, cpu_isolated_map, cpumask); +} + struct s_data { struct sched_domain ** __percpu sd; struct root_domain *rd; diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index a4c4edac4528..b0092d02ca3f 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -385,6 +385,9 @@ void __init tick_nohz_init(void) for_each_cpu(cpu, tick_nohz_full_mask) context_tracking_cpu_set(cpu); + /* It's not meaningful to be nohz without disabling the scheduler. */ + sched_isolated_map_add(tick_nohz_full_mask); + cpu_notifier(tick_nohz_cpu_down_callback, 0); pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n", cpumask_pr_args(tick_nohz_full_mask)); -- 2.1.2