From: Srikar Dronamraju <srikar@linux.vnet.ibm.com> To: Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org> Cc: LKML <linux-kernel@vger.kernel.org>, Mel Gorman <mgorman@techsingularity.net>, Rik van Riel <riel@surriel.com>, Srikar Dronamraju <srikar@linux.vnet.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Wang <wang.yi59@zte.com.cn>, zhong.weidong@zte.com.cn, Yi Liu <liu.yi24@zte.com.cn>, Frederic Weisbecker <frederic@kernel.org> Subject: [PATCH v3 2/3] sched/core: Don't mix isolcpus and housekeeping CPUs Date: Fri, 26 Oct 2018 00:12:22 +0530 [thread overview] Message-ID: <1540492943-17147-3-git-send-email-srikar@linux.vnet.ibm.com> (raw) In-Reply-To: <1540492943-17147-1-git-send-email-srikar@linux.vnet.ibm.com> Load balancer and NUMA balancer are not suppose to work on isolcpus. Currently when setting cpus_allowed for a task, there are no checks to see if the requested cpumask has CPUs from both isolcpus and housekeeping CPUs. If user passes a mix of isolcpus and housekeeping CPUs, then NUMA balancer can pick a isolcpu to schedule. With this change, if a combination of isolcpus and housekeeping CPUs are provided, then we restrict it to housekeeping CPUs only. For example: System with 32 CPUs $ grep -o "isolcpus=[,,1-9]*" /proc/cmdline isolcpus=1,5,9,13 $ grep -i cpus_allowed /proc/$$/status Cpus_allowed: ffffdddd Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 Running "perf bench numa mem --no-data_rand_walk -p 4 -t 8 -G 0 -P 3072 -T 0 -l 50 -c -s 1000" which calls sched_setaffinity to all CPUs in system. Without patch ------------ $ for i in $(pgrep -f perf); do grep -i cpus_allowed_list /proc/$i/task/*/status ; done | head -n 10 Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/2107/task/2107/status:Cpus_allowed_list: 0-31 /proc/2107/task/2196/status:Cpus_allowed_list: 0-31 /proc/2107/task/2197/status:Cpus_allowed_list: 0-31 /proc/2107/task/2198/status:Cpus_allowed_list: 0-31 /proc/2107/task/2199/status:Cpus_allowed_list: 0-31 /proc/2107/task/2200/status:Cpus_allowed_list: 0-31 /proc/2107/task/2201/status:Cpus_allowed_list: 0-31 /proc/2107/task/2202/status:Cpus_allowed_list: 0-31 /proc/2107/task/2203/status:Cpus_allowed_list: 0-31 With patch ---------- $ for i in $(pgrep -f perf); do grep -i cpus_allowed_list /proc/$i/task/*/status ; done | head -n 10 Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18591/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18603/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18604/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18605/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18606/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18607/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18608/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18609/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/18591/task/18610/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> --- Changelog v2->v3: The actual detection is moved to set_cpus_allowed_common from sched_setaffinity. This helps to solve all cases where task cpus_allowed is set. kernel/sched/core.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3064e0f..37e62b8 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1003,7 +1003,19 @@ static int migration_cpu_stop(void *data) */ void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask) { - cpumask_copy(&p->cpus_allowed, new_mask); + const struct cpumask *hk_mask = housekeeping_cpumask(HK_FLAG_DOMAIN); + + /* + * If the cpumask provided has CPUs that are part of isolated and + * housekeeping_cpumask, then restrict it to just the CPUs that + * are part of the housekeeping_cpumask. + */ + if (!cpumask_subset(new_mask, hk_mask) && + cpumask_intersects(new_mask, hk_mask)) + cpumask_and(&p->cpus_allowed, new_mask, hk_mask); + else + cpumask_copy(&p->cpus_allowed, new_mask); + p->nr_cpus_allowed = cpumask_weight(new_mask); } -- 1.8.3.1
next prev parent reply other threads:[~2018-10-25 18:42 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-25 18:42 [PATCH v3 0/3] isolcpus Srikar Dronamraju 2018-10-25 18:42 ` [PATCH v3 1/3] sched/core: Warn if cpumask has a mix of isolcpus and housekeeping CPUs Srikar Dronamraju 2018-10-26 8:32 ` Peter Zijlstra 2018-10-25 18:42 ` Srikar Dronamraju [this message] 2018-10-26 8:33 ` [PATCH v3 2/3] sched/core: Don't mix " Peter Zijlstra 2018-10-25 18:42 ` [PATCH v3 3/3] sched/core: Error out if cpumask has a mix of " Srikar Dronamraju
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1540492943-17147-3-git-send-email-srikar@linux.vnet.ibm.com \ --to=srikar@linux.vnet.ibm.com \ --cc=frederic@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=liu.yi24@zte.com.cn \ --cc=mgorman@techsingularity.net \ --cc=mingo@kernel.org \ --cc=peterz@infradead.org \ --cc=riel@surriel.com \ --cc=tglx@linutronix.de \ --cc=wang.yi59@zte.com.cn \ --cc=zhong.weidong@zte.com.cn \ --subject='Re: [PATCH v3 2/3] sched/core: Don'\''t mix isolcpus and housekeeping CPUs' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).