From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Rik van Riel <riel@surriel.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>, Wang <wang.yi59@zte.com.cn>,
zhong.weidong@zte.com.cn, Yi Liu <liu.yi24@zte.com.cn>,
Frederic Weisbecker <frederic@kernel.org>
Subject: [PATCH v3 2/3] sched/core: Don't mix isolcpus and housekeeping CPUs
Date: Fri, 26 Oct 2018 00:12:22 +0530 [thread overview]
Message-ID: <1540492943-17147-3-git-send-email-srikar@linux.vnet.ibm.com> (raw)
In-Reply-To: <1540492943-17147-1-git-send-email-srikar@linux.vnet.ibm.com>
Load balancer and NUMA balancer are not suppose to work on isolcpus.
Currently when setting cpus_allowed for a task, there are no checks to see
if the requested cpumask has CPUs from both isolcpus and housekeeping CPUs.
If user passes a mix of isolcpus and housekeeping CPUs, then NUMA balancer
can pick a isolcpu to schedule. With this change, if a combination of
isolcpus and housekeeping CPUs are provided, then we restrict it to
housekeeping CPUs only.
For example: System with 32 CPUs
$ grep -o "isolcpus=[,,1-9]*" /proc/cmdline
isolcpus=1,5,9,13
$ grep -i cpus_allowed /proc/$$/status
Cpus_allowed: ffffdddd
Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
Running "perf bench numa mem --no-data_rand_walk -p 4 -t 8 -G 0 -P 3072
-T 0 -l 50 -c -s 1000" which calls sched_setaffinity to all CPUs in
system.
Without patch
------------
$ for i in $(pgrep -f perf); do grep -i cpus_allowed_list /proc/$i/task/*/status ; done | head -n 10
Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/2107/task/2107/status:Cpus_allowed_list: 0-31
/proc/2107/task/2196/status:Cpus_allowed_list: 0-31
/proc/2107/task/2197/status:Cpus_allowed_list: 0-31
/proc/2107/task/2198/status:Cpus_allowed_list: 0-31
/proc/2107/task/2199/status:Cpus_allowed_list: 0-31
/proc/2107/task/2200/status:Cpus_allowed_list: 0-31
/proc/2107/task/2201/status:Cpus_allowed_list: 0-31
/proc/2107/task/2202/status:Cpus_allowed_list: 0-31
/proc/2107/task/2203/status:Cpus_allowed_list: 0-31
With patch
----------
$ for i in $(pgrep -f perf); do grep -i cpus_allowed_list /proc/$i/task/*/status ; done | head -n 10
Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18591/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18603/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18604/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18605/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18606/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18607/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18608/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18609/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
/proc/18591/task/18610/status:Cpus_allowed_list: 0,2-4,6-8,10-12,14-31
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v2->v3:
The actual detection is moved to set_cpus_allowed_common from
sched_setaffinity. This helps to solve all cases where task cpus_allowed is
set.
kernel/sched/core.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3064e0f..37e62b8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1003,7 +1003,19 @@ static int migration_cpu_stop(void *data)
*/
void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask)
{
- cpumask_copy(&p->cpus_allowed, new_mask);
+ const struct cpumask *hk_mask = housekeeping_cpumask(HK_FLAG_DOMAIN);
+
+ /*
+ * If the cpumask provided has CPUs that are part of isolated and
+ * housekeeping_cpumask, then restrict it to just the CPUs that
+ * are part of the housekeeping_cpumask.
+ */
+ if (!cpumask_subset(new_mask, hk_mask) &&
+ cpumask_intersects(new_mask, hk_mask))
+ cpumask_and(&p->cpus_allowed, new_mask, hk_mask);
+ else
+ cpumask_copy(&p->cpus_allowed, new_mask);
+
p->nr_cpus_allowed = cpumask_weight(new_mask);
}
--
1.8.3.1
next prev parent reply other threads:[~2018-10-25 18:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-25 18:42 [PATCH v3 0/3] isolcpus Srikar Dronamraju
2018-10-25 18:42 ` [PATCH v3 1/3] sched/core: Warn if cpumask has a mix of isolcpus and housekeeping CPUs Srikar Dronamraju
2018-10-26 8:32 ` Peter Zijlstra
2018-10-25 18:42 ` Srikar Dronamraju [this message]
2018-10-26 8:33 ` [PATCH v3 2/3] sched/core: Don't mix " Peter Zijlstra
2018-10-25 18:42 ` [PATCH v3 3/3] sched/core: Error out if cpumask has a mix of " Srikar Dronamraju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1540492943-17147-3-git-send-email-srikar@linux.vnet.ibm.com \
--to=srikar@linux.vnet.ibm.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liu.yi24@zte.com.cn \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=tglx@linutronix.de \
--cc=wang.yi59@zte.com.cn \
--cc=zhong.weidong@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).