From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Valentin Schneider <valentin.schneider@arm.com>,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Cc: lizefan@huawei.com, tj@kernel.org, hannes@cmpxchg.org,
mingo@kernel.org, peterz@infradead.org,
vincent.guittot@linaro.org, morten.rasmussen@arm.com,
qperret@google.com, stable@vger.kernel.org
Subject: Re: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains
Date: Thu, 24 Oct 2019 18:19:38 +0200 [thread overview]
Message-ID: <27d8a51d-a899-67f4-8b74-224281f25cef@arm.com> (raw)
In-Reply-To: <20191023153745.19515-2-valentin.schneider@arm.com>
On 23/10/2019 17:37, Valentin Schneider wrote:
> Turns out hotplugging CPUs that are in exclusive cpusets can lead to the
> cpuset code feeding empty cpumasks to the sched domain rebuild machinery.
> This leads to the following splat:
[...]
> The faulty line in question is
>
> cap = arch_scale_cpu_capacity(cpumask_first(cpu_map));
>
> and we're not checking the return value against nr_cpu_ids (we shouldn't
> have to!), which leads to the above.
>
> Prevent generate_sched_domains() from returning empty cpumasks, and add
> some assertion in build_sched_domains() to scream bloody murder if it
> happens again.
>
> The above splat was obtained on my Juno r0 with:
>
> cgcreate -g cpuset:asym
> cgset -r cpuset.cpus=0-3 asym
> cgset -r cpuset.mems=0 asym
> cgset -r cpuset.cpu_exclusive=1 asym
>
> cgcreate -g cpuset:smp
> cgset -r cpuset.cpus=4-5 smp
> cgset -r cpuset.mems=0 smp
> cgset -r cpuset.cpu_exclusive=1 smp
>
> cgset -r cpuset.sched_load_balance=0 .
>
> echo 0 > /sys/devices/system/cpu/cpu4/online
> echo 0 > /sys/devices/system/cpu/cpu5/online
>
> Cc: <stable@vger.kernel.org>
> Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection")
Sorry for being picky but IMHO you should also mention that it fixes
f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting
information")
Tested it on a hikey620 (8 CPus SMP) with v5.4-rc4 and a local fix for
asym_cpu_capacity_level().
2 exclusive cpusets [0-3] and [4-7], hp'ing out [0-3] and then hp'ing in
[0] again.
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 5a174ae6ecf3..8f83e8e3ea9a 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2203,8 +2203,19 @@ void partition_sched_domains_locked(int
ndoms_new, cpumask_var_t doms_new[],
for (i = 0; i < ndoms_cur; i++) {
for (j = 0; j < n && !new_topology; j++) {
if (cpumask_equal(doms_cur[i], doms_new[j]) &&
- dattrs_equal(dattr_cur, i, dattr_new, j))
+ dattrs_equal(dattr_cur, i, dattr_new, j)) {
+ struct root_domain *rd;
+
+ /*
+ * This domain won't be destroyed and as such
+ * its dl_bw->total_bw needs to be cleared. It
+ * will be recomputed in function
+ * update_tasks_root_domain().
+ */
+ rd = cpu_rq(cpumask_any(doms_cur[i]))->rd;
We have an issue here if doms_cur[i] is empty.
+ dl_clear_root_domain(rd);
goto match1;
There is yet another similar issue behind the first one
(asym_cpu_capacity_level()).
342 static bool build_perf_domains(const struct cpumask *cpu_map)
343 {
344 int i, nr_pd = 0, nr_cs = 0, nr_cpus = cpumask_weight(cpu_map);
345 struct perf_domain *pd = NULL, *tmp;
346 int cpu = cpumask_first(cpu_map); <--- !!!
347 struct root_domain *rd = cpu_rq(cpu)->rd; <--- !!!
348 struct cpufreq_policy *policy;
349 struct cpufreq_governor *gov;
...
406 tmp = rd->pd; <--- !!!
Caught when running hikey620 (8 CPus SMP) with v5.4-rc4 and a local fix
for asym_cpu_capacity_level() with CONFIG_ENERGY_MODEL=y.
There might be other places in build_sched_domains() suffering from the
same issue. So I assume it's wise to not call it with an empty cpu_map
and warn if done so.
[...]
next prev parent reply other threads:[~2019-10-24 16:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-23 15:37 [PATCH v4 0/2] sched/topology: Asymmetric topologies fixes Valentin Schneider
2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider
2019-10-24 16:19 ` Dietmar Eggemann [this message]
2019-10-24 16:45 ` Valentin Schneider
2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider
2019-10-31 16:23 ` [PATCH v4 1/2] " Michal Koutný
2019-10-31 17:23 ` Valentin Schneider
2019-11-01 10:08 ` Michal Koutný
2019-10-23 15:37 ` [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled Valentin Schneider
2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27d8a51d-a899-67f4-8b74-224281f25cef@arm.com \
--to=dietmar.eggemann@arm.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=qperret@google.com \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).