From: Valentin Schneider <valentin.schneider@arm.com>
To: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Cc: lizefan@huawei.com, tj@kernel.org, hannes@cmpxchg.org,
mingo@kernel.org, peterz@infradead.org,
vincent.guittot@linaro.org, Dietmar.Eggemann@arm.com,
morten.rasmussen@arm.com, qperret@google.com,
stable@vger.kernel.org
Subject: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains
Date: Wed, 23 Oct 2019 16:37:44 +0100 [thread overview]
Message-ID: <20191023153745.19515-2-valentin.schneider@arm.com> (raw)
In-Reply-To: <20191023153745.19515-1-valentin.schneider@arm.com>
Turns out hotplugging CPUs that are in exclusive cpusets can lead to the
cpuset code feeding empty cpumasks to the sched domain rebuild machinery.
This leads to the following splat:
[ 30.618174] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 30.623697] Modules linked in:
[ 30.626731] CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23
[ 30.635003] Hardware name: ARM Juno development board (r0) (DT)
[ 30.640877] Workqueue: events cpuset_hotplug_workfn
[ 30.645713] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 30.650464] pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969)
[ 30.655126] lr : build_sched_domains (kernel/sched/topology.c:1966)
[...]
[ 30.742047] Call trace:
[ 30.744474] build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969)
[ 30.748793] partition_sched_domains_locked (kernel/sched/topology.c:2250)
[ 30.753971] rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019)
[ 30.758977] rebuild_sched_domains (kernel/cgroup/cpuset.c:1032)
[ 30.763209] cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2))
[ 30.767613] process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274)
[ 30.771586] worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416)
[ 30.775217] kthread (kernel/kthread.c:255)
[ 30.778418] ret_from_fork (arch/arm64/kernel/entry.S:1167)
[ 30.781965] Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853)
The faulty line in question is
cap = arch_scale_cpu_capacity(cpumask_first(cpu_map));
and we're not checking the return value against nr_cpu_ids (we shouldn't
have to!), which leads to the above.
Prevent generate_sched_domains() from returning empty cpumasks, and add
some assertion in build_sched_domains() to scream bloody murder if it
happens again.
The above splat was obtained on my Juno r0 with:
cgcreate -g cpuset:asym
cgset -r cpuset.cpus=0-3 asym
cgset -r cpuset.mems=0 asym
cgset -r cpuset.cpu_exclusive=1 asym
cgcreate -g cpuset:smp
cgset -r cpuset.cpus=4-5 smp
cgset -r cpuset.mems=0 smp
cgset -r cpuset.cpu_exclusive=1 smp
cgset -r cpuset.sched_load_balance=0 .
echo 0 > /sys/devices/system/cpu/cpu4/online
echo 0 > /sys/devices/system/cpu/cpu5/online
Cc: <stable@vger.kernel.org>
Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection")
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
kernel/cgroup/cpuset.c | 3 ++-
kernel/sched/topology.c | 5 ++++-
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index c52bc91f882b..c87ee6412b36 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -798,7 +798,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus))
continue;
- if (is_sched_load_balance(cp))
+ if (is_sched_load_balance(cp) &&
+ !cpumask_empty(cp->effective_cpus))
csa[csn++] = cp;
/* skip @cp's subtree if not a partition root */
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 3623ffe85d18..2e7af755e17a 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1945,7 +1945,7 @@ static struct sched_domain_topology_level
static int
build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr)
{
- enum s_alloc alloc_state;
+ enum s_alloc alloc_state = sa_none;
struct sched_domain *sd;
struct s_data d;
struct rq *rq = NULL;
@@ -1953,6 +1953,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
struct sched_domain_topology_level *tl_asym;
bool has_asym = false;
+ if (WARN_ON(cpumask_empty(cpu_map)))
+ goto error;
+
alloc_state = __visit_domain_allocation_hell(&d, cpu_map);
if (alloc_state != sa_rootdomain)
goto error;
--
2.22.0
next prev parent reply other threads:[~2019-10-23 15:38 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-23 15:37 [PATCH v4 0/2] sched/topology: Asymmetric topologies fixes Valentin Schneider
2019-10-23 15:37 ` Valentin Schneider [this message]
2019-10-24 16:19 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Dietmar Eggemann
2019-10-24 16:45 ` Valentin Schneider
2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider
2019-10-31 16:23 ` [PATCH v4 1/2] " Michal Koutný
2019-10-31 17:23 ` Valentin Schneider
2019-11-01 10:08 ` Michal Koutný
2019-10-23 15:37 ` [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled Valentin Schneider
2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191023153745.19515-2-valentin.schneider@arm.com \
--to=valentin.schneider@arm.com \
--cc=Dietmar.Eggemann@arm.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=qperret@google.com \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).