* [PATCH v4 0/2] sched/topology: Asymmetric topologies fixes @ 2019-10-23 15:37 Valentin Schneider 2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider 2019-10-23 15:37 ` [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled Valentin Schneider 0 siblings, 2 replies; 10+ messages in thread From: Valentin Schneider @ 2019-10-23 15:37 UTC (permalink / raw) To: linux-kernel, cgroups Cc: lizefan, tj, hannes, mingo, peterz, vincent.guittot, Dietmar.Eggemann, morten.rasmussen, qperret Hi, I got a nice splat while testing out the toggling of sched_asym_cpucapacity, so this is a cpuset fix plus a topology patch. Details are in the logs. v2 changes: - Use static_branch_{inc,dec} rather than enable/disable v3 changes: - New patch: add fix for empty cpumap in sched domain rebuild - Move static_branch_dec outside of RCU read-side section (Quentin) v4 changes: - Patch 1/2: Directly tweak the cpuset array (Dietmar) - Patch 2/2: Add an example to the changelog (Dietmar) Cheers, Valentin Valentin Schneider (2): sched/topology: Don't try to build empty sched domains sched/topology: Allow sched_asym_cpucapacity to be disabled kernel/cgroup/cpuset.c | 3 ++- kernel/sched/topology.c | 11 +++++++++-- 2 files changed, 11 insertions(+), 3 deletions(-) -- 2.22.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains 2019-10-23 15:37 [PATCH v4 0/2] sched/topology: Asymmetric topologies fixes Valentin Schneider @ 2019-10-23 15:37 ` Valentin Schneider 2019-10-24 16:19 ` Dietmar Eggemann ` (2 more replies) 2019-10-23 15:37 ` [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled Valentin Schneider 1 sibling, 3 replies; 10+ messages in thread From: Valentin Schneider @ 2019-10-23 15:37 UTC (permalink / raw) To: linux-kernel, cgroups Cc: lizefan, tj, hannes, mingo, peterz, vincent.guittot, Dietmar.Eggemann, morten.rasmussen, qperret, stable Turns out hotplugging CPUs that are in exclusive cpusets can lead to the cpuset code feeding empty cpumasks to the sched domain rebuild machinery. This leads to the following splat: [ 30.618174] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 30.623697] Modules linked in: [ 30.626731] CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23 [ 30.635003] Hardware name: ARM Juno development board (r0) (DT) [ 30.640877] Workqueue: events cpuset_hotplug_workfn [ 30.645713] pstate: 60000005 (nZCv daif -PAN -UAO) [ 30.650464] pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) [ 30.655126] lr : build_sched_domains (kernel/sched/topology.c:1966) [...] [ 30.742047] Call trace: [ 30.744474] build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) [ 30.748793] partition_sched_domains_locked (kernel/sched/topology.c:2250) [ 30.753971] rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019) [ 30.758977] rebuild_sched_domains (kernel/cgroup/cpuset.c:1032) [ 30.763209] cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2)) [ 30.767613] process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274) [ 30.771586] worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416) [ 30.775217] kthread (kernel/kthread.c:255) [ 30.778418] ret_from_fork (arch/arm64/kernel/entry.S:1167) [ 30.781965] Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853) The faulty line in question is cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); and we're not checking the return value against nr_cpu_ids (we shouldn't have to!), which leads to the above. Prevent generate_sched_domains() from returning empty cpumasks, and add some assertion in build_sched_domains() to scream bloody murder if it happens again. The above splat was obtained on my Juno r0 with: cgcreate -g cpuset:asym cgset -r cpuset.cpus=0-3 asym cgset -r cpuset.mems=0 asym cgset -r cpuset.cpu_exclusive=1 asym cgcreate -g cpuset:smp cgset -r cpuset.cpus=4-5 smp cgset -r cpuset.mems=0 smp cgset -r cpuset.cpu_exclusive=1 smp cgset -r cpuset.sched_load_balance=0 . echo 0 > /sys/devices/system/cpu/cpu4/online echo 0 > /sys/devices/system/cpu/cpu5/online Cc: <stable@vger.kernel.org> Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> --- kernel/cgroup/cpuset.c | 3 ++- kernel/sched/topology.c | 5 ++++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index c52bc91f882b..c87ee6412b36 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -798,7 +798,8 @@ static int generate_sched_domains(cpumask_var_t **domains, cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus)) continue; - if (is_sched_load_balance(cp)) + if (is_sched_load_balance(cp) && + !cpumask_empty(cp->effective_cpus)) csa[csn++] = cp; /* skip @cp's subtree if not a partition root */ diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 3623ffe85d18..2e7af755e17a 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1945,7 +1945,7 @@ static struct sched_domain_topology_level static int build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr) { - enum s_alloc alloc_state; + enum s_alloc alloc_state = sa_none; struct sched_domain *sd; struct s_data d; struct rq *rq = NULL; @@ -1953,6 +1953,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att struct sched_domain_topology_level *tl_asym; bool has_asym = false; + if (WARN_ON(cpumask_empty(cpu_map))) + goto error; + alloc_state = __visit_domain_allocation_hell(&d, cpu_map); if (alloc_state != sa_rootdomain) goto error; -- 2.22.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains 2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider @ 2019-10-24 16:19 ` Dietmar Eggemann 2019-10-24 16:45 ` Valentin Schneider 2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider 2019-10-31 16:23 ` [PATCH v4 1/2] " Michal Koutný 2 siblings, 1 reply; 10+ messages in thread From: Dietmar Eggemann @ 2019-10-24 16:19 UTC (permalink / raw) To: Valentin Schneider, linux-kernel, cgroups Cc: lizefan, tj, hannes, mingo, peterz, vincent.guittot, morten.rasmussen, qperret, stable On 23/10/2019 17:37, Valentin Schneider wrote: > Turns out hotplugging CPUs that are in exclusive cpusets can lead to the > cpuset code feeding empty cpumasks to the sched domain rebuild machinery. > This leads to the following splat: [...] > The faulty line in question is > > cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); > > and we're not checking the return value against nr_cpu_ids (we shouldn't > have to!), which leads to the above. > > Prevent generate_sched_domains() from returning empty cpumasks, and add > some assertion in build_sched_domains() to scream bloody murder if it > happens again. > > The above splat was obtained on my Juno r0 with: > > cgcreate -g cpuset:asym > cgset -r cpuset.cpus=0-3 asym > cgset -r cpuset.mems=0 asym > cgset -r cpuset.cpu_exclusive=1 asym > > cgcreate -g cpuset:smp > cgset -r cpuset.cpus=4-5 smp > cgset -r cpuset.mems=0 smp > cgset -r cpuset.cpu_exclusive=1 smp > > cgset -r cpuset.sched_load_balance=0 . > > echo 0 > /sys/devices/system/cpu/cpu4/online > echo 0 > /sys/devices/system/cpu/cpu5/online > > Cc: <stable@vger.kernel.org> > Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Sorry for being picky but IMHO you should also mention that it fixes f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting information") Tested it on a hikey620 (8 CPus SMP) with v5.4-rc4 and a local fix for asym_cpu_capacity_level(). 2 exclusive cpusets [0-3] and [4-7], hp'ing out [0-3] and then hp'ing in [0] again. diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 5a174ae6ecf3..8f83e8e3ea9a 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2203,8 +2203,19 @@ void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[], for (i = 0; i < ndoms_cur; i++) { for (j = 0; j < n && !new_topology; j++) { if (cpumask_equal(doms_cur[i], doms_new[j]) && - dattrs_equal(dattr_cur, i, dattr_new, j)) + dattrs_equal(dattr_cur, i, dattr_new, j)) { + struct root_domain *rd; + + /* + * This domain won't be destroyed and as such + * its dl_bw->total_bw needs to be cleared. It + * will be recomputed in function + * update_tasks_root_domain(). + */ + rd = cpu_rq(cpumask_any(doms_cur[i]))->rd; We have an issue here if doms_cur[i] is empty. + dl_clear_root_domain(rd); goto match1; There is yet another similar issue behind the first one (asym_cpu_capacity_level()). 342 static bool build_perf_domains(const struct cpumask *cpu_map) 343 { 344 int i, nr_pd = 0, nr_cs = 0, nr_cpus = cpumask_weight(cpu_map); 345 struct perf_domain *pd = NULL, *tmp; 346 int cpu = cpumask_first(cpu_map); <--- !!! 347 struct root_domain *rd = cpu_rq(cpu)->rd; <--- !!! 348 struct cpufreq_policy *policy; 349 struct cpufreq_governor *gov; ... 406 tmp = rd->pd; <--- !!! Caught when running hikey620 (8 CPus SMP) with v5.4-rc4 and a local fix for asym_cpu_capacity_level() with CONFIG_ENERGY_MODEL=y. There might be other places in build_sched_domains() suffering from the same issue. So I assume it's wise to not call it with an empty cpu_map and warn if done so. [...] ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains 2019-10-24 16:19 ` Dietmar Eggemann @ 2019-10-24 16:45 ` Valentin Schneider 0 siblings, 0 replies; 10+ messages in thread From: Valentin Schneider @ 2019-10-24 16:45 UTC (permalink / raw) To: Dietmar Eggemann, linux-kernel, cgroups Cc: lizefan, tj, hannes, mingo, peterz, vincent.guittot, morten.rasmussen, qperret, stable On 24/10/2019 17:19, Dietmar Eggemann wrote: > Sorry for being picky but IMHO you should also mention that it fixes > > f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting > information") > I can append the following to the changelog, although I'd like some feedback from the cgroup folks before doing a respin: """ Note that commit f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting information") introduced a similar issue. Since doms_new is assigned to doms_cur without any filtering, we can end up with an empty cpumask in the doms_cur array. The next time we go through a rebuild, this will break on: rd = cpu_rq(cpumask_any(doms_cur[i]))->rd; If there wasn't enough already, this is yet another argument for *not* handing over empty cpumasks to the sched domain rebuild. """ I tagged the commit that introduces the static key with Fixes: because it was introduced earlier - I don't think it would make sense to have two "Fixes:" lines? In any case, it'll now be listed in the changelog. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [tip: sched/urgent] sched/topology: Don't try to build empty sched domains 2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider 2019-10-24 16:19 ` Dietmar Eggemann @ 2019-10-29 9:52 ` tip-bot2 for Valentin Schneider 2019-10-31 16:23 ` [PATCH v4 1/2] " Michal Koutný 2 siblings, 0 replies; 10+ messages in thread From: tip-bot2 for Valentin Schneider @ 2019-10-29 9:52 UTC (permalink / raw) To: linux-tip-commits Cc: Valentin Schneider, Peter Zijlstra (Intel), Dietmar.Eggemann, Linus Torvalds, Thomas Gleixner, hannes, lizefan, morten.rasmussen, qperret, tj, vincent.guittot, Ingo Molnar, Borislav Petkov, linux-kernel The following commit has been merged into the sched/urgent branch of tip: Commit-ID: cd1cb3350561d2bf544ddfef76fbf0b1c9c7178f Gitweb: https://git.kernel.org/tip/cd1cb3350561d2bf544ddfef76fbf0b1c9c7178f Author: Valentin Schneider <valentin.schneider@arm.com> AuthorDate: Wed, 23 Oct 2019 16:37:44 +01:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 29 Oct 2019 09:58:45 +01:00 sched/topology: Don't try to build empty sched domains Turns out hotplugging CPUs that are in exclusive cpusets can lead to the cpuset code feeding empty cpumasks to the sched domain rebuild machinery. This leads to the following splat: Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23 Hardware name: ARM Juno development board (r0) (DT) Workqueue: events cpuset_hotplug_workfn pstate: 60000005 (nZCv daif -PAN -UAO) pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) lr : build_sched_domains (kernel/sched/topology.c:1966) Call trace: build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) partition_sched_domains_locked (kernel/sched/topology.c:2250) rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019) rebuild_sched_domains (kernel/cgroup/cpuset.c:1032) cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2)) process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274) worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416) kthread (kernel/kthread.c:255) ret_from_fork (arch/arm64/kernel/entry.S:1167) Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853) The faulty line in question is: cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); and we're not checking the return value against nr_cpu_ids (we shouldn't have to!), which leads to the above. Prevent generate_sched_domains() from returning empty cpumasks, and add some assertion in build_sched_domains() to scream bloody murder if it happens again. The above splat was obtained on my Juno r0 with the following reproducer: $ cgcreate -g cpuset:asym $ cgset -r cpuset.cpus=0-3 asym $ cgset -r cpuset.mems=0 asym $ cgset -r cpuset.cpu_exclusive=1 asym $ cgcreate -g cpuset:smp $ cgset -r cpuset.cpus=4-5 smp $ cgset -r cpuset.mems=0 smp $ cgset -r cpuset.cpu_exclusive=1 smp $ cgset -r cpuset.sched_load_balance=0 . $ echo 0 > /sys/devices/system/cpu/cpu4/online $ echo 0 > /sys/devices/system/cpu/cpu5/online Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/cgroup/cpuset.c | 3 ++- kernel/sched/topology.c | 5 ++++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index c52bc91..c87ee64 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -798,7 +798,8 @@ static int generate_sched_domains(cpumask_var_t **domains, cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus)) continue; - if (is_sched_load_balance(cp)) + if (is_sched_load_balance(cp) && + !cpumask_empty(cp->effective_cpus)) csa[csn++] = cp; /* skip @cp's subtree if not a partition root */ diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index b5667a2..9318acf 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1948,7 +1948,7 @@ next_level: static int build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr) { - enum s_alloc alloc_state; + enum s_alloc alloc_state = sa_none; struct sched_domain *sd; struct s_data d; struct rq *rq = NULL; @@ -1956,6 +1956,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att struct sched_domain_topology_level *tl_asym; bool has_asym = false; + if (WARN_ON(cpumask_empty(cpu_map))) + goto error; + alloc_state = __visit_domain_allocation_hell(&d, cpu_map); if (alloc_state != sa_rootdomain) goto error; ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains 2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider 2019-10-24 16:19 ` Dietmar Eggemann 2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider @ 2019-10-31 16:23 ` Michal Koutný 2019-10-31 17:23 ` Valentin Schneider 2 siblings, 1 reply; 10+ messages in thread From: Michal Koutný @ 2019-10-31 16:23 UTC (permalink / raw) To: Valentin Schneider Cc: linux-kernel, cgroups, lizefan, tj, hannes, mingo, peterz, vincent.guittot, Dietmar.Eggemann, morten.rasmussen, qperret, stable [-- Attachment #1: Type: text/plain, Size: 1154 bytes --] On Wed, Oct 23, 2019 at 04:37:44PM +0100, Valentin Schneider <valentin.schneider@arm.com> wrote: > Prevent generate_sched_domains() from returning empty cpumasks, and add > some assertion in build_sched_domains() to scream bloody murder if it > happens again. Good catch. It makes sense to prune the empty domains in generate_sched_domains already. > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index c52bc91f882b..c87ee6412b36 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -798,7 +798,8 @@ static int generate_sched_domains(cpumask_var_t **domains, > cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus)) > continue; > > - if (is_sched_load_balance(cp)) > + if (is_sched_load_balance(cp) && > + !cpumask_empty(cp->effective_cpus)) > csa[csn++] = cp; If I didn't overlook anything, cp->effective_cpus can contain CPUs exluded by housekeeping_cpumask(HK_FLAG_DOMAIN) later, i.e. possibly still returning domains with empty cpusets. I'd suggest moving the emptiness check down into the loop where domain cpumasks are ultimately constructed. Michal [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains 2019-10-31 16:23 ` [PATCH v4 1/2] " Michal Koutný @ 2019-10-31 17:23 ` Valentin Schneider 2019-11-01 10:08 ` Michal Koutný 0 siblings, 1 reply; 10+ messages in thread From: Valentin Schneider @ 2019-10-31 17:23 UTC (permalink / raw) To: Michal Koutný Cc: linux-kernel, cgroups, lizefan, tj, hannes, mingo, peterz, vincent.guittot, Dietmar.Eggemann, morten.rasmussen, qperret, stable Hi Michal, On 31/10/2019 17:23, Michal Koutný wrote: > On Wed, Oct 23, 2019 at 04:37:44PM +0100, Valentin Schneider <valentin.schneider@arm.com> wrote: >> Prevent generate_sched_domains() from returning empty cpumasks, and add >> some assertion in build_sched_domains() to scream bloody murder if it >> happens again. > Good catch. It makes sense to prune the empty domains in > generate_sched_domains already. > >> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c >> index c52bc91f882b..c87ee6412b36 100644 >> --- a/kernel/cgroup/cpuset.c >> +++ b/kernel/cgroup/cpuset.c >> @@ -798,7 +798,8 @@ static int generate_sched_domains(cpumask_var_t **domains, >> cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus)) >> continue; >> >> - if (is_sched_load_balance(cp)) >> + if (is_sched_load_balance(cp) && >> + !cpumask_empty(cp->effective_cpus)) >> csa[csn++] = cp; > If I didn't overlook anything, cp->effective_cpus can contain CPUs > exluded by housekeeping_cpumask(HK_FLAG_DOMAIN) later, i.e. possibly > still returning domains with empty cpusets. > > I'd suggest moving the emptiness check down into the loop where domain > cpumasks are ultimately constructed. > Ah, wasn't aware of this - thanks for having a look! I think I need to have the check before the final cpumask gets built, because at this point the cpumask array is already built and it's handed off directly to the sched domain rebuild. Do you reckon the following would work? ----8<---- diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index c87ee6412b36..e4c10785dc7c 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -798,8 +798,14 @@ static int generate_sched_domains(cpumask_var_t **domains, cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus)) continue; + /* + * Skip cpusets that would lead to an empty sched domain. + * That could be because effective_cpus is empty, or because + * it's only spanning CPUs outside the housekeeping mask. + */ if (is_sched_load_balance(cp) && - !cpumask_empty(cp->effective_cpus)) + cpumask_intersects(cp->effective_cpus, + housekeeping_cpumask(HK_FLAG_DOMAIN))) csa[csn++] = cp; /* skip @cp's subtree if not a partition root */ ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains 2019-10-31 17:23 ` Valentin Schneider @ 2019-11-01 10:08 ` Michal Koutný 0 siblings, 0 replies; 10+ messages in thread From: Michal Koutný @ 2019-11-01 10:08 UTC (permalink / raw) To: Valentin Schneider Cc: linux-kernel, cgroups, lizefan, tj, hannes, mingo, peterz, vincent.guittot, Dietmar.Eggemann, morten.rasmussen, qperret, stable [-- Attachment #1: Type: text/plain, Size: 268 bytes --] On Thu, Oct 31, 2019 at 06:23:12PM +0100, Valentin Schneider <valentin.schneider@arm.com> wrote: > Do you reckon the following would work? LGTM (i.e. cpuset will be skipped if no CPUs taking part in load balancing remain in it after hot(un)plug event). Michal [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled 2019-10-23 15:37 [PATCH v4 0/2] sched/topology: Asymmetric topologies fixes Valentin Schneider 2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider @ 2019-10-23 15:37 ` Valentin Schneider 2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider 1 sibling, 1 reply; 10+ messages in thread From: Valentin Schneider @ 2019-10-23 15:37 UTC (permalink / raw) To: linux-kernel, cgroups Cc: lizefan, tj, hannes, mingo, peterz, vincent.guittot, Dietmar.Eggemann, morten.rasmussen, qperret, stable, Dietmar Eggemann While the static key is correctly initialized as being disabled, it will remain forever enabled once turned on. This means that if we start with an asymmetric system and hotplug out enough CPUs to end up with an SMP system, the static key will remain set - which is obviously wrong. We should detect this and turn off things like misfit migration and capacity aware wakeups. As Quentin pointed out, having separate root domains makes this slightly trickier. We could have exclusive cpusets that create an SMP island - IOW, the domains within this root domain will not see any asymmetry. This means we can't just disable the key on domain destruction, we need to count how many asymmetric root domains we have. Consider the following example using Juno r0 which is 2+4 big.LITTLE, where two identical cpusets are created: they both span both big and LITTLE CPUs: asym0 asym1 [ ][ ] L L B L L B cgcreate -g cpuset:asym0 cgset -r cpuset.cpus=0,1,3 asym0 cgset -r cpuset.mems=0 asym0 cgset -r cpuset.cpu_exclusive=1 asym0 cgcreate -g cpuset:asym1 cgset -r cpuset.cpus=2,4,5 asym1 cgset -r cpuset.mems=0 asym1 cgset -r cpuset.cpu_exclusive=1 asym1 cgset -r cpuset.sched_load_balance=0 . (the CPU numbering may look odd because on the Juno LITTLEs are CPUs 0,3-5 and bigs are CPUs 1-2) If we make one of those SMP (IOW remove asymmetry) by e.g. hotplugging its big core, we would end up with an SMP cpuset and an asymmetric cpuset - the static key must remain set, because we still have one asymmetric root domain. With the above example, this could be done with: echo 0 > /sys/devices/system/cpu/cpu2/online Which would result in: asym0 asym1 [ ][ ] L L B L L When both SMP and asymmetric cpusets are present, all CPUs will observe sched_asym_cpucapacity being set (it is system-wide), but not all CPUs observe asymmetry in their sched domain hierarchy: per_cpu(sd_asym_cpucapacity, <any CPU in asym0>) == <some SD at DIE level> per_cpu(sd_asym_cpucapacity, <any CPU in asym1>) == NULL Change the simple key enablement to an increment, and decrement the key counter when destroying domains that cover asymmetric CPUs. Cc: <stable@vger.kernel.org> Fixes: df054e8445a4 ("sched/topology: Add static_key for asymmetric CPU capacity optimizations") Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> --- kernel/sched/topology.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 2e7af755e17a..6ec1e595b1d4 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2026,7 +2026,7 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att rcu_read_unlock(); if (has_asym) - static_branch_enable_cpuslocked(&sched_asym_cpucapacity); + static_branch_inc_cpuslocked(&sched_asym_cpucapacity); if (rq && sched_debug_enabled) { pr_info("root domain span: %*pbl (max cpu_capacity = %lu)\n", @@ -2121,9 +2121,12 @@ int sched_init_domains(const struct cpumask *cpu_map) */ static void detach_destroy_domains(const struct cpumask *cpu_map) { + unsigned int cpu = cpumask_any(cpu_map); int i; + if (rcu_access_pointer(per_cpu(sd_asym_cpucapacity, cpu))) + static_branch_dec_cpuslocked(&sched_asym_cpucapacity); + rcu_read_lock(); for_each_cpu(i, cpu_map) cpu_attach_domain(NULL, &def_root_domain, i); -- 2.22.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip: sched/urgent] sched/topology: Allow sched_asym_cpucapacity to be disabled 2019-10-23 15:37 ` [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled Valentin Schneider @ 2019-10-29 9:52 ` tip-bot2 for Valentin Schneider 0 siblings, 0 replies; 10+ messages in thread From: tip-bot2 for Valentin Schneider @ 2019-10-29 9:52 UTC (permalink / raw) To: linux-tip-commits Cc: Valentin Schneider, Peter Zijlstra (Intel), Dietmar Eggemann, Dietmar.Eggemann, Linus Torvalds, Thomas Gleixner, hannes, lizefan, morten.rasmussen, qperret, tj, vincent.guittot, Ingo Molnar, Borislav Petkov, linux-kernel The following commit has been merged into the sched/urgent branch of tip: Commit-ID: e284df705cf1eeedb5ec3a66ed82d17a64659150 Gitweb: https://git.kernel.org/tip/e284df705cf1eeedb5ec3a66ed82d17a64659150 Author: Valentin Schneider <valentin.schneider@arm.com> AuthorDate: Wed, 23 Oct 2019 16:37:45 +01:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 29 Oct 2019 09:58:46 +01:00 sched/topology: Allow sched_asym_cpucapacity to be disabled While the static key is correctly initialized as being disabled, it will remain forever enabled once turned on. This means that if we start with an asymmetric system and hotplug out enough CPUs to end up with an SMP system, the static key will remain set - which is obviously wrong. We should detect this and turn off things like misfit migration and capacity aware wakeups. As Quentin pointed out, having separate root domains makes this slightly trickier. We could have exclusive cpusets that create an SMP island - IOW, the domains within this root domain will not see any asymmetry. This means we can't just disable the key on domain destruction, we need to count how many asymmetric root domains we have. Consider the following example using Juno r0 which is 2+4 big.LITTLE, where two identical cpusets are created: they both span both big and LITTLE CPUs: asym0 asym1 [ ][ ] L L B L L B $ cgcreate -g cpuset:asym0 $ cgset -r cpuset.cpus=0,1,3 asym0 $ cgset -r cpuset.mems=0 asym0 $ cgset -r cpuset.cpu_exclusive=1 asym0 $ cgcreate -g cpuset:asym1 $ cgset -r cpuset.cpus=2,4,5 asym1 $ cgset -r cpuset.mems=0 asym1 $ cgset -r cpuset.cpu_exclusive=1 asym1 $ cgset -r cpuset.sched_load_balance=0 . (the CPU numbering may look odd because on the Juno LITTLEs are CPUs 0,3-5 and bigs are CPUs 1-2) If we make one of those SMP (IOW remove asymmetry) by e.g. hotplugging its big core, we would end up with an SMP cpuset and an asymmetric cpuset - the static key must remain set, because we still have one asymmetric root domain. With the above example, this could be done with: $ echo 0 > /sys/devices/system/cpu/cpu2/online Which would result in: asym0 asym1 [ ][ ] L L B L L When both SMP and asymmetric cpusets are present, all CPUs will observe sched_asym_cpucapacity being set (it is system-wide), but not all CPUs observe asymmetry in their sched domain hierarchy: per_cpu(sd_asym_cpucapacity, <any CPU in asym0>) == <some SD at DIE level> per_cpu(sd_asym_cpucapacity, <any CPU in asym1>) == NULL Change the simple key enablement to an increment, and decrement the key counter when destroying domains that cover asymmetric CPUs. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: df054e8445a4 ("sched/topology: Add static_key for asymmetric CPU capacity optimizations") Link: https://lkml.kernel.org/r/20191023153745.19515-3-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/topology.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 9318acf..49b835f 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2029,7 +2029,7 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att rcu_read_unlock(); if (has_asym) - static_branch_enable_cpuslocked(&sched_asym_cpucapacity); + static_branch_inc_cpuslocked(&sched_asym_cpucapacity); if (rq && sched_debug_enabled) { pr_info("root domain span: %*pbl (max cpu_capacity = %lu)\n", @@ -2124,8 +2124,12 @@ int sched_init_domains(const struct cpumask *cpu_map) */ static void detach_destroy_domains(const struct cpumask *cpu_map) { + unsigned int cpu = cpumask_any(cpu_map); int i; + if (rcu_access_pointer(per_cpu(sd_asym_cpucapacity, cpu))) + static_branch_dec_cpuslocked(&sched_asym_cpucapacity); + rcu_read_lock(); for_each_cpu(i, cpu_map) cpu_attach_domain(NULL, &def_root_domain, i); ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2019-11-01 10:08 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-23 15:37 [PATCH v4 0/2] sched/topology: Asymmetric topologies fixes Valentin Schneider 2019-10-23 15:37 ` [PATCH v4 1/2] sched/topology: Don't try to build empty sched domains Valentin Schneider 2019-10-24 16:19 ` Dietmar Eggemann 2019-10-24 16:45 ` Valentin Schneider 2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider 2019-10-31 16:23 ` [PATCH v4 1/2] " Michal Koutný 2019-10-31 17:23 ` Valentin Schneider 2019-11-01 10:08 ` Michal Koutný 2019-10-23 15:37 ` [PATCH v4 2/2] sched/topology: Allow sched_asym_cpucapacity to be disabled Valentin Schneider 2019-10-29 9:52 ` [tip: sched/urgent] " tip-bot2 for Valentin Schneider
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).