* [PATCH 0/2] Update cpu_cpu_mask on CPU online/offline
@ 2021-07-01 9:07 Srikar Dronamraju
2021-07-01 9:07 ` [PATCH 1/2] powerpc/numa: Print debug statements only when required Srikar Dronamraju
2021-07-01 9:07 ` [PATCH 2/2] powerpc/numa: Update cpu_cpu_map on CPU online/offline Srikar Dronamraju
0 siblings, 2 replies; 3+ messages in thread
From: Srikar Dronamraju @ 2021-07-01 9:07 UTC (permalink / raw)
To: Michael Ellerman
Cc: Nathan Lynch, Gautham R Shenoy, Vincent Guittot,
Srikar Dronamraju, Peter Zijlstra, Geetika Moolchandani,
Ingo Molnar, Laurent Dufour, linuxppc-dev, Valentin Schneider
When simultaneously running CPU online/offline with CPU add/remove in a
loop, we see a WARNING messages.
WARNING: CPU: 13 PID: 1142 at kernel/sched/topology.c:898 build_sched_domains+0xd48/0x1720
Modules linked in: rpadlpar_io rpaphp mptcp_diag xsk_diag tcp_diag udp_diag
raw_diag inet_diag unix_diag af_packet_diag netlink_diag bonding tls
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables nfnetlink
pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs
libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc scsi_transport_fc ibmveth
dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
CPU: 13 PID: 1142 Comm: kworker/13:2 Not tainted 5.13.0-rc6+ #28
Workqueue: events cpuset_hotplug_workfn
NIP: c0000000001caac8 LR: c0000000001caac4 CTR: 00000000007088ec
REGS: c00000005596f220 TRAP: 0700 Not tainted (5.13.0-rc6+)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48828222 XER: 00000009
CFAR: c0000000001ea698 IRQMASK: 0
GPR00: c0000000001caac4 c00000005596f4c0 c000000001c4a400 0000000000000036
GPR04: 00000000fffdffff c00000005596f1d0 0000000000000027 c0000018cfd07f90
GPR08: 0000000000000023 0000000000000001 0000000000000027 c0000018fe68ffe8
GPR12: 0000000000008000 c00000001e9d1880 c00000013a047200 0000000000000800
GPR16: c000000001d3c7d0 0000000000000240 0000000000000048 c000000010aacd18
GPR20: 0000000000000001 c000000010aacc18 c00000013a047c00 c000000139ec2400
GPR24: 0000000000000280 c000000139ec2520 c000000136c1b400 c000000001c93060
GPR28: c00000013a047c20 c000000001d3c6c0 c000000001c978a0 000000000000000d
NIP [c0000000001caac8] build_sched_domains+0xd48/0x1720
LR [c0000000001caac4] build_sched_domains+0xd44/0x1720
Call Trace:
[c00000005596f4c0] [c0000000001caac4] build_sched_domains+0xd44/0x1720 (unreliable)
[c00000005596f670] [c0000000001cc5ec] partition_sched_domains_locked+0x3ac/0x4b0
[c00000005596f710] [c0000000002804e4] rebuild_sched_domains_locked+0x404/0x9e0
[c00000005596f810] [c000000000283e60] rebuild_sched_domains+0x40/0x70
[c00000005596f840] [c000000000284124] cpuset_hotplug_workfn+0x294/0xf10
[c00000005596fc60] [c000000000175040] process_one_work+0x290/0x590
[c00000005596fd00] [c0000000001753c8] worker_thread+0x88/0x620
[c00000005596fda0] [c000000000181704] kthread+0x194/0x1a0
[c00000005596fe10] [c00000000000ccec] ret_from_kernel_thread+0x5c/0x70
Instruction dump:
485af049 60000000 2fa30800 409e0028 80fe0000 e89a00f8 e86100e8 38da0120
7f88e378 7ce53b78 4801fb91 60000000 <0fe00000> 39000000 38e00000 38c00000
This was because cpu_cpu_mask() was not getting updated on CPU
online/offline but would be only updated when add/remove of CPUs.
Other cpumasks get updated both on CPU online/offline and add/remove
Update cpu_cpu_mask() on CPU online/offline too.
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Srikar Dronamraju (2):
powerpc/numa: Print debug statements only when required
powerpc/numa: Update cpu_cpu_map on CPU online/offline
arch/powerpc/include/asm/topology.h | 12 ++++++++++++
arch/powerpc/kernel/smp.c | 3 +++
arch/powerpc/mm/numa.c | 18 +++++++-----------
3 files changed, 22 insertions(+), 11 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] powerpc/numa: Print debug statements only when required
2021-07-01 9:07 [PATCH 0/2] Update cpu_cpu_mask on CPU online/offline Srikar Dronamraju
@ 2021-07-01 9:07 ` Srikar Dronamraju
2021-07-01 9:07 ` [PATCH 2/2] powerpc/numa: Update cpu_cpu_map on CPU online/offline Srikar Dronamraju
1 sibling, 0 replies; 3+ messages in thread
From: Srikar Dronamraju @ 2021-07-01 9:07 UTC (permalink / raw)
To: Michael Ellerman
Cc: Nathan Lynch, Gautham R Shenoy, Vincent Guittot,
Srikar Dronamraju, Peter Zijlstra, Geetika Moolchandani,
Ingo Molnar, Laurent Dufour, linuxppc-dev, Valentin Schneider
Currently, a debug message gets printed every time an attempt to
add(remove) a CPU. However this is redundant if the CPU is already added
(removed) from the node.
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
arch/powerpc/mm/numa.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 6d0d89127190..f68dbe4e982c 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -141,10 +141,11 @@ static void map_cpu_to_node(int cpu, int node)
{
update_numa_cpu_lookup_table(cpu, node);
- dbg("adding cpu %d to node %d\n", cpu, node);
- if (!(cpumask_test_cpu(cpu, node_to_cpumask_map[node])))
+ if (!(cpumask_test_cpu(cpu, node_to_cpumask_map[node]))) {
+ dbg("adding cpu %d to node %d\n", cpu, node);
cpumask_set_cpu(cpu, node_to_cpumask_map[node]);
+ }
}
#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_PPC_SPLPAR)
@@ -152,13 +153,11 @@ static void unmap_cpu_from_node(unsigned long cpu)
{
int node = numa_cpu_lookup_table[cpu];
- dbg("removing cpu %lu from node %d\n", cpu, node);
-
if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
+ dbg("removing cpu %lu from node %d\n", cpu, node);
} else {
- printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n",
- cpu, node);
+ pr_err("WARNING: cpu %lu not found in node %d\n", cpu, node);
}
}
#endif /* CONFIG_HOTPLUG_CPU || CONFIG_PPC_SPLPAR */
--
2.27.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] powerpc/numa: Update cpu_cpu_map on CPU online/offline
2021-07-01 9:07 [PATCH 0/2] Update cpu_cpu_mask on CPU online/offline Srikar Dronamraju
2021-07-01 9:07 ` [PATCH 1/2] powerpc/numa: Print debug statements only when required Srikar Dronamraju
@ 2021-07-01 9:07 ` Srikar Dronamraju
1 sibling, 0 replies; 3+ messages in thread
From: Srikar Dronamraju @ 2021-07-01 9:07 UTC (permalink / raw)
To: Michael Ellerman
Cc: Nathan Lynch, Gautham R Shenoy, Vincent Guittot,
Srikar Dronamraju, Peter Zijlstra, Geetika Moolchandani,
Ingo Molnar, Laurent Dufour, linuxppc-dev, Valentin Schneider
cpu_cpu_map holds all the CPUs in the DIE. However in PowerPC, when
onlining/offlining of CPUs, this mask doesn't get updated. This mask
is however updated when CPUs are added/removed. So when both
operations like online/offline of CPUs and adding/removing of CPUs are
done simultaneously, then cpumaps end up broken.
WARNING: CPU: 13 PID: 1142 at kernel/sched/topology.c:898
build_sched_domains+0xd48/0x1720
Modules linked in: rpadlpar_io rpaphp mptcp_diag xsk_diag tcp_diag
udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag
bonding tls nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set
rfkill nf_tables nfnetlink pseries_rng xts vmx_crypto uio_pdrv_genirq
uio binfmt_misc ip_tables xfs libcrc32c dm_service_time sd_mod t10_pi sg
ibmvfc scsi_transport_fc ibmveth dm_multipath dm_mirror dm_region_hash
dm_log dm_mod fuse
CPU: 13 PID: 1142 Comm: kworker/13:2 Not tainted 5.13.0-rc6+ #28
Workqueue: events cpuset_hotplug_workfn
NIP: c0000000001caac8 LR: c0000000001caac4 CTR: 00000000007088ec
REGS: c00000005596f220 TRAP: 0700 Not tainted (5.13.0-rc6+)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48828222 XER:
00000009
CFAR: c0000000001ea698 IRQMASK: 0
GPR00: c0000000001caac4 c00000005596f4c0 c000000001c4a400 0000000000000036
GPR04: 00000000fffdffff c00000005596f1d0 0000000000000027 c0000018cfd07f90
GPR08: 0000000000000023 0000000000000001 0000000000000027 c0000018fe68ffe8
GPR12: 0000000000008000 c00000001e9d1880 c00000013a047200 0000000000000800
GPR16: c000000001d3c7d0 0000000000000240 0000000000000048 c000000010aacd18
GPR20: 0000000000000001 c000000010aacc18 c00000013a047c00 c000000139ec2400
GPR24: 0000000000000280 c000000139ec2520 c000000136c1b400 c000000001c93060
GPR28: c00000013a047c20 c000000001d3c6c0 c000000001c978a0 000000000000000d
NIP [c0000000001caac8] build_sched_domains+0xd48/0x1720
LR [c0000000001caac4] build_sched_domains+0xd44/0x1720
Call Trace:
[c00000005596f4c0] [c0000000001caac4] build_sched_domains+0xd44/0x1720 (unreliable)
[c00000005596f670] [c0000000001cc5ec] partition_sched_domains_locked+0x3ac/0x4b0
[c00000005596f710] [c0000000002804e4] rebuild_sched_domains_locked+0x404/0x9e0
[c00000005596f810] [c000000000283e60] rebuild_sched_domains+0x40/0x70
[c00000005596f840] [c000000000284124] cpuset_hotplug_workfn+0x294/0xf10
[c00000005596fc60] [c000000000175040] process_one_work+0x290/0x590
[c00000005596fd00] [c0000000001753c8] worker_thread+0x88/0x620
[c00000005596fda0] [c000000000181704] kthread+0x194/0x1a0
[c00000005596fe10] [c00000000000ccec] ret_from_kernel_thread+0x5c/0x70
Instruction dump:
485af049 60000000 2fa30800 409e0028 80fe0000 e89a00f8 e86100e8 38da0120
7f88e378 7ce53b78 4801fb91 60000000 <0fe00000> 39000000 38e00000 38c00000
Fix this by updating cpu_cpu_map aka cpumask_of_node() on every CPU
online/offline.
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/topology.h | 12 ++++++++++++
arch/powerpc/kernel/smp.c | 3 +++
arch/powerpc/mm/numa.c | 7 ++-----
3 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index e4db64c0e184..2f0a4d7b95f6 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -65,6 +65,11 @@ static inline int early_cpu_to_node(int cpu)
int of_drconf_to_nid_single(struct drmem_lmb *lmb);
+extern void map_cpu_to_node(int cpu, int node);
+#ifdef CONFIG_HOTPLUG_CPU
+extern void unmap_cpu_from_node(unsigned long cpu);
+#endif /* CONFIG_HOTPLUG_CPU */
+
#else
static inline int early_cpu_to_node(int cpu) { return 0; }
@@ -93,6 +98,13 @@ static inline int of_drconf_to_nid_single(struct drmem_lmb *lmb)
return first_online_node;
}
+#ifdef CONFIG_SMP
+static inline void map_cpu_to_node(int cpu, int node) {}
+#ifdef CONFIG_HOTPLUG_CPU
+static inline void unmap_cpu_from_node(unsigned long cpu) {}
+#endif /* CONFIG_HOTPLUG_CPU */
+#endif /* CONFIG_SMP */
+
#endif /* CONFIG_NUMA */
#if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6c6e4d934d86..e562cca13d66 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1407,6 +1407,8 @@ static void remove_cpu_from_masks(int cpu)
struct cpumask *(*mask_fn)(int) = cpu_sibling_mask;
int i;
+ unmap_cpu_from_node(cpu);
+
if (shared_caches)
mask_fn = cpu_l2_cache_mask;
@@ -1491,6 +1493,7 @@ static void add_cpu_to_masks(int cpu)
* This CPU will not be in the online mask yet so we need to manually
* add it to it's own thread sibling mask.
*/
+ map_cpu_to_node(cpu, cpu_to_node(cpu));
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
for (i = first_thread; i < first_thread + threads_per_core; i++)
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index f68dbe4e982c..7eb5252d274e 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -137,7 +137,7 @@ static void reset_numa_cpu_lookup_table(void)
numa_cpu_lookup_table[cpu] = -1;
}
-static void map_cpu_to_node(int cpu, int node)
+void map_cpu_to_node(int cpu, int node)
{
update_numa_cpu_lookup_table(cpu, node);
@@ -149,7 +149,7 @@ static void map_cpu_to_node(int cpu, int node)
}
#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_PPC_SPLPAR)
-static void unmap_cpu_from_node(unsigned long cpu)
+void unmap_cpu_from_node(unsigned long cpu)
{
int node = numa_cpu_lookup_table[cpu];
@@ -597,9 +597,6 @@ static int ppc_numa_cpu_prepare(unsigned int cpu)
static int ppc_numa_cpu_dead(unsigned int cpu)
{
-#ifdef CONFIG_HOTPLUG_CPU
- unmap_cpu_from_node(cpu);
-#endif
return 0;
}
--
2.27.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-07-01 9:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-01 9:07 [PATCH 0/2] Update cpu_cpu_mask on CPU online/offline Srikar Dronamraju
2021-07-01 9:07 ` [PATCH 1/2] powerpc/numa: Print debug statements only when required Srikar Dronamraju
2021-07-01 9:07 ` [PATCH 2/2] powerpc/numa: Update cpu_cpu_map on CPU online/offline Srikar Dronamraju
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.