* [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration
@ 2018-10-29 18:43 Nathan Fontenot
2019-01-28 15:41 ` Michael Bringmann
2019-02-08 13:02 ` Michael Ellerman
0 siblings, 2 replies; 6+ messages in thread
From: Nathan Fontenot @ 2018-10-29 18:43 UTC (permalink / raw)
To: linuxppc-dev; +Cc: ldufour
On pseries systems, performing a partition migration can result in
altering the nodes a CPU is assigned to on the destination system. For
exampl, pre-migration on the source system CPUs are in node 1 and 3,
post-migration on the destination system CPUs are in nodes 2 and 3.
Handling the node change for a CPU can cause corruption in the slab
cache if we hit a timing where a CPUs node is changed while cache_reap()
is invoked. The corruption occurs because the slab cache code appears
to rely on the CPU and slab cache pages being on the same node.
The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
does not prevent us from hitting this scenario.
Changing the device tree property update notification handler that
recognizes an affinity change for a CPU to do a full DLPAR remove and
add of the CPU instead of dynamically changing its node resolves this
issue.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/topology.h | 2 ++
arch/powerpc/mm/numa.c | 9 +--------
arch/powerpc/platforms/pseries/hotplug-cpu.c | 19 +++++++++++++++++++
3 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index a4a718dbfec6..f85e2b01c3df 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -132,6 +132,8 @@ static inline void shared_proc_topology_init(void) {}
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_core_id(cpu) (cpu_to_core_id(cpu))
+
+int dlpar_cpu_readd(int cpu);
#endif
#endif
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 693ae1c1acba..bb6a7b56bef7 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1461,13 +1461,6 @@ static void reset_topology_timer(void)
#ifdef CONFIG_SMP
-static void stage_topology_update(int core_id)
-{
- cpumask_or(&cpu_associativity_changes_mask,
- &cpu_associativity_changes_mask, cpu_sibling_mask(core_id));
- reset_topology_timer();
-}
-
static int dt_update_callback(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -1480,7 +1473,7 @@ static int dt_update_callback(struct notifier_block *nb,
!of_prop_cmp(update->prop->name, "ibm,associativity")) {
u32 core_id;
of_property_read_u32(update->dn, "reg", &core_id);
- stage_topology_update(core_id);
+ rc = dlpar_cpu_readd(core_id);
rc = NOTIFY_OK;
}
break;
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 2f8e62163602..97feb6e79f1a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -802,6 +802,25 @@ static int dlpar_cpu_add_by_count(u32 cpus_to_add)
return rc;
}
+int dlpar_cpu_readd(int cpu)
+{
+ struct device_node *dn;
+ struct device *dev;
+ u32 drc_index;
+ int rc;
+
+ dev = get_cpu_device(cpu);
+ dn = dev->of_node;
+
+ rc = of_property_read_u32(dn, "ibm,my-drc-index", &drc_index);
+
+ rc = dlpar_cpu_remove_by_index(drc_index);
+ if (!rc)
+ rc = dlpar_cpu_add(drc_index);
+
+ return rc;
+}
+
int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
{
u32 count, drc_index;
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration
2018-10-29 18:43 [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration Nathan Fontenot
@ 2019-01-28 15:41 ` Michael Bringmann
2019-01-29 9:37 ` Michael Ellerman
2019-02-08 13:02 ` Michael Ellerman
1 sibling, 1 reply; 6+ messages in thread
From: Michael Bringmann @ 2019-01-28 15:41 UTC (permalink / raw)
To: Nathan Fontenot, linuxppc-dev; +Cc: ldufour
On 10/29/18 1:43 PM, Nathan Fontenot wrote:
> On pseries systems, performing a partition migration can result in
> altering the nodes a CPU is assigned to on the destination system. For
> exampl, pre-migration on the source system CPUs are in node 1 and 3,
> post-migration on the destination system CPUs are in nodes 2 and 3.
>
> Handling the node change for a CPU can cause corruption in the slab
> cache if we hit a timing where a CPUs node is changed while cache_reap()
> is invoked. The corruption occurs because the slab cache code appears
> to rely on the CPU and slab cache pages being on the same node.
>
> The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
> does not prevent us from hitting this scenario.
>
> Changing the device tree property update notification handler that
> recognizes an affinity change for a CPU to do a full DLPAR remove and
> add of the CPU instead of dynamically changing its node resolves this
> issue.
>
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com
Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/topology.h | 2 ++
> arch/powerpc/mm/numa.c | 9 +--------
> arch/powerpc/platforms/pseries/hotplug-cpu.c | 19 +++++++++++++++++++
> 3 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index a4a718dbfec6..f85e2b01c3df 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -132,6 +132,8 @@ static inline void shared_proc_topology_init(void) {}
> #define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
> #define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
> #define topology_core_id(cpu) (cpu_to_core_id(cpu))
> +
> +int dlpar_cpu_readd(int cpu);
> #endif
> #endif
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 693ae1c1acba..bb6a7b56bef7 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1461,13 +1461,6 @@ static void reset_topology_timer(void)
>
> #ifdef CONFIG_SMP
>
> -static void stage_topology_update(int core_id)
> -{
> - cpumask_or(&cpu_associativity_changes_mask,
> - &cpu_associativity_changes_mask, cpu_sibling_mask(core_id));
> - reset_topology_timer();
> -}
> -
> static int dt_update_callback(struct notifier_block *nb,
> unsigned long action, void *data)
> {
> @@ -1480,7 +1473,7 @@ static int dt_update_callback(struct notifier_block *nb,
> !of_prop_cmp(update->prop->name, "ibm,associativity")) {
> u32 core_id;
> of_property_read_u32(update->dn, "reg", &core_id);
> - stage_topology_update(core_id);
> + rc = dlpar_cpu_readd(core_id);
> rc = NOTIFY_OK;
> }
> break;
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 2f8e62163602..97feb6e79f1a 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -802,6 +802,25 @@ static int dlpar_cpu_add_by_count(u32 cpus_to_add)
> return rc;
> }
>
> +int dlpar_cpu_readd(int cpu)
> +{
> + struct device_node *dn;
> + struct device *dev;
> + u32 drc_index;
> + int rc;
> +
> + dev = get_cpu_device(cpu);
> + dn = dev->of_node;
> +
> + rc = of_property_read_u32(dn, "ibm,my-drc-index", &drc_index);
> +
> + rc = dlpar_cpu_remove_by_index(drc_index);
> + if (!rc)
> + rc = dlpar_cpu_add(drc_index);
> +
> + return rc;
> +}
> +
> int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
> {
> u32 count, drc_index;
>
>
--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb@linux.vnet.ibm.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration
2019-01-28 15:41 ` Michael Bringmann
@ 2019-01-29 9:37 ` Michael Ellerman
2019-01-29 16:12 ` Michael Bringmann
0 siblings, 1 reply; 6+ messages in thread
From: Michael Ellerman @ 2019-01-29 9:37 UTC (permalink / raw)
To: Michael Bringmann, Nathan Fontenot, linuxppc-dev; +Cc: ldufour
Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
> On 10/29/18 1:43 PM, Nathan Fontenot wrote:
>> On pseries systems, performing a partition migration can result in
>> altering the nodes a CPU is assigned to on the destination system. For
>> exampl, pre-migration on the source system CPUs are in node 1 and 3,
>> post-migration on the destination system CPUs are in nodes 2 and 3.
>>
>> Handling the node change for a CPU can cause corruption in the slab
>> cache if we hit a timing where a CPUs node is changed while cache_reap()
>> is invoked. The corruption occurs because the slab cache code appears
>> to rely on the CPU and slab cache pages being on the same node.
>>
>> The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
>> does not prevent us from hitting this scenario.
>>
>> Changing the device tree property update notification handler that
>> recognizes an affinity change for a CPU to do a full DLPAR remove and
>> add of the CPU instead of dynamically changing its node resolves this
>> issue.
>>
>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com
> Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Are you sure that's what you meant? ie. you wrote some of the patch?
What I'd like is to get a Tested-by from you.
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration
2019-01-29 9:37 ` Michael Ellerman
@ 2019-01-29 16:12 ` Michael Bringmann
2019-01-30 12:28 ` Michael Ellerman
0 siblings, 1 reply; 6+ messages in thread
From: Michael Bringmann @ 2019-01-29 16:12 UTC (permalink / raw)
To: Michael Ellerman, Nathan Fontenot, linuxppc-dev; +Cc: ldufour
On 1/29/19 3:37 AM, Michael Ellerman wrote:
> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>
>> On 10/29/18 1:43 PM, Nathan Fontenot wrote:
>>> On pseries systems, performing a partition migration can result in
>>> altering the nodes a CPU is assigned to on the destination system. For
>>> exampl, pre-migration on the source system CPUs are in node 1 and 3,
>>> post-migration on the destination system CPUs are in nodes 2 and 3.
>>>
>>> Handling the node change for a CPU can cause corruption in the slab
>>> cache if we hit a timing where a CPUs node is changed while cache_reap()
>>> is invoked. The corruption occurs because the slab cache code appears
>>> to rely on the CPU and slab cache pages being on the same node.
>>>
>>> The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
>>> does not prevent us from hitting this scenario.
>>>
>>> Changing the device tree property update notification handler that
>>> recognizes an affinity change for a CPU to do a full DLPAR remove and
>>> add of the CPU instead of dynamically changing its node resolves this
>>> issue.
>>>
>>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com
>> Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Tested-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
>
> Are you sure that's what you meant? ie. you wrote some of the patch?
>
> What I'd like is to get a Tested-by from you.
>
> cheers
>
>
--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb@linux.vnet.ibm.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration
2019-01-29 16:12 ` Michael Bringmann
@ 2019-01-30 12:28 ` Michael Ellerman
0 siblings, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2019-01-30 12:28 UTC (permalink / raw)
To: Michael Bringmann, Nathan Fontenot, linuxppc-dev; +Cc: ldufour
Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
> On 1/29/19 3:37 AM, Michael Ellerman wrote:
>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>
>>> On 10/29/18 1:43 PM, Nathan Fontenot wrote:
>>>> On pseries systems, performing a partition migration can result in
>>>> altering the nodes a CPU is assigned to on the destination system. For
>>>> exampl, pre-migration on the source system CPUs are in node 1 and 3,
>>>> post-migration on the destination system CPUs are in nodes 2 and 3.
>>>>
>>>> Handling the node change for a CPU can cause corruption in the slab
>>>> cache if we hit a timing where a CPUs node is changed while cache_reap()
>>>> is invoked. The corruption occurs because the slab cache code appears
>>>> to rely on the CPU and slab cache pages being on the same node.
>>>>
>>>> The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
>>>> does not prevent us from hitting this scenario.
>>>>
>>>> Changing the device tree property update notification handler that
>>>> recognizes an affinity change for a CPU to do a full DLPAR remove and
>>>> add of the CPU instead of dynamically changing its node resolves this
>>>> issue.
>>>>
>>>> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com
>>> Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
>
> Tested-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Thanks.
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: powerpc/pseries: Perform full re-add of CPU for topology update post-migration
2018-10-29 18:43 [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration Nathan Fontenot
2019-01-28 15:41 ` Michael Bringmann
@ 2019-02-08 13:02 ` Michael Ellerman
1 sibling, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2019-02-08 13:02 UTC (permalink / raw)
To: Nathan Fontenot, linuxppc-dev; +Cc: ldufour
On Mon, 2018-10-29 at 18:43:36 UTC, Nathan Fontenot wrote:
> On pseries systems, performing a partition migration can result in
> altering the nodes a CPU is assigned to on the destination system. For
> exampl, pre-migration on the source system CPUs are in node 1 and 3,
> post-migration on the destination system CPUs are in nodes 2 and 3.
>
> Handling the node change for a CPU can cause corruption in the slab
> cache if we hit a timing where a CPUs node is changed while cache_reap()
> is invoked. The corruption occurs because the slab cache code appears
> to rely on the CPU and slab cache pages being on the same node.
>
> The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
> does not prevent us from hitting this scenario.
>
> Changing the device tree property update notification handler that
> recognizes an affinity change for a CPU to do a full DLPAR remove and
> add of the CPU instead of dynamically changing its node resolves this
> issue.
>
> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
> Tested-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/81b61324922c67f73813d8a9c175f3c1
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-02-08 14:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-29 18:43 [PATCH] powerpc/pseries: Perform full re-add of CPU for topology update post-migration Nathan Fontenot
2019-01-28 15:41 ` Michael Bringmann
2019-01-29 9:37 ` Michael Ellerman
2019-01-29 16:12 ` Michael Bringmann
2019-01-30 12:28 ` Michael Ellerman
2019-02-08 13:02 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).