All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] disable NUMA affinity reassignments at runtime
@ 2019-04-18 18:56 Nathan Lynch
  2019-04-18 18:56 ` [PATCH 1/2] powerpc/numa: improve control of topology updates Nathan Lynch
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Nathan Lynch @ 2019-04-18 18:56 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srikar.dronamraju, mmc, mwb, julietk

Changing cpu <-> node relationships at runtime, as the pseries
platform code attempts to do for LPM, PRRN, and VPHN is essentially
unsupported by core subsystems. [1]

While more significant changes (i.e. discarding all that code) likely
are in store, these patches are a minimally invasive way to disable
the problem behavior in a way that should be suitable for backporting
to -stable and distros, and is an improvement on the current
situation.

Note: this doesn't affect use of VPHN at boot time for detecting
shared processor node assignments. Only runtime VPHN-initiated
reassignments are disabled.

[1] E.g. see the discussion here:
    https://lore.kernel.org/lkml/20180831115350.GC8437@linux.vnet.ibm.com/T/#u
 
Nathan Lynch (2):
  powerpc/numa: improve control of topology updates
  powerpc/numa: document topology_updates_enabled, disable by default

 arch/powerpc/mm/numa.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] powerpc/numa: improve control of topology updates
  2019-04-18 18:56 [PATCH 0/2] disable NUMA affinity reassignments at runtime Nathan Lynch
@ 2019-04-18 18:56 ` Nathan Lynch
  2019-04-21 14:19   ` [1/2] " Michael Ellerman
  2019-04-18 18:56 ` [PATCH 2/2] powerpc/numa: document topology_updates_enabled, disable by default Nathan Lynch
  2019-04-18 20:30 ` [PATCH 0/2] disable NUMA affinity reassignments at runtime Michal Suchánek
  2 siblings, 1 reply; 6+ messages in thread
From: Nathan Lynch @ 2019-04-18 18:56 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srikar.dronamraju, mmc, mwb, julietk

When booted with "topology_updates=no", or when "off" is written to
/proc/powerpc/topology_updates, NUMA reassignments are inhibited for
PRRN and VPHN events. However, migration and suspend unconditionally
re-enable reassignments via start_topology_update(). This is
incoherent.

Check the topology_updates_enabled flag in
start/stop_topology_update() so that callers of those APIs need not be
aware of whether reassignments are enabled. This allows the
administrative decision on reassignments to remain in force across
migrations and suspensions.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/mm/numa.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index f976676004ad..48c9a97eb2c3 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1498,6 +1498,9 @@ int start_topology_update(void)
 {
 	int rc = 0;
 
+	if (!topology_updates_enabled)
+		return 0;
+
 	if (firmware_has_feature(FW_FEATURE_PRRN)) {
 		if (!prrn_enabled) {
 			prrn_enabled = 1;
@@ -1531,6 +1534,9 @@ int stop_topology_update(void)
 {
 	int rc = 0;
 
+	if (!topology_updates_enabled)
+		return 0;
+
 	if (prrn_enabled) {
 		prrn_enabled = 0;
 #ifdef CONFIG_SMP
@@ -1588,11 +1594,13 @@ static ssize_t topology_write(struct file *file, const char __user *buf,
 
 	kbuf[read_len] = '\0';
 
-	if (!strncmp(kbuf, "on", 2))
+	if (!strncmp(kbuf, "on", 2)) {
+		topology_updates_enabled = true;
 		start_topology_update();
-	else if (!strncmp(kbuf, "off", 3))
+	} else if (!strncmp(kbuf, "off", 3)) {
 		stop_topology_update();
-	else
+		topology_updates_enabled = false;
+	} else
 		return -EINVAL;
 
 	return count;
@@ -1607,9 +1615,7 @@ static const struct file_operations topology_ops = {
 
 static int topology_update_init(void)
 {
-	/* Do not poll for changes if disabled at boot */
-	if (topology_updates_enabled)
-		start_topology_update();
+	start_topology_update();
 
 	if (vphn_enabled)
 		topology_schedule_update();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] powerpc/numa: document topology_updates_enabled, disable by default
  2019-04-18 18:56 [PATCH 0/2] disable NUMA affinity reassignments at runtime Nathan Lynch
  2019-04-18 18:56 ` [PATCH 1/2] powerpc/numa: improve control of topology updates Nathan Lynch
@ 2019-04-18 18:56 ` Nathan Lynch
  2019-04-18 20:30 ` [PATCH 0/2] disable NUMA affinity reassignments at runtime Michal Suchánek
  2 siblings, 0 replies; 6+ messages in thread
From: Nathan Lynch @ 2019-04-18 18:56 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srikar.dronamraju, mmc, mwb, julietk

Changing the NUMA associations for CPUs and memory at runtime is
basically unsupported by the core mm, scheduler etc. We see all manner
of crashes, warnings and instability when the pseries code tries to do
this. Disable this behavior by default, and document the switch a bit.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/mm/numa.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 48c9a97eb2c3..71af382ce1d5 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -908,16 +908,22 @@ static int __init early_numa(char *p)
 }
 early_param("numa", early_numa);
 
-static bool topology_updates_enabled = true;
+/*
+ * The platform can inform us through one of several mechanisms
+ * (post-migration device tree updates, PRRN or VPHN) that the NUMA
+ * assignment of a resource has changed. This controls whether we act
+ * on that. Disabled by default.
+ */
+static bool topology_updates_enabled;
 
 static int __init early_topology_updates(char *p)
 {
 	if (!p)
 		return 0;
 
-	if (!strcmp(p, "off")) {
-		pr_info("Disabling topology updates\n");
-		topology_updates_enabled = false;
+	if (!strcmp(p, "on")) {
+		pr_warn("Caution: enabling topology updates\n");
+		topology_updates_enabled = true;
 	}
 
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] disable NUMA affinity reassignments at runtime
  2019-04-18 18:56 [PATCH 0/2] disable NUMA affinity reassignments at runtime Nathan Lynch
  2019-04-18 18:56 ` [PATCH 1/2] powerpc/numa: improve control of topology updates Nathan Lynch
  2019-04-18 18:56 ` [PATCH 2/2] powerpc/numa: document topology_updates_enabled, disable by default Nathan Lynch
@ 2019-04-18 20:30 ` Michal Suchánek
  2019-04-18 22:37   ` Nathan Lynch
  2 siblings, 1 reply; 6+ messages in thread
From: Michal Suchánek @ 2019-04-18 20:30 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: srikar.dronamraju, mmc, linuxppc-dev, mwb, julietk

On Thu, 18 Apr 2019 13:56:56 -0500
Nathan Lynch <nathanl@linux.ibm.com> wrote:

Hello,

> Changing cpu <-> node relationships at runtime, as the pseries
> platform code attempts to do for LPM, PRRN, and VPHN is essentially
> unsupported by core subsystems. [1]

Wasn't there a patch that solves the discrepancy by removing and
re-adding the updated CPUs?

http://patchwork.ozlabs.org/patch/1051761/

Thanks

Michal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] disable NUMA affinity reassignments at runtime
  2019-04-18 20:30 ` [PATCH 0/2] disable NUMA affinity reassignments at runtime Michal Suchánek
@ 2019-04-18 22:37   ` Nathan Lynch
  0 siblings, 0 replies; 6+ messages in thread
From: Nathan Lynch @ 2019-04-18 22:37 UTC (permalink / raw)
  To: Michal Suchánek; +Cc: srikar.dronamraju, mmc, linuxppc-dev, mwb, julietk

Michal Suchánek <msuchanek@suse.de> writes:

> On Thu, 18 Apr 2019 13:56:56 -0500
> Nathan Lynch <nathanl@linux.ibm.com> wrote:
>
> Hello,
>
>> Changing cpu <-> node relationships at runtime, as the pseries
>> platform code attempts to do for LPM, PRRN, and VPHN is essentially
>> unsupported by core subsystems. [1]
>
> Wasn't there a patch that solves the discrepancy by removing and
> re-adding the updated CPUs?
>
> http://patchwork.ozlabs.org/patch/1051761/

In our testing it seems that changing the result of cpu_to_node() for a
given cpu id, even with an intervening offline/online, leads to crashes
and assertions in the slab allocator. There have been some ideas floated
to sidestep that but I think there are larger issues to consider.

Even if changing CPU node assignments were possible to do without
destabilizing the system it's not all that useful without updating
memory/LMB affinity as well. (VPHN is an exception.)

Furthermore I'm not aware of any effort to make the numa/affinity APIs
at the system call level account for the possibility that the cpu/mem
<-> node relationship could be dynamic. Nor is there any facility for
notifying applications of changes. Even if the kernel were to properly
support this internally, NUMA-aware applications are the ones that will
suffer unless appropriate APIs are provided to them.

To me it seems this all needs more careful consideration and work, and
much of it should happen outside of this particular arch/platform
context.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [1/2] powerpc/numa: improve control of topology updates
  2019-04-18 18:56 ` [PATCH 1/2] powerpc/numa: improve control of topology updates Nathan Lynch
@ 2019-04-21 14:19   ` Michael Ellerman
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2019-04-21 14:19 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: srikar.dronamraju, mmc, mwb, julietk

On Thu, 2019-04-18 at 18:56:57 UTC, Nathan Lynch wrote:
> When booted with "topology_updates=no", or when "off" is written to
> /proc/powerpc/topology_updates, NUMA reassignments are inhibited for
> PRRN and VPHN events. However, migration and suspend unconditionally
> re-enable reassignments via start_topology_update(). This is
> incoherent.
> 
> Check the topology_updates_enabled flag in
> start/stop_topology_update() so that callers of those APIs need not be
> aware of whether reassignments are enabled. This allows the
> administrative decision on reassignments to remain in force across
> migrations and suspensions.
> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2d4d9b308f8f8dec68f6dbbff18c68ec

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-04-21 15:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-18 18:56 [PATCH 0/2] disable NUMA affinity reassignments at runtime Nathan Lynch
2019-04-18 18:56 ` [PATCH 1/2] powerpc/numa: improve control of topology updates Nathan Lynch
2019-04-21 14:19   ` [1/2] " Michael Ellerman
2019-04-18 18:56 ` [PATCH 2/2] powerpc/numa: document topology_updates_enabled, disable by default Nathan Lynch
2019-04-18 20:30 ` [PATCH 0/2] disable NUMA affinity reassignments at runtime Michal Suchánek
2019-04-18 22:37   ` Nathan Lynch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.