All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Early node associativity
@ 2019-08-29  5:50 Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 1/4] powerpc/vphn: Check for error from hcall_vphn Srikar Dronamraju
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Srikar Dronamraju @ 2019-08-29  5:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Srikar Dronamraju, Nicholas Piggin, Abdul Haleem,
	Satheesh Rajendran, linuxppc-dev

Abdul reported  a warning on a shared lpar.
"WARNING: workqueue cpumask: online intersect > possible intersect".
This is because per node workqueue possible mask is set very early in the
boot process even before the system was querying the home node
associativity. However per node workqueue online cpumask gets updated
dynamically. Hence there is a chance when per node workqueue online cpumask
is a superset of per node workqueue possible mask.

Link for v1: https://patchwork.ozlabs.org/patch/1151658
Changelog: v1->v2
 - Handled comments from Nathan Lynch.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>

Srikar Dronamraju (4):
  powerpc/vphn: Check for error from hcall_vphn
  powerpc/numa: Handle extra hcall_vphn error cases
  powerpc/numa: Early request for home node associativity
  powerpc/numa: Remove late request for home node associativity

 arch/powerpc/include/asm/topology.h   |  4 --
 arch/powerpc/kernel/smp.c             |  5 ---
 arch/powerpc/mm/numa.c                | 77 ++++++++++++++++++++++++-----------
 arch/powerpc/platforms/pseries/vphn.c |  3 +-
 4 files changed, 56 insertions(+), 33 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/4] powerpc/vphn: Check for error from hcall_vphn
  2019-08-29  5:50 [PATCH v2 0/4] Early node associativity Srikar Dronamraju
@ 2019-08-29  5:50 ` Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 2/4] powerpc/numa: Handle extra hcall_vphn error cases Srikar Dronamraju
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Srikar Dronamraju @ 2019-08-29  5:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, linuxppc-dev, Srikar Dronamraju, Nicholas Piggin

There is no value in unpacking associativity, if
H_HOME_NODE_ASSOCIATIVITY hcall has returned an error.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
---
Changelog (v2->v1):
- Split the patch into 2(Suggested by Nathan).

 arch/powerpc/platforms/pseries/vphn.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vphn.c b/arch/powerpc/platforms/pseries/vphn.c
index 3f07bf6..cca474a 100644
--- a/arch/powerpc/platforms/pseries/vphn.c
+++ b/arch/powerpc/platforms/pseries/vphn.c
@@ -82,7 +82,8 @@ long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity)
 	long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
 
 	rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, retbuf, flags, cpu);
-	vphn_unpack_associativity(retbuf, associativity);
+	if (rc == H_SUCCESS)
+		vphn_unpack_associativity(retbuf, associativity);
 
 	return rc;
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/4] powerpc/numa: Handle extra hcall_vphn error cases
  2019-08-29  5:50 [PATCH v2 0/4] Early node associativity Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 1/4] powerpc/vphn: Check for error from hcall_vphn Srikar Dronamraju
@ 2019-08-29  5:50 ` Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 3/4] powerpc/numa: Early request for home node associativity Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 4/4] powerpc/numa: Remove late " Srikar Dronamraju
  3 siblings, 0 replies; 9+ messages in thread
From: Srikar Dronamraju @ 2019-08-29  5:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Satheesh Rajendran, linuxppc-dev,
	Srikar Dronamraju, Nicholas Piggin

Currently code handles H_FUNCTION, H_SUCCESS, H_HARDWARE return codes.
However hcall_vphn can return other return codes. Now it also handles
H_PARAMETER return code.  Also the rest return codes are handled under the
default case.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
---
Changelog (v2->v1):
 Handled comments from Nathan:
  - Split patch from patch 1.
  - Corrected a problem where I missed calling stop_topology_update().
  - Using pr_err_ratelimited instead of printk.

 arch/powerpc/mm/numa.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 50d68d2..8fbe57c 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1191,23 +1191,30 @@ static long vphn_get_associativity(unsigned long cpu,
 				VPHN_FLAG_VCPU, associativity);
 
 	switch (rc) {
+	case H_SUCCESS:
+		dbg("VPHN hcall succeeded. Reset polling...\n");
+		timed_topology_update(0);
+		goto out;
+
 	case H_FUNCTION:
-		printk_once(KERN_INFO
-			"VPHN is not supported. Disabling polling...\n");
-		stop_topology_update();
+		pr_err_ratelimited("VPHN unsupported. Disabling polling...\n");
 		break;
 	case H_HARDWARE:
-		printk(KERN_ERR
-			"hcall_vphn() experienced a hardware fault "
+		pr_err_ratelimited("hcall_vphn() experienced a hardware fault "
 			"preventing VPHN. Disabling polling...\n");
-		stop_topology_update();
 		break;
-	case H_SUCCESS:
-		dbg("VPHN hcall succeeded. Reset polling...\n");
-		timed_topology_update(0);
+	case H_PARAMETER:
+		pr_err_ratelimited("hcall_vphn() was passed an invalid parameter. "
+			"Disabling polling...\n");
+		break;
+	default:
+		pr_err_ratelimited("hcall_vphn() returned %ld. Disabling polling...\n"
+			, rc);
 		break;
 	}
 
+	stop_topology_update();
+out:
 	return rc;
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/4] powerpc/numa: Early request for home node associativity
  2019-08-29  5:50 [PATCH v2 0/4] Early node associativity Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 1/4] powerpc/vphn: Check for error from hcall_vphn Srikar Dronamraju
  2019-08-29  5:50 ` [PATCH v2 2/4] powerpc/numa: Handle extra hcall_vphn error cases Srikar Dronamraju
@ 2019-08-29  5:50 ` Srikar Dronamraju
  2019-09-05 20:04   ` Nathan Lynch
  2019-08-29  5:50 ` [PATCH v2 4/4] powerpc/numa: Remove late " Srikar Dronamraju
  3 siblings, 1 reply; 9+ messages in thread
From: Srikar Dronamraju @ 2019-08-29  5:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Satheesh Rajendran, linuxppc-dev,
	Srikar Dronamraju, Nicholas Piggin

Currently the kernel detects if its running on a shared lpar platform
and requests home node associativity before the scheduler sched_domains
are setup. However between the time NUMA setup is initialized and the
request for home node associativity, workqueue initializes its per node
cpumask. The per node workqueue possible cpumask may turn invalid
after home node associativity resulting in weird situations like
workqueue possible cpumask being a subset of workqueue online cpumask.

This can be fixed by requesting home node associativity earlier just
before NUMA setup. However at the NUMA setup time, kernel may not be in
a position to detect if its running on a shared lpar platform. So
request for home node associativity and if the request fails, fallback
on the device tree property.

While here, fix a problem where of_node_put could be called even when
of_get_cpu_node was not successful.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
---
Changelog (v2->v1):
- Handled comments from Nathan Lynch
  * Dont depend on pacas to be setup for the hwid

 arch/powerpc/mm/numa.c | 43 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 8fbe57c..de4a1a1 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -461,14 +461,33 @@ static int of_drconf_to_nid_single(struct drmem_lmb *lmb)
 	return nid;
 }
 
+static int vphn_get_nid(unsigned long cpu, bool get_hwid)
+{
+	__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
+	long rc, hwid;
+
+	if (get_hwid)
+		hwid = get_hard_smp_processor_id(cpu);
+	else
+		hwid = cpu_to_phys_id[cpu];
+
+	rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);
+	if (rc == H_SUCCESS)
+		return associativity_to_nid(associativity);
+
+	return NUMA_NO_NODE;
+}
+
 /*
  * Figure out to which domain a cpu belongs and stick it there.
+ * cpu_to_phys_id is only valid between smp_setup_cpu_maps() and
+ * smp_setup_pacas(). If called outside this window, set get_hwid to true.
  * Return the id of the domain used.
  */
-static int numa_setup_cpu(unsigned long lcpu)
+static int numa_setup_cpu(unsigned long lcpu, bool get_hwid)
 {
 	int nid = NUMA_NO_NODE;
-	struct device_node *cpu;
+	struct device_node *cpu = NULL;
 
 	/*
 	 * If a valid cpu-to-node mapping is already available, use it
@@ -480,6 +499,20 @@ static int numa_setup_cpu(unsigned long lcpu)
 		return nid;
 	}
 
+	/*
+	 * On a shared lpar, device tree will not have node associativity.
+	 * At this time lppaca, or its __old_status field may not be
+	 * updated. Hence kernel cannot detect if its on a shared lpar. So
+	 * request an explicit associativity irrespective of whether the
+	 * lpar is shared or dedicated. Use the device tree property as a
+	 * fallback.
+	 */
+	if (firmware_has_feature(FW_FEATURE_VPHN))
+		nid = vphn_get_nid(lcpu, get_hwid);
+
+	if (nid != NUMA_NO_NODE)
+		goto out_present;
+
 	cpu = of_get_cpu_node(lcpu, NULL);
 
 	if (!cpu) {
@@ -491,13 +524,13 @@ static int numa_setup_cpu(unsigned long lcpu)
 	}
 
 	nid = of_node_to_nid_single(cpu);
+	of_node_put(cpu);
 
 out_present:
 	if (nid < 0 || !node_possible(nid))
 		nid = first_online_node;
 
 	map_cpu_to_node(lcpu, nid);
-	of_node_put(cpu);
 out:
 	return nid;
 }
@@ -528,7 +561,7 @@ static int ppc_numa_cpu_prepare(unsigned int cpu)
 {
 	int nid;
 
-	nid = numa_setup_cpu(cpu);
+	nid = numa_setup_cpu(cpu, true);
 	verify_cpu_node_mapping(cpu, nid);
 	return 0;
 }
@@ -875,7 +908,7 @@ void __init mem_topology_setup(void)
 	reset_numa_cpu_lookup_table();
 
 	for_each_present_cpu(cpu)
-		numa_setup_cpu(cpu);
+		numa_setup_cpu(cpu, false);
 }
 
 void __init initmem_init(void)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/4] powerpc/numa: Remove late request for home node associativity
  2019-08-29  5:50 [PATCH v2 0/4] Early node associativity Srikar Dronamraju
                   ` (2 preceding siblings ...)
  2019-08-29  5:50 ` [PATCH v2 3/4] powerpc/numa: Early request for home node associativity Srikar Dronamraju
@ 2019-08-29  5:50 ` Srikar Dronamraju
  3 siblings, 0 replies; 9+ messages in thread
From: Srikar Dronamraju @ 2019-08-29  5:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Satheesh Rajendran, linuxppc-dev,
	Srikar Dronamraju, Nicholas Piggin

With commit ("powerpc/numa: Early request for home node associativity"),
commit 2ea626306810 ("powerpc/topology: Get topology for shared
processors at boot") which was requesting home node associativity
becomes redundant.

Hence remove the late request for home node associativity.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/topology.h | 4 ----
 arch/powerpc/kernel/smp.c           | 5 -----
 arch/powerpc/mm/numa.c              | 9 ---------
 3 files changed, 18 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 2f7e1ea..9bd396f 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -98,7 +98,6 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
 extern int prrn_is_enabled(void);
 extern int find_and_online_cpu_nid(int cpu);
 extern int timed_topology_update(int nsecs);
-extern void __init shared_proc_topology_init(void);
 #else
 static inline int start_topology_update(void)
 {
@@ -121,9 +120,6 @@ static inline int timed_topology_update(int nsecs)
 	return 0;
 }
 
-#ifdef CONFIG_SMP
-static inline void shared_proc_topology_init(void) {}
-#endif
 #endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
 
 #include <asm-generic/topology.h>
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ea6adbf..cdd39a0 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1359,11 +1359,6 @@ void __init smp_cpus_done(unsigned int max_cpus)
 	if (smp_ops && smp_ops->bringup_done)
 		smp_ops->bringup_done();
 
-	/*
-	 * On a shared LPAR, associativity needs to be requested.
-	 * Hence, get numa topology before dumping cpu topology
-	 */
-	shared_proc_topology_init();
 	dump_numa_cpu_topology();
 
 #ifdef CONFIG_SCHED_SMT
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index de4a1a1..a20617a 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1608,15 +1608,6 @@ int prrn_is_enabled(void)
 	return prrn_enabled;
 }
 
-void __init shared_proc_topology_init(void)
-{
-	if (lppaca_shared_proc(get_lppaca())) {
-		bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
-			    nr_cpumask_bits);
-		numa_update_cpu_topology(false);
-	}
-}
-
 static int topology_read(struct seq_file *file, void *v)
 {
 	if (vphn_enabled || prrn_enabled)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] powerpc/numa: Early request for home node associativity
  2019-08-29  5:50 ` [PATCH v2 3/4] powerpc/numa: Early request for home node associativity Srikar Dronamraju
@ 2019-09-05 20:04   ` Nathan Lynch
  2019-09-06  3:41     ` Srikar Dronamraju
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Lynch @ 2019-09-05 20:04 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Satheesh Rajendran, linuxppc-dev, Nicholas Piggin

Hi Srikar,

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> Currently the kernel detects if its running on a shared lpar platform
> and requests home node associativity before the scheduler sched_domains
> are setup. However between the time NUMA setup is initialized and the
> request for home node associativity, workqueue initializes its per node
> cpumask. The per node workqueue possible cpumask may turn invalid
> after home node associativity resulting in weird situations like
> workqueue possible cpumask being a subset of workqueue online cpumask.
>
> This can be fixed by requesting home node associativity earlier just
> before NUMA setup. However at the NUMA setup time, kernel may not be in
> a position to detect if its running on a shared lpar platform. So
> request for home node associativity and if the request fails, fallback
> on the device tree property.
>
> While here, fix a problem where of_node_put could be called even when
> of_get_cpu_node was not successful.

of_node_put() handles NULL arguments, so this should not be necessary.

> +static int vphn_get_nid(unsigned long cpu, bool get_hwid)

[...]

> +static int numa_setup_cpu(unsigned long lcpu, bool get_hwid)

[...]

> @@ -528,7 +561,7 @@ static int ppc_numa_cpu_prepare(unsigned int cpu)
>  {
>  	int nid;
>  
> -	nid = numa_setup_cpu(cpu);
> +	nid = numa_setup_cpu(cpu, true);
>  	verify_cpu_node_mapping(cpu, nid);
>  	return 0;
>  }
> @@ -875,7 +908,7 @@ void __init mem_topology_setup(void)
>  	reset_numa_cpu_lookup_table();
>  
>  	for_each_present_cpu(cpu)
> -		numa_setup_cpu(cpu);
> +		numa_setup_cpu(cpu, false);
>  }

I'm open to other points of view here, but I would prefer two separate
functions, something like vphn_get_nid() for runtime and
vphn_get_nid_early() (which could be __init) for boot-time
initialization. Propagating a somewhat unexpressive boolean flag through
two levels of function calls in this code is unappealing...

Regardless, I have an annoying question :-) Isn't it possible that,
while Linux is calling vphn_get_nid() for each logical cpu in sequence,
the platform could change a virtual processor's node assignment,
potentially causing sibling threads to get different node assignments
and producing an incoherent topology (which then leads to sched domain
assertions etc)?

If so, I think more care is needed. The algorithm should make the vphn
call only once per cpu node, I think?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] powerpc/numa: Early request for home node associativity
  2019-09-05 20:04   ` Nathan Lynch
@ 2019-09-06  3:41     ` Srikar Dronamraju
  2019-09-06 13:49       ` Srikar Dronamraju
  2019-09-06 21:34       ` Nathan Lynch
  0 siblings, 2 replies; 9+ messages in thread
From: Srikar Dronamraju @ 2019-09-06  3:41 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: Satheesh Rajendran, linuxppc-dev, Nicholas Piggin

> >
> > While here, fix a problem where of_node_put could be called even when
> > of_get_cpu_node was not successful.
> 
> of_node_put() handles NULL arguments, so this should not be necessary.
> 

Ok 

> > @@ -875,7 +908,7 @@ void __init mem_topology_setup(void)
> >  	reset_numa_cpu_lookup_table();
> >  
> >  	for_each_present_cpu(cpu)
> > -		numa_setup_cpu(cpu);
> > +		numa_setup_cpu(cpu, false);
> >  }
> 
> I'm open to other points of view here, but I would prefer two separate
> functions, something like vphn_get_nid() for runtime and
> vphn_get_nid_early() (which could be __init) for boot-time
> initialization. Propagating a somewhat unexpressive boolean flag through
> two levels of function calls in this code is unappealing...
> 

Somehow not convinced that we need to duplicate function just to avoid
passing a bool.

If propagating a boolean flag in two levels of function calls is an issue,
we could decipher the logic in numa_setup_cpu itself

Something like this
static int numa_setup_cpu(unsigned long lcpu, bool get_hwid)
{
....

	if (firmware_has_feature(FW_FEATURE_VPHN)) {
		long hwid;

		if (get_hwid)
			hwid = get_hard_smp_processor_id(cpu);
		else
			hwid = cpu_to_phys_id[cpu];

		nid = vphn_get_nid(lcpu, hwid);
	}

....
Would this help?

> Regardless, I have an annoying question :-) Isn't it possible that,
> while Linux is calling vphn_get_nid() for each logical cpu in sequence,
> the platform could change a virtual processor's node assignment,
> potentially causing sibling threads to get different node assignments
> and producing an incoherent topology (which then leads to sched domain
> assertions etc)?
> 

Right, its certainly possible for node assignment to change while we iterate
through the siblings. Do you have an recommendations?

> If so, I think more care is needed. The algorithm should make the vphn
> call only once per cpu node, I think?

I didn't get "once per cpu node", How do we know which all cpus are part of
that cpu node? Or did you mean once per cpu core?

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] powerpc/numa: Early request for home node associativity
  2019-09-06  3:41     ` Srikar Dronamraju
@ 2019-09-06 13:49       ` Srikar Dronamraju
  2019-09-06 21:34       ` Nathan Lynch
  1 sibling, 0 replies; 9+ messages in thread
From: Srikar Dronamraju @ 2019-09-06 13:49 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: Satheesh Rajendran, linuxppc-dev, Nicholas Piggin

> > Regardless, I have an annoying question :-) Isn't it possible that,
> > while Linux is calling vphn_get_nid() for each logical cpu in sequence,
> > the platform could change a virtual processor's node assignment,
> > potentially causing sibling threads to get different node assignments
> > and producing an incoherent topology (which then leads to sched domain
> > assertions etc)?
> > 
> 
> Right, its certainly possible for node assignment to change while we iterate
> through the siblings. Do you have an recommendations?
> 

One thingk that I forgot to add was we already cache the cpu_to_node
mapping. If the mapping is around, we still don't lookup the nid.

However its still possible that in the first iteration where we cache the
nid's. we still end up with different nids for different siblings if node
associativity changes in between.

> > If so, I think more care is needed. The algorithm should make the vphn
> > call only once per cpu node, I think?
> 
> I didn't get "once per cpu node", How do we know which all cpus are part of
> that cpu node? Or did you mean once per cpu core?

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] powerpc/numa: Early request for home node associativity
  2019-09-06  3:41     ` Srikar Dronamraju
  2019-09-06 13:49       ` Srikar Dronamraju
@ 2019-09-06 21:34       ` Nathan Lynch
  1 sibling, 0 replies; 9+ messages in thread
From: Nathan Lynch @ 2019-09-06 21:34 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Satheesh Rajendran, linuxppc-dev, Nicholas Piggin

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>> Regardless, I have an annoying question :-) Isn't it possible that,
>> while Linux is calling vphn_get_nid() for each logical cpu in sequence,
>> the platform could change a virtual processor's node assignment,
>> potentially causing sibling threads to get different node assignments
>> and producing an incoherent topology (which then leads to sched domain
>> assertions etc)?
>> 
>
> Right, its certainly possible for node assignment to change while we iterate
> through the siblings. Do you have an recommendations?
>
>> If so, I think more care is needed. The algorithm should make the vphn
>> call only once per cpu node, I think?
>
> I didn't get "once per cpu node", How do we know which all cpus are part of
> that cpu node? Or did you mean once per cpu core?

Sorry, "node" is overloaded in this context.  I meant once per cpu
device node from the device tree, each of which I believe corresponds to
a core, yes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-06 21:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-29  5:50 [PATCH v2 0/4] Early node associativity Srikar Dronamraju
2019-08-29  5:50 ` [PATCH v2 1/4] powerpc/vphn: Check for error from hcall_vphn Srikar Dronamraju
2019-08-29  5:50 ` [PATCH v2 2/4] powerpc/numa: Handle extra hcall_vphn error cases Srikar Dronamraju
2019-08-29  5:50 ` [PATCH v2 3/4] powerpc/numa: Early request for home node associativity Srikar Dronamraju
2019-09-05 20:04   ` Nathan Lynch
2019-09-06  3:41     ` Srikar Dronamraju
2019-09-06 13:49       ` Srikar Dronamraju
2019-09-06 21:34       ` Nathan Lynch
2019-08-29  5:50 ` [PATCH v2 4/4] powerpc/numa: Remove late " Srikar Dronamraju

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.