linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Offline memoryless cpuless node 0
@ 2020-05-12 13:29 Srikar Dronamraju
  2020-05-12 13:29 ` [PATCH v4 1/3] powerpc/numa: Set numa_node for all possible cpus Srikar Dronamraju
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Srikar Dronamraju @ 2020-05-12 13:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Srikar Dronamraju, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Christopher Lameter, Michael Ellerman, Linus Torvalds,
	Gautham R Shenoy, Satheesh Rajendran, David Hildenbrand

Changelog v3:->v4:
- Resolved comments from Christopher.
Link v3: http://lore.kernel.org/lkml/20200501031128.19584-1-srikar@linux.vnet.ibm.com/t/#u

Changelog v2:->v3:
- Resolved comments from Gautham.
Link v2: https://lore.kernel.org/linuxppc-dev/20200428093836.27190-1-srikar@linux.vnet.ibm.com/t/#u

Changelog v1:->v2:
- Rebased to v5.7-rc3
- Updated the changelog.
Link v1: https://lore.kernel.org/linuxppc-dev/20200311110237.5731-1-srikar@linux.vnet.ibm.com/t/#u

Linux kernel configured with CONFIG_NUMA on a system with multiple
possible nodes, marks node 0 as online at boot. However in practice,
there are systems which have node 0 as memoryless and cpuless.

This can cause
1. numa_balancing to be enabled on systems with only one online node.
2. Existence of dummy (cpuless and memoryless) node which can confuse
users/scripts looking at output of lscpu / numactl.

This patchset wants to correct this anomaly.

This should only affect systems that have CONFIG_MEMORYLESS_NODES.
Currently there are only 2 architectures ia64 and powerpc that have this
config.

Note: Patch 3 in this patch series depends on patches 1 and 2.
Without patches 1 and 2, patch 3 might crash powerpc.

v5.7-rc3
 available: 2 nodes (0,2)
 node 0 cpus:
 node 0 size: 0 MB
 node 0 free: 0 MB
 node 2 cpus: 0 1 2 3 4 5 6 7
 node 2 size: 32625 MB
 node 2 free: 31490 MB
 node distances:
 node   0   2
   0:  10  20
   2:  20  10

proc and sys files
------------------
 /sys/devices/system/node/online:            0,2
 /proc/sys/kernel/numa_balancing:            1
 /sys/devices/system/node/has_cpu:           2
 /sys/devices/system/node/has_memory:        2
 /sys/devices/system/node/has_normal_memory: 2
 /sys/devices/system/node/possible:          0-31

v5.7-rc3 + patches
------------------
 available: 1 nodes (2)
 node 2 cpus: 0 1 2 3 4 5 6 7
 node 2 size: 32625 MB
 node 2 free: 31487 MB
 node distances:
 node   2
   2:  10

proc and sys files
------------------
/sys/devices/system/node/online:            2
/proc/sys/kernel/numa_balancing:            0
/sys/devices/system/node/has_cpu:           2
/sys/devices/system/node/has_memory:        2
/sys/devices/system/node/has_normal_memory: 2
/sys/devices/system/node/possible:          0-31

Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Christopher Lameter <cl@linux.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: David Hildenbrand <david@redhat.com>

Srikar Dronamraju (3):
  powerpc/numa: Set numa_node for all possible cpus
  powerpc/numa: Prefer node id queried from vphn
  mm/page_alloc: Keep memoryless cpuless node 0 offline

 arch/powerpc/mm/numa.c | 32 ++++++++++++++++++++++----------
 mm/page_alloc.c        |  4 +++-
 2 files changed, 25 insertions(+), 11 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/3] powerpc/numa: Set numa_node for all possible cpus
  2020-05-12 13:29 [PATCH v4 0/3] Offline memoryless cpuless node 0 Srikar Dronamraju
@ 2020-05-12 13:29 ` Srikar Dronamraju
  2020-05-12 13:29 ` [PATCH v4 2/3] powerpc/numa: Prefer node id queried from vphn Srikar Dronamraju
  2020-05-12 13:29 ` [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Srikar Dronamraju
  2 siblings, 0 replies; 7+ messages in thread
From: Srikar Dronamraju @ 2020-05-12 13:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Srikar Dronamraju, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Christopher Lameter, Michael Ellerman, Linus Torvalds,
	Gautham R Shenoy, Satheesh Rajendran, David Hildenbrand

A Powerpc system with multiple possible nodes and with CONFIG_NUMA
enabled always used to have a node 0, even if node 0 does not any cpus
or memory attached to it. As per PAPR, node affinity of a cpu is only
available once its present / online. For all cpus that are possible but
not present, cpu_to_node() would point to node 0.

To ensure a cpuless, memoryless dummy node is not online, powerpc need
to make sure all possible but not present cpu_to_node are set to a
proper node.

Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Christopher Lameter <cl@linux.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v3:->v4:
- Resolved comments from Christopher.
Link v3: http://lore.kernel.org/lkml/20200501031128.19584-1-srikar@linux.vnet.ibm.com/t/#u

Changelog v1:->v2:
- Rebased to v5.7-rc3

 arch/powerpc/mm/numa.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 9fcf2d1..5b7918c 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -506,6 +506,11 @@ static int numa_setup_cpu(unsigned long lcpu)
 	int fcpu = cpu_first_thread_sibling(lcpu);
 	int nid = NUMA_NO_NODE;
 
+	if (!cpu_present(lcpu)) {
+		set_cpu_numa_node(lcpu, first_online_node);
+		return first_online_node;
+	}
+
 	/*
 	 * If a valid cpu-to-node mapping is already available, use it
 	 * directly instead of querying the firmware, since it represents
@@ -931,8 +936,17 @@ void __init mem_topology_setup(void)
 
 	reset_numa_cpu_lookup_table();
 
-	for_each_present_cpu(cpu)
+	for_each_possible_cpu(cpu) {
+		/*
+		 * Powerpc with CONFIG_NUMA always used to have a node 0,
+		 * even if it was memoryless or cpuless. For all cpus that
+		 * are possible but not present, cpu_to_node() would point
+		 * to node 0. To remove a cpuless, memoryless dummy node,
+		 * powerpc need to make sure all possible but not present
+		 * cpu_to_node are set to a proper node.
+		 */
 		numa_setup_cpu(cpu);
+	}
 }
 
 void __init initmem_init(void)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/3] powerpc/numa: Prefer node id queried from vphn
  2020-05-12 13:29 [PATCH v4 0/3] Offline memoryless cpuless node 0 Srikar Dronamraju
  2020-05-12 13:29 ` [PATCH v4 1/3] powerpc/numa: Set numa_node for all possible cpus Srikar Dronamraju
@ 2020-05-12 13:29 ` Srikar Dronamraju
  2020-05-13  3:58   ` Gautham R Shenoy
  2020-05-12 13:29 ` [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Srikar Dronamraju
  2 siblings, 1 reply; 7+ messages in thread
From: Srikar Dronamraju @ 2020-05-12 13:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Srikar Dronamraju, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Christopher Lameter, Michael Ellerman, Linus Torvalds,
	Gautham R Shenoy, Satheesh Rajendran, David Hildenbrand

Node id queried from the static device tree may not
be correct. For example: it may always show 0 on a shared processor.
Hence prefer the node id queried from vphn and fallback on the device tree
based node id if vphn query fails.

Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Christopher Lameter <cl@linux.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v2:->v3:
- Resolved comments from Gautham.
Link v2: https://lore.kernel.org/linuxppc-dev/20200428093836.27190-1-srikar@linux.vnet.ibm.com/t/#u

Changelog v1:->v2:
- Rebased to v5.7-rc3

 arch/powerpc/mm/numa.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b3615b7..2815313 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -719,20 +719,20 @@ static int __init parse_numa_properties(void)
 	 */
 	for_each_present_cpu(i) {
 		struct device_node *cpu;
-		int nid;
-
-		cpu = of_get_cpu_node(i, NULL);
-		BUG_ON(!cpu);
-		nid = of_node_to_nid_single(cpu);
-		of_node_put(cpu);
+		int nid = vphn_get_nid(i);
 
 		/*
 		 * Don't fall back to default_nid yet -- we will plug
 		 * cpus into nodes once the memory scan has discovered
 		 * the topology.
 		 */
-		if (nid < 0)
-			continue;
-		node_set_online(nid);
+		if (nid == NUMA_NO_NODE) {
+			cpu = of_get_cpu_node(i, NULL);
+			BUG_ON(!cpu);
+			nid = of_node_to_nid_single(cpu);
+			of_node_put(cpu);
+		}
+
+		if (likely(nid > 0))
+			node_set_online(nid);
 	}
 
 	get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
  2020-05-12 13:29 [PATCH v4 0/3] Offline memoryless cpuless node 0 Srikar Dronamraju
  2020-05-12 13:29 ` [PATCH v4 1/3] powerpc/numa: Set numa_node for all possible cpus Srikar Dronamraju
  2020-05-12 13:29 ` [PATCH v4 2/3] powerpc/numa: Prefer node id queried from vphn Srikar Dronamraju
@ 2020-05-12 13:29 ` Srikar Dronamraju
  2020-05-12 16:31   ` Christopher Lameter
  2 siblings, 1 reply; 7+ messages in thread
From: Srikar Dronamraju @ 2020-05-12 13:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Srikar Dronamraju, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Christopher Lameter, Michael Ellerman, Linus Torvalds,
	Gautham R Shenoy, Satheesh Rajendran, David Hildenbrand

Currently Linux kernel with CONFIG_NUMA on a system with multiple
possible nodes, marks node 0 as online at boot.  However in practice,
there are systems which have node 0 as memoryless and cpuless.

This can cause numa_balancing to be enabled on systems with only one node
with memory and CPUs. The existence of this dummy node which is cpuless and
memoryless node can confuse users/scripts looking at output of lscpu /
numactl.

By marking, N_ONLINE as NODE_MASK_NONE, lets stop assuming that Node 0 is
always online.

v5.7-rc3
 available: 2 nodes (0,2)
 node 0 cpus:
 node 0 size: 0 MB
 node 0 free: 0 MB
 node 2 cpus: 0 1 2 3 4 5 6 7
 node 2 size: 32625 MB
 node 2 free: 31490 MB
 node distances:
 node   0   2
   0:  10  20
   2:  20  10

proc and sys files
------------------
 /sys/devices/system/node/online:            0,2
 /proc/sys/kernel/numa_balancing:            1
 /sys/devices/system/node/has_cpu:           2
 /sys/devices/system/node/has_memory:        2
 /sys/devices/system/node/has_normal_memory: 2
 /sys/devices/system/node/possible:          0-31

v5.7-rc3 + patch
------------------
 available: 1 nodes (2)
 node 2 cpus: 0 1 2 3 4 5 6 7
 node 2 size: 32625 MB
 node 2 free: 31487 MB
 node distances:
 node   2
   2:  10

proc and sys files
------------------
/sys/devices/system/node/online:            2
/proc/sys/kernel/numa_balancing:            0
/sys/devices/system/node/has_cpu:           2
/sys/devices/system/node/has_memory:        2
/sys/devices/system/node/has_normal_memory: 2
/sys/devices/system/node/possible:          0-31

Note: On Powerpc, cpu_to_node of possible but not present cpus would
previously return 0. Hence this commit depends on commit ("powerpc/numa: Set
numa_node for all possible cpus") and commit ("powerpc/numa: Prefer node id
queried from vphn"). Without the 2 commits, Powerpc system might crash.

Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Christopher Lameter <cl@linux.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1:->v2:
- Rebased to v5.7-rc3
Link v2: https://lore.kernel.org/linuxppc-dev/20200428093836.27190-1-srikar@linux.vnet.ibm.com/t/#u

 mm/page_alloc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 69827d4..03b8959 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -116,8 +116,10 @@ struct pcpu_drain {
  */
 nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
 	[N_POSSIBLE] = NODE_MASK_ALL,
+#ifdef CONFIG_NUMA
+	[N_ONLINE] = NODE_MASK_NONE,
+#else
 	[N_ONLINE] = { { [0] = 1UL } },
-#ifndef CONFIG_NUMA
 	[N_NORMAL_MEMORY] = { { [0] = 1UL } },
 #ifdef CONFIG_HIGHMEM
 	[N_HIGH_MEMORY] = { { [0] = 1UL } },
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
  2020-05-12 13:29 ` [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Srikar Dronamraju
@ 2020-05-12 16:31   ` Christopher Lameter
  2020-05-13  7:46     ` Srikar Dronamraju
  0 siblings, 1 reply; 7+ messages in thread
From: Christopher Lameter @ 2020-05-12 16:31 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Andrew Morton, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Michael Ellerman, Linus Torvalds, Gautham R Shenoy,
	Satheesh Rajendran, David Hildenbrand

On Tue, 12 May 2020, Srikar Dronamraju wrote:

> +#ifdef CONFIG_NUMA
> +	[N_ONLINE] = NODE_MASK_NONE,

Again. Same issue as before. If you do this then you do a global change
for all architectures. You need to put something in the early boot
sequence (in a non architecture specific way) that sets the first node
online by default.

You have fixed the issue in your earlier patches for the powerpc
archicture. What about the other architectures?

Or did I miss something?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] powerpc/numa: Prefer node id queried from vphn
  2020-05-12 13:29 ` [PATCH v4 2/3] powerpc/numa: Prefer node id queried from vphn Srikar Dronamraju
@ 2020-05-13  3:58   ` Gautham R Shenoy
  0 siblings, 0 replies; 7+ messages in thread
From: Gautham R Shenoy @ 2020-05-13  3:58 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Andrew Morton, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Christopher Lameter, Michael Ellerman, Linus Torvalds,
	Gautham R Shenoy, Satheesh Rajendran, David Hildenbrand

On Tue, May 12, 2020 at 06:59:36PM +0530, Srikar Dronamraju wrote:
> Node id queried from the static device tree may not
> be correct. For example: it may always show 0 on a shared processor.
> Hence prefer the node id queried from vphn and fallback on the device tree
> based node id if vphn query fails.
> 
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
> Cc: Christopher Lameter <cl@linux.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

Looks good to me.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

> ---
> Changelog v2:->v3:
> - Resolved comments from Gautham.
> Link v2: https://lore.kernel.org/linuxppc-dev/20200428093836.27190-1-srikar@linux.vnet.ibm.com/t/#u
> 
> Changelog v1:->v2:
> - Rebased to v5.7-rc3
> 
>  arch/powerpc/mm/numa.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index b3615b7..2815313 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -719,20 +719,20 @@ static int __init parse_numa_properties(void)
>  	 */
>  	for_each_present_cpu(i) {
>  		struct device_node *cpu;
> -		int nid;
> -
> -		cpu = of_get_cpu_node(i, NULL);
> -		BUG_ON(!cpu);
> -		nid = of_node_to_nid_single(cpu);
> -		of_node_put(cpu);
> +		int nid = vphn_get_nid(i);
> 
>  		/*
>  		 * Don't fall back to default_nid yet -- we will plug
>  		 * cpus into nodes once the memory scan has discovered
>  		 * the topology.
>  		 */
> -		if (nid < 0)
> -			continue;
> -		node_set_online(nid);
> +		if (nid == NUMA_NO_NODE) {
> +			cpu = of_get_cpu_node(i, NULL);
> +			BUG_ON(!cpu);
> +			nid = of_node_to_nid_single(cpu);
> +			of_node_put(cpu);
> +		}
> +
> +		if (likely(nid > 0))
> +			node_set_online(nid);
>  	}
> 
>  	get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells);
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
  2020-05-12 16:31   ` Christopher Lameter
@ 2020-05-13  7:46     ` Srikar Dronamraju
  0 siblings, 0 replies; 7+ messages in thread
From: Srikar Dronamraju @ 2020-05-13  7:46 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Andrew Morton, linuxppc-dev, linux-mm, linux-kernel,
	Michal Hocko, Mel Gorman, Vlastimil Babka, Kirill A. Shutemov,
	Michael Ellerman, Linus Torvalds, Gautham R Shenoy,
	Satheesh Rajendran, David Hildenbrand

* Christopher Lameter <cl@linux.com> [2020-05-12 16:31:26]:

> On Tue, 12 May 2020, Srikar Dronamraju wrote:
> 
> > +#ifdef CONFIG_NUMA
> > +	[N_ONLINE] = NODE_MASK_NONE,
> 
> Again. Same issue as before. If you do this then you do a global change
> for all architectures. You need to put something in the early boot
> sequence (in a non architecture specific way) that sets the first node
> online by default.
> 

I did respond to that earlier.

> You have fixed the issue in your earlier patches for the powerpc
> archicture. What about the other architectures?
> 
> Or did I miss something?
> 

Here are my assumptions, please do correct me if any of them are wrong.
1. My other patches for Powerpc, don't change when the nodes are being
onlined. They only change how the cpu_to_node numbering of the offline cpus.
In this respect Powerpc due to its PAPR compliance may be slightly unique
from other archs where the cpu binding of the node is not known till CPUs
are onlined.

2. Currently the nodes are onlined (in all arch specific code) as soon as
they are detected. This is unconditional onlining as in there are no checks
to see the node number is 0. i.e I don't see any special checks that
restrict or allow node 0 from being onlined / offlined. Its considered no
special than any other online node.

3. If we were to expect node 0 to be always online, then why do we have
first_online_node. We could always hard code it to 0.

4. I tried enabling CONFIG_MEMORYLESS_NODE on x86, but that's seems to be
not possible. And it looks to me that something like that is only possible
on powerpc and IA64.

5. Without my patch on a regular numa system, node 0 would be onlined by
default during structure initialization. When the nodes get detected, node 0
and other nodes would again be onlined. The only drawback being if node 0
wasn't suppose to be online, it will still end up being marked online.
With the proposed patch, when the nodes get detected, any nodes detected
would be onlined.

I think the node onlining is already pretty early in boot. I don't know of
any other mechanism to move the onlining further up and in a non
architecture specific way. However if you have ideas, please do let me know.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-13  7:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-12 13:29 [PATCH v4 0/3] Offline memoryless cpuless node 0 Srikar Dronamraju
2020-05-12 13:29 ` [PATCH v4 1/3] powerpc/numa: Set numa_node for all possible cpus Srikar Dronamraju
2020-05-12 13:29 ` [PATCH v4 2/3] powerpc/numa: Prefer node id queried from vphn Srikar Dronamraju
2020-05-13  3:58   ` Gautham R Shenoy
2020-05-12 13:29 ` [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Srikar Dronamraju
2020-05-12 16:31   ` Christopher Lameter
2020-05-13  7:46     ` Srikar Dronamraju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).