All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86, numa: fix boot without RAM on node0 again
@ 2010-07-07 21:35 Yinghai Lu
  0 siblings, 0 replies; 6+ messages in thread
From: Yinghai Lu @ 2010-07-07 21:35 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: Tejun Heo, Andrew Morton, Denys Vlasenko, Lee Schermerhorn, linux-kernel


|commit e534c7c5f8d6e9fc46f57fab067c7e48d8ceb172
|Author: Lee Schermerhorn <lee.schermerhorn@hp.com>
|Date:   Wed May 26 14:44:58 2010 -0700
|
|    numa: x86_64: use generic percpu var numa_node_id() implementation
|
|    x86 arch specific changes to use generic numa_node_id() based on generic
|    percpu variable infrastructure.  Back out x86's custom version of
|    numa_node_id()

broke numa system that doesn't have ram on node0 when MEMORY_HOTPLUG is enabled.

because cpu_up() will call cpu_to_node() before per_cpu(numa_node) is setup for
APs.

When Node0 doesn't have RAM, on x86, cpus already round it to nearest node with
RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set up until in c_init for APs.

when later cpu_up() calling cpu_to_node() will get 0 again, and make it online
even there is no RAM on node0. so later all APs can not booted up, and later
will have panic.

[    1.611101] On node 0 totalpages: 0
.........
[    2.608558] On node 0 totalpages: 0
[    2.612065] Brought up 1 CPUs
[    2.615199] Total of 1 processors activated (3990.31 BogoMIPS).
...
   93.225341] calling  loop_init+0x0/0x1a4 @ 1
[   93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate
[   93.246539] Pid: 1, comm: swapper Tainted: G        W   2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354
[   93.264621] Call Trace:
[   93.266533]  [<ffffffff81125e43>] pcpu_alloc+0x83a/0x8e7
[   93.270710]  [<ffffffff81125f15>] __alloc_percpu+0x10/0x12
[   93.285849]  [<ffffffff8140786c>] alloc_disk_node+0x94/0x16d
[   93.291811]  [<ffffffff81407956>] alloc_disk+0x11/0x13
[   93.306157]  [<ffffffff81503e51>] loop_alloc+0xa7/0x180
[   93.310538]  [<ffffffff8277ef48>] loop_init+0x9b/0x1a4
[   93.324909]  [<ffffffff8277eead>] ? loop_init+0x0/0x1a4
[   93.329650]  [<ffffffff810001f2>] do_one_initcall+0x57/0x136
[   93.345197]  [<ffffffff827486d0>] kernel_init+0x184/0x20e
[   93.348146]  [<ffffffff81034954>] kernel_thread_helper+0x4/0x10
[   93.365194]  [<ffffffff81c7cc3c>] ? restore_args+0x0/0x30
[   93.369305]  [<ffffffff8274854c>] ? kernel_init+0x0/0x20e
[   93.386011]  [<ffffffff81034950>] ? kernel_thread_helper+0x0/0x10
[   93.392047] loop: out of memory
...

Try to assign per_cpu(numa_node) early

Signed-off-by: Yinghai <yinghai@kernel.org>

---
 arch/x86/kernel/setup_percpu.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Index: linux-2.6/arch/x86/kernel/setup_percpu.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup_percpu.c
+++ linux-2.6/arch/x86/kernel/setup_percpu.c
@@ -238,6 +238,15 @@ void __init setup_per_cpu_areas(void)
 #ifdef CONFIG_NUMA
 		per_cpu(x86_cpu_to_node_map, cpu) =
 			early_per_cpu_map(x86_cpu_to_node_map, cpu);
+		/*
+		 * make sure boot cpu numa_node is right, when boot cpu is on
+		 *  the node that doesn't have mem installed
+		 * also cpu_up() will call cpu_to_node() for APs when
+		 *  MEMORY_HOTPLUG is defined, before per_cpu(numa_node) is set
+		 *  up later with c_init aka intel_init/amd_init
+		 * So set them all (boot cpu and all APs)
+		 */
+		set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
 #endif
 #endif
 		/*
@@ -257,14 +266,6 @@ void __init setup_per_cpu_areas(void)
 	early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
 #endif
 
-#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
-	/*
-	 * make sure boot cpu numa_node is right, when boot cpu is on the
-	 * node that doesn't have mem installed
-	 */
-	set_cpu_numa_node(boot_cpu_id, early_cpu_to_node(boot_cpu_id));
-#endif
-
 	/* Setup node to cpumask map */
 	setup_node_to_cpumask_map();
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86, numa: fix boot without RAM on node0 again
  2010-07-20 19:28   ` Andrew Morton
@ 2010-07-20 20:06     ` H. Peter Anvin
  0 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2010-07-20 20:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, Tejun Heo,
	Denys Vlasenko, Lee Schermerhorn, linux-kernel

On 07/20/2010 12:28 PM, Andrew Morton wrote:
> 
> And this patch is missing my patch which fixed up that block comment.
> 
> I'll send both these patches into Linus today.  I'm a bit behind due to
> various linux-next catastrophies.
> 
> Catastrophies which have been known about for up to a week, yet all the
> offending patches are still in linux-next, unaltered.  Go figure.
> 

Thank you.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86, numa: fix boot without RAM on node0 again
  2010-07-20 19:16 ` H. Peter Anvin
  2010-07-20 19:24   ` Yinghai Lu
@ 2010-07-20 19:28   ` Andrew Morton
  2010-07-20 20:06     ` H. Peter Anvin
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2010-07-20 19:28 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Yinghai Lu, Thomas Gleixner, Ingo Molnar, Tejun Heo,
	Denys Vlasenko, Lee Schermerhorn, linux-kernel

On Tue, 20 Jul 2010 12:16:00 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:

> On 07/20/2010 11:35 AM, Yinghai Lu wrote:
> 
> That was not what Linus (and I) asked you to do.  He asked specifically
> that you resend them under a separate 0/2 cover with an explanation.
> 

And this patch is missing my patch which fixed up that block comment.

I'll send both these patches into Linus today.  I'm a bit behind due to
various linux-next catastrophies.

Catastrophies which have been known about for up to a week, yet all the
offending patches are still in linux-next, unaltered.  Go figure.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86, numa: fix boot without RAM on node0 again
  2010-07-20 19:16 ` H. Peter Anvin
@ 2010-07-20 19:24   ` Yinghai Lu
  2010-07-20 19:28   ` Andrew Morton
  1 sibling, 0 replies; 6+ messages in thread
From: Yinghai Lu @ 2010-07-20 19:24 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Tejun Heo, Andrew Morton,
	Denys Vlasenko, Lee Schermerhorn, linux-kernel

On 07/20/2010 12:16 PM, H. Peter Anvin wrote:
> On 07/20/2010 11:35 AM, Yinghai Lu wrote:
> 
> That was not what Linus (and I) asked you to do.  He asked specifically
> that you resend them under a separate 0/2 cover with an explanation.

even those two patches are not related each other.

Yinghai


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86, numa: fix boot without RAM on node0 again
  2010-07-20 18:35 Yinghai Lu
@ 2010-07-20 19:16 ` H. Peter Anvin
  2010-07-20 19:24   ` Yinghai Lu
  2010-07-20 19:28   ` Andrew Morton
  0 siblings, 2 replies; 6+ messages in thread
From: H. Peter Anvin @ 2010-07-20 19:16 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, Ingo Molnar, Tejun Heo, Andrew Morton,
	Denys Vlasenko, Lee Schermerhorn, linux-kernel

On 07/20/2010 11:35 AM, Yinghai Lu wrote:

That was not what Linus (and I) asked you to do.  He asked specifically
that you resend them under a separate 0/2 cover with an explanation.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] x86, numa: fix boot without RAM on node0 again
@ 2010-07-20 18:35 Yinghai Lu
  2010-07-20 19:16 ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Yinghai Lu @ 2010-07-20 18:35 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Tejun Heo, Andrew Morton,
	Denys Vlasenko, Lee Schermerhorn, linux-kernel


|commit e534c7c5f8d6e9fc46f57fab067c7e48d8ceb172
|Author: Lee Schermerhorn <lee.schermerhorn@hp.com>
|Date:   Wed May 26 14:44:58 2010 -0700
|
|    numa: x86_64: use generic percpu var numa_node_id() implementation
|
|    x86 arch specific changes to use generic numa_node_id() based on generic
|    percpu variable infrastructure.  Back out x86's custom version of
|    numa_node_id()

broke numa system that doesn't have ram on node0 when MEMORY_HOTPLUG is enabled.

because cpu_up() will call cpu_to_node() before per_cpu(numa_node) is setup for
APs.

When Node0 doesn't have RAM, on x86, cpus already round it to nearest node with
RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set up until in c_init for APs.

when later cpu_up() calling cpu_to_node() will get 0 again, and make it online
even there is no RAM on node0. so later all APs can not booted up, and later
will have panic.

[    1.611101] On node 0 totalpages: 0
.........
[    2.608558] On node 0 totalpages: 0
[    2.612065] Brought up 1 CPUs
[    2.615199] Total of 1 processors activated (3990.31 BogoMIPS).
...
   93.225341] calling  loop_init+0x0/0x1a4 @ 1
[   93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate
[   93.246539] Pid: 1, comm: swapper Tainted: G        W   2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354
[   93.264621] Call Trace:
[   93.266533]  [<ffffffff81125e43>] pcpu_alloc+0x83a/0x8e7
[   93.270710]  [<ffffffff81125f15>] __alloc_percpu+0x10/0x12
[   93.285849]  [<ffffffff8140786c>] alloc_disk_node+0x94/0x16d
[   93.291811]  [<ffffffff81407956>] alloc_disk+0x11/0x13
[   93.306157]  [<ffffffff81503e51>] loop_alloc+0xa7/0x180
[   93.310538]  [<ffffffff8277ef48>] loop_init+0x9b/0x1a4
[   93.324909]  [<ffffffff8277eead>] ? loop_init+0x0/0x1a4
[   93.329650]  [<ffffffff810001f2>] do_one_initcall+0x57/0x136
[   93.345197]  [<ffffffff827486d0>] kernel_init+0x184/0x20e
[   93.348146]  [<ffffffff81034954>] kernel_thread_helper+0x4/0x10
[   93.365194]  [<ffffffff81c7cc3c>] ? restore_args+0x0/0x30
[   93.369305]  [<ffffffff8274854c>] ? kernel_init+0x0/0x20e
[   93.386011]  [<ffffffff81034950>] ? kernel_thread_helper+0x0/0x10
[   93.392047] loop: out of memory
...

Try to assign per_cpu(numa_node) early

Signed-off-by: Yinghai <yinghai@kernel.org>

---
 arch/x86/kernel/setup_percpu.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Index: linux-2.6/arch/x86/kernel/setup_percpu.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup_percpu.c
+++ linux-2.6/arch/x86/kernel/setup_percpu.c
@@ -238,6 +238,15 @@ void __init setup_per_cpu_areas(void)
 #ifdef CONFIG_NUMA
 		per_cpu(x86_cpu_to_node_map, cpu) =
 			early_per_cpu_map(x86_cpu_to_node_map, cpu);
+		/*
+		 * make sure boot cpu numa_node is right, when boot cpu is on
+		 *  the node that doesn't have mem installed
+		 * also cpu_up() will call cpu_to_node() for APs when
+		 *  MEMORY_HOTPLUG is defined, before per_cpu(numa_node) is set
+		 *  up later with c_init aka intel_init/amd_init
+		 * So set them all (boot cpu and all APs)
+		 */
+		set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
 #endif
 #endif
 		/*
@@ -257,14 +266,6 @@ void __init setup_per_cpu_areas(void)
 	early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
 #endif
 
-#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
-	/*
-	 * make sure boot cpu numa_node is right, when boot cpu is on the
-	 * node that doesn't have mem installed
-	 */
-	set_cpu_numa_node(boot_cpu_id, early_cpu_to_node(boot_cpu_id));
-#endif
-
 	/* Setup node to cpumask map */
 	setup_node_to_cpumask_map();
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-07-20 20:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-07 21:35 [PATCH] x86, numa: fix boot without RAM on node0 again Yinghai Lu
2010-07-20 18:35 Yinghai Lu
2010-07-20 19:16 ` H. Peter Anvin
2010-07-20 19:24   ` Yinghai Lu
2010-07-20 19:28   ` Andrew Morton
2010-07-20 20:06     ` H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.