All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix early panic issue on machines with memless node
@ 2009-05-05  3:15 Zhang, Yanmin
  2009-05-05  3:32 ` David Rientjes
  0 siblings, 1 reply; 12+ messages in thread
From: Zhang, Yanmin @ 2009-05-05  3:15 UTC (permalink / raw)
  To: Jack Steiner, David Rientjes; +Cc: alex.shi, LKML, Ingo Molnar, Andi Kleen

Kernel 2.6.30-rc4 panic with boot parameter mem=2G on Nehalem machine.
The machines has 2 nodes and every node has about 3G memory.

Alex Shi did a good bisect and located the bad patch.

commit dc098551918093901d8ac8936e9d1a1b891b56ed
Author: Jack Steiner <steiner@sgi.com>
Date:   Fri Apr 17 09:22:42 2009 -0500

    x86/uv: fix init of memory-less nodes
    
    Add support for nodes that have cpus but no memory.
    The current code was failing to add these nodes
    to the nodes_present_map.
    
    v2: Fixes case caught by David Rientjes - missed support
        for the x2apic SRAT table.
    
    [ Impact: fix potential boot crash on memory-less UV nodes. ]
    
    Reported-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Jack Steiner <steiner@sgi.com>
    LKML-Reference: <20090417142242.GA23743@sgi.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>



With earlyprintk boot parameter, we captured below dump info.

<6>bootmem::alloc_bootmem_core nid=0 size=0 [0 pages] align=1000 goal=1000000 lim0
PANIC: early exception 06 rip 10:ffffffff80a2fbe4 error 0 cr2 0
Pid: 0, comm: swapper Not tainted 2.6.30-rc4-ymz #3
Call Trace:                                        
 [<ffffffff80a1a195>] ? early_idt_handler+0x55/0x68  
 [<ffffffff80a2fbe4>] ? alloc_bootmem_core+0x91/0x2ae
 [<ffffffff80a2fbdc>] ? alloc_bootmem_core+0x89/0x2ae     
 [<ffffffff80a2fe74>] ? ___alloc_bootmem_nopanic+0x73/0xab
 [<ffffffff80a2af73>] ? early_node_mem+0x54/0x78      
 [<ffffffff80a2b0ed>] ? setup_node_bootmem+0x156/0x282
 [<ffffffff80a2b880>] ? acpi_scan_nodes+0x207/0x303
 [<ffffffff80a2b255>] ? initmem_init+0x3c/0x14c
 [<ffffffff80a1e33b>] ? setup_arch+0x5ba/0x760       
 [<ffffffff80a2e904>] ? cgroup_init_subsys+0xfc/0x105
 [<ffffffff80a2ea5f>] ? cgroup_init_early+0x152/0x163
 [<ffffffff80a1a915>] ? start_kernel+0x84/0x35e      
 [<ffffffff80a1a37e>] ? x86_64_start_kernel+0xe5/0xeb
RIP alloc_bootmem_core+0x91/0x2ae

Consider below call chain:
acpi_scan_nodes =>
		setup_node_bootmem
			 (twice) => early_node_mem

At begining, acpi_scan_nodes filters out memless nodes by calling
unparse_node. Patch dc098551918 adds the node back actually.
acpi_scan_nodes has many comments around unparse_node.

Below patch fixes it with node memory checking. Another method is just
to revert the bad patch.

David Rientjes, Jack Steiner,
Would you check if below patch satisfy your original objective?


Signed-off-by: Shi Alex <alex.shi@intel.com>
Signed-off-by: Zhang Yanmin <yanmin.zhang@linux.intel.com>


---

--- linux-2.6.30-rc4/arch/x86/mm/numa_64.c	2009-05-05 09:20:05.000000000 +0800
+++ linux-2.6.30-rc4_memlessnode/arch/x86/mm/numa_64.c	2009-05-05 10:28:34.000000000 +0800
@@ -199,6 +199,10 @@ void __init setup_node_bootmem(int nodei
 	start_pfn = start >> PAGE_SHIFT;
 	last_pfn = end >> PAGE_SHIFT;
 
+	bootmap_pages = bootmem_bootmap_pages(last_pfn - start_pfn);
+	if (bootmap_pages == 0)
+		return;
+
 	node_data[nodeid] = early_node_mem(nodeid, start, end, pgdat_size,
 					   SMP_CACHE_BYTES);
 	if (node_data[nodeid] == NULL)
@@ -219,7 +223,6 @@ void __init setup_node_bootmem(int nodei
 	 * early_node_mem will get that with find_e820_area instead
 	 * of alloc_bootmem, that could clash with reserved range
 	 */
-	bootmap_pages = bootmem_bootmap_pages(last_pfn - start_pfn);
 	nid = phys_to_nid(nodedata_phys);
 	if (nid == nodeid)
 		bootmap_start = roundup(nodedata_phys + pgdat_size, PAGE_SIZE);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05  3:15 [PATCH] Fix early panic issue on machines with memless node Zhang, Yanmin
@ 2009-05-05  3:32 ` David Rientjes
  2009-05-05  5:55   ` Zhang, Yanmin
  2009-05-05 16:36   ` Jack Steiner
  0 siblings, 2 replies; 12+ messages in thread
From: David Rientjes @ 2009-05-05  3:32 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Jack Steiner, alex.shi, LKML, Ingo Molnar, Andi Kleen

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3397 bytes --]

On Tue, 5 May 2009, Zhang, Yanmin wrote:

> Kernel 2.6.30-rc4 panic with boot parameter mem=2G on Nehalem machine.
> The machines has 2 nodes and every node has about 3G memory.
> 
> Alex Shi did a good bisect and located the bad patch.
> 
> commit dc098551918093901d8ac8936e9d1a1b891b56ed
> Author: Jack Steiner <steiner@sgi.com>
> Date:   Fri Apr 17 09:22:42 2009 -0500
> 
>     x86/uv: fix init of memory-less nodes
>     
>     Add support for nodes that have cpus but no memory.
>     The current code was failing to add these nodes
>     to the nodes_present_map.
>     
>     v2: Fixes case caught by David Rientjes - missed support
>         for the x2apic SRAT table.
>     
>     [ Impact: fix potential boot crash on memory-less UV nodes. ]
>     
>     Reported-by: David Rientjes <rientjes@google.com>
>     Signed-off-by: Jack Steiner <steiner@sgi.com>
>     LKML-Reference: <20090417142242.GA23743@sgi.com>
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> 
> 
> With earlyprintk boot parameter, we captured below dump info.
> 
> <6>bootmem::alloc_bootmem_core nid=0 size=0 [0 pages] align=1000 goal=1000000 lim0
> PANIC: early exception 06 rip 10:ffffffff80a2fbe4 error 0 cr2 0
> Pid: 0, comm: swapper Not tainted 2.6.30-rc4-ymz #3
> Call Trace:                                        
>  [<ffffffff80a1a195>] ? early_idt_handler+0x55/0x68  
>  [<ffffffff80a2fbe4>] ? alloc_bootmem_core+0x91/0x2ae
>  [<ffffffff80a2fbdc>] ? alloc_bootmem_core+0x89/0x2ae     
>  [<ffffffff80a2fe74>] ? ___alloc_bootmem_nopanic+0x73/0xab
>  [<ffffffff80a2af73>] ? early_node_mem+0x54/0x78      
>  [<ffffffff80a2b0ed>] ? setup_node_bootmem+0x156/0x282
>  [<ffffffff80a2b880>] ? acpi_scan_nodes+0x207/0x303
>  [<ffffffff80a2b255>] ? initmem_init+0x3c/0x14c
>  [<ffffffff80a1e33b>] ? setup_arch+0x5ba/0x760       
>  [<ffffffff80a2e904>] ? cgroup_init_subsys+0xfc/0x105
>  [<ffffffff80a2ea5f>] ? cgroup_init_early+0x152/0x163
>  [<ffffffff80a1a915>] ? start_kernel+0x84/0x35e      
>  [<ffffffff80a1a37e>] ? x86_64_start_kernel+0xe5/0xeb
> RIP alloc_bootmem_core+0x91/0x2ae
> 
> Consider below call chain:
> acpi_scan_nodes =>
> 		setup_node_bootmem
> 			 (twice) => early_node_mem
> 
> At begining, acpi_scan_nodes filters out memless nodes by calling
> unparse_node. Patch dc098551918 adds the node back actually.
> acpi_scan_nodes has many comments around unparse_node.
> 
> Below patch fixes it with node memory checking. Another method is just
> to revert the bad patch.
> 
> David Rientjes, Jack Steiner,
> Would you check if below patch satisfy your original objective?
> 

Could you try this instead?


srat: do not register nodes beyond e820 map

The mem= option will truncate the memory map at a specified address so 
it's not possible to register nodes with memory beyond the e820 upper 
bound.

unparse_node() is only called when then node had memory associated with 
it, although with the mem= option it is no longer addressable.

Signed-off-by: David Rientjes <rientjes@google.com>
---
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -361,6 +361,7 @@ static void __init unparse_node(int node)
 {
 	int i;
 	node_clear(node, nodes_parsed);
+	node_clear(node, cpu_nodes_parsed);
 	for (i = 0; i < MAX_LOCAL_APIC; i++) {
 		if (apicid_to_node[i] == node)
 			apicid_to_node[i] = NUMA_NO_NODE;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05  3:32 ` David Rientjes
@ 2009-05-05  5:55   ` Zhang, Yanmin
  2009-05-05 16:36   ` Jack Steiner
  1 sibling, 0 replies; 12+ messages in thread
From: Zhang, Yanmin @ 2009-05-05  5:55 UTC (permalink / raw)
  To: David Rientjes; +Cc: Jack Steiner, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Mon, 2009-05-04 at 20:32 -0700, David Rientjes wrote:
> On Tue, 5 May 2009, Zhang, Yanmin wrote:
> 
> > Kernel 2.6.30-rc4 panic with boot parameter mem=2G on Nehalem machine.
> > The machines has 2 nodes and every node has about 3G memory.
> > 
> > Alex Shi did a good bisect and located the bad patch.
> > 
> > commit dc098551918093901d8ac8936e9d1a1b891b56ed
> > Author: Jack Steiner <steiner@sgi.com>
> > Date:   Fri Apr 17 09:22:42 2009 -0500
> > 
> >     x86/uv: fix init of memory-less nodes
> >     
> >     Add support for nodes that have cpus but no memory.
> >     The current code was failing to add these nodes
> >     to the nodes_present_map.
> >     
> >     v2: Fixes case caught by David Rientjes - missed support
> >         for the x2apic SRAT table.
> >     
> >     [ Impact: fix potential boot crash on memory-less UV nodes. ]
> >     
> >     Reported-by: David Rientjes <rientjes@google.com>
> >     Signed-off-by: Jack Steiner <steiner@sgi.com>
> >     LKML-Reference: <20090417142242.GA23743@sgi.com>
> >     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> > 
> > 
> > 
> > With earlyprintk boot parameter, we captured below dump info.
> > 
> > <6>bootmem::alloc_bootmem_core nid=0 size=0 [0 pages] align=1000 goal=1000000 lim0
> > PANIC: early exception 06 rip 10:ffffffff80a2fbe4 error 0 cr2 0
> > Pid: 0, comm: swapper Not tainted 2.6.30-rc4-ymz #3
> > Call Trace:                                        
> >  [<ffffffff80a1a195>] ? early_idt_handler+0x55/0x68  
> >  [<ffffffff80a2fbe4>] ? alloc_bootmem_core+0x91/0x2ae
> >  [<ffffffff80a2fbdc>] ? alloc_bootmem_core+0x89/0x2ae     
> >  [<ffffffff80a2fe74>] ? ___alloc_bootmem_nopanic+0x73/0xab
> >  [<ffffffff80a2af73>] ? early_node_mem+0x54/0x78      
> >  [<ffffffff80a2b0ed>] ? setup_node_bootmem+0x156/0x282
> >  [<ffffffff80a2b880>] ? acpi_scan_nodes+0x207/0x303
> >  [<ffffffff80a2b255>] ? initmem_init+0x3c/0x14c
> >  [<ffffffff80a1e33b>] ? setup_arch+0x5ba/0x760       
> >  [<ffffffff80a2e904>] ? cgroup_init_subsys+0xfc/0x105
> >  [<ffffffff80a2ea5f>] ? cgroup_init_early+0x152/0x163
> >  [<ffffffff80a1a915>] ? start_kernel+0x84/0x35e      
> >  [<ffffffff80a1a37e>] ? x86_64_start_kernel+0xe5/0xeb
> > RIP alloc_bootmem_core+0x91/0x2ae
> > 
> > Consider below call chain:
> > acpi_scan_nodes =>
> > 		setup_node_bootmem
> > 			 (twice) => early_node_mem
> > 
> > At begining, acpi_scan_nodes filters out memless nodes by calling
> > unparse_node. Patch dc098551918 adds the node back actually.
> > acpi_scan_nodes has many comments around unparse_node.
> > 
> > Below patch fixes it with node memory checking. Another method is just
> > to revert the bad patch.
> > 
> > David Rientjes, Jack Steiner,
> > Would you check if below patch satisfy your original objective?
> > 
> 
> Could you try this instead?
> 
> 
> srat: do not register nodes beyond e820 map
> 
> The mem= option will truncate the memory map at a specified address so 
> it's not possible to register nodes with memory beyond the e820 upper 
> bound.
> 
> unparse_node() is only called when then node had memory associated with 
> it, although with the mem= option it is no longer addressable.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
Your patch does fix the hang issue.



> ---
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -361,6 +361,7 @@ static void __init unparse_node(int node)
>  {
>  	int i;
>  	node_clear(node, nodes_parsed);
> +	node_clear(node, cpu_nodes_parsed);
>  	for (i = 0; i < MAX_LOCAL_APIC; i++) {
>  		if (apicid_to_node[i] == node)
>  			apicid_to_node[i] = NUMA_NO_NODE;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05  3:32 ` David Rientjes
  2009-05-05  5:55   ` Zhang, Yanmin
@ 2009-05-05 16:36   ` Jack Steiner
  2009-05-05 19:50     ` [patch] srat: do not register nodes beyond e820 map David Rientjes
  2009-05-05 19:52     ` [PATCH] Fix early panic issue on machines with memless node David Rientjes
  1 sibling, 2 replies; 12+ messages in thread
From: Jack Steiner @ 2009-05-05 16:36 UTC (permalink / raw)
  To: David Rientjes; +Cc: Zhang, Yanmin, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Mon, May 04, 2009 at 08:32:36PM -0700, David Rientjes wrote:
> On Tue, 5 May 2009, Zhang, Yanmin wrote:
> 
> > Kernel 2.6.30-rc4 panic with boot parameter mem=2G on Nehalem machine.
> > The machines has 2 nodes and every node has about 3G memory.
> > 
> > Alex Shi did a good bisect and located the bad patch.
> > 
> > commit dc098551918093901d8ac8936e9d1a1b891b56ed
> > Author: Jack Steiner <steiner@sgi.com>
> > Date:   Fri Apr 17 09:22:42 2009 -0500
> > 
> >     x86/uv: fix init of memory-less nodes
> >     
> >     Add support for nodes that have cpus but no memory.
> >     The current code was failing to add these nodes
> >     to the nodes_present_map.
> >     
> >     v2: Fixes case caught by David Rientjes - missed support
> >         for the x2apic SRAT table.
> >     
> >     [ Impact: fix potential boot crash on memory-less UV nodes. ]
> >     
> >     Reported-by: David Rientjes <rientjes@google.com>
> >     Signed-off-by: Jack Steiner <steiner@sgi.com>
> >     LKML-Reference: <20090417142242.GA23743@sgi.com>
> >     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> > 
> > 
> > 
> > With earlyprintk boot parameter, we captured below dump info.
> > 
> > <6>bootmem::alloc_bootmem_core nid=0 size=0 [0 pages] align=1000 goal=1000000 lim0
> > PANIC: early exception 06 rip 10:ffffffff80a2fbe4 error 0 cr2 0
> > Pid: 0, comm: swapper Not tainted 2.6.30-rc4-ymz #3
> > Call Trace:                                        
> >  [<ffffffff80a1a195>] ? early_idt_handler+0x55/0x68  
> >  [<ffffffff80a2fbe4>] ? alloc_bootmem_core+0x91/0x2ae
> >  [<ffffffff80a2fbdc>] ? alloc_bootmem_core+0x89/0x2ae     
> >  [<ffffffff80a2fe74>] ? ___alloc_bootmem_nopanic+0x73/0xab
> >  [<ffffffff80a2af73>] ? early_node_mem+0x54/0x78      
> >  [<ffffffff80a2b0ed>] ? setup_node_bootmem+0x156/0x282
> >  [<ffffffff80a2b880>] ? acpi_scan_nodes+0x207/0x303
> >  [<ffffffff80a2b255>] ? initmem_init+0x3c/0x14c
> >  [<ffffffff80a1e33b>] ? setup_arch+0x5ba/0x760       
> >  [<ffffffff80a2e904>] ? cgroup_init_subsys+0xfc/0x105
> >  [<ffffffff80a2ea5f>] ? cgroup_init_early+0x152/0x163
> >  [<ffffffff80a1a915>] ? start_kernel+0x84/0x35e      
> >  [<ffffffff80a1a37e>] ? x86_64_start_kernel+0xe5/0xeb
> > RIP alloc_bootmem_core+0x91/0x2ae
> > 
> > Consider below call chain:
> > acpi_scan_nodes =>
> > 		???setup_node_bootmem
> > 			??? (twice) => ???early_node_mem
> > 
> > At begining, acpi_scan_nodes filters out memless nodes by calling
> > unparse_node. Patch ???dc098551918 adds the node back actually.
> > ???acpi_scan_nodes has many comments around ???unparse_node.
> > 
> > Below patch fixes it with node memory checking. Another method is just
> > to revert the bad patch.
> > 
> > ???David Rientjes, ???Jack Steiner,
> > Would you check if below patch satisfy your original objective?
> > 
> 
> Could you try this instead?

I was able to duplicate your original problem. Your patch below solves the
problem. AFAICT, it causes no new reqgressions to the various configurations
that I'm testing. (I'll add the "mem=2G" to my configs that I test).

However, I see a new regression that was not present a couple of weeks ago.
Configurations that have nodes with cpus and no memory panic during
boot. This occurs both with and without your patch and is not related to "mem=".

I need to isolate the problem but here is the stack trace. :
	Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #12
	Call Trace:
	 [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
	 [<ffffffff802920fe>] ? build_zonelists_node+0x4c/0x8d
	 [<ffffffff8029333f>] __build_all_zonelists+0x1ae/0x55a
	 [<ffffffff80293932>] build_all_zonelists+0x1b5/0x263
	 [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
	 [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
	 [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
	 [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161



> 
> 
> srat: do not register nodes beyond e820 map
> 
> The mem= option will truncate the memory map at a specified address so 
> it's not possible to register nodes with memory beyond the e820 upper 
> bound.
> 
> unparse_node() is only called when then node had memory associated with 
> it, although with the mem= option it is no longer addressable.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -361,6 +361,7 @@ static void __init unparse_node(int node)
>  {
>  	int i;
>  	node_clear(node, nodes_parsed);
> +	node_clear(node, cpu_nodes_parsed);
>  	for (i = 0; i < MAX_LOCAL_APIC; i++) {
>  		if (apicid_to_node[i] == node)
>  			apicid_to_node[i] = NUMA_NO_NODE;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch] srat: do not register nodes beyond e820 map
  2009-05-05 16:36   ` Jack Steiner
@ 2009-05-05 19:50     ` David Rientjes
  2009-05-06  8:58       ` [tip:x86/urgent] x86, " tip-bot for David Rientjes
  2009-05-05 19:52     ` [PATCH] Fix early panic issue on machines with memless node David Rientjes
  1 sibling, 1 reply; 12+ messages in thread
From: David Rientjes @ 2009-05-05 19:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Zhang, Yanmin, alex.shi, linux-kernel, Andi Kleen, Jack Steiner

The mem= option will truncate the memory map at a specified address so 
it's not possible to register nodes with memory beyond the e820 upper 
bound.

unparse_node() is only called when then node had memory associated with 
it, although with the mem= option it is no longer addressable.

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 Ingo, this fixes the problem reported by "Zhang, Yanmin" and is needed
 for 2.6.30-rc5.

 arch/x86/mm/srat_64.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -361,6 +361,7 @@ static void __init unparse_node(int node)
 {
 	int i;
 	node_clear(node, nodes_parsed);
+	node_clear(node, cpu_nodes_parsed);
 	for (i = 0; i < MAX_LOCAL_APIC; i++) {
 		if (apicid_to_node[i] == node)
 			apicid_to_node[i] = NUMA_NO_NODE;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05 16:36   ` Jack Steiner
  2009-05-05 19:50     ` [patch] srat: do not register nodes beyond e820 map David Rientjes
@ 2009-05-05 19:52     ` David Rientjes
  2009-05-05 20:27       ` Jack Steiner
  2009-05-06  8:50       ` Ingo Molnar
  1 sibling, 2 replies; 12+ messages in thread
From: David Rientjes @ 2009-05-05 19:52 UTC (permalink / raw)
  To: Jack Steiner; +Cc: Zhang, Yanmin, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Tue, 5 May 2009, Jack Steiner wrote:

> I was able to duplicate your original problem. Your patch below solves the
> problem. AFAICT, it causes no new reqgressions to the various configurations
> that I'm testing. (I'll add the "mem=2G" to my configs that I test).
> 

Great, it would be helpful to catch these problems before 2.6.30 is 
released.  I've passed my patch along to Ingo.

> However, I see a new regression that was not present a couple of weeks ago.
> Configurations that have nodes with cpus and no memory panic during
> boot. This occurs both with and without your patch and is not related to "mem=".
> 
> I need to isolate the problem but here is the stack trace. :
> 	Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #12
> 	Call Trace:
> 	 [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
> 	 [<ffffffff802920fe>] ? build_zonelists_node+0x4c/0x8d
> 	 [<ffffffff8029333f>] __build_all_zonelists+0x1ae/0x55a
> 	 [<ffffffff80293932>] build_all_zonelists+0x1b5/0x263
> 	 [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
> 	 [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
> 	 [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
> 	 [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161
> 

Please post your .config since it apparently differs from x86_64 defconfig 
judging by my debugging symbols and also the full output of the panic.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05 19:52     ` [PATCH] Fix early panic issue on machines with memless node David Rientjes
@ 2009-05-05 20:27       ` Jack Steiner
  2009-05-05 20:41         ` David Rientjes
  2009-05-06  5:19         ` Zhang, Yanmin
  2009-05-06  8:50       ` Ingo Molnar
  1 sibling, 2 replies; 12+ messages in thread
From: Jack Steiner @ 2009-05-05 20:27 UTC (permalink / raw)
  To: David Rientjes; +Cc: Zhang, Yanmin, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Tue, May 05, 2009 at 12:52:54PM -0700, David Rientjes wrote:
> On Tue, 5 May 2009, Jack Steiner wrote:
> 
> > I was able to duplicate your original problem. Your patch below solves the
> > problem. AFAICT, it causes no new reqgressions to the various configurations
> > that I'm testing. (I'll add the "mem=2G" to my configs that I test).
> > 
> 
> Great, it would be helpful to catch these problems before 2.6.30 is 
> released.  I've passed my patch along to Ingo.
> 
> > However, I see a new regression that was not present a couple of weeks ago.
> > Configurations that have nodes with cpus and no memory panic during
> > boot. This occurs both with and without your patch and is not related to "mem=".
> > 
> > I need to isolate the problem but here is the stack trace. :
> > 	Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #12
> > 	Call Trace:
> > 	 [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
> > 	 [<ffffffff802920fe>] ? build_zonelists_node+0x4c/0x8d
> > 	 [<ffffffff8029333f>] __build_all_zonelists+0x1ae/0x55a
> > 	 [<ffffffff80293932>] build_all_zonelists+0x1b5/0x263
> > 	 [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
> > 	 [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
> > 	 [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
> > 	 [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161
> > 
> 
> Please post your .config since it apparently differs from x86_64 defconfig 
> judging by my debugging symbols and also the full output of the panic.

I suspect I mislead you when I mentioned "configurations". I did not mean
the .config file. I use a more-or-less standard .config file.

I do much of my testing on a system simulator. Using a simulator config file,
I specify the system configuration such as number of nodes, sockets per node,
cpus per socket, memory per socket, address map, boot options, etc. This
makes it easy to quickly test a lot of strange (but real) configurations.

The configuration above that is failing is a 2-socket Nehelem blade that has no
memory on socket 0. All memory is located on socket 1.  The panic is caused by a
null dereference of NODE_DATA(0).

Still looking....




--- jack


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05 20:27       ` Jack Steiner
@ 2009-05-05 20:41         ` David Rientjes
  2009-05-06  5:19         ` Zhang, Yanmin
  1 sibling, 0 replies; 12+ messages in thread
From: David Rientjes @ 2009-05-05 20:41 UTC (permalink / raw)
  To: Jack Steiner; +Cc: Zhang, Yanmin, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Tue, 5 May 2009, Jack Steiner wrote:

> I suspect I mislead you when I mentioned "configurations". I did not mean
> the .config file. I use a more-or-less standard .config file.
> 

Your .config would still let me build a kernel with debugging symbols so 
that your offsets actually made sense to my gdb.

> I do much of my testing on a system simulator. Using a simulator config file,
> I specify the system configuration such as number of nodes, sockets per node,
> cpus per socket, memory per socket, address map, boot options, etc. This
> makes it easy to quickly test a lot of strange (but real) configurations.
> 
> The configuration above that is failing is a 2-socket Nehelem blade that has no
> memory on socket 0. All memory is located on socket 1.  The panic is caused by a
> null dereference of NODE_DATA(0).
> 

A NULL dereference of NODE_DATA(0) should never happen even with 
!CONFIG_NEED_MULTIPLE_NODES.  So when you say the panic is caused by that 
(and I'm speculating since all I've seen is a call trace and not the 
entire panic), I'm assuming it's because NODE_DATA(0)->node_zones + offset 
is NULL because node 0 has no memory?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05 20:27       ` Jack Steiner
  2009-05-05 20:41         ` David Rientjes
@ 2009-05-06  5:19         ` Zhang, Yanmin
  2009-05-06 14:38           ` Jack Steiner
  1 sibling, 1 reply; 12+ messages in thread
From: Zhang, Yanmin @ 2009-05-06  5:19 UTC (permalink / raw)
  To: Jack Steiner; +Cc: David Rientjes, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Tue, 2009-05-05 at 15:27 -0500, Jack Steiner wrote:
> On Tue, May 05, 2009 at 12:52:54PM -0700, David Rientjes wrote:
> > On Tue, 5 May 2009, Jack Steiner wrote:
> > 
> > > I was able to duplicate your original problem. Your patch below solves the
> > > problem. AFAICT, it causes no new reqgressions to the various configurations
> > > that I'm testing. (I'll add the "mem=2G" to my configs that I test).
> > > 
> > 
> > Great, it would be helpful to catch these problems before 2.6.30 is 
> > released.  I've passed my patch along to Ingo.
> > 
> > > However, I see a new regression that was not present a couple of weeks ago.
> > > Configurations that have nodes with cpus and no memory panic during
> > > boot. This occurs both with and without your patch and is not related to "mem=".
> > > 
> > > I need to isolate the problem but here is the stack trace. :
> > > 	Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #12
> > > 	Call Trace:
> > > 	 [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
> > > 	 [<ffffffff802920fe>] ? build_zonelists_node+0x4c/0x8d
> > > 	 [<ffffffff8029333f>] __build_all_zonelists+0x1ae/0x55a
> > > 	 [<ffffffff80293932>] build_all_zonelists+0x1b5/0x263
> > > 	 [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
> > > 	 [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
> > > 	 [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
> > > 	 [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161
> > > 
> > 
> > Please post your .config since it apparently differs from x86_64 defconfig 
> > judging by my debugging symbols and also the full output of the panic.
> 
> I suspect I mislead you when I mentioned "configurations". I did not mean
> the .config file. I use a more-or-less standard .config file.
> 
> I do much of my testing on a system simulator. Using a simulator config file,
> I specify the system configuration such as number of nodes, sockets per node,
> cpus per socket, memory per socket, address map, boot options, etc. This
> makes it easy to quickly test a lot of strange (but real) configurations.
> 
> The configuration above that is failing is a 2-socket Nehelem blade that has no
> memory on socket 0. All memory is located on socket 1.  The panic is caused by a
> null dereference of NODE_DATA(0).
> 
> Still looking....
It seems in function setup_node_bootmem:

        if (!end)
                return;

stops the initialization of node_data[nodeid]. Later on panic when build_zonelists
dereference NODE_DATA(0).

Although a node is memoryless, but mostly it has small blocks of memory, so function
acpi_scan_nodes marks them offline. However, if getting node info in
acpi_numa_processor_affinity_init. the node might have no any memory, and acpi_scan_nodes
doesn't mark it offline.

The logic is confusing with patch dc09855191809. Could you revert it to retest?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-05 19:52     ` [PATCH] Fix early panic issue on machines with memless node David Rientjes
  2009-05-05 20:27       ` Jack Steiner
@ 2009-05-06  8:50       ` Ingo Molnar
  1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2009-05-06  8:50 UTC (permalink / raw)
  To: David Rientjes; +Cc: Jack Steiner, Zhang, Yanmin, alex.shi, LKML, Andi Kleen


* David Rientjes <rientjes@google.com> wrote:

> On Tue, 5 May 2009, Jack Steiner wrote:
> 
> > I was able to duplicate your original problem. Your patch below solves the
> > problem. AFAICT, it causes no new reqgressions to the various configurations
> > that I'm testing. (I'll add the "mem=2G" to my configs that I test).
> > 
> 
> Great, it would be helpful to catch these problems before 2.6.30 
> is released.  I've passed my patch along to Ingo.

Good catch - i've queued it up in x86/urgent, thanks!

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:x86/urgent] x86, srat: do not register nodes beyond e820 map
  2009-05-05 19:50     ` [patch] srat: do not register nodes beyond e820 map David Rientjes
@ 2009-05-06  8:58       ` tip-bot for David Rientjes
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for David Rientjes @ 2009-05-06  8:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, steiner, yanmin_zhang, tglx, rientjes, mingo

Commit-ID:  7eccf7b227b6d3b1745b937ce35efc9c27f9b0e5
Gitweb:     http://git.kernel.org/tip/7eccf7b227b6d3b1745b937ce35efc9c27f9b0e5
Author:     David Rientjes <rientjes@google.com>
AuthorDate: Tue, 5 May 2009 12:50:02 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Wed, 6 May 2009 10:49:07 +0200

x86, srat: do not register nodes beyond e820 map

The mem= option will truncate the memory map at a specified address so
it's not possible to register nodes with memory beyond the e820 upper
bound.

unparse_node() is only called when then node had memory associated with
it, although with the mem= option it is no longer addressable.

[ Impact: fix boot hang on certain (large) systems ]

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <alpine.DEB.2.00.0905051248150.20021@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/mm/srat_64.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index 33c5fa5..0176595 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -361,6 +361,7 @@ static void __init unparse_node(int node)
 {
 	int i;
 	node_clear(node, nodes_parsed);
+	node_clear(node, cpu_nodes_parsed);
 	for (i = 0; i < MAX_LOCAL_APIC; i++) {
 		if (apicid_to_node[i] == node)
 			apicid_to_node[i] = NUMA_NO_NODE;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix early panic issue on machines with memless node
  2009-05-06  5:19         ` Zhang, Yanmin
@ 2009-05-06 14:38           ` Jack Steiner
  0 siblings, 0 replies; 12+ messages in thread
From: Jack Steiner @ 2009-05-06 14:38 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: David Rientjes, alex.shi, LKML, Ingo Molnar, Andi Kleen

On Wed, May 06, 2009 at 01:19:52PM +0800, Zhang, Yanmin wrote: > On Tue, 2009-05-05 at 15:27 -0500, Jack Steiner wrote:
> > On Tue, May 05, 2009 at 12:52:54PM -0700, David Rientjes wrote:
> > > On Tue, 5 May 2009, Jack Steiner wrote:
> > > 
> > > > I was able to duplicate your original problem. Your patch below solves the
> > > > problem. AFAICT, it causes no new reqgressions to the various configurations
> > > > that I'm testing. (I'll add the "mem=2G" to my configs that I test).
> > > > 
> > > 
> > > Great, it would be helpful to catch these problems before 2.6.30 is 
> > > released.  I've passed my patch along to Ingo.
> > > 
> > > > However, I see a new regression that was not present a couple of weeks ago.
> > > > Configurations that have nodes with cpus and no memory panic during
> > > > boot. This occurs both with and without your patch and is not related to "mem=".
> > > > 
> > > > I need to isolate the problem but here is the stack trace. :
> > > > 	Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #12
> > > > 	Call Trace:
> > > > 	 [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
> > > > 	 [<ffffffff802920fe>] ? build_zonelists_node+0x4c/0x8d
> > > > 	 [<ffffffff8029333f>] __build_all_zonelists+0x1ae/0x55a
> > > > 	 [<ffffffff80293932>] build_all_zonelists+0x1b5/0x263
> > > > 	 [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
> > > > 	 [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
> > > > 	 [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
> > > > 	 [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161
> > > > 
> > > 
> > > Please post your .config since it apparently differs from x86_64 defconfig 
> > > judging by my debugging symbols and also the full output of the panic.
> > 
> > I suspect I mislead you when I mentioned "configurations". I did not mean
> > the .config file. I use a more-or-less standard .config file.
> > 
> > I do much of my testing on a system simulator. Using a simulator config file,
> > I specify the system configuration such as number of nodes, sockets per node,
> > cpus per socket, memory per socket, address map, boot options, etc. This
> > makes it easy to quickly test a lot of strange (but real) configurations.
> > 
> > The configuration above that is failing is a 2-socket Nehelem blade that has no
> > memory on socket 0. All memory is located on socket 1.  The panic is caused by a
> > null dereference of NODE_DATA(0).
> > 
> > Still looking....
> It seems in function setup_node_bootmem:
> 
>         if (!end)
>                 return;
> 
> stops the initialization of node_data[nodeid]. Later on panic when build_zonelists
> dereference ???NODE_DATA(0).
> 
> Although a node is memoryless, but mostly it has small blocks of memory, so function
> acpi_scan_nodes marks them offline. However, if getting node info in
> acpi_numa_processor_affinity_init. the node might have no any memory, and ???acpi_scan_nodes
> doesn't mark it offline.

How _should_ a node without memory be treated??

For example, consider a Nehelem board with:
	- 2 sockets
	- all memory is located on socket 1 (socket 0 has no memory)

Our BIOS currently builds the SRAT with:
	- cpus in socket 0 in promimity domain 0
	- cpus in socket 1 in promimity domain 1
	- all memory is in promimity domain 1

Should this be a valid configuration? This is not a corner case that is
unlikely to occur. We actually have these types of configurations.


> 
> The logic is confusing with patch dc09855191809. Could you revert it to retest?
> 

I reverted dc09855191809.  AFAICT, the results are identical.


PROM>> 
PROM>> Fake PROM Config: cpus 2 (0 disabled), nodes 2, sockets 2, MB 256
PROM>>    socket 0, lsocket 0, blade 0, nasid 0, mem 0, cpumap 0x1, disabled 0x0
PROM>>    socket 1, lsocket 0, blade 1, nasid 4, mem 256, cpumap 0x1, disabled 0x0
PROM>> 
PROM>> SGI UV X86_64 FakeProm Version 1.00. Built 14:04:20 May  1 2009
PROM>>     Hub            : UV
PROM>>     Blades         : 2
PROM>>     Sockets        : 2
PROM>>     Cpus           : 2
PROM>>       disabled     : 0
PROM>>     Nodes          : 2
PROM>>       M value      : 37
PROM>>       N value      : 7
PROM>>     NodeMode       : 0 (node)
PROM>>     APICMode       : 1 (x2apic-UV)
PROM>>     efi_enabled    : 1
PROM>>     nasid0_present : 1
PROM>>     noded0_split   : 0
PROM>>     Total Memory   : 268435456 (256 MB)
PROM>>     Superpages     : 0 per-blade
PROM>>     Bootline       : root=/dev/hda2 init=/bin/bash console=ttyS0,38400n8 fprom lpj=10000 nohpet loglevel=8 iommu=off dma32_size=4096

PROM>> Memory Map:
PROM>>    blade 0, nasid 0: 0x0 - 0x6000: socket 1, pxm 1, RAM
PROM>>    blade 0, nasid 0: 0x6000 - 0xb0000: socket 1, pxm 1, CODE
PROM>>    blade 0, nasid 0: 0xb0000 - 0x200000 (2MB): socket 1, pxm 1, DATA
PROM>>    blade 1, nasid 4: 0x200000 (2MB) - 0x10000000 (256MB): socket 1, pxm 1, RAM
PROM>>    blade 0, nasid 0: 0x80000000 (2GB) - 0x90000000 (2GB + 256MB): socket 1, pxm 1, MMIO
PROM>>    blade 0, nasid 0: 0xf0000000 (3GB + 768MB) - 0xfc000000 (3GB + 960MB): socket 1, pxm 1, MMIO
PROM>>    blade 0, nasid 0: 0xfed1c000 (3GB + 1005MB + 114688) - 0xfed20000 (3GB + 1005MB + 131072): socket 1, pxm 1, MMIO
PROM>>    blade 0, nasid 0: 0xfff60000 (3GB + 1023MB + 393216) - 0xfff6c000 (3GB + 1023MB + 442368): socket 1, pxm 1, MMIO
PROM>>    blade 0, nasid 0: 0xfe000000000 (15TB + 896GB) - 0xfe018000000 (15TB + 896GB + 384MB): socket 0, pxm 255, MMR
PROM>> MMR BASE 0xfe000000000
PROM>> GRU BASE 0xdf800000000
PROM>> M-value: 37 (128GB)
PROM>> E820 Map: 0x00000000000c12d0
PROM>>      0: 0x0 - 0x6000: type 1
PROM>>      1: 0x6000 - 0x200000 (2MB): type 2
PROM>>      2: 0x200000 (2MB) - 0x10000000 (256MB): type 1
PROM>>      3: 0x80000000 (2GB) - 0x90000000 (2GB + 256MB): type 2
PROM>>      4: 0xf0000000 (3GB + 768MB) - 0xfc000000 (3GB + 960MB): type 2
PROM>>      5: 0xfed1c000 (3GB + 1005MB + 114688) - 0xfed20000 (3GB + 1005MB + 131072): type 2
PROM>>      6: 0xfff60000 (3GB + 1023MB + 393216) - 0xfff6c000 (3GB + 1023MB + 442368): type 2
PROM>>      7: 0xfe000000000 (15TB + 896GB) - 0xfe018000000 (15TB + 896GB + 384MB): type 2
PROM>> Build ACPI tables
PROM>>   RSDP at 0x00000000000e0200
PROM>>   XSDT at 0x00000000000e0240
PROM>>   DSDT at 0x00000000000e02a0
PROM>>   MADT at 0x00000000000e02e0 (0xa0)
PROM>>     sapic: cpu 0, socket 0, lcpu 0, proc_id 0x0, id 0x00, eid 0x00, apicid 0x0000, 
PROM>>     sapic: cpu 1, socket 1, lcpu 0, proc_id 0x1, id 0x00, eid 0x80, apicid 0x0080, 
PROM>>     io_apic: id 8, base 0, entries 24, prq 0, arb 0
PROM>>     io_apic: id 9, base 24, entries 24, prq 1, arb 9
PROM>>     lapic_nmi: acpi_id 0, flags 0x5, lint 1
PROM>>     lapic_nmi: acpi_id 1, flags 0x5, lint 1
PROM>>     int_src_ovr: bus 0, bus_irq 0, global_irq 2, flags 5
PROM>>     int_src_ovr: bus 0, bus_irq 9, global_irq 9, flags 13
PROM>>   SRAT at 0x00000000000e0380
PROM>>     Memory:
PROM>>       blade 0, soc 1: paddr 0x0 - 0xfff6c000 (3GB + 1023MB + 442368), pxm 1
PROM>>     Processor at 00000000000e03d8:
PROM>>       soc 0, lcpu 0: sapicid 0x0000, pxm 0
PROM>>       soc 1, lcpu 0: sapicid 0x0080, pxm 1
PROM>>   SLIT at 0x00000000000e05e0, dim 2
PROM>>       10  21
PROM>>       21  10
PROM>>   FADT at 0x00000000000e06a0
PROM>>   FACS at 0x00000000000e07a0
PROM>>   DMAR at 0x00000000000e0860
PROM>> Memmap (EFI):
PROM>>   0000000000000000 - 0000000000006000:       24 kb, RAM   (0x0 - 0x6000)
PROM>>   0000000000006000 - 00000000000b0000:      680 kb, CODE  (0x6000 - 0xb0000)
PROM>>   00000000000b0000 - 0000000000200000:     1344 kb, DATA  (0xb0000 - 0x200000 (2MB))
PROM>>   0000000000200000 - 0000000010000000:      254 MB, RAM   (0x200000 (2MB) - 0x10000000 (256MB))
PROM>>   0000000080000000 - 0000000090000000:      256 MB, MMIO  (0x80000000 (2GB) - 0x90000000 (2GB + 256MB))
PROM>>   00000000f0000000 - 00000000fc000000:      192 MB, MMIO  (0xf0000000 (3GB + 768MB) - 0xfc000000 (3GB + 960MB))
PROM>>   00000000fed1c000 - 00000000fed20000:       16 kb, MMIO  (0xfed1c000 (3GB + 1005MB + 114688) - 0xfed20000 (3GB + 1005MB + 131072))
PROM>>   00000000fff60000 - 00000000fff6c000:       48 kb, MMIO  (0xfff60000 (3GB + 1023MB + 393216) - 0xfff6c000 (3GB + 1023MB + 442368))
PROM>>   00000fe000000000 - 00000fe018000000:      384 MB, MMR   (0xfe000000000 (15TB + 896GB) - 0xfe018000000 (15TB + 896GB + 384MB))
PROM>> Total memory: 0x10000000 (268435456) bytes, 256 MB, 0 GB
PROM>> Set x2apic APICID: pcpu 0, val 0x0 -> 0x0
PROM>> init_local_mmrs: cpu 0, apicid 0x0000, nasid 0x0
PROM>> BAU GB: nasid 0, paddr 0x1e0000
PROM>> Set x2apic APICID: pcpu 1, val 0x0 -> 0x80
PROM>> init_local_mmrs: cpu 1, apicid 0x0080, nasid 0x4
PROM>> BAU GB: nasid 4, paddr 0x1e4000
<6>Initializing cgroup subsys cpuset
<6>Initializing cgroup subsys cpu
<5>Linux version 2.6.30-rc4-next-20090505-medusa (steiner@alcatraz.americas.sgi.com) (gcc version 4.2.4) #41 SMP Wed May 6 08:21:07 CDT 2009
<6>Command line: root=/dev/hda2 init=/bin/bash console=ttyS0,38400n8 fprom lpj=10000 nohpet loglevel=8 iommu=off dma32_size=4096
<6>KERNEL supported cpus:
<6>  Intel GenuineIntel
<6>  AMD AuthenticAMD
<6>  Centaur CentaurHauls
<6>BIOS-provided physical RAM map:
<6> BIOS-e820: 0000000000000000 - 0000000000006000 (usable)
<6> BIOS-e820: 0000000000006000 - 0000000000200000 (reserved)
<6> BIOS-e820: 0000000000200000 - 0000000010000000 (usable)
<6> BIOS-e820: 0000000080000000 - 0000000090000000 (reserved)
<6> BIOS-e820: 00000000f0000000 - 00000000fc000000 (reserved)
<6> BIOS-e820: 00000000fed1c000 - 00000000fed20000 (reserved)
<6> BIOS-e820: 00000000fff60000 - 00000000fff6c000 (reserved)
<6> BIOS-e820: 00000fe000000000 - 00000fe018000000 (reserved)
<6>EFI v1.00 by SGI 
<6> ACPI 2.0=0xe0200  UVsystab=0xe08c0 
<6>EFI: mem00: type=7, attr=0x8, range=[0x0000000000000000-0x0000000000006000) (0MB)
<6>EFI: mem01: type=5, attr=0x8000000000001000, range=[0x0000000000006000-0x00000000000b0000) (0MB)
<6>EFI: mem02: type=6, attr=0x8000000000000008, range=[0x00000000000b0000-0x0000000000200000) (1MB)
<6>EFI: mem03: type=7, attr=0x8, range=[0x0000000000200000-0x0000000010000000) (254MB)
<6>EFI: mem04: type=6, attr=0x8000000000000001, range=[0x0000000080000000-0x0000000090000000) (256MB)
<6>EFI: mem05: type=6, attr=0x8000000000000001, range=[0x00000000f0000000-0x00000000fc000000) (192MB)
<6>EFI: mem06: type=6, attr=0x8000000000000001, range=[0x00000000fed1c000-0x00000000fed20000) (0MB)
<6>EFI: mem07: type=6, attr=0x8000000000000001, range=[0x00000000fff60000-0x00000000fff6c000) (0MB)
<6>EFI: mem08: type=11, attr=0x8000000000000001, range=[0x00000fe000000000-0x00000fe018000000) (384MB)
<6>DMI not present or invalid.
<6>last_pfn = 0x10000 max_arch_pfn = 0x100000000
<7>MTRR default type: write-back
<7>MTRR fixed ranges enabled:
<7>  00000-FFFFF write-back
<7>MTRR variable ranges enabled:
<7>  0 base 0   F0000000 mask FFF F0000000 uncachable
<7>  1 base E0  00000000 mask FF0 00000000 uncachable
<7>  2 base F0  00000000 mask FF0 00000000 uncachable
<7>  3 base F00 00000000 mask FF0000000000 uncachable
<7>  4 disabled
<7>  5 disabled
<7>  6 disabled
<7>  7 disabled
<6>x86 PAT enabled: cpu 0, old 0x606060606060606, new 0x7010600070106
<6>x2apic enabled by BIOS, switching to x2apic ops
<6>init_memory_mapping: 0000000000000000-0000000010000000
<7> 0000000000 - 0010000000 page 2M
<7>kernel direct mapping tables up to 10000000 @ 936000-938000
<4>ACPI: RSDP 00000000000e0200 00024 (v02       )
<4>ACPI: XSDT 00000000000e0240 00054 (v01    SGI      UVX 00010001 FPRM 00000001)
<4>ACPI: APIC 00000000000e02e0 00086 (v01    SGI      UVX 00010001 FPRM 00000001)
<4>ACPI: SRAT 00000000000e0380 00078 (v01    SGI      UVX 00010001 FPRM 00000001)
<4>ACPI: SLIT 00000000000e05e0 00030 (v01    SGI      UVX 00010001 FPRM 00000001)
<4>ACPI: MCFG 00000000000e0640 0004C (v01    SGI      UVX 00010001 FPRM 00000001)
<4>ACPI: FACP 00000000000e06a0 000F4 (v03    SGI      UVX 00030001 FPRM 00000001)
<4>ACPI: DSDT 00000000000e02a0 00030 (v01    SGI      UVX 00010001 FPRM 00000001)
<4>ACPI: FACS 00000000000e07a0 00040
<4>ACPI: DMAR 00000000000e0860 0004C (v01    SGI      UVX 00010001 FPRM 00000001)
<7>ACPI: Local APIC address 0xfee00000
<6>Setting APIC routing to cluster x2apic.
<6>SRAT: PXM 0 -> APIC 0 -> Node 0
<6>SRAT: PXM 1 -> APIC 128 -> Node 1
<6>SRAT: Node 1 PXM 1 0-fff6c000
<7>NUMA: Using 63 for the hash shift.
<6>Bootmem setup node 1 0000000000000000-0000000010000000
<6>  NODE_DATA [0000000000935a80 - 0000000000969a7f]
<6>  bootmap [000000000096a000 -  000000000096bfff] pages 2
<6>(7 early reservations) ==> bootmem [0000000000 - 0010000000]
<6>  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
<6>  #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
<6>  #2 [0000200000 - 0000935a5c]    TEXT DATA BSS ==> [0000200000 - 0000935a5c]
<6>  #3 [000009f000 - 00000e0900]    BIOS reserved ==> [000009f000 - 00000e0900]
<6>  #4 [00000e0a68 - 0000100000]    BIOS reserved ==> [00000e0a68 - 0000100000]
<6>  #5 [00000e0900 - 00000e0a68]       EFI memmap ==> [00000e0900 - 00000e0a68]
<6>  #6 [0000001000 - 0000001030]        ACPI SLIT ==> [0000001000 - 0000001030]
<7> [ffffe20000000000-ffffe200003fffff] PMD -> [ffff880001200000-ffff8800015fffff] on node 1
<4>Zone PFN ranges:
<4>  DMA      0x00000000 -> 0x00001000
<4>  DMA32    0x00001000 -> 0x00100000
<4>  Normal   0x00100000 -> 0x00100000
<4>Movable zone start PFN for each node
<4>early_node_map[2] active PFN ranges
<4>    1: 0x00000000 -> 0x00000006
<4>    1: 0x00000200 -> 0x00010000
<7>On node 1 totalpages: 65030
<7>  DMA zone: 56 pages used for memmap
<7>  DMA zone: 1944 pages reserved
<7>  DMA zone: 1590 pages, LIFO batch:0
<7>  DMA32 zone: 840 pages used for memmap
<7>  DMA32 zone: 60600 pages, LIFO batch:15
<6>ACPI: PM-Timer IO Port: 0x1008
<7>ACPI: Local APIC address 0xfee00000
<6>Setting APIC routing to cluster x2apic.
<6>ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00] lsapic_eid[0x00] enabled)
<6>ACPI: LSAPIC (acpi_id[0x01] lsapic_id[0x00] lsapic_eid[0x80] enabled)
<6>ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
<6>ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
<6>ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
<6>IOAPIC[0]: apic_id 8, version 0, address 0xfec00000, GSI 0-23
<6>ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24])
<6>IOAPIC[1]: apic_id 9, version 0, address 0xfec80000, GSI 24-24
<6>ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
<6>ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
<7>ACPI: IRQ0 used by override.
<7>ACPI: IRQ2 used by override.
<7>ACPI: IRQ9 used by override.
<6>Using ACPI (MADT) for SMP configuration information
<6>SMP: Allowing 2 CPUs, 0 hotplug CPUs
<7>nr_irqs_gsi: 25
<6>PM: Registered nosave memory: 0000000000006000 - 0000000000200000
<6>Allocating PCI resources starting at 18000000 (gap: 10000000:70000000)
<6>NR_CPUS:4096 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:2
<6>PERCPU: Embedded 26 pages at ffff880001005000, static data 76384 bytes
<4>Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #41
<4>Call Trace:
<4> [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
<4> [<ffffffff802920e1>] ? build_zonelists_node+0x2f/0x70
<4> [<ffffffff80232241>] ? __node_distance+0x59/0x70
<4> [<ffffffff80293322>] __build_all_zonelists+0x1ae/0x55a
<4> [<ffffffff80293915>] build_all_zonelists+0x1b5/0x263
<4> [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
<4> [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
<4> [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
<4> [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161
<4>RIP build_zonelists_node+0x2f/0x70

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-05-06 14:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-05  3:15 [PATCH] Fix early panic issue on machines with memless node Zhang, Yanmin
2009-05-05  3:32 ` David Rientjes
2009-05-05  5:55   ` Zhang, Yanmin
2009-05-05 16:36   ` Jack Steiner
2009-05-05 19:50     ` [patch] srat: do not register nodes beyond e820 map David Rientjes
2009-05-06  8:58       ` [tip:x86/urgent] x86, " tip-bot for David Rientjes
2009-05-05 19:52     ` [PATCH] Fix early panic issue on machines with memless node David Rientjes
2009-05-05 20:27       ` Jack Steiner
2009-05-05 20:41         ` David Rientjes
2009-05-06  5:19         ` Zhang, Yanmin
2009-05-06 14:38           ` Jack Steiner
2009-05-06  8:50       ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.