linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/numa: kernel stack corruption fix
@ 2015-04-01  4:53 Dave Young
  2015-04-01  5:11 ` Dave Young
  2015-04-02 19:36 ` Yasuaki Ishimatsu
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Young @ 2015-04-01  4:53 UTC (permalink / raw)
  To: x86, linux-kernel; +Cc: tglx, dyoung, bhe, mingo, hpa, akpm

I got below kernel panic during kdump test on Thinkpad T420 laptop:

[    0.000000] No NUMA configuration found                                      
[    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
[    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
upted in: ffffffff81d21910                                                     r
[    0.000000]                                                                  
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
[    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
5/2013                                                                         0
[    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
a26                                                                            2
[    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
8d2                                                                            c
[    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
4f6                                                                            e
[    0.000000] Call Trace:                                                      
[    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
[    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
[    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
[    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
[    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
[    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
[    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
[    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
[    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
[    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
[    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
k is corrupted in: ffffffff81d21910                                            c
[    0.000000]                                                                  
PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
[    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
5/2013                                                                         0
[    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
a26                                                                            2
[    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
084 0000000000000a0d 0000000000000a00                                          0
[    0.000000] Call Trace:                                                      
[    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
[    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
[    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
[    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
[    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
[    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
[    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
[    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
[    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
[    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
[    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
[    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
[    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
[    0.000000] RIP 0x46                                                         

This is caused by writing over end of numa mask bitmap.

numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
reserved region and assume every regions have valid nid. It is not true because
There's an exception for graphic memory quirks. see function trim_snb_memory
in arch/x86/kernel/setup.c

It is easily to reproduce the bug in kdump kernel because kdump kernel use
prereserved memory instead of whole memory, but kexec pass other reserved memory
ranges to 2nd kernel as well. like below in my test:
kdump kernel ram 0x2d000000 - 0x37bfffff
One of the reserved regions: 0x40000000 - 0x40100000

The above reserved region includes 0x40004000, a page excluded in
trim_snb_memory. For this memblock reserved region the nid is not set it is
still default value MAX_NUMNODES. later node_set callback will set bit
MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 

Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.

Signed-off-by: Dave Young <dyoung@redhat.com>
---
 arch/x86/mm/numa.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/mm/numa.c
+++ linux/arch/x86/mm/numa.c
@@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
 
 	/* Mark all kernel nodes. */
 	for_each_memblock(reserved, r)
-		node_set(r->nid, numa_kernel_nodes);
+		if (r->nid != MAX_NUMNODES)
+			node_set(r->nid, numa_kernel_nodes);
 
 	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
 	for (i = 0; i < numa_meminfo.nr_blks; i++) {


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  4:53 [PATCH] x86/numa: kernel stack corruption fix Dave Young
@ 2015-04-01  5:11 ` Dave Young
  2015-04-01  7:27   ` Xishi Qiu
  2015-04-02  1:51   ` Xishi Qiu
  2015-04-02 19:36 ` Yasuaki Ishimatsu
  1 sibling, 2 replies; 19+ messages in thread
From: Dave Young @ 2015-04-01  5:11 UTC (permalink / raw)
  To: x86, linux-kernel; +Cc: tglx, bhe, mingo, hpa, akpm, qiuxishi

Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.

On 04/01/15 at 12:53pm, Dave Young wrote:
> I got below kernel panic during kdump test on Thinkpad T420 laptop:
> 
> [    0.000000] No NUMA configuration found                                      
> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> upted in: ffffffff81d21910                                                     r
> [    0.000000]                                                                  
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> 5/2013                                                                         0
> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> a26                                                                            2
> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> 8d2                                                                            c
> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> 4f6                                                                            e
> [    0.000000] Call Trace:                                                      
> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> k is corrupted in: ffffffff81d21910                                            c
> [    0.000000]                                                                  
> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> 5/2013                                                                         0
> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> a26                                                                            2
> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> 084 0000000000000a0d 0000000000000a00                                          0
> [    0.000000] Call Trace:                                                      
> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> [    0.000000] RIP 0x46                                                         
> 
> This is caused by writing over end of numa mask bitmap.
> 
> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> reserved region and assume every regions have valid nid. It is not true because
> There's an exception for graphic memory quirks. see function trim_snb_memory
> in arch/x86/kernel/setup.c
> 
> It is easily to reproduce the bug in kdump kernel because kdump kernel use
> prereserved memory instead of whole memory, but kexec pass other reserved memory
> ranges to 2nd kernel as well. like below in my test:
> kdump kernel ram 0x2d000000 - 0x37bfffff
> One of the reserved regions: 0x40000000 - 0x40100000
> 
> The above reserved region includes 0x40004000, a page excluded in
> trim_snb_memory. For this memblock reserved region the nid is not set it is
> still default value MAX_NUMNODES. later node_set callback will set bit
> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> 
> Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> 
> Signed-off-by: Dave Young <dyoung@redhat.com>
> ---
>  arch/x86/mm/numa.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- linux.orig/arch/x86/mm/numa.c
> +++ linux/arch/x86/mm/numa.c
> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
>  
>  	/* Mark all kernel nodes. */
>  	for_each_memblock(reserved, r)
> -		node_set(r->nid, numa_kernel_nodes);
> +		if (r->nid != MAX_NUMNODES)
> +			node_set(r->nid, numa_kernel_nodes);
>  
>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  5:11 ` Dave Young
@ 2015-04-01  7:27   ` Xishi Qiu
  2015-04-01  7:41     ` Dave Young
  2015-04-02  1:51   ` Xishi Qiu
  1 sibling, 1 reply; 19+ messages in thread
From: Xishi Qiu @ 2015-04-01  7:27 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

On 2015/4/1 13:11, Dave Young wrote:

> Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
> 
> On 04/01/15 at 12:53pm, Dave Young wrote:
>> I got below kernel panic during kdump test on Thinkpad T420 laptop:
>>
>> [    0.000000] No NUMA configuration found                                      
>> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
>> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
>> upted in: ffffffff81d21910                                                     r
>> [    0.000000]                                                                  
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>> 5/2013                                                                         0
>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
>> a26                                                                            2
>> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
>> 8d2                                                                            c
>> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
>> 4f6                                                                            e
>> [    0.000000] Call Trace:                                                      
>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
>> k is corrupted in: ffffffff81d21910                                            c
>> [    0.000000]                                                                  
>> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>> 5/2013                                                                         0
>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
>> a26                                                                            2
>> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
>> 084 0000000000000a0d 0000000000000a00                                          0
>> [    0.000000] Call Trace:                                                      
>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
>> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
>> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>> [    0.000000] RIP 0x46                                                         
>>
>> This is caused by writing over end of numa mask bitmap.
>>
>> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
>> reserved region and assume every regions have valid nid. It is not true because
>> There's an exception for graphic memory quirks. see function trim_snb_memory
>> in arch/x86/kernel/setup.c
>>
>> It is easily to reproduce the bug in kdump kernel because kdump kernel use
>> prereserved memory instead of whole memory, but kexec pass other reserved memory
>> ranges to 2nd kernel as well. like below in my test:
>> kdump kernel ram 0x2d000000 - 0x37bfffff
>> One of the reserved regions: 0x40000000 - 0x40100000
>>
>> The above reserved region includes 0x40004000, a page excluded in
>> trim_snb_memory. For this memblock reserved region the nid is not set it is
>> still default value MAX_NUMNODES. later node_set callback will set bit
>> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
>>

Hi Dave,

Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
kernel, so this region is not include in "numa_meminfo", and memblock.reserved
(0x40004000) is still MAX_NUMNODES from trim_snb_memory().

numa_clear_kernel_node_hotplug
{
	...
	for (i = 0; i < numa_meminfo.nr_blks; i++) {
		struct numa_memblk *mb = &numa_meminfo.blk[i];

		memblock_set_node(mb->start, mb->end - mb->start,
				  &memblock.reserved, mb->nid);  // this will not reset 0x40004000's node, right?
	}
	...
}

Thanks
Xishi Qiu

>> Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
>>
>> Signed-off-by: Dave Young <dyoung@redhat.com>
>> ---
>>  arch/x86/mm/numa.c |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> --- linux.orig/arch/x86/mm/numa.c
>> +++ linux/arch/x86/mm/numa.c
>> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
>>  
>>  	/* Mark all kernel nodes. */
>>  	for_each_memblock(reserved, r)
>> -		node_set(r->nid, numa_kernel_nodes);
>> +		if (r->nid != MAX_NUMNODES)
>> +			node_set(r->nid, numa_kernel_nodes);
>>  
>>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
>>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
>>
> 
> .
> 




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  7:27   ` Xishi Qiu
@ 2015-04-01  7:41     ` Dave Young
  2015-04-01  8:21       ` Xishi Qiu
  2015-04-02 19:15       ` Yasuaki Ishimatsu
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Young @ 2015-04-01  7:41 UTC (permalink / raw)
  To: Xishi Qiu; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

On 04/01/15 at 03:27pm, Xishi Qiu wrote:
> On 2015/4/1 13:11, Dave Young wrote:
> 
> > Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
> > 
> > On 04/01/15 at 12:53pm, Dave Young wrote:
> >> I got below kernel panic during kdump test on Thinkpad T420 laptop:
> >>
> >> [    0.000000] No NUMA configuration found                                      
> >> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> >> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> >> upted in: ffffffff81d21910                                                     r
> >> [    0.000000]                                                                  
> >> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> >> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> >> 5/2013                                                                         0
> >> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> >> a26                                                                            2
> >> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> >> 8d2                                                                            c
> >> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> >> 4f6                                                                            e
> >> [    0.000000] Call Trace:                                                      
> >> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> >> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> >> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> >> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> >> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> >> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> >> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> >> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> >> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> >> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> >> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> >> k is corrupted in: ffffffff81d21910                                            c
> >> [    0.000000]                                                                  
> >> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> >> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> >> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> >> 5/2013                                                                         0
> >> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> >> a26                                                                            2
> >> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> >> 084 0000000000000a0d 0000000000000a00                                          0
> >> [    0.000000] Call Trace:                                                      
> >> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> >> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> >> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> >> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> >> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> >> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> >> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> >> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> >> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> >> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> >> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> >> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> >> [    0.000000] RIP 0x46                                                         
> >>
> >> This is caused by writing over end of numa mask bitmap.
> >>
> >> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> >> reserved region and assume every regions have valid nid. It is not true because
> >> There's an exception for graphic memory quirks. see function trim_snb_memory
> >> in arch/x86/kernel/setup.c
> >>
> >> It is easily to reproduce the bug in kdump kernel because kdump kernel use
> >> prereserved memory instead of whole memory, but kexec pass other reserved memory
> >> ranges to 2nd kernel as well. like below in my test:
> >> kdump kernel ram 0x2d000000 - 0x37bfffff
> >> One of the reserved regions: 0x40000000 - 0x40100000
> >>
> >> The above reserved region includes 0x40004000, a page excluded in
> >> trim_snb_memory. For this memblock reserved region the nid is not set it is
> >> still default value MAX_NUMNODES. later node_set callback will set bit
> >> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> >>
> 
> Hi Dave,
> 
> Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
> kernel, so this region is not include in "numa_meminfo", and memblock.reserved
> (0x40004000) is still MAX_NUMNODES from trim_snb_memory().

Right, btw, I booted kdump kernel with numa=off for saving memory.

I suspect it will also be reproduced with mem=XYZ with normal kernel.

> 
> numa_clear_kernel_node_hotplug
> {
> 	...
> 	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> 		struct numa_memblk *mb = &numa_meminfo.blk[i];
> 
> 		memblock_set_node(mb->start, mb->end - mb->start,
> 				  &memblock.reserved, mb->nid);  // this will not reset 0x40004000's node, right?
> 	}
> 	...
> }
> 
> Thanks
> Xishi Qiu
> 
> >> Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> >>
> >> Signed-off-by: Dave Young <dyoung@redhat.com>
> >> ---
> >>  arch/x86/mm/numa.c |    3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> --- linux.orig/arch/x86/mm/numa.c
> >> +++ linux/arch/x86/mm/numa.c
> >> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
> >>  
> >>  	/* Mark all kernel nodes. */
> >>  	for_each_memblock(reserved, r)
> >> -		node_set(r->nid, numa_kernel_nodes);
> >> +		if (r->nid != MAX_NUMNODES)
> >> +			node_set(r->nid, numa_kernel_nodes);
> >>  
> >>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
> >>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> >>
> > 
> > .
> > 
> 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  7:41     ` Dave Young
@ 2015-04-01  8:21       ` Xishi Qiu
  2015-04-01  8:34         ` Xishi Qiu
  2015-04-02 19:15       ` Yasuaki Ishimatsu
  1 sibling, 1 reply; 19+ messages in thread
From: Xishi Qiu @ 2015-04-01  8:21 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm, Tang Chen

On 2015/4/1 15:41, Dave Young wrote:

> On 04/01/15 at 03:27pm, Xishi Qiu wrote:
>> On 2015/4/1 13:11, Dave Young wrote:
>>
>>> Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
>>>
>>> On 04/01/15 at 12:53pm, Dave Young wrote:
>>>> I got below kernel panic during kdump test on Thinkpad T420 laptop:
>>>>
>>>> [    0.000000] No NUMA configuration found                                      
>>>> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
>>>> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
>>>> upted in: ffffffff81d21910                                                     r
>>>> [    0.000000]                                                                  
>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>>>> 5/2013                                                                         0
>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
>>>> a26                                                                            2
>>>> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
>>>> 8d2                                                                            c
>>>> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
>>>> 4f6                                                                            e
>>>> [    0.000000] Call Trace:                                                      
>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>>>> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>>>> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
>>>> k is corrupted in: ffffffff81d21910                                            c
>>>> [    0.000000]                                                                  
>>>> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>>>> 5/2013                                                                         0
>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
>>>> a26                                                                            2
>>>> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
>>>> 084 0000000000000a0d 0000000000000a00                                          0
>>>> [    0.000000] Call Trace:                                                      
>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>>>> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
>>>> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
>>>> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>>>> [    0.000000] RIP 0x46                                                         
>>>>
>>>> This is caused by writing over end of numa mask bitmap.
>>>>
>>>> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
>>>> reserved region and assume every regions have valid nid. It is not true because
>>>> There's an exception for graphic memory quirks. see function trim_snb_memory
>>>> in arch/x86/kernel/setup.c
>>>>
>>>> It is easily to reproduce the bug in kdump kernel because kdump kernel use
>>>> prereserved memory instead of whole memory, but kexec pass other reserved memory
>>>> ranges to 2nd kernel as well. like below in my test:
>>>> kdump kernel ram 0x2d000000 - 0x37bfffff
>>>> One of the reserved regions: 0x40000000 - 0x40100000
>>>>
>>>> The above reserved region includes 0x40004000, a page excluded in
>>>> trim_snb_memory. For this memblock reserved region the nid is not set it is
>>>> still default value MAX_NUMNODES. later node_set callback will set bit
>>>> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
>>>>
>>
>> Hi Dave,
>>
>> Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
>> kernel, so this region is not include in "numa_meminfo", and memblock.reserved
>> (0x40004000) is still MAX_NUMNODES from trim_snb_memory().
> 
> Right, btw, I booted kdump kernel with numa=off for saving memory.
> 
> I suspect it will also be reproduced with mem=XYZ with normal kernel.
> 

cc Tang Chen, numa_clear_kernel_node_hotplug() is original written by him.

Hi Dave,
I tested the problem, and find the kdump's "numa_meminfo" is the same as the first
kernel. I did not set "numa=off" in kdump kernel, maybe this will lead to the
difference of "numa_meminfo"

Thanks,
Xishi Qiu

>>
>> numa_clear_kernel_node_hotplug
>> {
>> 	...
>> 	for (i = 0; i < numa_meminfo.nr_blks; i++) {
>> 		struct numa_memblk *mb = &numa_meminfo.blk[i];
>>
>> 		memblock_set_node(mb->start, mb->end - mb->start,
>> 				  &memblock.reserved, mb->nid);  // this will not reset 0x40004000's node, right?
>> 	}
>> 	...
>> }
>>
>> Thanks
>> Xishi Qiu
>>




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  8:21       ` Xishi Qiu
@ 2015-04-01  8:34         ` Xishi Qiu
  2015-04-01  9:17           ` Dave Young
  0 siblings, 1 reply; 19+ messages in thread
From: Xishi Qiu @ 2015-04-01  8:34 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm, Tang Chen

On 2015/4/1 16:21, Xishi Qiu wrote:

> On 2015/4/1 15:41, Dave Young wrote:
> 
>> On 04/01/15 at 03:27pm, Xishi Qiu wrote:
>>> On 2015/4/1 13:11, Dave Young wrote:
>>>
>>>> Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
>>>>
>>>> On 04/01/15 at 12:53pm, Dave Young wrote:
>>>>> I got below kernel panic during kdump test on Thinkpad T420 laptop:
>>>>>
>>>>> [    0.000000] No NUMA configuration found                                      
>>>>> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
>>>>> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
>>>>> upted in: ffffffff81d21910                                                     r
>>>>> [    0.000000]                                                                  
>>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
>>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>>>>> 5/2013                                                                         0
>>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
>>>>> a26                                                                            2
>>>>> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
>>>>> 8d2                                                                            c
>>>>> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
>>>>> 4f6                                                                            e
>>>>> [    0.000000] Call Trace:                                                      
>>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>>>>> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
>>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>>>>> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
>>>>> k is corrupted in: ffffffff81d21910                                            c
>>>>> [    0.000000]                                                                  
>>>>> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
>>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
>>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>>>>> 5/2013                                                                         0
>>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
>>>>> a26                                                                            2
>>>>> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
>>>>> 084 0000000000000a0d 0000000000000a00                                          0
>>>>> [    0.000000] Call Trace:                                                      
>>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>>>>> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
>>>>> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
>>>>> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
>>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>>>>> [    0.000000] RIP 0x46                                                         
>>>>>
>>>>> This is caused by writing over end of numa mask bitmap.
>>>>>
>>>>> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
>>>>> reserved region and assume every regions have valid nid. It is not true because
>>>>> There's an exception for graphic memory quirks. see function trim_snb_memory
>>>>> in arch/x86/kernel/setup.c
>>>>>
>>>>> It is easily to reproduce the bug in kdump kernel because kdump kernel use
>>>>> prereserved memory instead of whole memory, but kexec pass other reserved memory
>>>>> ranges to 2nd kernel as well. like below in my test:
>>>>> kdump kernel ram 0x2d000000 - 0x37bfffff
>>>>> One of the reserved regions: 0x40000000 - 0x40100000
>>>>>
>>>>> The above reserved region includes 0x40004000, a page excluded in
>>>>> trim_snb_memory. For this memblock reserved region the nid is not set it is
>>>>> still default value MAX_NUMNODES. later node_set callback will set bit
>>>>> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
>>>>>
>>>
>>> Hi Dave,
>>>
>>> Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
>>> kernel, so this region is not include in "numa_meminfo", and memblock.reserved
>>> (0x40004000) is still MAX_NUMNODES from trim_snb_memory().
>>
>> Right, btw, I booted kdump kernel with numa=off for saving memory.
>>
>> I suspect it will also be reproduced with mem=XYZ with normal kernel.
>>
> 
> cc Tang Chen, numa_clear_kernel_node_hotplug() is original written by him.
> 
> Hi Dave,
> I tested the problem, and find the kdump's "numa_meminfo" is the same as the first
> kernel. I did not set "numa=off" in kdump kernel, maybe this will lead to the
> difference of "numa_meminfo"
>

Hi Dave,

I find the reason, it's "dummy_numa_init() -> numa_add_memblk(0, 0, PFN_PHYS(max_pfn));",
this lead to the difference of "numa_meminfo" when set "numa=off".

However we should fix the bug when set "numa=off".


Thanks,
Xishi Qiu



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  8:34         ` Xishi Qiu
@ 2015-04-01  9:17           ` Dave Young
  2015-04-01  9:33             ` Xishi Qiu
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Young @ 2015-04-01  9:17 UTC (permalink / raw)
  To: Xishi Qiu; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm, Tang Chen

On 04/01/15 at 04:34pm, Xishi Qiu wrote:
> On 2015/4/1 16:21, Xishi Qiu wrote:
> 
> > On 2015/4/1 15:41, Dave Young wrote:
> > 
> >> On 04/01/15 at 03:27pm, Xishi Qiu wrote:
> >>> On 2015/4/1 13:11, Dave Young wrote:
> >>>
> >>>> Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
> >>>>
> >>>> On 04/01/15 at 12:53pm, Dave Young wrote:
> >>>>> I got below kernel panic during kdump test on Thinkpad T420 laptop:
> >>>>>
> >>>>> [    0.000000] No NUMA configuration found                                      
> >>>>> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> >>>>> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> >>>>> upted in: ffffffff81d21910                                                     r
> >>>>> [    0.000000]                                                                  
> >>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> >>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> >>>>> 5/2013                                                                         0
> >>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> >>>>> a26                                                                            2
> >>>>> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> >>>>> 8d2                                                                            c
> >>>>> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> >>>>> 4f6                                                                            e
> >>>>> [    0.000000] Call Trace:                                                      
> >>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> >>>>> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> >>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> >>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> >>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> >>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> >>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> >>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> >>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> >>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> >>>>> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> >>>>> k is corrupted in: ffffffff81d21910                                            c
> >>>>> [    0.000000]                                                                  
> >>>>> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> >>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> >>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> >>>>> 5/2013                                                                         0
> >>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> >>>>> a26                                                                            2
> >>>>> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> >>>>> 084 0000000000000a0d 0000000000000a00                                          0
> >>>>> [    0.000000] Call Trace:                                                      
> >>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> >>>>> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> >>>>> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> >>>>> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> >>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> >>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> >>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> >>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> >>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> >>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> >>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> >>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> >>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> >>>>> [    0.000000] RIP 0x46                                                         
> >>>>>
> >>>>> This is caused by writing over end of numa mask bitmap.
> >>>>>
> >>>>> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> >>>>> reserved region and assume every regions have valid nid. It is not true because
> >>>>> There's an exception for graphic memory quirks. see function trim_snb_memory
> >>>>> in arch/x86/kernel/setup.c
> >>>>>
> >>>>> It is easily to reproduce the bug in kdump kernel because kdump kernel use
> >>>>> prereserved memory instead of whole memory, but kexec pass other reserved memory
> >>>>> ranges to 2nd kernel as well. like below in my test:
> >>>>> kdump kernel ram 0x2d000000 - 0x37bfffff
> >>>>> One of the reserved regions: 0x40000000 - 0x40100000
> >>>>>
> >>>>> The above reserved region includes 0x40004000, a page excluded in
> >>>>> trim_snb_memory. For this memblock reserved region the nid is not set it is
> >>>>> still default value MAX_NUMNODES. later node_set callback will set bit
> >>>>> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> >>>>>
> >>>
> >>> Hi Dave,
> >>>
> >>> Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
> >>> kernel, so this region is not include in "numa_meminfo", and memblock.reserved
> >>> (0x40004000) is still MAX_NUMNODES from trim_snb_memory().
> >>
> >> Right, btw, I booted kdump kernel with numa=off for saving memory.
> >>
> >> I suspect it will also be reproduced with mem=XYZ with normal kernel.
> >>
> > 
> > cc Tang Chen, numa_clear_kernel_node_hotplug() is original written by him.
> > 
> > Hi Dave,
> > I tested the problem, and find the kdump's "numa_meminfo" is the same as the first
> > kernel. I did not set "numa=off" in kdump kernel, maybe this will lead to the
> > difference of "numa_meminfo"
> >
> 
> Hi Dave,
> 
> I find the reason, it's "dummy_numa_init() -> numa_add_memblk(0, 0, PFN_PHYS(max_pfn));",
> this lead to the difference of "numa_meminfo" when set "numa=off".
> 
> However we should fix the bug when set "numa=off".

So do you means MAXPFN should include non system ram region at the end?

The case like below, I believe max_pfn is set to the end of system ram currently:
[what ever] [ system ram ] [ bios reserved region ]

Thanks
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  9:17           ` Dave Young
@ 2015-04-01  9:33             ` Xishi Qiu
  0 siblings, 0 replies; 19+ messages in thread
From: Xishi Qiu @ 2015-04-01  9:33 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm, Tang Chen

On 2015/4/1 17:17, Dave Young wrote:

> On 04/01/15 at 04:34pm, Xishi Qiu wrote:
>> On 2015/4/1 16:21, Xishi Qiu wrote:
>>
>>> On 2015/4/1 15:41, Dave Young wrote:
>>>
>>>> On 04/01/15 at 03:27pm, Xishi Qiu wrote:
>>>>> On 2015/4/1 13:11, Dave Young wrote:
>>>>>
>>>>>> Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
>>>>>>
>>>>>> On 04/01/15 at 12:53pm, Dave Young wrote:
>>>>>>> I got below kernel panic during kdump test on Thinkpad T420 laptop:
>>>>>>>
>>>>>>> [    0.000000] No NUMA configuration found                                      
>>>>>>> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
>>>>>>> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
>>>>>>> upted in: ffffffff81d21910                                                     r
>>>>>>> [    0.000000]                                                                  
>>>>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
>>>>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>>>>>>> 5/2013                                                                         0
>>>>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
>>>>>>> a26                                                                            2
>>>>>>> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
>>>>>>> 8d2                                                                            c
>>>>>>> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
>>>>>>> 4f6                                                                            e
>>>>>>> [    0.000000] Call Trace:                                                      
>>>>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>>>>>>> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
>>>>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>>>>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>>>>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>>>>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>>>>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>>>>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>>>>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>>>>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>>>>>>> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
>>>>>>> k is corrupted in: ffffffff81d21910                                            c
>>>>>>> [    0.000000]                                                                  
>>>>>>> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
>>>>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
>>>>>>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>>>>>>> 5/2013                                                                         0
>>>>>>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
>>>>>>> a26                                                                            2
>>>>>>> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
>>>>>>> 084 0000000000000a0d 0000000000000a00                                          0
>>>>>>> [    0.000000] Call Trace:                                                      
>>>>>>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>>>>>>> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
>>>>>>> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
>>>>>>> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
>>>>>>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>>>>>>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>>>>>>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>>>>>>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>>>>>>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>>>>>>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>>>>>>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>>>>>>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>>>>>>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>>>>>>> [    0.000000] RIP 0x46                                                         
>>>>>>>
>>>>>>> This is caused by writing over end of numa mask bitmap.
>>>>>>>
>>>>>>> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
>>>>>>> reserved region and assume every regions have valid nid. It is not true because
>>>>>>> There's an exception for graphic memory quirks. see function trim_snb_memory
>>>>>>> in arch/x86/kernel/setup.c
>>>>>>>
>>>>>>> It is easily to reproduce the bug in kdump kernel because kdump kernel use
>>>>>>> prereserved memory instead of whole memory, but kexec pass other reserved memory
>>>>>>> ranges to 2nd kernel as well. like below in my test:
>>>>>>> kdump kernel ram 0x2d000000 - 0x37bfffff
>>>>>>> One of the reserved regions: 0x40000000 - 0x40100000
>>>>>>>
>>>>>>> The above reserved region includes 0x40004000, a page excluded in
>>>>>>> trim_snb_memory. For this memblock reserved region the nid is not set it is
>>>>>>> still default value MAX_NUMNODES. later node_set callback will set bit
>>>>>>> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
>>>>>>>
>>>>>
>>>>> Hi Dave,
>>>>>
>>>>> Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
>>>>> kernel, so this region is not include in "numa_meminfo", and memblock.reserved
>>>>> (0x40004000) is still MAX_NUMNODES from trim_snb_memory().
>>>>
>>>> Right, btw, I booted kdump kernel with numa=off for saving memory.
>>>>
>>>> I suspect it will also be reproduced with mem=XYZ with normal kernel.
>>>>
>>>
>>> cc Tang Chen, numa_clear_kernel_node_hotplug() is original written by him.
>>>
>>> Hi Dave,
>>> I tested the problem, and find the kdump's "numa_meminfo" is the same as the first
>>> kernel. I did not set "numa=off" in kdump kernel, maybe this will lead to the
>>> difference of "numa_meminfo"
>>>
>>
>> Hi Dave,
>>
>> I find the reason, it's "dummy_numa_init() -> numa_add_memblk(0, 0, PFN_PHYS(max_pfn));",
>> this lead to the difference of "numa_meminfo" when set "numa=off".
>>
>> However we should fix the bug when set "numa=off".
> 
> So do you means MAXPFN should include non system ram region at the end?
> 

No, I means when set "numa=off", numa_meminfo comes from max_pfn, this makes the
difference. and max_pfn is come from e820, not SRAT tables. if no "numa=off", SRAT
tables will fill the numa_meminfo.

Thanks,
Xishi Qiu

> The case like below, I believe max_pfn is set to the end of system ram currently:
> [what ever] [ system ram ] [ bios reserved region ]
> 
> Thanks
> Dave
> 
> .
> 




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  5:11 ` Dave Young
  2015-04-01  7:27   ` Xishi Qiu
@ 2015-04-02  1:51   ` Xishi Qiu
  2015-04-02  3:24     ` Dave Young
  1 sibling, 1 reply; 19+ messages in thread
From: Xishi Qiu @ 2015-04-02  1:51 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm, Tang Chen

On 2015/4/1 13:11, Dave Young wrote:

> Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
> 
> On 04/01/15 at 12:53pm, Dave Young wrote:
>> I got below kernel panic during kdump test on Thinkpad T420 laptop:
>>
>> [    0.000000] No NUMA configuration found                                      
>> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
>> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
>> upted in: ffffffff81d21910                                                     r
>> [    0.000000]                                                                  
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>> 5/2013                                                                         0
>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
>> a26                                                                            2
>> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
>> 8d2                                                                            c
>> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
>> 4f6                                                                            e
>> [    0.000000] Call Trace:                                                      
>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
>> k is corrupted in: ffffffff81d21910                                            c
>> [    0.000000]                                                                  
>> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
>> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
>> 5/2013                                                                         0
>> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
>> a26                                                                            2
>> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
>> 084 0000000000000a0d 0000000000000a00                                          0
>> [    0.000000] Call Trace:                                                      
>> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
>> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
>> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
>> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
>> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
>> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
>> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
>> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
>> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
>> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
>> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
>> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
>> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
>> [    0.000000] RIP 0x46                                                         
>>
>> This is caused by writing over end of numa mask bitmap.
>>
>> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
>> reserved region and assume every regions have valid nid. It is not true because
>> There's an exception for graphic memory quirks. see function trim_snb_memory
>> in arch/x86/kernel/setup.c
>>
>> It is easily to reproduce the bug in kdump kernel because kdump kernel use
>> prereserved memory instead of whole memory, but kexec pass other reserved memory
>> ranges to 2nd kernel as well. like below in my test:
>> kdump kernel ram 0x2d000000 - 0x37bfffff
>> One of the reserved regions: 0x40000000 - 0x40100000
>>
>> The above reserved region includes 0x40004000, a page excluded in
>> trim_snb_memory. For this memblock reserved region the nid is not set it is
>> still default value MAX_NUMNODES. later node_set callback will set bit
>> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
>>
>> Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
>>
>> Signed-off-by: Dave Young <dyoung@redhat.com>
>> ---
>>  arch/x86/mm/numa.c |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> --- linux.orig/arch/x86/mm/numa.c
>> +++ linux/arch/x86/mm/numa.c
>> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
>>  
>>  	/* Mark all kernel nodes. */
>>  	for_each_memblock(reserved, r)
>> -		node_set(r->nid, numa_kernel_nodes);

Hi Dave,

How about add some comment here? if set numa=off, numa_meminfo may not include
all the memblock.reserved memory. e.g. trim_snb_memory()

Thanks,
Xishi Qiu

>> +		if (r->nid != MAX_NUMNODES)
>> +			node_set(r->nid, numa_kernel_nodes);
>>  
>>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
>>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
>>
> 
> .
> 




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-02  1:51   ` Xishi Qiu
@ 2015-04-02  3:24     ` Dave Young
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Young @ 2015-04-02  3:24 UTC (permalink / raw)
  To: Xishi Qiu; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm, Tang Chen

Hi, Xishi

[snip]
> >>  arch/x86/mm/numa.c |    3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> --- linux.orig/arch/x86/mm/numa.c
> >> +++ linux/arch/x86/mm/numa.c
> >> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
> >>  
> >>  	/* Mark all kernel nodes. */
> >>  	for_each_memblock(reserved, r)
> >> -		node_set(r->nid, numa_kernel_nodes);
> 
> Hi Dave,
> 
> How about add some comment here? if set numa=off, numa_meminfo may not include
> all the memblock.reserved memory. e.g. trim_snb_memory()

Sure, I can do that. I will send an update if there's no other comments.

> >> +		if (r->nid != MAX_NUMNODES)
> >> +			node_set(r->nid, numa_kernel_nodes);
> >>  
> >>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
> >>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> >>
> > 
> 

Thanks
Dave
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  7:41     ` Dave Young
  2015-04-01  8:21       ` Xishi Qiu
@ 2015-04-02 19:15       ` Yasuaki Ishimatsu
  2015-04-03  7:03         ` Dave Young
  1 sibling, 1 reply; 19+ messages in thread
From: Yasuaki Ishimatsu @ 2015-04-02 19:15 UTC (permalink / raw)
  To: Dave Young; +Cc: Xishi Qiu, x86, linux-kernel, tglx, bhe, mingo, hpa, akpm


On Wed, 1 Apr 2015 15:41:20 +0800
Dave Young <dyoung@redhat.com> wrote:

> On 04/01/15 at 03:27pm, Xishi Qiu wrote:
> > On 2015/4/1 13:11, Dave Young wrote:
> > 
> > > Ccing Xishi Qiu who wrote the clear_kernel_node_hotplug code.
> > > 
> > > On 04/01/15 at 12:53pm, Dave Young wrote:
> > >> I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > >>
> > >> [    0.000000] No NUMA configuration found                                      
> > >> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > >> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > >> upted in: ffffffff81d21910                                                     r
> > >> [    0.000000]                                                                  
> > >> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > >> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > >> 5/2013                                                                         0
> > >> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > >> a26                                                                            2
> > >> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > >> 8d2                                                                            c
> > >> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > >> 4f6                                                                            e
> > >> [    0.000000] Call Trace:                                                      
> > >> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > >> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > >> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > >> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > >> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > >> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > >> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > >> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > >> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > >> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > >> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > >> k is corrupted in: ffffffff81d21910                                            c
> > >> [    0.000000]                                                                  
> > >> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > >> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > >> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > >> 5/2013                                                                         0
> > >> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > >> a26                                                                            2
> > >> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > >> 084 0000000000000a0d 0000000000000a00                                          0
> > >> [    0.000000] Call Trace:                                                      
> > >> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > >> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > >> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > >> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > >> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > >> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > >> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > >> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > >> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > >> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > >> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > >> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > >> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > >> [    0.000000] RIP 0x46                                                         
> > >>
> > >> This is caused by writing over end of numa mask bitmap.
> > >>
> > >> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > >> reserved region and assume every regions have valid nid. It is not true because
> > >> There's an exception for graphic memory quirks. see function trim_snb_memory
> > >> in arch/x86/kernel/setup.c
> > >>
> > >> It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > >> prereserved memory instead of whole memory, but kexec pass other reserved memory
> > >> ranges to 2nd kernel as well. like below in my test:
> > >> kdump kernel ram 0x2d000000 - 0x37bfffff
> > >> One of the reserved regions: 0x40000000 - 0x40100000
> > >>
> > >> The above reserved region includes 0x40004000, a page excluded in
> > >> trim_snb_memory. For this memblock reserved region the nid is not set it is
> > >> still default value MAX_NUMNODES. later node_set callback will set bit
> > >> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > >>
> > 
> > Hi Dave,
> > 
> > Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
> > kernel, so this region is not include in "numa_meminfo", and memblock.reserved
> > (0x40004000) is still MAX_NUMNODES from trim_snb_memory().
> 
> Right, btw, I booted kdump kernel with numa=off for saving memory.
> 
> I suspect it will also be reproduced with mem=XYZ with normal kernel.

Does the issue occur on your system with mem=0x40000000?

I think the issue occurs when reserved memory range is not includes
in system ram which informed by e820 or SRAT table. On your system,
0x40004000 is reserved by trim_snb_memory(). But if you use mem=0x40000000,
the system ram is limited within 0x40000000. So the issue will occur.

Thanks,
Yasuaki Ishimatsu

> 
> > 
> > numa_clear_kernel_node_hotplug
> > {
> > 	...
> > 	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> > 		struct numa_memblk *mb = &numa_meminfo.blk[i];
> > 
> > 		memblock_set_node(mb->start, mb->end - mb->start,
> > 				  &memblock.reserved, mb->nid);  // this will not reset 0x40004000's node, right?
> > 	}
> > 	...
> > }
> > 
> > Thanks
> > Xishi Qiu
> > 
> > >> Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > >>
> > >> Signed-off-by: Dave Young <dyoung@redhat.com>
> > >> ---
> > >>  arch/x86/mm/numa.c |    3 ++-
> > >>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>
> > >> --- linux.orig/arch/x86/mm/numa.c
> > >> +++ linux/arch/x86/mm/numa.c
> > >> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
> > >>  
> > >>  	/* Mark all kernel nodes. */
> > >>  	for_each_memblock(reserved, r)
> > >> -		node_set(r->nid, numa_kernel_nodes);
> > >> +		if (r->nid != MAX_NUMNODES)
> > >> +			node_set(r->nid, numa_kernel_nodes);
> > >>  
> > >>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
> > >>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> > >>
> > > 
> > > .
> > > 
> > 
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-01  4:53 [PATCH] x86/numa: kernel stack corruption fix Dave Young
  2015-04-01  5:11 ` Dave Young
@ 2015-04-02 19:36 ` Yasuaki Ishimatsu
  2015-04-03  7:15   ` Dave Young
  1 sibling, 1 reply; 19+ messages in thread
From: Yasuaki Ishimatsu @ 2015-04-02 19:36 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm


On Wed, 1 Apr 2015 12:53:46 +0800
Dave Young <dyoung@redhat.com> wrote:

> I got below kernel panic during kdump test on Thinkpad T420 laptop:
> 
> [    0.000000] No NUMA configuration found                                      
> [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> upted in: ffffffff81d21910                                                     r
> [    0.000000]                                                                  
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> 5/2013                                                                         0
> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> a26                                                                            2
> [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> 8d2                                                                            c
> [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> 4f6                                                                            e
> [    0.000000] Call Trace:                                                      
> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> k is corrupted in: ffffffff81d21910                                            c
> [    0.000000]                                                                  
> PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> 5/2013                                                                         0
> [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> a26                                                                            2
> [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> 084 0000000000000a0d 0000000000000a00                                          0
> [    0.000000] Call Trace:                                                      
> [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> [    0.000000] RIP 0x46                                                         
> 
> This is caused by writing over end of numa mask bitmap.
> 
> numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> reserved region and assume every regions have valid nid. It is not true because
> There's an exception for graphic memory quirks. see function trim_snb_memory
> in arch/x86/kernel/setup.c
> 
> It is easily to reproduce the bug in kdump kernel because kdump kernel use
> prereserved memory instead of whole memory, but kexec pass other reserved memory
> ranges to 2nd kernel as well. like below in my test:
> kdump kernel ram 0x2d000000 - 0x37bfffff
> One of the reserved regions: 0x40000000 - 0x40100000
> 
> The above reserved region includes 0x40004000, a page excluded in
> trim_snb_memory. For this memblock reserved region the nid is not set it is
> still default value MAX_NUMNODES. later node_set callback will set bit
> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> 
> Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> 
> Signed-off-by: Dave Young <dyoung@redhat.com>
> ---

Looks good to me.

Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Thanks,
Yasuaki Ishimatsu

>  arch/x86/mm/numa.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- linux.orig/arch/x86/mm/numa.c
> +++ linux/arch/x86/mm/numa.c
> @@ -484,7 +484,8 @@ static void __init numa_clear_kernel_nod
>  
>  	/* Mark all kernel nodes. */
>  	for_each_memblock(reserved, r)
> -		node_set(r->nid, numa_kernel_nodes);
> +		if (r->nid != MAX_NUMNODES)
> +			node_set(r->nid, numa_kernel_nodes);
>  
>  	/* Clear MEMBLOCK_HOTPLUG flag for memory in kernel nodes. */
>  	for (i = 0; i < numa_meminfo.nr_blks; i++) {
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-02 19:15       ` Yasuaki Ishimatsu
@ 2015-04-03  7:03         ` Dave Young
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Young @ 2015-04-03  7:03 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: Xishi Qiu, x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

> > > >>
> > > >> The above reserved region includes 0x40004000, a page excluded in
> > > >> trim_snb_memory. For this memblock reserved region the nid is not set it is
> > > >> still default value MAX_NUMNODES. later node_set callback will set bit
> > > >> MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > > >>
> > > 
> > > Hi Dave,
> > > 
> > > Is it means, first reserved region 0x40000000 - 0x40100000, then boot the kdump
> > > kernel, so this region is not include in "numa_meminfo", and memblock.reserved
> > > (0x40004000) is still MAX_NUMNODES from trim_snb_memory().
> > 
> > Right, btw, I booted kdump kernel with numa=off for saving memory.
> > 
> > I suspect it will also be reproduced with mem=XYZ with normal kernel.
> 
> Does the issue occur on your system with mem=0x40000000?
> 
> I think the issue occurs when reserved memory range is not includes
> in system ram which informed by e820 or SRAT table. On your system,
> 0x40004000 is reserved by trim_snb_memory(). But if you use mem=0x40000000,
> the system ram is limited within 0x40000000. So the issue will occur.

It does occur with mem=800M during my previous test, I think it will occur with
mem=0x40000000 as well though I did not test mem=0x40000000.

Thanks
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-02 19:36 ` Yasuaki Ishimatsu
@ 2015-04-03  7:15   ` Dave Young
  2015-04-03  7:17     ` Ingo Molnar
  2015-04-06 14:26     ` Yasuaki Ishimatsu
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Young @ 2015-04-03  7:15 UTC (permalink / raw)
  To: Yasuaki Ishimatsu; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

Hi,

On 04/02/15 at 12:36pm, Yasuaki Ishimatsu wrote:
> 
> On Wed, 1 Apr 2015 12:53:46 +0800
> Dave Young <dyoung@redhat.com> wrote:
> 
> > I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > 
> > [    0.000000] No NUMA configuration found                                      
> > [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > upted in: ffffffff81d21910                                                     r
> > [    0.000000]                                                                  
> > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > 5/2013                                                                         0
> > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > a26                                                                            2
> > [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > 8d2                                                                            c
> > [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > 4f6                                                                            e
> > [    0.000000] Call Trace:                                                      
> > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > k is corrupted in: ffffffff81d21910                                            c
> > [    0.000000]                                                                  
> > PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > 5/2013                                                                         0
> > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > a26                                                                            2
> > [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > 084 0000000000000a0d 0000000000000a00                                          0
> > [    0.000000] Call Trace:                                                      
> > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > [    0.000000] RIP 0x46                                                         
> > 
> > This is caused by writing over end of numa mask bitmap.
> > 
> > numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > reserved region and assume every regions have valid nid. It is not true because
> > There's an exception for graphic memory quirks. see function trim_snb_memory
> > in arch/x86/kernel/setup.c
> > 
> > It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > prereserved memory instead of whole memory, but kexec pass other reserved memory
> > ranges to 2nd kernel as well. like below in my test:
> > kdump kernel ram 0x2d000000 - 0x37bfffff
> > One of the reserved regions: 0x40000000 - 0x40100000
> > 
> > The above reserved region includes 0x40004000, a page excluded in
> > trim_snb_memory. For this memblock reserved region the nid is not set it is
> > still default value MAX_NUMNODES. later node_set callback will set bit
> > MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > 
> > Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > 
> > Signed-off-by: Dave Young <dyoung@redhat.com>
> > ---
> 
> Looks good to me.
> 
> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Per suggestion from Xishi Qiu, I would like to add below comment in code and repost
Could I carry with your Reviewed-by line? 

/* In case booting with numa=off and only using part of system ram
 * ie. mem=nn[kMG] or in kdump kernel, numa_meminfo may not include all the
 * memblock.reserved memory because trim_snb_memory() reserves specific pages 
 * for Sandy Bridge graphics. */

Thanks
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-03  7:15   ` Dave Young
@ 2015-04-03  7:17     ` Ingo Molnar
  2015-04-03  7:23       ` Dave Young
  2015-04-06 14:26     ` Yasuaki Ishimatsu
  1 sibling, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2015-04-03  7:17 UTC (permalink / raw)
  To: Dave Young
  Cc: Yasuaki Ishimatsu, x86, linux-kernel, tglx, bhe, mingo, hpa, akpm


* Dave Young <dyoung@redhat.com> wrote:

> Hi,
> 
> On 04/02/15 at 12:36pm, Yasuaki Ishimatsu wrote:
> > 
> > On Wed, 1 Apr 2015 12:53:46 +0800
> > Dave Young <dyoung@redhat.com> wrote:
> > 
> > > I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > > 
> > > [    0.000000] No NUMA configuration found                                      
> > > [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > > [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > > upted in: ffffffff81d21910                                                     r
> > > [    0.000000]                                                                  
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > 5/2013                                                                         0
> > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > > a26                                                                            2
> > > [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > > 8d2                                                                            c
> > > [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > > 4f6                                                                            e
> > > [    0.000000] Call Trace:                                                      
> > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > > k is corrupted in: ffffffff81d21910                                            c
> > > [    0.000000]                                                                  
> > > PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > 5/2013                                                                         0
> > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > > a26                                                                            2
> > > [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > > 084 0000000000000a0d 0000000000000a00                                          0
> > > [    0.000000] Call Trace:                                                      
> > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > > [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > > [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > [    0.000000] RIP 0x46                                                         
> > > 
> > > This is caused by writing over end of numa mask bitmap.
> > > 
> > > numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > > reserved region and assume every regions have valid nid. It is not true because
> > > There's an exception for graphic memory quirks. see function trim_snb_memory
> > > in arch/x86/kernel/setup.c
> > > 
> > > It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > > prereserved memory instead of whole memory, but kexec pass other reserved memory
> > > ranges to 2nd kernel as well. like below in my test:
> > > kdump kernel ram 0x2d000000 - 0x37bfffff
> > > One of the reserved regions: 0x40000000 - 0x40100000
> > > 
> > > The above reserved region includes 0x40004000, a page excluded in
> > > trim_snb_memory. For this memblock reserved region the nid is not set it is
> > > still default value MAX_NUMNODES. later node_set callback will set bit
> > > MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > > 
> > > Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > > 
> > > Signed-off-by: Dave Young <dyoung@redhat.com>
> > > ---
> > 
> > Looks good to me.
> > 
> > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> 
> Per suggestion from Xishi Qiu, I would like to add below comment in code and repost
> Could I carry with your Reviewed-by line? 
> 
> /* In case booting with numa=off and only using part of system ram
>  * ie. mem=nn[kMG] or in kdump kernel, numa_meminfo may not include all the
>  * memblock.reserved memory because trim_snb_memory() reserves specific pages 
>  * for Sandy Bridge graphics. */

Nit: please use the customary (multi-line) comment style:

  /*
   * Comment .....
   * ...... goes here.
   */

specified in Documentation/CodingStyle.

Thanks,

        Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-03  7:17     ` Ingo Molnar
@ 2015-04-03  7:23       ` Dave Young
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Young @ 2015-04-03  7:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Yasuaki Ishimatsu, x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

On 04/03/15 at 09:17am, Ingo Molnar wrote:
> 
> * Dave Young <dyoung@redhat.com> wrote:
> 
> > Hi,
> > 
> > On 04/02/15 at 12:36pm, Yasuaki Ishimatsu wrote:
> > > 
> > > On Wed, 1 Apr 2015 12:53:46 +0800
> > > Dave Young <dyoung@redhat.com> wrote:
> > > 
> > > > I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > > > 
> > > > [    0.000000] No NUMA configuration found                                      
> > > > [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > > > [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > > > upted in: ffffffff81d21910                                                     r
> > > > [    0.000000]                                                                  
> > > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > > 5/2013                                                                         0
> > > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > > > a26                                                                            2
> > > > [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > > > 8d2                                                                            c
> > > > [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > > > 4f6                                                                            e
> > > > [    0.000000] Call Trace:                                                      
> > > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > > [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > > [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > > > k is corrupted in: ffffffff81d21910                                            c
> > > > [    0.000000]                                                                  
> > > > PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > > 5/2013                                                                         0
> > > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > > > a26                                                                            2
> > > > [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > > > 084 0000000000000a0d 0000000000000a00                                          0
> > > > [    0.000000] Call Trace:                                                      
> > > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > > [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > > > [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > > > [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > > [    0.000000] RIP 0x46                                                         
> > > > 
> > > > This is caused by writing over end of numa mask bitmap.
> > > > 
> > > > numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > > > reserved region and assume every regions have valid nid. It is not true because
> > > > There's an exception for graphic memory quirks. see function trim_snb_memory
> > > > in arch/x86/kernel/setup.c
> > > > 
> > > > It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > > > prereserved memory instead of whole memory, but kexec pass other reserved memory
> > > > ranges to 2nd kernel as well. like below in my test:
> > > > kdump kernel ram 0x2d000000 - 0x37bfffff
> > > > One of the reserved regions: 0x40000000 - 0x40100000
> > > > 
> > > > The above reserved region includes 0x40004000, a page excluded in
> > > > trim_snb_memory. For this memblock reserved region the nid is not set it is
> > > > still default value MAX_NUMNODES. later node_set callback will set bit
> > > > MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > > > 
> > > > Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > > > 
> > > > Signed-off-by: Dave Young <dyoung@redhat.com>
> > > > ---
> > > 
> > > Looks good to me.
> > > 
> > > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> > 
> > Per suggestion from Xishi Qiu, I would like to add below comment in code and repost
> > Could I carry with your Reviewed-by line? 
> > 
> > /* In case booting with numa=off and only using part of system ram
> >  * ie. mem=nn[kMG] or in kdump kernel, numa_meminfo may not include all the
> >  * memblock.reserved memory because trim_snb_memory() reserves specific pages 
> >  * for Sandy Bridge graphics. */
> 
> Nit: please use the customary (multi-line) comment style:
> 
>   /*
>    * Comment .....
>    * ...... goes here.
>    */
> 
> specified in Documentation/CodingStyle.

Sure, will do.

Thanks for telling me.
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-03  7:15   ` Dave Young
  2015-04-03  7:17     ` Ingo Molnar
@ 2015-04-06 14:26     ` Yasuaki Ishimatsu
  2015-04-07  3:33       ` Dave Young
  1 sibling, 1 reply; 19+ messages in thread
From: Yasuaki Ishimatsu @ 2015-04-06 14:26 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

Hi,

On Fri, 3 Apr 2015 15:15:13 +0800
Dave Young <dyoung@redhat.com> wrote:

> Hi,
> 
> On 04/02/15 at 12:36pm, Yasuaki Ishimatsu wrote:
> > 
> > On Wed, 1 Apr 2015 12:53:46 +0800
> > Dave Young <dyoung@redhat.com> wrote:
> > 
> > > I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > > 
> > > [    0.000000] No NUMA configuration found                                      
> > > [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > > [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > > upted in: ffffffff81d21910                                                     r
> > > [    0.000000]                                                                  
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > 5/2013                                                                         0
> > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > > a26                                                                            2
> > > [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > > 8d2                                                                            c
> > > [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > > 4f6                                                                            e
> > > [    0.000000] Call Trace:                                                      
> > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > > k is corrupted in: ffffffff81d21910                                            c
> > > [    0.000000]                                                                  
> > > PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > 5/2013                                                                         0
> > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > > a26                                                                            2
> > > [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > > 084 0000000000000a0d 0000000000000a00                                          0
> > > [    0.000000] Call Trace:                                                      
> > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > > [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > > [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > [    0.000000] RIP 0x46                                                         
> > > 
> > > This is caused by writing over end of numa mask bitmap.
> > > 
> > > numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > > reserved region and assume every regions have valid nid. It is not true because
> > > There's an exception for graphic memory quirks. see function trim_snb_memory
> > > in arch/x86/kernel/setup.c
> > > 
> > > It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > > prereserved memory instead of whole memory, but kexec pass other reserved memory
> > > ranges to 2nd kernel as well. like below in my test:
> > > kdump kernel ram 0x2d000000 - 0x37bfffff
> > > One of the reserved regions: 0x40000000 - 0x40100000
> > > 
> > > The above reserved region includes 0x40004000, a page excluded in
> > > trim_snb_memory. For this memblock reserved region the nid is not set it is
> > > still default value MAX_NUMNODES. later node_set callback will set bit
> > > MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > > 
> > > Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > > 
> > > Signed-off-by: Dave Young <dyoung@redhat.com>
> > > ---
> > 
> > Looks good to me.
> > 
> > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> 

> Per suggestion from Xishi Qiu, I would like to add below comment in code and repost
> Could I carry with your Reviewed-by line? 
> 
> /* In case booting with numa=off and only using part of system ram
>  * ie. mem=nn[kMG] or in kdump kernel, numa_meminfo may not include all the
>  * memblock.reserved memory because trim_snb_memory() reserves specific pages 
>  * for Sandy Bridge graphics. */

I don't think the issue is related to numa=off. If we use numa=off,
all memory ranges which are informed by e820 are managed to Node 0.

The issue occurs when reserved memory ranges are not included
in system ram which informed by e820.

Thanks,
Yasuaki Ishimatsu

> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-06 14:26     ` Yasuaki Ishimatsu
@ 2015-04-07  3:33       ` Dave Young
  2015-04-07 14:15         ` Yasuaki Ishimatsu
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Young @ 2015-04-07  3:33 UTC (permalink / raw)
  To: Yasuaki Ishimatsu; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

On 04/06/15 at 07:26am, Yasuaki Ishimatsu wrote:
> Hi,
> 
> On Fri, 3 Apr 2015 15:15:13 +0800
> Dave Young <dyoung@redhat.com> wrote:
> 
> > Hi,
> > 
> > On 04/02/15 at 12:36pm, Yasuaki Ishimatsu wrote:
> > > 
> > > On Wed, 1 Apr 2015 12:53:46 +0800
> > > Dave Young <dyoung@redhat.com> wrote:
> > > 
> > > > I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > > > 
> > > > [    0.000000] No NUMA configuration found                                      
> > > > [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > > > [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > > > upted in: ffffffff81d21910                                                     r
> > > > [    0.000000]                                                                  
> > > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > > 5/2013                                                                         0
> > > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > > > a26                                                                            2
> > > > [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > > > 8d2                                                                            c
> > > > [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > > > 4f6                                                                            e
> > > > [    0.000000] Call Trace:                                                      
> > > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > > [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > > [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > > > k is corrupted in: ffffffff81d21910                                            c
> > > > [    0.000000]                                                                  
> > > > PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > > 5/2013                                                                         0
> > > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > > > a26                                                                            2
> > > > [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > > > 084 0000000000000a0d 0000000000000a00                                          0
> > > > [    0.000000] Call Trace:                                                      
> > > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > > [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > > > [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > > > [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > > [    0.000000] RIP 0x46                                                         
> > > > 
> > > > This is caused by writing over end of numa mask bitmap.
> > > > 
> > > > numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > > > reserved region and assume every regions have valid nid. It is not true because
> > > > There's an exception for graphic memory quirks. see function trim_snb_memory
> > > > in arch/x86/kernel/setup.c
> > > > 
> > > > It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > > > prereserved memory instead of whole memory, but kexec pass other reserved memory
> > > > ranges to 2nd kernel as well. like below in my test:
> > > > kdump kernel ram 0x2d000000 - 0x37bfffff
> > > > One of the reserved regions: 0x40000000 - 0x40100000
> > > > 
> > > > The above reserved region includes 0x40004000, a page excluded in
> > > > trim_snb_memory. For this memblock reserved region the nid is not set it is
> > > > still default value MAX_NUMNODES. later node_set callback will set bit
> > > > MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > > > 
> > > > Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > > > 
> > > > Signed-off-by: Dave Young <dyoung@redhat.com>
> > > > ---
> > > 
> > > Looks good to me.
> > > 
> > > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> > 
> 
> > Per suggestion from Xishi Qiu, I would like to add below comment in code and repost
> > Could I carry with your Reviewed-by line? 
> > 
> > /* In case booting with numa=off and only using part of system ram
> >  * ie. mem=nn[kMG] or in kdump kernel, numa_meminfo may not include all the
> >  * memblock.reserved memory because trim_snb_memory() reserves specific pages 
> >  * for Sandy Bridge graphics. */
> 
> I don't think the issue is related to numa=off. If we use numa=off,
> all memory ranges which are informed by e820 are managed to Node 0.
> 
> The issue occurs when reserved memory ranges are not included
> in system ram which informed by e820.

Maybe in theory it is possible that it occurs without numa=off so that numa
meminfo does not include some reserved regions, but I think I can only
reproduce it with numa=off.

But ok, I can use the comment below instead:

/*
 * In case booting with mem=nn[kMG] or in kdump kernel, numa_meminfo may not
 * include all the memblock.reserved memory ranges because trim_snb_memory()
 * reserves specific pages for Sandy Bridge graphics.
 */

Thanks
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] x86/numa: kernel stack corruption fix
  2015-04-07  3:33       ` Dave Young
@ 2015-04-07 14:15         ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 19+ messages in thread
From: Yasuaki Ishimatsu @ 2015-04-07 14:15 UTC (permalink / raw)
  To: Dave Young; +Cc: x86, linux-kernel, tglx, bhe, mingo, hpa, akpm

Hi Dave,

On Tue, 7 Apr 2015 11:33:55 +0800
Dave Young <dyoung@redhat.com> wrote:

> On 04/06/15 at 07:26am, Yasuaki Ishimatsu wrote:
> > Hi,
> > 
> > On Fri, 3 Apr 2015 15:15:13 +0800
> > Dave Young <dyoung@redhat.com> wrote:
> > 
> > > Hi,
> > > 
> > > On 04/02/15 at 12:36pm, Yasuaki Ishimatsu wrote:
> > > > 
> > > > On Wed, 1 Apr 2015 12:53:46 +0800
> > > > Dave Young <dyoung@redhat.com> wrote:
> > > > 
> > > > > I got below kernel panic during kdump test on Thinkpad T420 laptop:
> > > > > 
> > > > > [    0.000000] No NUMA configuration found                                      
> > > > > [    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]     
> > > > > [    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is cor 
> > > > > upted in: ffffffff81d21910                                                     r
> > > > > [    0.000000]                                                                  
> > > > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44           
> > > > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > > > 5/2013                                                                         0
> > > > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67ce8 ffffffff817c 
> > > > > a26                                                                            2
> > > > > [    0.000000]  0000000000000000 ffffffff81a61c90 ffffffff81b67d68 ffffffff817b 
> > > > > 8d2                                                                            c
> > > > > [    0.000000]  0000000000000010 ffffffff81b67d78 ffffffff81b67d18 c70296ddd809 
> > > > > 4f6                                                                            e
> > > > > [    0.000000] Call Trace:                                                      
> > > > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > > > [    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204                           
> > > > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > > > [    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta 
> > > > > k is corrupted in: ffffffff81d21910                                            c
> > > > > [    0.000000]                                                                  
> > > > > PANIC: early exception 0d rip 10:ffffffff8105d2a6 error 7eb cr2 ffff8800371dd00 
> > > > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.0.0-rc6+ #44          0
> > > > > [    0.000000] Hardware name: LENOVO 4236NUC/4236NUC, BIOS 83ET76WW (1.46 ) 07/ 
> > > > > 5/2013                                                                         0
> > > > > [    0.000000]  0000000000000000 c70296ddd809e4f6 ffffffff81b67c60 ffffffff817c 
> > > > > a26                                                                            2
> > > > > [    0.000000]  0000000000000096 ffffffff81a61c90 ffffffff81b67d68 fffffff00000 
> > > > > 084 0000000000000a0d 0000000000000a00                                          0
> > > > > [    0.000000] Call Trace:                                                      
> > > > > [    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57                       
> > > > > [    0.000000]  [<ffffffff81d051b0>] early_idt_handler+0x90/0xb7                
> > > > > [    0.000000]  [<ffffffff8105d2a6>] ? native_irq_enable+0x6/0x10               
> > > > > [    0.000000]  [<ffffffff817bc9c5>] ? panic+0x1c3/0x204                        
> > > > > [    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 
> > > > > [    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20                 
> > > > > [    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2   
> > > > > [    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520                      
> > > > > [    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d                    
> > > > > [    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb                       
> > > > > [    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82                     
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b                         
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6                    
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120           
> > > > > [    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c        
> > > > > [    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184            
> > > > > [    0.000000] RIP 0x46                                                         
> > > > > 
> > > > > This is caused by writing over end of numa mask bitmap.
> > > > > 
> > > > > numa_clear_kernel_node try to set node id in a mask bitmap, it iterating all
> > > > > reserved region and assume every regions have valid nid. It is not true because
> > > > > There's an exception for graphic memory quirks. see function trim_snb_memory
> > > > > in arch/x86/kernel/setup.c
> > > > > 
> > > > > It is easily to reproduce the bug in kdump kernel because kdump kernel use
> > > > > prereserved memory instead of whole memory, but kexec pass other reserved memory
> > > > > ranges to 2nd kernel as well. like below in my test:
> > > > > kdump kernel ram 0x2d000000 - 0x37bfffff
> > > > > One of the reserved regions: 0x40000000 - 0x40100000
> > > > > 
> > > > > The above reserved region includes 0x40004000, a page excluded in
> > > > > trim_snb_memory. For this memblock reserved region the nid is not set it is
> > > > > still default value MAX_NUMNODES. later node_set callback will set bit
> > > > > MAX_NUMNODES in nodemask bitmap thus stack corruption happen. 
> > > > > 
> > > > > Fixing this by adding a check, do not call node_set in case nid is MAX_NUMNODES.
> > > > > 
> > > > > Signed-off-by: Dave Young <dyoung@redhat.com>
> > > > > ---
> > > > 
> > > > Looks good to me.
> > > > 
> > > > Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> > > 
> > 
> > > Per suggestion from Xishi Qiu, I would like to add below comment in code and repost
> > > Could I carry with your Reviewed-by line? 
> > > 
> > > /* In case booting with numa=off and only using part of system ram
> > >  * ie. mem=nn[kMG] or in kdump kernel, numa_meminfo may not include all the
> > >  * memblock.reserved memory because trim_snb_memory() reserves specific pages 
> > >  * for Sandy Bridge graphics. */
> > 
> > I don't think the issue is related to numa=off. If we use numa=off,
> > all memory ranges which are informed by e820 are managed to Node 0.
> > 
> > The issue occurs when reserved memory ranges are not included
> > in system ram which informed by e820.
> 
> Maybe in theory it is possible that it occurs without numa=off so that numa
> meminfo does not include some reserved regions, but I think I can only
> reproduce it with numa=off.
> 
> But ok, I can use the comment below instead:
> 

> /*
>  * In case booting with mem=nn[kMG] or in kdump kernel, numa_meminfo may not
>  * include all the memblock.reserved memory ranges because trim_snb_memory()
>  * reserves specific pages for Sandy Bridge graphics.
>  */

Looks good to me.

Feel free to add my reviewed by.

Thanks,
Yasuaki Ishimatsu

> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-04-07 14:15 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-01  4:53 [PATCH] x86/numa: kernel stack corruption fix Dave Young
2015-04-01  5:11 ` Dave Young
2015-04-01  7:27   ` Xishi Qiu
2015-04-01  7:41     ` Dave Young
2015-04-01  8:21       ` Xishi Qiu
2015-04-01  8:34         ` Xishi Qiu
2015-04-01  9:17           ` Dave Young
2015-04-01  9:33             ` Xishi Qiu
2015-04-02 19:15       ` Yasuaki Ishimatsu
2015-04-03  7:03         ` Dave Young
2015-04-02  1:51   ` Xishi Qiu
2015-04-02  3:24     ` Dave Young
2015-04-02 19:36 ` Yasuaki Ishimatsu
2015-04-03  7:15   ` Dave Young
2015-04-03  7:17     ` Ingo Molnar
2015-04-03  7:23       ` Dave Young
2015-04-06 14:26     ` Yasuaki Ishimatsu
2015-04-07  3:33       ` Dave Young
2015-04-07 14:15         ` Yasuaki Ishimatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).