linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]mm/ia64: fix a node distance bug
@ 2012-08-20  6:21 wujianguo
  2012-08-20  7:06 ` Wen Congyang
  0 siblings, 1 reply; 3+ messages in thread
From: wujianguo @ 2012-08-20  6:21 UTC (permalink / raw)
  To: tony.luck, fenghua.yu
  Cc: linux-ia64, linux-kernel, jiang.liu, guohanjun, qiuxishi, liuj97

From: Jianguo Wu <wujianguo@huawei.com>

Hi all,
	When doing memory-hot-plug, We found node distance is wrong after offline
a node in IA64 platform. For example system has 4 nodes:
node distances:
node   0   1   2   3
  0:  10  21  21  32
  1:  21  10  32  21
  2:  21  32  10  21
  3:  32  21  21  10

linux-drf:/sys/devices/system/node/node0 # cat distance
10  21  21  32
linux-drf:/sys/devices/system/node/node1 # cat distance
21  10  32  21

After offline node2:
linux-drf:/sys/devices/system/node/node0 # cat distance
10 21 32
linux-drf:/sys/devices/system/node/node1 # cat distance
32 21 32	--------->expected value is: 21  10  21

In arch IA, we have following definition:
extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])

node distance is setup as following:
acpi_numa_arch_fixup()
{
	...
	memset(numa_slit, -1, sizeof(numa_slit));
	for (i = 0; i < slit_table->locality_count; i++) {
		if (!pxm_bit_test(i))
			continue;
		node_from = pxm_to_node(i);
		for (j = 0; j < slit_table->locality_count; j++) {
			if (!pxm_bit_test(j))
				continue;
			node_to = pxm_to_node(j);
			node_distance(node_from, node_to) =
			    slit_table->entry[i * slit_table->locality_count + j];
		}
	}
	...
}
	num_online_nodes() is a variable value, during system boot the return vale is 4,
but after offline node2, the return value is 3, so we read a wrong node distance value.
This patch is trying to fix this bug.

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
---
 arch/ia64/include/asm/numa.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/include/asm/numa.h b/arch/ia64/include/asm/numa.h
index 6a8a27c..2e27ef1 100644
--- a/arch/ia64/include/asm/numa.h
+++ b/arch/ia64/include/asm/numa.h
@@ -59,7 +59,7 @@ extern struct node_cpuid_s node_cpuid[NR_CPUS];
  */

 extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
-#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
+#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)])

 extern int paddr_to_nid(unsigned long paddr);

-- 
1.7.6.1



.




^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH]mm/ia64: fix a node distance bug
  2012-08-20  6:21 [PATCH]mm/ia64: fix a node distance bug wujianguo
@ 2012-08-20  7:06 ` Wen Congyang
  2012-08-20 13:54   ` Jianguo Wu
  0 siblings, 1 reply; 3+ messages in thread
From: Wen Congyang @ 2012-08-20  7:06 UTC (permalink / raw)
  To: wujianguo
  Cc: tony.luck, fenghua.yu, linux-ia64, linux-kernel, jiang.liu,
	guohanjun, qiuxishi, liuj97

At 08/20/2012 02:21 PM, wujianguo Wrote:
> From: Jianguo Wu <wujianguo@huawei.com>
> 
> Hi all,
> 	When doing memory-hot-plug, We found node distance is wrong after offline
> a node in IA64 platform. For example system has 4 nodes:
> node distances:
> node   0   1   2   3
>   0:  10  21  21  32
>   1:  21  10  32  21
>   2:  21  32  10  21
>   3:  32  21  21  10
> 
> linux-drf:/sys/devices/system/node/node0 # cat distance
> 10  21  21  32
> linux-drf:/sys/devices/system/node/node1 # cat distance
> 21  10  32  21
> 
> After offline node2:
> linux-drf:/sys/devices/system/node/node0 # cat distance
> 10 21 32
> linux-drf:/sys/devices/system/node/node1 # cat distance
> 32 21 32	--------->expected value is: 21  10  21
> 
> In arch IA, we have following definition:
> extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
> #define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
> 
> node distance is setup as following:
> acpi_numa_arch_fixup()
> {
> 	...
> 	memset(numa_slit, -1, sizeof(numa_slit));
> 	for (i = 0; i < slit_table->locality_count; i++) {
> 		if (!pxm_bit_test(i))
> 			continue;
> 		node_from = pxm_to_node(i);
> 		for (j = 0; j < slit_table->locality_count; j++) {
> 			if (!pxm_bit_test(j))
> 				continue;
> 			node_to = pxm_to_node(j);
> 			node_distance(node_from, node_to) =
> 			    slit_table->entry[i * slit_table->locality_count + j];
> 		}
> 	}
> 	...
> }
> 	num_online_nodes() is a variable value, during system boot the return vale is 4,
> but after offline node2, the return value is 3, so we read a wrong node distance value.
> This patch is trying to fix this bug.
> 
> Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
> ---
>  arch/ia64/include/asm/numa.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/ia64/include/asm/numa.h b/arch/ia64/include/asm/numa.h
> index 6a8a27c..2e27ef1 100644
> --- a/arch/ia64/include/asm/numa.h
> +++ b/arch/ia64/include/asm/numa.h
> @@ -59,7 +59,7 @@ extern struct node_cpuid_s node_cpuid[NR_CPUS];
>   */
> 
>  extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
> -#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
> +#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)])

Hmm, MAX_NUMNODES is too large. I think num_possible_nodes() is better.

Thanks
Wen Congyang

> 
>  extern int paddr_to_nid(unsigned long paddr);
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH]mm/ia64: fix a node distance bug
  2012-08-20  7:06 ` Wen Congyang
@ 2012-08-20 13:54   ` Jianguo Wu
  0 siblings, 0 replies; 3+ messages in thread
From: Jianguo Wu @ 2012-08-20 13:54 UTC (permalink / raw)
  To: Wen Congyang
  Cc: tony.luck, fenghua.yu, linux-ia64, linux-kernel, jiang.liu,
	guohanjun, qiuxishi, liuj97

On 2012/8/20 15:06, Wen Congyang wrote:
> At 08/20/2012 02:21 PM, wujianguo Wrote:
>> From: Jianguo Wu <wujianguo@huawei.com>
>>
>> Hi all,
>> 	When doing memory-hot-plug, We found node distance is wrong after offline
>> a node in IA64 platform. For example system has 4 nodes:
>> node distances:
>> node   0   1   2   3
>>   0:  10  21  21  32
>>   1:  21  10  32  21
>>   2:  21  32  10  21
>>   3:  32  21  21  10
>>
>> linux-drf:/sys/devices/system/node/node0 # cat distance
>> 10  21  21  32
>> linux-drf:/sys/devices/system/node/node1 # cat distance
>> 21  10  32  21
>>
>> After offline node2:
>> linux-drf:/sys/devices/system/node/node0 # cat distance
>> 10 21 32
>> linux-drf:/sys/devices/system/node/node1 # cat distance
>> 32 21 32	--------->expected value is: 21  10  21
>>
>> In arch IA, we have following definition:
>> extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
>> #define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
>>
>> node distance is setup as following:
>> acpi_numa_arch_fixup()
>> {
>> 	...
>> 	memset(numa_slit, -1, sizeof(numa_slit));
>> 	for (i = 0; i < slit_table->locality_count; i++) {
>> 		if (!pxm_bit_test(i))
>> 			continue;
>> 		node_from = pxm_to_node(i);
>> 		for (j = 0; j < slit_table->locality_count; j++) {
>> 			if (!pxm_bit_test(j))
>> 				continue;
>> 			node_to = pxm_to_node(j);
>> 			node_distance(node_from, node_to) =
>> 			    slit_table->entry[i * slit_table->locality_count + j];
>> 		}
>> 	}
>> 	...
>> }
>> 	num_online_nodes() is a variable value, during system boot the return vale is 4,
>> but after offline node2, the return value is 3, so we read a wrong node distance value.
>> This patch is trying to fix this bug.
>>
>> Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
>> ---
>>  arch/ia64/include/asm/numa.h |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/ia64/include/asm/numa.h b/arch/ia64/include/asm/numa.h
>> index 6a8a27c..2e27ef1 100644
>> --- a/arch/ia64/include/asm/numa.h
>> +++ b/arch/ia64/include/asm/numa.h
>> @@ -59,7 +59,7 @@ extern struct node_cpuid_s node_cpuid[NR_CPUS];
>>   */
>>
>>  extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
>> -#define node_distance(from,to) (numa_slit[(from) * num_online_nodes() + (to)])
>> +#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)])
> 
> Hmm, MAX_NUMNODES is too large. I think num_possible_nodes() is better.
> 
> Thanks
> Wen Congyang
> 

Hi Congyang,
	Thanks for you comments.
	numa_slit[MAX_NUMNODES * MAX_NUMNODES] is a static array, so I think it makes
no difference using MAX_NUMNODES or num_online_nodes().

>>
>>  extern int paddr_to_nid(unsigned long paddr);
>>
> 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-08-20 13:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-20  6:21 [PATCH]mm/ia64: fix a node distance bug wujianguo
2012-08-20  7:06 ` Wen Congyang
2012-08-20 13:54   ` Jianguo Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).