linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND] memory hotplug: fix a double register section info bug
@ 2012-09-14  3:43 qiuxishi
  2012-09-14  9:52 ` Mel Gorman
  2012-09-14 11:00 ` Yasuaki Ishimatsu
  0 siblings, 2 replies; 7+ messages in thread
From: qiuxishi @ 2012-09-14  3:43 UTC (permalink / raw)
  To: akpm, mgorman, tony.luck, Jiang Liu
  Cc: qiuxishi, bessel.wang, wujianguo, paul.gortmaker,
	kamezawa.hiroyu, kosaki.motohiro, rientjes, Minchan Kim,
	linux-mm, linux-kernel, Wen Congyang

There may be a bug when registering section info. For example, on
my Itanium platform, the pfn range of node0 includes the other nodes,
so other nodes' section info will be double registered, and memmap's
page count will equal to 3.

node0: start_pfn=0x100,    spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00
node1: start_pfn=0x80000,  spanned_pfn=0x80000,  present_pfn=0x80000, => 0x080000-0x100000
node2: start_pfn=0x100000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x100000-0x180000
node3: start_pfn=0x180000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x180000-0x200000

free_all_bootmem_node()
	register_page_bootmem_info_node()
		register_page_bootmem_info_section()

When hot remove memory, we can't free the memmap's page because
page_count() is 2 after put_page_bootmem().

sparse_remove_one_section()
	free_section_usemap()
		free_map_bootmem()
			put_page_bootmem()

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 mm/memory_hotplug.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2adbcac..cf493c7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -126,9 +126,6 @@ static void register_page_bootmem_info_section(unsigned long start_pfn)
 	struct mem_section *ms;
 	struct page *page, *memmap;

-	if (!pfn_valid(start_pfn))
-		return;
-
 	section_nr = pfn_to_section_nr(start_pfn);
 	ms = __nr_to_section(section_nr);

@@ -187,9 +184,10 @@ void register_page_bootmem_info_node(struct pglist_data *pgdat)
 	end_pfn = pfn + pgdat->node_spanned_pages;

 	/* register_section info */
-	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
-		register_page_bootmem_info_section(pfn);
-
+	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))
+			register_page_bootmem_info_section(pfn);
+	}
 }
 #endif /* !CONFIG_SPARSEMEM_VMEMMAP */

-- 
1.7.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND] memory hotplug: fix a double register section info bug
  2012-09-14  3:43 [PATCH RESEND] memory hotplug: fix a double register section info bug qiuxishi
@ 2012-09-14  9:52 ` Mel Gorman
  2012-09-14 16:24   ` Luck, Tony
  2012-09-14 11:00 ` Yasuaki Ishimatsu
  1 sibling, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2012-09-14  9:52 UTC (permalink / raw)
  To: qiuxishi
  Cc: akpm, tony.luck, Jiang Liu, qiuxishi, bessel.wang, wujianguo,
	paul.gortmaker, kamezawa.hiroyu, kosaki.motohiro, rientjes,
	Minchan Kim, linux-mm, linux-kernel, Wen Congyang

On Fri, Sep 14, 2012 at 11:43:27AM +0800, qiuxishi wrote:
> There may be a bug when registering section info. For example, on
> my Itanium platform, the pfn range of node0 includes the other nodes,
> so other nodes' section info will be double registered, and memmap's
> page count will equal to 3.
> 
> node0: start_pfn=0x100,    spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00
> node1: start_pfn=0x80000,  spanned_pfn=0x80000,  present_pfn=0x80000, => 0x080000-0x100000
> node2: start_pfn=0x100000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x100000-0x180000
> node3: start_pfn=0x180000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x180000-0x200000
> 

This is an unusual configuration but it's not unheard of. PPC64 in rare
(and usually broken) configurations can have one node span another. Tony
should know if such a configuration is normally allowed on Itanium or if
this should be considered a platform bug. Tony?

> free_all_bootmem_node()
> 	register_page_bootmem_info_node()
> 		register_page_bootmem_info_section()
> 
> When hot remove memory, we can't free the memmap's page because
> page_count() is 2 after put_page_bootmem().
> 
> sparse_remove_one_section()
> 	free_section_usemap()
> 		free_map_bootmem()
> 			put_page_bootmem()
> 
> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> ---
>  mm/memory_hotplug.c |   10 ++++------
>  1 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 2adbcac..cf493c7 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -126,9 +126,6 @@ static void register_page_bootmem_info_section(unsigned long start_pfn)
>  	struct mem_section *ms;
>  	struct page *page, *memmap;
> 
> -	if (!pfn_valid(start_pfn))
> -		return;
> -
>  	section_nr = pfn_to_section_nr(start_pfn);
>  	ms = __nr_to_section(section_nr);
> 
> @@ -187,9 +184,10 @@ void register_page_bootmem_info_node(struct pglist_data *pgdat)
>  	end_pfn = pfn + pgdat->node_spanned_pages;
> 
>  	/* register_section info */
> -	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
> -		register_page_bootmem_info_section(pfn);
> -
> +	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))
> +			register_page_bootmem_info_section(pfn);
> +	}

Functionally what the patch does is check if the PFN is both valid *and*
belongs to the expected node to catch a situation where nodes overlap. As
there are no other callers of register_page_bootmem_info_section() this
patch seems reasonable to me so

Acked-by: Mel Gorman <mgorman@suse.de>

I think it would also be ok to consider this a -stable candidate.

>  }
>  #endif /* !CONFIG_SPARSEMEM_VMEMMAP */
> 
> -- 
> 1.7.1

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND] memory hotplug: fix a double register section info bug
  2012-09-14  3:43 [PATCH RESEND] memory hotplug: fix a double register section info bug qiuxishi
  2012-09-14  9:52 ` Mel Gorman
@ 2012-09-14 11:00 ` Yasuaki Ishimatsu
  2012-09-14 20:14   ` Andrew Morton
  1 sibling, 1 reply; 7+ messages in thread
From: Yasuaki Ishimatsu @ 2012-09-14 11:00 UTC (permalink / raw)
  To: qiuxishi
  Cc: akpm, mgorman, tony.luck, Jiang Liu, qiuxishi, bessel.wang,
	wujianguo, paul.gortmaker, kamezawa.hiroyu, kosaki.motohiro,
	rientjes, Minchan Kim, linux-mm, linux-kernel, Wen Congyang

HiXishi,

2012/09/14 12:43, qiuxishi wrote:
> There may be a bug when registering section info. For example, on
> my Itanium platform, the pfn range of node0 includes the other nodes,
> so other nodes' section info will be double registered, and memmap's
> page count will equal to 3.
>
> node0: start_pfn=0x100,    spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00
> node1: start_pfn=0x80000,  spanned_pfn=0x80000,  present_pfn=0x80000, => 0x080000-0x100000
> node2: start_pfn=0x100000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x100000-0x180000
> node3: start_pfn=0x180000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x180000-0x200000
>
> free_all_bootmem_node()
> 	register_page_bootmem_info_node()
> 		register_page_bootmem_info_section()
>
> When hot remove memory, we can't free the memmap's page because
> page_count() is 2 after put_page_bootmem().
>
> sparse_remove_one_section()
> 	free_section_usemap()
> 		free_map_bootmem()
> 			put_page_bootmem()
>
> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> ---
>   mm/memory_hotplug.c |   10 ++++------
>   1 files changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 2adbcac..cf493c7 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -126,9 +126,6 @@ static void register_page_bootmem_info_section(unsigned long start_pfn)
>   	struct mem_section *ms;
>   	struct page *page, *memmap;
>
> -	if (!pfn_valid(start_pfn))
> -		return;
> -
>   	section_nr = pfn_to_section_nr(start_pfn);
>   	ms = __nr_to_section(section_nr);
>
> @@ -187,9 +184,10 @@ void register_page_bootmem_info_node(struct pglist_data *pgdat)
>   	end_pfn = pfn + pgdat->node_spanned_pages;
>
>   	/* register_section info */
> -	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
> -		register_page_bootmem_info_section(pfn);
> -
> +	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))

I cannot judge whether your configuration is correct or not.
Thus if it is correct, I want a comment of why the node check is
needed. In usual configuration, a node does not span the other one.
So it is natural that "pfn_to_nid(pfn) is same as "pgdat->node_id".
Thus we may remove the node check in the future.

Thanks,
Yasuaki Ishimatsu

> +			register_page_bootmem_info_section(pfn);
> +	}
>   }
>   #endif /* !CONFIG_SPARSEMEM_VMEMMAP */
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH RESEND] memory hotplug: fix a double register section info bug
  2012-09-14  9:52 ` Mel Gorman
@ 2012-09-14 16:24   ` Luck, Tony
  2012-09-17 14:21     ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Luck, Tony @ 2012-09-14 16:24 UTC (permalink / raw)
  To: Mel Gorman, qiuxishi
  Cc: akpm, Jiang Liu, qiuxishi, bessel.wang, wujianguo,
	paul.gortmaker, kamezawa.hiroyu, kosaki.motohiro, rientjes,
	Minchan Kim, linux-mm, linux-kernel, Wen Congyang

> This is an unusual configuration but it's not unheard of. PPC64 in rare
> (and usually broken) configurations can have one node span another. Tony
> should know if such a configuration is normally allowed on Itanium or if
> this should be considered a platform bug. Tony?

We definitely have platforms where the physical memory on node 0
that we skipped to leave physical address space for PCI mem mapped
devices gets tagged back at the very top of memory, after other nodes.

E.g. A 2-node system with 8G on each might look like this:

0-2G RAM on node 0
2G-4G  PCI map space
4G-8G RAM on node 0
8G-16GRAM on node 1
16G-18G RAM on node 0

Is this the situation that we are talking about? Or something different?

-Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND] memory hotplug: fix a double register section info bug
  2012-09-14 11:00 ` Yasuaki Ishimatsu
@ 2012-09-14 20:14   ` Andrew Morton
  2012-09-18  0:11     ` Yasuaki Ishimatsu
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2012-09-14 20:14 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: qiuxishi, mgorman, tony.luck, Jiang Liu, qiuxishi, bessel.wang,
	wujianguo, paul.gortmaker, kamezawa.hiroyu, kosaki.motohiro,
	rientjes, Minchan Kim, linux-mm, linux-kernel, Wen Congyang

On Fri, 14 Sep 2012 20:00:09 +0900
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> wrote:

> > @@ -187,9 +184,10 @@ void register_page_bootmem_info_node(struct pglist_data *pgdat)
> >   	end_pfn = pfn + pgdat->node_spanned_pages;
> >
> >   	/* register_section info */
> > -	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
> > -		register_page_bootmem_info_section(pfn);
> > -
> > +	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> > +		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))
> 
> I cannot judge whether your configuration is correct or not.
> Thus if it is correct, I want a comment of why the node check is
> needed. In usual configuration, a node does not span the other one.
> So it is natural that "pfn_to_nid(pfn) is same as "pgdat->node_id".
> Thus we may remove the node check in the future.

yup.  How does this look?

--- a/mm/memory_hotplug.c~memory-hotplug-fix-a-double-register-section-info-bug-fix
+++ a/mm/memory_hotplug.c
@@ -185,6 +185,12 @@ void register_page_bootmem_info_node(str
 
 	/* register_section info */
 	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+		/*
+		 * Some platforms can assign the same pfn to multiple nodes - on
+		 * node0 as well as nodeN.  To avoid registering a pfn against
+		 * multiple nodes we check that this pfn does not already
+		 * reside in some other node.
+		 */
 		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))
 			register_page_bootmem_info_section(pfn);
 	}
_


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND] memory hotplug: fix a double register section info bug
  2012-09-14 16:24   ` Luck, Tony
@ 2012-09-17 14:21     ` Mel Gorman
  0 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2012-09-17 14:21 UTC (permalink / raw)
  To: Luck, Tony
  Cc: qiuxishi, akpm, Jiang Liu, qiuxishi, bessel.wang, wujianguo,
	paul.gortmaker, kamezawa.hiroyu, kosaki.motohiro, rientjes,
	Minchan Kim, linux-mm, linux-kernel, Wen Congyang

On Fri, Sep 14, 2012 at 04:24:32PM +0000, Luck, Tony wrote:
> > This is an unusual configuration but it's not unheard of. PPC64 in rare
> > (and usually broken) configurations can have one node span another. Tony
> > should know if such a configuration is normally allowed on Itanium or if
> > this should be considered a platform bug. Tony?
> 
> We definitely have platforms where the physical memory on node 0
> that we skipped to leave physical address space for PCI mem mapped
> devices gets tagged back at the very top of memory, after other nodes.
> 
> E.g. A 2-node system with 8G on each might look like this:
> 
> 0-2G RAM on node 0
> 2G-4G  PCI map space
> 4G-8G RAM on node 0
> 8G-16GRAM on node 1
> 16G-18G RAM on node 0
> 
> Is this the situation that we are talking about? Or something different?
> 

This is the type of situation we are talking about. The spanned range of
node 0 includes node 1. The patch needs another revision with a comment
explaining the situation included but otherwise the patch should be
fine.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND] memory hotplug: fix a double register section info bug
  2012-09-14 20:14   ` Andrew Morton
@ 2012-09-18  0:11     ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 7+ messages in thread
From: Yasuaki Ishimatsu @ 2012-09-18  0:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: qiuxishi, mgorman, tony.luck, Jiang Liu, qiuxishi, bessel.wang,
	wujianguo, paul.gortmaker, kamezawa.hiroyu, kosaki.motohiro,
	rientjes, Minchan Kim, linux-mm, linux-kernel, Wen Congyang

2012/09/15 5:14, Andrew Morton wrote:
> On Fri, 14 Sep 2012 20:00:09 +0900
> Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> wrote:
>
>>> @@ -187,9 +184,10 @@ void register_page_bootmem_info_node(struct pglist_data *pgdat)
>>>    	end_pfn = pfn + pgdat->node_spanned_pages;
>>>
>>>    	/* register_section info */
>>> -	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
>>> -		register_page_bootmem_info_section(pfn);
>>> -
>>> +	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>>> +		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))
>>
>> I cannot judge whether your configuration is correct or not.
>> Thus if it is correct, I want a comment of why the node check is
>> needed. In usual configuration, a node does not span the other one.
>> So it is natural that "pfn_to_nid(pfn) is same as "pgdat->node_id".
>> Thus we may remove the node check in the future.
>
> yup.  How does this look?

Looks good to me.

Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

>
> --- a/mm/memory_hotplug.c~memory-hotplug-fix-a-double-register-section-info-bug-fix
> +++ a/mm/memory_hotplug.c
> @@ -185,6 +185,12 @@ void register_page_bootmem_info_node(str
>
>   	/* register_section info */
>   	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +		/*
> +		 * Some platforms can assign the same pfn to multiple nodes - on
> +		 * node0 as well as nodeN.  To avoid registering a pfn against
> +		 * multiple nodes we check that this pfn does not already
> +		 * reside in some other node.
> +		 */
>   		if (pfn_valid(pfn) && (pfn_to_nid(pfn) == node))
>   			register_page_bootmem_info_section(pfn);
>   	}
> _
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-09-18  0:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-14  3:43 [PATCH RESEND] memory hotplug: fix a double register section info bug qiuxishi
2012-09-14  9:52 ` Mel Gorman
2012-09-14 16:24   ` Luck, Tony
2012-09-17 14:21     ` Mel Gorman
2012-09-14 11:00 ` Yasuaki Ishimatsu
2012-09-14 20:14   ` Andrew Morton
2012-09-18  0:11     ` Yasuaki Ishimatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).