All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc
@ 2012-06-27 16:36 Gavin Shan
  2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Gavin Shan @ 2012-06-27 16:36 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, dave, rientjes, hannes, akpm, Gavin Shan

With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
descriptors are allocated from slab or bootmem. When allocating
from slab, let slab/bootmem allocator to clear the memory chunk.
We needn't clear that explicitly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
---
 mm/sparse.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 6a4bf91..781fa04 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -65,14 +65,12 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 
 	if (slab_is_available()) {
 		if (node_state(nid, N_HIGH_MEMORY))
-			section = kmalloc_node(array_size, GFP_KERNEL, nid);
+			section = kzalloc_node(array_size, GFP_KERNEL, nid);
 		else
-			section = kmalloc(array_size, GFP_KERNEL);
-	} else
+			section = kzalloc(array_size, GFP_KERNEL);
+	} else {
 		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
-
-	if (section)
-		memset(section, 0, array_size);
+	}
 
 	return section;
 }
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-27 16:36 [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
@ 2012-06-27 16:36 ` Gavin Shan
  2012-06-27 17:01   ` Dave Hansen
                     ` (2 more replies)
  2012-06-27 16:36 ` [PATCH v2 3/3] mm/sparse: more check on mem_section number Gavin Shan
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 19+ messages in thread
From: Gavin Shan @ 2012-06-27 16:36 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, dave, rientjes, hannes, akpm, Gavin Shan

With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
are allocated by slab or bootmem allocator. Also, the descriptors
might have been allocated and initialized during the hotplug path.
However, the memory chunk allocated in current implementation wouldn't
be put into the available pool if that has been allocated. The situation
will lead to memory leak.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
---
 mm/sparse.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index 781fa04..a803599 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -75,6 +75,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 	return section;
 }
 
+static void noinline __init_refok sparse_index_free(struct mem_section *section,
+						    int nid)
+{
+	unsigned long size = SECTIONS_PER_ROOT *
+			     sizeof(struct mem_section);
+
+	if (!section)
+		return;
+
+	if (slab_is_available())
+		kfree(section);
+	else
+		free_bootmem_node(NODE_DATA(nid),
+			virt_to_phys(section), size);
+}
+
 static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 {
 	static DEFINE_SPINLOCK(index_init_lock);
@@ -102,6 +118,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 	mem_section[root] = section;
 out:
 	spin_unlock(&index_init_lock);
+	if (ret)
+		sparse_index_free(section, nid);
+
 	return ret;
 }
 #else /* !SPARSEMEM_EXTREME */
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/3] mm/sparse: more check on mem_section number
  2012-06-27 16:36 [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
  2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
@ 2012-06-27 16:36 ` Gavin Shan
  2012-06-27 22:06   ` David Rientjes
  2012-06-27 22:10 ` [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc David Rientjes
  2012-06-28 12:52 ` Michal Hocko
  3 siblings, 1 reply; 19+ messages in thread
From: Gavin Shan @ 2012-06-27 16:36 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, dave, rientjes, hannes, akpm, Gavin Shan

Function __section_nr() was implemented to retrieve the corresponding
memory section number according to its descriptor. It's possible that
the specified memory section descriptor isn't existing in the global
array. So here to add more check on that and report error for wrong
case.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Dave Hansen <dave@linux.vnet.ibm.com>
---
 mm/sparse.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index a803599..8b8250e 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -149,6 +149,8 @@ int __section_nr(struct mem_section* ms)
 		     break;
 	}
 
+	VM_BUG_ON(root_nr >= NR_SECTION_ROOTS);
+
 	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
 }
 
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
@ 2012-06-27 17:01   ` Dave Hansen
  2012-06-28  6:03     ` Gavin Shan
  2012-06-27 22:07   ` David Rientjes
  2012-06-28 12:57   ` Michal Hocko
  2 siblings, 1 reply; 19+ messages in thread
From: Dave Hansen @ 2012-06-27 17:01 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, mhocko, rientjes, hannes, akpm

On 06/27/2012 09:36 AM, Gavin Shan wrote:
> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
> are allocated by slab or bootmem allocator. Also, the descriptors
> might have been allocated and initialized during the hotplug path.
> However, the memory chunk allocated in current implementation wouldn't
> be put into the available pool if that has been allocated. The situation
> will lead to memory leak.

I've read this changelog about ten times and I'm still not really clear
what the bug is here.

--

sparse_index_init() is designed to be safe if two copies of it race.  It
uses "index_init_lock" to ensure that, even in the case of a race, only
one CPU will manage to do:

	mem_section[root] = section;

However, in the case where two copies of sparse_index_init() _do_ race,
the one that loses the race will leak the "section" that
sparse_index_alloc() allocated for it.  This patch fixes that leak.

--

Technically, I'm not sure that we can race during the time when we'd be
using bootmem.  I think we do all those initializations single-threaded
at the moment, and we'd finish them before we turn the slab on.  So,
technically, we probably don't need the bootmem stuff in
sparse_index_free().  But, I guess it doesn't hurt, and it's fine for
completeness.

Gavin, have you actually tested this in some way?  It looks OK to me,
but I worry that you've just added a block of code that's exceedingly
unlikely to get run.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] mm/sparse: more check on mem_section number
  2012-06-27 16:36 ` [PATCH v2 3/3] mm/sparse: more check on mem_section number Gavin Shan
@ 2012-06-27 22:06   ` David Rientjes
  2012-06-27 22:14     ` Dave Hansen
  2012-06-28  6:18     ` Gavin Shan
  0 siblings, 2 replies; 19+ messages in thread
From: David Rientjes @ 2012-06-27 22:06 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, mhocko, dave, hannes, akpm

On Thu, 28 Jun 2012, Gavin Shan wrote:

> diff --git a/mm/sparse.c b/mm/sparse.c
> index a803599..8b8250e 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -149,6 +149,8 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
>  
> +	VM_BUG_ON(root_nr >= NR_SECTION_ROOTS);
> +

VM_BUG_ON(root_nr == NR_SECTION_ROOTS);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
  2012-06-27 17:01   ` Dave Hansen
@ 2012-06-27 22:07   ` David Rientjes
  2012-06-28  6:16     ` Gavin Shan
  2012-06-28 12:57   ` Michal Hocko
  2 siblings, 1 reply; 19+ messages in thread
From: David Rientjes @ 2012-06-27 22:07 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, mhocko, dave, hannes, akpm

On Thu, 28 Jun 2012, Gavin Shan wrote:

> diff --git a/mm/sparse.c b/mm/sparse.c
> index 781fa04..a803599 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -75,6 +75,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  	return section;
>  }
>  
> +static void noinline __init_refok sparse_index_free(struct mem_section *section,
> +						    int nid)

noinline is unecessary, this is only referenced from sparse_index_init() 
and it's perfectly legimitate to inline.  Also, this should be __meminit 
and not __init.

> +{
> +	unsigned long size = SECTIONS_PER_ROOT *
> +			     sizeof(struct mem_section);
> +
> +	if (!section)
> +		return;
> +
> +	if (slab_is_available())
> +		kfree(section);
> +	else
> +		free_bootmem_node(NODE_DATA(nid),
> +			virt_to_phys(section), size);

Did you check what happens here if !node_state(nid, N_HIGH_MEMORY)?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc
  2012-06-27 16:36 [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
  2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
  2012-06-27 16:36 ` [PATCH v2 3/3] mm/sparse: more check on mem_section number Gavin Shan
@ 2012-06-27 22:10 ` David Rientjes
  2012-06-28 12:52 ` Michal Hocko
  3 siblings, 0 replies; 19+ messages in thread
From: David Rientjes @ 2012-06-27 22:10 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, mhocko, dave, hannes, akpm

On Thu, 28 Jun 2012, Gavin Shan wrote:

> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
> descriptors are allocated from slab or bootmem. When allocating
> from slab, let slab/bootmem allocator to clear the memory chunk.
> We needn't clear that explicitly.
> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>

Acked-by: David Rientjes <rientjes@google.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] mm/sparse: more check on mem_section number
  2012-06-27 22:06   ` David Rientjes
@ 2012-06-27 22:14     ` Dave Hansen
  2012-06-28  6:18     ` Gavin Shan
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Hansen @ 2012-06-27 22:14 UTC (permalink / raw)
  To: David Rientjes; +Cc: Gavin Shan, linux-mm, mhocko, hannes, akpm

On 06/27/2012 03:06 PM, David Rientjes wrote:
>> > +	VM_BUG_ON(root_nr >= NR_SECTION_ROOTS);
>> > +
> VM_BUG_ON(root_nr == NR_SECTION_ROOTS);

Whoops, when I suggested >=, I wasn't reading the context.  I thought
root_nr was an argument, not a for() loop variable.  This isn't exactly
_broken_, but it makes no sense the way the code is now.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-27 17:01   ` Dave Hansen
@ 2012-06-28  6:03     ` Gavin Shan
  2012-06-28 14:54       ` Dave Hansen
  0 siblings, 1 reply; 19+ messages in thread
From: Gavin Shan @ 2012-06-28  6:03 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Gavin Shan, linux-mm, mhocko, rientjes, hannes, akpm

>> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
>> are allocated by slab or bootmem allocator. Also, the descriptors
>> might have been allocated and initialized during the hotplug path.
>> However, the memory chunk allocated in current implementation wouldn't
>> be put into the available pool if that has been allocated. The situation
>> will lead to memory leak.
>
>I've read this changelog about ten times and I'm still not really clear
>what the bug is here.
>

yep, I need improve my written English definitely :-)

>--
>
>sparse_index_init() is designed to be safe if two copies of it race.  It
>uses "index_init_lock" to ensure that, even in the case of a race, only
>one CPU will manage to do:
>
>	mem_section[root] = section;
>
>However, in the case where two copies of sparse_index_init() _do_ race,
>the one that loses the race will leak the "section" that
>sparse_index_alloc() allocated for it.  This patch fixes that leak.
>
>--

Thank you very much, Dave. Let me merge your changelog into next version.

>
>Technically, I'm not sure that we can race during the time when we'd be
>using bootmem.  I think we do all those initializations single-threaded
>at the moment, and we'd finish them before we turn the slab on.  So,
>technically, we probably don't need the bootmem stuff in
>sparse_index_free().  But, I guess it doesn't hurt, and it's fine for
>completeness.
>
>Gavin, have you actually tested this in some way?  It looks OK to me,
>but I worry that you've just added a block of code that's exceedingly
>unlikely to get run.

I didn't test this and I just catch the point while reading the source
code. By the way, I would like to know the popular utilities used for
memory testing. If you can share some information regarding that, that
would be great.

	- memory related benchmark testing utility.
	- some documents on Linux memory testing.

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-27 22:07   ` David Rientjes
@ 2012-06-28  6:16     ` Gavin Shan
  2012-06-28 21:34       ` David Rientjes
  0 siblings, 1 reply; 19+ messages in thread
From: Gavin Shan @ 2012-06-28  6:16 UTC (permalink / raw)
  To: David Rientjes; +Cc: Gavin Shan, linux-mm, mhocko, dave, hannes, akpm

>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 781fa04..a803599 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -75,6 +75,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  	return section;
>>  }
>>  
>> +static void noinline __init_refok sparse_index_free(struct mem_section *section,
>> +						    int nid)
>
>noinline is unecessary, this is only referenced from sparse_index_init() 
>and it's perfectly legimitate to inline.  Also, this should be __meminit 
>and not __init.
>

Thanks for your comments. I'll change it into "inline __meminit" in next version :-)

>> +{
>> +	unsigned long size = SECTIONS_PER_ROOT *
>> +			     sizeof(struct mem_section);
>> +
>> +	if (!section)
>> +		return;
>> +
>> +	if (slab_is_available())
>> +		kfree(section);
>> +	else
>> +		free_bootmem_node(NODE_DATA(nid),
>> +			virt_to_phys(section), size);
>
>Did you check what happens here if !node_state(nid, N_HIGH_MEMORY)?
>

I'm sorry that I'm not catching your point. Please explain for more
if necessary.

I'm not sure you're talking about "kfree(section);" since the memory
chunk is allocated either by kzalloc_node() or kzalloc().

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] mm/sparse: more check on mem_section number
  2012-06-27 22:06   ` David Rientjes
  2012-06-27 22:14     ` Dave Hansen
@ 2012-06-28  6:18     ` Gavin Shan
  1 sibling, 0 replies; 19+ messages in thread
From: Gavin Shan @ 2012-06-28  6:18 UTC (permalink / raw)
  To: David Rientjes; +Cc: Gavin Shan, linux-mm, mhocko, dave, hannes, akpm

On Wed, Jun 27, 2012 at 03:06:52PM -0700, David Rientjes wrote:
>On Thu, 28 Jun 2012, Gavin Shan wrote:
>
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index a803599..8b8250e 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -149,6 +149,8 @@ int __section_nr(struct mem_section* ms)
>>  		     break;
>>  	}
>>  
>> +	VM_BUG_ON(root_nr >= NR_SECTION_ROOTS);
>> +
>
>VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
>

Thanks, David. I will change it according to your comments.

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc
  2012-06-27 16:36 [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
                   ` (2 preceding siblings ...)
  2012-06-27 22:10 ` [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc David Rientjes
@ 2012-06-28 12:52 ` Michal Hocko
  2012-06-29  1:20   ` Gavin Shan
  3 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2012-06-28 12:52 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, rientjes, hannes, akpm

On Thu 28-06-12 00:36:06, Gavin Shan wrote:
> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
> descriptors are allocated from slab or bootmem. When allocating
> from slab, let slab/bootmem allocator to clear the memory chunk.
> We needn't clear that explicitly.
> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>

Well, I don't remember to give my r-b but now you have it official
(please do not do that in future)
Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/sparse.c |   10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 6a4bf91..781fa04 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -65,14 +65,12 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  
>  	if (slab_is_available()) {
>  		if (node_state(nid, N_HIGH_MEMORY))
> -			section = kmalloc_node(array_size, GFP_KERNEL, nid);
> +			section = kzalloc_node(array_size, GFP_KERNEL, nid);
>  		else
> -			section = kmalloc(array_size, GFP_KERNEL);
> -	} else
> +			section = kzalloc(array_size, GFP_KERNEL);
> +	} else {
>  		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
> -
> -	if (section)
> -		memset(section, 0, array_size);
> +	}
>  
>  	return section;
>  }
> -- 
> 1.7.9.5
> 

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
  2012-06-27 17:01   ` Dave Hansen
  2012-06-27 22:07   ` David Rientjes
@ 2012-06-28 12:57   ` Michal Hocko
  2012-06-29  1:22     ` Gavin Shan
  2 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2012-06-28 12:57 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, rientjes, hannes, akpm

On Thu 28-06-12 00:36:07, Gavin Shan wrote:
> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
> are allocated by slab or bootmem allocator. Also, the descriptors
> might have been allocated and initialized during the hotplug path.
> However, the memory chunk allocated in current implementation wouldn't
> be put into the available pool if that has been allocated. The situation
> will lead to memory leak.
> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>

And again!
To quote my answers to this patch in previous run:
"
I am not saying the bug is not real. It is just that the changelog
doesn's say how the bug is hit, who is affected and when it has been
introduced. These is essential for stable.
"

Does this sound like Reviewed-by? Hell no!

This changelog btw. doesn't say this either!

> ---
>  mm/sparse.c |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 781fa04..a803599 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -75,6 +75,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  	return section;
>  }
>  
> +static void noinline __init_refok sparse_index_free(struct mem_section *section,
> +						    int nid)
> +{
> +	unsigned long size = SECTIONS_PER_ROOT *
> +			     sizeof(struct mem_section);
> +
> +	if (!section)
> +		return;
> +
> +	if (slab_is_available())
> +		kfree(section);
> +	else
> +		free_bootmem_node(NODE_DATA(nid),
> +			virt_to_phys(section), size);
> +}
> +
>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  {
>  	static DEFINE_SPINLOCK(index_init_lock);
> @@ -102,6 +118,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	mem_section[root] = section;
>  out:
>  	spin_unlock(&index_init_lock);
> +	if (ret)
> +		sparse_index_free(section, nid);
> +
>  	return ret;
>  }
>  #else /* !SPARSEMEM_EXTREME */
> -- 
> 1.7.9.5
> 

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-28  6:03     ` Gavin Shan
@ 2012-06-28 14:54       ` Dave Hansen
  2012-07-01 13:26         ` Gavin Shan
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Hansen @ 2012-06-28 14:54 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, mhocko, rientjes, hannes, akpm

On 06/27/2012 11:03 PM, Gavin Shan wrote:
>> >Gavin, have you actually tested this in some way?  It looks OK to me,
>> >but I worry that you've just added a block of code that's exceedingly
>> >unlikely to get run.
> I didn't test this and I just catch the point while reading the source
> code. By the way, I would like to know the popular utilities used for
> memory testing. If you can share some information regarding that, that
> would be great.
> 
> 	- memory related benchmark testing utility.
> 	- some documents on Linux memory testing.

This patch is intended to fix a memory leak in the case of a race.  Can
you _actually_ make it race to ensure that things work properly?  If
not, can you add something like a sleep() to _force_ it to race?

Or, have you simply run your code a couple of times like this, both for
the bootmem and slab cases:

	int nid = 0;
	for (i=0; i < something; i++) {
		section = sparse_index_alloc(nid);
		sparse_index_free(section, nid);
	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-28  6:16     ` Gavin Shan
@ 2012-06-28 21:34       ` David Rientjes
  2012-07-01 13:41         ` Gavin Shan
  0 siblings, 1 reply; 19+ messages in thread
From: David Rientjes @ 2012-06-28 21:34 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, mhocko, dave, hannes, akpm

On Thu, 28 Jun 2012, Gavin Shan wrote:

> >> +{
> >> +	unsigned long size = SECTIONS_PER_ROOT *
> >> +			     sizeof(struct mem_section);
> >> +
> >> +	if (!section)
> >> +		return;
> >> +
> >> +	if (slab_is_available())
> >> +		kfree(section);
> >> +	else
> >> +		free_bootmem_node(NODE_DATA(nid),
> >> +			virt_to_phys(section), size);
> >
> >Did you check what happens here if !node_state(nid, N_HIGH_MEMORY)?
> >
> 
> I'm sorry that I'm not catching your point. Please explain for more
> if necessary.
> 

I'm asking specifically about the free_bootmem_node(NODE_DATA(nid), ...).

If this section was allocated in sparse_index_alloc() before 
slab_is_available() with alloc_bootmem_node() and nid is not in 
N_HIGH_MEMORY, will alloc_bootmem_node() fallback to any node or return 
NULL?

If it falls back to any node, is it safe to try to free that section by 
passing NODE_DATA(nid) here when it wasn't allocated on that nid?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc
  2012-06-28 12:52 ` Michal Hocko
@ 2012-06-29  1:20   ` Gavin Shan
  0 siblings, 0 replies; 19+ messages in thread
From: Gavin Shan @ 2012-06-29  1:20 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, dave, rientjes, hannes, akpm

>> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
>> descriptors are allocated from slab or bootmem. When allocating
>> from slab, let slab/bootmem allocator to clear the memory chunk.
>> We needn't clear that explicitly.
>> 
>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> Reviewed-by: Michal Hocko <mhocko@suse.cz>
>
>Well, I don't remember to give my r-b but now you have it official
>(please do not do that in future)

Ok. I thought anybody gave comments should be put into r-b list, which
is wrong. I won't do it and thanks for your comments :-)

>Reviewed-by: Michal Hocko <mhocko@suse.cz>

Thanks, Michal.

Gavin

>
>> ---
>>  mm/sparse.c |   10 ++++------
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 6a4bf91..781fa04 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -65,14 +65,12 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  
>>  	if (slab_is_available()) {
>>  		if (node_state(nid, N_HIGH_MEMORY))
>> -			section = kmalloc_node(array_size, GFP_KERNEL, nid);
>> +			section = kzalloc_node(array_size, GFP_KERNEL, nid);
>>  		else
>> -			section = kmalloc(array_size, GFP_KERNEL);
>> -	} else
>> +			section = kzalloc(array_size, GFP_KERNEL);
>> +	} else {
>>  		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
>> -
>> -	if (section)
>> -		memset(section, 0, array_size);
>> +	}
>>  
>>  	return section;
>>  }
>> -- 
>> 1.7.9.5
>> 
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-28 12:57   ` Michal Hocko
@ 2012-06-29  1:22     ` Gavin Shan
  0 siblings, 0 replies; 19+ messages in thread
From: Gavin Shan @ 2012-06-29  1:22 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, dave, rientjes, hannes, akpm

On Thu, Jun 28, 2012 at 02:57:42PM +0200, Michal Hocko wrote:
>On Thu 28-06-12 00:36:07, Gavin Shan wrote:
>> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
>> are allocated by slab or bootmem allocator. Also, the descriptors
>> might have been allocated and initialized during the hotplug path.
>> However, the memory chunk allocated in current implementation wouldn't
>> be put into the available pool if that has been allocated. The situation
>> will lead to memory leak.
>> 
>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> Reviewed-by: Michal Hocko <mhocko@suse.cz>
>
>And again!
>To quote my answers to this patch in previous run:
>"
>I am not saying the bug is not real. It is just that the changelog
>doesn's say how the bug is hit, who is affected and when it has been
>introduced. These is essential for stable.
>"
>
>Does this sound like Reviewed-by? Hell no!
>

Ok. I won't do this again wrongly.

>This changelog btw. doesn't say this either!
>

Here's the changelog that Dave suggested. I just copy & paste it.

--

sparse_index_init() is designed to be safe if two copies of it race.  It
uses "index_init_lock" to ensure that, even in the case of a race, only
one CPU will manage to do:

        mem_section[root] = section;

However, in the case where two copies of sparse_index_init() _do_ race,
the one that loses the race will leak the "section" that
sparse_index_alloc() allocated for it.  This patch fixes that leak.

--

Thanks,
Gavin

>> ---
>>  mm/sparse.c |   19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 781fa04..a803599 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -75,6 +75,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  	return section;
>>  }
>>  
>> +static void noinline __init_refok sparse_index_free(struct mem_section *section,
>> +						    int nid)
>> +{
>> +	unsigned long size = SECTIONS_PER_ROOT *
>> +			     sizeof(struct mem_section);
>> +
>> +	if (!section)
>> +		return;
>> +
>> +	if (slab_is_available())
>> +		kfree(section);
>> +	else
>> +		free_bootmem_node(NODE_DATA(nid),
>> +			virt_to_phys(section), size);
>> +}
>> +
>>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  {
>>  	static DEFINE_SPINLOCK(index_init_lock);
>> @@ -102,6 +118,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  	mem_section[root] = section;
>>  out:
>>  	spin_unlock(&index_init_lock);
>> +	if (ret)
>> +		sparse_index_free(section, nid);
>> +
>>  	return ret;
>>  }
>>  #else /* !SPARSEMEM_EXTREME */
>> -- 
>> 1.7.9.5
>> 
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-28 14:54       ` Dave Hansen
@ 2012-07-01 13:26         ` Gavin Shan
  0 siblings, 0 replies; 19+ messages in thread
From: Gavin Shan @ 2012-07-01 13:26 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Gavin Shan, linux-mm, mhocko, rientjes, hannes, akpm

On Thu, Jun 28, 2012 at 07:54:02AM -0700, Dave Hansen wrote:
>On 06/27/2012 11:03 PM, Gavin Shan wrote:
>>> >Gavin, have you actually tested this in some way?  It looks OK to me,
>>> >but I worry that you've just added a block of code that's exceedingly
>>> >unlikely to get run.
>> I didn't test this and I just catch the point while reading the source
>> code. By the way, I would like to know the popular utilities used for
>> memory testing. If you can share some information regarding that, that
>> would be great.
>> 
>> 	- memory related benchmark testing utility.
>> 	- some documents on Linux memory testing.
>
>This patch is intended to fix a memory leak in the case of a race.  Can
>you _actually_ make it race to ensure that things work properly?  If
>not, can you add something like a sleep() to _force_ it to race?
>

Thank you very much, Dave :-)

>Or, have you simply run your code a couple of times like this, both for
>the bootmem and slab cases:
>
>	int nid = 0;
>	for (i=0; i < something; i++) {
>		section = sparse_index_alloc(nid);
>		sparse_index_free(section, nid);
>	}
>

I ran following function for bootmem/slab case and everything looks fine. Please
let me know if you have any more concerns :-)

void sparse_test(void)
{
        int nid;
        int i;
        struct mem_section *section;

        pr_info("*************************************\n");
        if (slab_is_available()) {
                pr_info("* Sparse Testing on slab\n");
        } else {
                pr_info("* Sparse Testing on bootmem\n");
        }
        pr_info("*************************************\n");

	/* Currently, we have 2 nodes in the system */
        for (nid = 0; nid < 2; nid++) {
                for (i = 0; i < 100; i++) {
                        pr_info(" Testing sequence ... %d for nid %d\n", i, nid);
                        section = sparse_index_alloc(nid);
                        sparse_index_free(section, nid);
                }
        }
}

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] mm/sparse: fix possible memory leak
  2012-06-28 21:34       ` David Rientjes
@ 2012-07-01 13:41         ` Gavin Shan
  0 siblings, 0 replies; 19+ messages in thread
From: Gavin Shan @ 2012-07-01 13:41 UTC (permalink / raw)
  To: David Rientjes; +Cc: Gavin Shan, linux-mm, mhocko, dave, hannes, akpm

On Thu, Jun 28, 2012 at 02:34:29PM -0700, David Rientjes wrote:
>On Thu, 28 Jun 2012, Gavin Shan wrote:
>
>> >> +{
>> >> +	unsigned long size = SECTIONS_PER_ROOT *
>> >> +			     sizeof(struct mem_section);
>> >> +
>> >> +	if (!section)
>> >> +		return;
>> >> +
>> >> +	if (slab_is_available())
>> >> +		kfree(section);
>> >> +	else
>> >> +		free_bootmem_node(NODE_DATA(nid),
>> >> +			virt_to_phys(section), size);
>> >
>> >Did you check what happens here if !node_state(nid, N_HIGH_MEMORY)?
>> >
>> 
>> I'm sorry that I'm not catching your point. Please explain for more
>> if necessary.
>> 
>
>I'm asking specifically about the free_bootmem_node(NODE_DATA(nid), ...).
>

Thanks for pointing it out, David.

>If this section was allocated in sparse_index_alloc() before 
>slab_is_available() with alloc_bootmem_node() and nid is not in 
>N_HIGH_MEMORY, will alloc_bootmem_node() fallback to any node or return 
>NULL?
>

Yes, you're right that bootmem allocator will try other nodes if the
specified node can't accomodate the memory allocation. So it's not
safe to free memory by free_bootmem_node().

>If it falls back to any node, is it safe to try to free that section by 
>passing NODE_DATA(nid) here when it wasn't allocated on that nid?
>

I think free_bootmem() should be used here :-)

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-07-01 13:41 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-27 16:36 [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
2012-06-27 16:36 ` [PATCH v2 2/3] mm/sparse: fix possible memory leak Gavin Shan
2012-06-27 17:01   ` Dave Hansen
2012-06-28  6:03     ` Gavin Shan
2012-06-28 14:54       ` Dave Hansen
2012-07-01 13:26         ` Gavin Shan
2012-06-27 22:07   ` David Rientjes
2012-06-28  6:16     ` Gavin Shan
2012-06-28 21:34       ` David Rientjes
2012-07-01 13:41         ` Gavin Shan
2012-06-28 12:57   ` Michal Hocko
2012-06-29  1:22     ` Gavin Shan
2012-06-27 16:36 ` [PATCH v2 3/3] mm/sparse: more check on mem_section number Gavin Shan
2012-06-27 22:06   ` David Rientjes
2012-06-27 22:14     ` Dave Hansen
2012-06-28  6:18     ` Gavin Shan
2012-06-27 22:10 ` [PATCH v2 1/3] mm/sparse: optimize sparse_index_alloc David Rientjes
2012-06-28 12:52 ` Michal Hocko
2012-06-29  1:20   ` Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.