All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc
@ 2012-07-02  9:28 Gavin Shan
  2012-07-02  9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan
  2012-07-02  9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan
  0 siblings, 2 replies; 13+ messages in thread
From: Gavin Shan @ 2012-07-02  9:28 UTC (permalink / raw)
  To: linux-mm; +Cc: dave, mhocko, rientjes, akpm, Gavin Shan

With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
descriptors are allocated from slab or bootmem. When allocating
from slab, let slab/bootmem allocator to clear the memory chunk.
We needn't clear that explicitly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
---
 mm/sparse.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 6a4bf91..781fa04 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -65,14 +65,12 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 
 	if (slab_is_available()) {
 		if (node_state(nid, N_HIGH_MEMORY))
-			section = kmalloc_node(array_size, GFP_KERNEL, nid);
+			section = kzalloc_node(array_size, GFP_KERNEL, nid);
 		else
-			section = kmalloc(array_size, GFP_KERNEL);
-	} else
+			section = kzalloc(array_size, GFP_KERNEL);
+	} else {
 		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
-
-	if (section)
-		memset(section, 0, array_size);
+	}
 
 	return section;
 }
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02  9:28 [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
@ 2012-07-02  9:28 ` Gavin Shan
  2012-07-02  9:43   ` Michal Hocko
  2012-07-02 11:04   ` David Rientjes
  2012-07-02  9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan
  1 sibling, 2 replies; 13+ messages in thread
From: Gavin Shan @ 2012-07-02  9:28 UTC (permalink / raw)
  To: linux-mm; +Cc: dave, mhocko, rientjes, akpm, Gavin Shan

sparse_index_init() is designed to be safe if two copies of it race.  It
uses "index_init_lock" to ensure that, even in the case of a race, only
one CPU will manage to do:

	mem_section[root] = section;

However, in the case where two copies of sparse_index_init() _do_ race,
the one that loses the race will leak the "section" that
sparse_index_alloc() allocated for it.  This patch fixes that leak.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index 781fa04..a6984d9 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 	return section;
 }
 
+static inline void __meminit sparse_index_free(struct mem_section *section)
+{
+	unsigned long size = SECTIONS_PER_ROOT *
+			     sizeof(struct mem_section);
+
+	if (!section)
+		return;
+
+	if (slab_is_available())
+		kfree(section);
+	else
+		free_bootmem(virt_to_phys(section), size);
+}
+
 static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 {
 	static DEFINE_SPINLOCK(index_init_lock);
@@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 	mem_section[root] = section;
 out:
 	spin_unlock(&index_init_lock);
+	if (ret)
+		sparse_index_free(section);
+
 	return ret;
 }
 #else /* !SPARSEMEM_EXTREME */
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/3] mm/sparse: more check on mem_section number
  2012-07-02  9:28 [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
  2012-07-02  9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan
@ 2012-07-02  9:28 ` Gavin Shan
  2012-07-02 11:05   ` David Rientjes
  1 sibling, 1 reply; 13+ messages in thread
From: Gavin Shan @ 2012-07-02  9:28 UTC (permalink / raw)
  To: linux-mm; +Cc: dave, mhocko, rientjes, akpm, Gavin Shan

Function __section_nr() was implemented to retrieve the corresponding
memory section number according to its descriptor. It's possible that
the specified memory section descriptor isn't existing in the global
array. So here to add more check on that and report error for wrong
case.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index a6984d9..f2525fd 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -147,6 +147,8 @@ int __section_nr(struct mem_section* ms)
 		     break;
 	}
 
+	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
+
 	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
 }
 
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02  9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan
@ 2012-07-02  9:43   ` Michal Hocko
  2012-07-02 13:40     ` Gavin Shan
  2012-07-02 11:04   ` David Rientjes
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2012-07-02  9:43 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, rientjes, akpm

On Mon 02-07-12 17:28:56, Gavin Shan wrote:
> sparse_index_init() is designed to be safe if two copies of it race.  It
> uses "index_init_lock" to ensure that, even in the case of a race, only
> one CPU will manage to do:
> 
> 	mem_section[root] = section;
> 
> However, in the case where two copies of sparse_index_init() _do_ race,
> the one that loses the race will leak the "section" that
> sparse_index_alloc() allocated for it.  This patch fixes that leak.

I would still like to hear how we can possibly race in this code path.
I've thought that memory onlining is done from a single CPU.

> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  mm/sparse.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 781fa04..a6984d9 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  	return section;
>  }
>  
> +static inline void __meminit sparse_index_free(struct mem_section *section)
> +{
> +	unsigned long size = SECTIONS_PER_ROOT *
> +			     sizeof(struct mem_section);
> +
> +	if (!section)
> +		return;
> +
> +	if (slab_is_available())
> +		kfree(section);
> +	else
> +		free_bootmem(virt_to_phys(section), size);
> +}
> +
>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  {
>  	static DEFINE_SPINLOCK(index_init_lock);
> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	mem_section[root] = section;
>  out:
>  	spin_unlock(&index_init_lock);
> +	if (ret)
> +		sparse_index_free(section);
> +
>  	return ret;
>  }
>  #else /* !SPARSEMEM_EXTREME */
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02  9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan
  2012-07-02  9:43   ` Michal Hocko
@ 2012-07-02 11:04   ` David Rientjes
  2012-07-02 13:28     ` Gavin Shan
  1 sibling, 1 reply; 13+ messages in thread
From: David Rientjes @ 2012-07-02 11:04 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, mhocko, akpm

On Mon, 2 Jul 2012, Gavin Shan wrote:

> diff --git a/mm/sparse.c b/mm/sparse.c
> index 781fa04..a6984d9 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  	return section;
>  }
>  
> +static inline void __meminit sparse_index_free(struct mem_section *section)
> +{
> +	unsigned long size = SECTIONS_PER_ROOT *
> +			     sizeof(struct mem_section);
> +
> +	if (!section)
> +		return;
> +
> +	if (slab_is_available())
> +		kfree(section);
> +	else
> +		free_bootmem(virt_to_phys(section), size);

Eek, does that work?

> +}
> +
>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  {
>  	static DEFINE_SPINLOCK(index_init_lock);
> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	mem_section[root] = section;
>  out:
>  	spin_unlock(&index_init_lock);
> +	if (ret)
> +		sparse_index_free(section);
> +
>  	return ret;
>  }
>  #else /* !SPARSEMEM_EXTREME */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/3] mm/sparse: more check on mem_section number
  2012-07-02  9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan
@ 2012-07-02 11:05   ` David Rientjes
  0 siblings, 0 replies; 13+ messages in thread
From: David Rientjes @ 2012-07-02 11:05 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, mhocko, akpm

On Mon, 2 Jul 2012, Gavin Shan wrote:

> Function __section_nr() was implemented to retrieve the corresponding
> memory section number according to its descriptor. It's possible that
> the specified memory section descriptor isn't existing in the global
> array. So here to add more check on that and report error for wrong
> case.
> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02 11:04   ` David Rientjes
@ 2012-07-02 13:28     ` Gavin Shan
  2012-07-02 21:19       ` David Rientjes
  0 siblings, 1 reply; 13+ messages in thread
From: Gavin Shan @ 2012-07-02 13:28 UTC (permalink / raw)
  To: David Rientjes; +Cc: Gavin Shan, linux-mm, dave, mhocko, akpm

>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 781fa04..a6984d9 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  	return section;
>>  }
>>  
>> +static inline void __meminit sparse_index_free(struct mem_section *section)
>> +{
>> +	unsigned long size = SECTIONS_PER_ROOT *
>> +			     sizeof(struct mem_section);
>> +
>> +	if (!section)
>> +		return;
>> +
>> +	if (slab_is_available())
>> +		kfree(section);
>> +	else
>> +		free_bootmem(virt_to_phys(section), size);
>
>Eek, does that work?
>

David, I think it's working fine. If my understanding is wrong, please
correct me. Thanks a lot :-)

The "section" allocated from the bootmem allocator might take following
function call path. In the function alloc_bootmem_core(), all online nodes
will be checked for the memory allocation. So we could have memory allocated
from different node other than the specified one to alloc_bootmem_node()

alloc_bootmem_node(nid, size)
__alloc_bootmem_node()
___alloc_bootmem_node_nopanic()
alloc_bootmem_core()

On the other hand, function free_bootmem() checks which node the memory
block belongs to and then free it into that node. That looks reasonable.

Thanks,
Gavin 

>> +}
>> +
>>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  {
>>  	static DEFINE_SPINLOCK(index_init_lock);
>> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  	mem_section[root] = section;
>>  out:
>>  	spin_unlock(&index_init_lock);
>> +	if (ret)
>> +		sparse_index_free(section);
>> +
>>  	return ret;
>>  }
>>  #else /* !SPARSEMEM_EXTREME */
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02  9:43   ` Michal Hocko
@ 2012-07-02 13:40     ` Gavin Shan
  2012-07-02 15:46       ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: Gavin Shan @ 2012-07-02 13:40 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, dave, rientjes, akpm

On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote:
>On Mon 02-07-12 17:28:56, Gavin Shan wrote:
>> sparse_index_init() is designed to be safe if two copies of it race.  It
>> uses "index_init_lock" to ensure that, even in the case of a race, only
>> one CPU will manage to do:
>> 
>> 	mem_section[root] = section;
>> 
>> However, in the case where two copies of sparse_index_init() _do_ race,
>> the one that loses the race will leak the "section" that
>> sparse_index_alloc() allocated for it.  This patch fixes that leak.
>
>I would still like to hear how we can possibly race in this code path.
>I've thought that memory onlining is done from a single CPU.
>

Hi Michael, how about to use the following changelog? :-)

-----

sparse_index_init() is designed to be safe if two copies of it race.  It
uses "index_init_lock" to ensure that, even in the case of a race, only
one CPU will manage to do:

mem_section[root] = section;

However, in the case where two copies of sparse_index_init() _do_ race,
which is probablly caused by making online for multiple memory sections
that depend on same entry of array mem_section[] simultaneously from
different CPUs. The one that loses the race will leak the "section" that
sparse_index_alloc() allocated for it. This patch fixes that leak.

-----

Thanks,
Gavin

>> 
>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> ---
>>  mm/sparse.c |   17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 781fa04..a6984d9 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  	return section;
>>  }
>>  
>> +static inline void __meminit sparse_index_free(struct mem_section *section)
>> +{
>> +	unsigned long size = SECTIONS_PER_ROOT *
>> +			     sizeof(struct mem_section);
>> +
>> +	if (!section)
>> +		return;
>> +
>> +	if (slab_is_available())
>> +		kfree(section);
>> +	else
>> +		free_bootmem(virt_to_phys(section), size);
>> +}
>> +
>>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  {
>>  	static DEFINE_SPINLOCK(index_init_lock);
>> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  	mem_section[root] = section;
>>  out:
>>  	spin_unlock(&index_init_lock);
>> +	if (ret)
>> +		sparse_index_free(section);
>> +
>>  	return ret;
>>  }
>>  #else /* !SPARSEMEM_EXTREME */
>> -- 
>> 1.7.9.5
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02 13:40     ` Gavin Shan
@ 2012-07-02 15:46       ` Michal Hocko
  2012-07-03  3:38         ` Gavin Shan
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2012-07-02 15:46 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, rientjes, akpm

On Mon 02-07-12 21:40:53, Gavin Shan wrote:
> On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote:
> >On Mon 02-07-12 17:28:56, Gavin Shan wrote:
> >> sparse_index_init() is designed to be safe if two copies of it race.  It
> >> uses "index_init_lock" to ensure that, even in the case of a race, only
> >> one CPU will manage to do:
> >> 
> >> 	mem_section[root] = section;
> >> 
> >> However, in the case where two copies of sparse_index_init() _do_ race,
> >> the one that loses the race will leak the "section" that
> >> sparse_index_alloc() allocated for it.  This patch fixes that leak.
> >
> >I would still like to hear how we can possibly race in this code path.
> >I've thought that memory onlining is done from a single CPU.
> >
> 
> Hi Michael, how about to use the following changelog? :-)
> 
> -----
> 
> sparse_index_init() is designed to be safe if two copies of it race.  It
> uses "index_init_lock" to ensure that, even in the case of a race, only
> one CPU will manage to do:
> 
> mem_section[root] = section;
> 
> However, in the case where two copies of sparse_index_init() _do_ race,
> which is probablly caused by making online for multiple memory sections
> that depend on same entry of array mem_section[] simultaneously from
> different CPUs. 

And you really think that this clarified the things? You have just
tweaked the comment to sound more obscure.

OK, so you have pushed me into the code...
If you had looked into the hotplug callchain up to add_memory you would
have seen that the whole arch_add_memory -> __add_pages -> ... ->
sparse_index_init is called with lock_memory_hotplug held so the hotplug
cannot run from the multiple CPUs.

I do not see any other users apart from  boot time
sparse_memory_present_with_active_regions and add_memory so I think that
the lock is just a heritage from old days.

So please make sure you are fixing a real issue rather than add another
code which simply never gets executed.

And no obscuring the changelog doesn't help anybody.

> The one that loses the race will leak the "section" that
> sparse_index_alloc() allocated for it. This patch fixes that leak.

> 
> -----
> 
> Thanks,
> Gavin
> 
> >> 
> >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> >> ---
> >>  mm/sparse.c |   17 +++++++++++++++++
> >>  1 file changed, 17 insertions(+)
> >> 
> >> diff --git a/mm/sparse.c b/mm/sparse.c
> >> index 781fa04..a6984d9 100644
> >> --- a/mm/sparse.c
> >> +++ b/mm/sparse.c
> >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
> >>  	return section;
> >>  }
> >>  
> >> +static inline void __meminit sparse_index_free(struct mem_section *section)
> >> +{
> >> +	unsigned long size = SECTIONS_PER_ROOT *
> >> +			     sizeof(struct mem_section);
> >> +
> >> +	if (!section)
> >> +		return;
> >> +
> >> +	if (slab_is_available())
> >> +		kfree(section);
> >> +	else
> >> +		free_bootmem(virt_to_phys(section), size);
> >> +}
> >> +
> >>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
> >>  {
> >>  	static DEFINE_SPINLOCK(index_init_lock);
> >> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
> >>  	mem_section[root] = section;
> >>  out:
> >>  	spin_unlock(&index_init_lock);
> >> +	if (ret)
> >> +		sparse_index_free(section);
> >> +
> >>  	return ret;
> >>  }
> >>  #else /* !SPARSEMEM_EXTREME */
> >> -- 
> >> 1.7.9.5
> >> 
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org.  For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> >-- 
> >Michal Hocko
> >SUSE Labs
> >SUSE LINUX s.r.o.
> >Lihovarska 1060/12
> >190 00 Praha 9    
> >Czech Republic
> >
> >--
> >To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >the body to majordomo@kvack.org.  For more info on Linux MM,
> >see: http://www.linux-mm.org/ .
> >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02 13:28     ` Gavin Shan
@ 2012-07-02 21:19       ` David Rientjes
  2012-07-03  1:19         ` Gavin Shan
  0 siblings, 1 reply; 13+ messages in thread
From: David Rientjes @ 2012-07-02 21:19 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, mhocko, akpm

On Mon, 2 Jul 2012, Gavin Shan wrote:

> >> diff --git a/mm/sparse.c b/mm/sparse.c
> >> index 781fa04..a6984d9 100644
> >> --- a/mm/sparse.c
> >> +++ b/mm/sparse.c
> >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
> >>  	return section;
> >>  }
> >>  
> >> +static inline void __meminit sparse_index_free(struct mem_section *section)
> >> +{
> >> +	unsigned long size = SECTIONS_PER_ROOT *
> >> +			     sizeof(struct mem_section);
> >> +
> >> +	if (!section)
> >> +		return;
> >> +
> >> +	if (slab_is_available())
> >> +		kfree(section);
> >> +	else
> >> +		free_bootmem(virt_to_phys(section), size);
> >
> >Eek, does that work?
> >
> 
> David, I think it's working fine. If my understanding is wrong, please
> correct me. Thanks a lot :-)
> 

I'm thinking it should be free_bootmem(__pa(section), size);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02 21:19       ` David Rientjes
@ 2012-07-03  1:19         ` Gavin Shan
  0 siblings, 0 replies; 13+ messages in thread
From: Gavin Shan @ 2012-07-03  1:19 UTC (permalink / raw)
  To: David Rientjes; +Cc: Gavin Shan, linux-mm, dave, mhocko, akpm

On Mon, Jul 02, 2012 at 02:19:39PM -0700, David Rientjes wrote:
>On Mon, 2 Jul 2012, Gavin Shan wrote:
>
>> >> diff --git a/mm/sparse.c b/mm/sparse.c
>> >> index 781fa04..a6984d9 100644
>> >> --- a/mm/sparse.c
>> >> +++ b/mm/sparse.c
>> >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>> >>  	return section;
>> >>  }
>> >>  
>> >> +static inline void __meminit sparse_index_free(struct mem_section *section)
>> >> +{
>> >> +	unsigned long size = SECTIONS_PER_ROOT *
>> >> +			     sizeof(struct mem_section);
>> >> +
>> >> +	if (!section)
>> >> +		return;
>> >> +
>> >> +	if (slab_is_available())
>> >> +		kfree(section);
>> >> +	else
>> >> +		free_bootmem(virt_to_phys(section), size);
>> >
>> >Eek, does that work?
>> >
>> 
>> David, I think it's working fine. If my understanding is wrong, please
>> correct me. Thanks a lot :-)
>> 
>
>I'm thinking it should be free_bootmem(__pa(section), size);
>

Thanks for pointing it out, David.

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-02 15:46       ` Michal Hocko
@ 2012-07-03  3:38         ` Gavin Shan
  2012-07-03 12:51           ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: Gavin Shan @ 2012-07-03  3:38 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, dave, rientjes, akpm

On Mon, Jul 02, 2012 at 05:46:28PM +0200, Michal Hocko wrote:
>On Mon 02-07-12 21:40:53, Gavin Shan wrote:
>> On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote:
>> >On Mon 02-07-12 17:28:56, Gavin Shan wrote:
>> >> sparse_index_init() is designed to be safe if two copies of it race.  It
>> >> uses "index_init_lock" to ensure that, even in the case of a race, only
>> >> one CPU will manage to do:
>> >> 
>> >> 	mem_section[root] = section;
>> >> 
>> >> However, in the case where two copies of sparse_index_init() _do_ race,
>> >> the one that loses the race will leak the "section" that
>> >> sparse_index_alloc() allocated for it.  This patch fixes that leak.
>> >
>> >I would still like to hear how we can possibly race in this code path.
>> >I've thought that memory onlining is done from a single CPU.
>> >
>> 
>> Hi Michael, how about to use the following changelog? :-)
>> 
>> -----
>> 
>> sparse_index_init() is designed to be safe if two copies of it race.  It
>> uses "index_init_lock" to ensure that, even in the case of a race, only
>> one CPU will manage to do:
>> 
>> mem_section[root] = section;
>> 
>> However, in the case where two copies of sparse_index_init() _do_ race,
>> which is probablly caused by making online for multiple memory sections
>> that depend on same entry of array mem_section[] simultaneously from
>> different CPUs. 
>
>And you really think that this clarified the things? You have just
>tweaked the comment to sound more obscure.
>
>OK, so you have pushed me into the code...
>If you had looked into the hotplug callchain up to add_memory you would
>have seen that the whole arch_add_memory -> __add_pages -> ... ->
>sparse_index_init is called with lock_memory_hotplug held so the hotplug
>cannot run from the multiple CPUs.
>
>I do not see any other users apart from  boot time
>sparse_memory_present_with_active_regions and add_memory so I think that
>the lock is just a heritage from old days.
>

I just had quick go-through on the source code as you suggested and I
think you're right, Michal. So please drop this :-)

With CONFIG_ARCH_MEMORY_PROBE enabled on Power machines, following
functions would be included in hotplug path.

memory_probe_store
add_memory
	lock_memory_hotplug	/* protect the whole hotplug path */
arch_add_memory
__add_pages
__add_section
sparse_add_one_section
sparse_index_init
sparse_index_alloc

The mutex "mem_hotplug_mutex" will be hold by lock_memory_hotplug() to protect
the whole hotplug path. However, I'm wandering if we can remove the "index_init_lock"
of function sparse_index_init() since that sounds duplicate lock.

static int __meminit sparse_index_init(unsigned long section_nr, int nid)
{
        static DEFINE_SPINLOCK(index_init_lock);
        unsigned long root = SECTION_NR_TO_ROOT(section_nr);
        struct mem_section *section;
        int ret = 0;

        if (mem_section[root])
                return -EEXIST;

        section = sparse_index_alloc(nid);
        if (!section)
                return -ENOMEM;
        /*
         * This lock keeps two different sections from
         * reallocating for the same index
         */
        spin_lock(&index_init_lock);

        if (mem_section[root]) {
                ret = -EEXIST;
                goto out;
        }

        mem_section[root] = section;
out:
        spin_unlock(&index_init_lock);
        if (ret)
                sparse_index_free(section);

        return ret;
}

Thanks,
Gavin




>So please make sure you are fixing a real issue rather than add another
>code which simply never gets executed.
>
>And no obscuring the changelog doesn't help anybody.
>
>> The one that loses the race will leak the "section" that
>> sparse_index_alloc() allocated for it. This patch fixes that leak.
>
>> 
>> -----
>> 
>> Thanks,
>> Gavin
>> 
>> >> 
>> >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> >> ---
>> >>  mm/sparse.c |   17 +++++++++++++++++
>> >>  1 file changed, 17 insertions(+)
>> >> 
>> >> diff --git a/mm/sparse.c b/mm/sparse.c
>> >> index 781fa04..a6984d9 100644
>> >> --- a/mm/sparse.c
>> >> +++ b/mm/sparse.c
>> >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>> >>  	return section;
>> >>  }
>> >>  
>> >> +static inline void __meminit sparse_index_free(struct mem_section *section)
>> >> +{
>> >> +	unsigned long size = SECTIONS_PER_ROOT *
>> >> +			     sizeof(struct mem_section);
>> >> +
>> >> +	if (!section)
>> >> +		return;
>> >> +
>> >> +	if (slab_is_available())
>> >> +		kfree(section);
>> >> +	else
>> >> +		free_bootmem(virt_to_phys(section), size);
>> >> +}
>> >> +
>> >>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>> >>  {
>> >>  	static DEFINE_SPINLOCK(index_init_lock);
>> >> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>> >>  	mem_section[root] = section;
>> >>  out:
>> >>  	spin_unlock(&index_init_lock);
>> >> +	if (ret)
>> >> +		sparse_index_free(section);
>> >> +
>> >>  	return ret;
>> >>  }
>> >>  #else /* !SPARSEMEM_EXTREME */
>> >> -- 
>> >> 1.7.9.5
>> >> 
>> >> --
>> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >> the body to majordomo@kvack.org.  For more info on Linux MM,
>> >> see: http://www.linux-mm.org/ .
>> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> >
>> >-- 
>> >Michal Hocko
>> >SUSE Labs
>> >SUSE LINUX s.r.o.
>> >Lihovarska 1060/12
>> >190 00 Praha 9    
>> >Czech Republic
>> >
>> >--
>> >To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >the body to majordomo@kvack.org.  For more info on Linux MM,
>> >see: http://www.linux-mm.org/ .
>> >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> >
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak
  2012-07-03  3:38         ` Gavin Shan
@ 2012-07-03 12:51           ` Michal Hocko
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2012-07-03 12:51 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, dave, rientjes, akpm

On Tue 03-07-12 11:38:23, Gavin Shan wrote:
> On Mon, Jul 02, 2012 at 05:46:28PM +0200, Michal Hocko wrote:
> >On Mon 02-07-12 21:40:53, Gavin Shan wrote:
> >> On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote:
> >> >On Mon 02-07-12 17:28:56, Gavin Shan wrote:
> >> >> sparse_index_init() is designed to be safe if two copies of it race.  It
> >> >> uses "index_init_lock" to ensure that, even in the case of a race, only
> >> >> one CPU will manage to do:
> >> >> 
> >> >> 	mem_section[root] = section;
> >> >> 
> >> >> However, in the case where two copies of sparse_index_init() _do_ race,
> >> >> the one that loses the race will leak the "section" that
> >> >> sparse_index_alloc() allocated for it.  This patch fixes that leak.
> >> >
> >> >I would still like to hear how we can possibly race in this code path.
> >> >I've thought that memory onlining is done from a single CPU.
> >> >
> >> 
> >> Hi Michael, how about to use the following changelog? :-)
> >> 
> >> -----
> >> 
> >> sparse_index_init() is designed to be safe if two copies of it race.  It
> >> uses "index_init_lock" to ensure that, even in the case of a race, only
> >> one CPU will manage to do:
> >> 
> >> mem_section[root] = section;
> >> 
> >> However, in the case where two copies of sparse_index_init() _do_ race,
> >> which is probablly caused by making online for multiple memory sections
> >> that depend on same entry of array mem_section[] simultaneously from
> >> different CPUs. 
> >
> >And you really think that this clarified the things? You have just
> >tweaked the comment to sound more obscure.
> >
> >OK, so you have pushed me into the code...
> >If you had looked into the hotplug callchain up to add_memory you would
> >have seen that the whole arch_add_memory -> __add_pages -> ... ->
> >sparse_index_init is called with lock_memory_hotplug held so the hotplug
> >cannot run from the multiple CPUs.
> >
> >I do not see any other users apart from  boot time
> >sparse_memory_present_with_active_regions and add_memory so I think that
> >the lock is just a heritage from old days.
> >
> 
> I just had quick go-through on the source code as you suggested and I
> think you're right, Michal. So please drop this :-)
> 
> With CONFIG_ARCH_MEMORY_PROBE enabled on Power machines, following
> functions would be included in hotplug path.

I am not sure why you are mentioning Power arch here, add_memory which
does the locking is arch independent.

> 
> memory_probe_store
> add_memory
> 	lock_memory_hotplug	/* protect the whole hotplug path */
> arch_add_memory
> __add_pages
> __add_section
> sparse_add_one_section
> sparse_index_init
> sparse_index_alloc
> 
> The mutex "mem_hotplug_mutex" will be hold by lock_memory_hotplug() to protect
> the whole hotplug path. 

> However, I'm wandering if we can remove the "index_init_lock" of
> function sparse_index_init() since that sounds duplicate lock.

Heh, that's what I am asking from the very beginning... I do not see any
purpose of the lock but I might be missing something. So make sure you
really understand the locking of this code if you are going to send a
patch to remove the lock.
-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-07-03 12:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-02  9:28 [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan
2012-07-02  9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan
2012-07-02  9:43   ` Michal Hocko
2012-07-02 13:40     ` Gavin Shan
2012-07-02 15:46       ` Michal Hocko
2012-07-03  3:38         ` Gavin Shan
2012-07-03 12:51           ` Michal Hocko
2012-07-02 11:04   ` David Rientjes
2012-07-02 13:28     ` Gavin Shan
2012-07-02 21:19       ` David Rientjes
2012-07-03  1:19         ` Gavin Shan
2012-07-02  9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan
2012-07-02 11:05   ` David Rientjes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.