All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] mm/sparse: check size of struct mm_section
@ 2012-06-23 15:52 Gavin Shan
  2012-06-23 15:52 ` [PATCH 2/5] mm/sparse: optimize sparse_index_alloc Gavin Shan
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-23 15:52 UTC (permalink / raw)
  To: linux-mm; +Cc: rientjes, hannes, akpm, Gavin Shan

Platforms like PPC might need two level mem_section for SPARSEMEM
with enabled CONFIG_SPARSEMEM_EXTREME. On the other hand, the
memory section descriptor might be allocated from bootmem allocator
with PAGE_SIZE alignment. In order to fully utilize the memory chunk
allocated from bootmem allocator, it'd better to assure memory
sector descriptor won't run across the boundary (PAGE_SIZE).

The patch introduces the check on size of "struct mm_section" to
assure that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index 6a4bf91..afd0998 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -63,6 +63,15 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 	unsigned long array_size = SECTIONS_PER_ROOT *
 				   sizeof(struct mem_section);
 
+	/*
+	 * The root memory section descriptor might be allocated
+	 * from bootmem, which has minimal memory chunk requirement
+	 * of page. In order to fully utilize the memory, the sparse
+	 * memory section descriptor shouldn't run across the boundary
+	 * that bootmem allocator has.
+	 */
+	BUILD_BUG_ON(PAGE_SIZE % sizeof(struct mem_section));
+
 	if (slab_is_available()) {
 		if (node_state(nid, N_HIGH_MEMORY))
 			section = kmalloc_node(array_size, GFP_KERNEL, nid);
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/5] mm/sparse: optimize sparse_index_alloc
  2012-06-23 15:52 [PATCH 1/5] mm/sparse: check size of struct mm_section Gavin Shan
@ 2012-06-23 15:52 ` Gavin Shan
  2012-06-25 15:30   ` Michal Hocko
  2012-06-23 15:52 ` [PATCH 3/5] mm/sparse: fix possible memory leak Gavin Shan
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-23 15:52 UTC (permalink / raw)
  To: linux-mm; +Cc: rientjes, hannes, akpm, Gavin Shan

With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
descriptors are allocated from slab or bootmem. When allocating
from slab, let slab allocator to clear the memory chunk. However,
the memory chunk from bootmem allocator, we have to clear that
explicitly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index afd0998..ce50c8b 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -74,14 +74,14 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 
 	if (slab_is_available()) {
 		if (node_state(nid, N_HIGH_MEMORY))
-			section = kmalloc_node(array_size, GFP_KERNEL, nid);
+			section = kzalloc_node(array_size, GFP_KERNEL, nid);
 		else
-			section = kmalloc(array_size, GFP_KERNEL);
-	} else
+			section = kzalloc(array_size, GFP_KERNEL);
+	} else {
 		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
-
-	if (section)
-		memset(section, 0, array_size);
+		if (section)
+			memset(section, 0, array_size);
+	}
 
 	return section;
 }
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/5] mm/sparse: fix possible memory leak
  2012-06-23 15:52 [PATCH 1/5] mm/sparse: check size of struct mm_section Gavin Shan
  2012-06-23 15:52 ` [PATCH 2/5] mm/sparse: optimize sparse_index_alloc Gavin Shan
@ 2012-06-23 15:52 ` Gavin Shan
  2012-06-25 15:48   ` Michal Hocko
  2012-06-23 15:52 ` [PATCH 4/5] mm/sparse: more check on mem_section number Gavin Shan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-23 15:52 UTC (permalink / raw)
  To: linux-mm; +Cc: rientjes, hannes, akpm, Gavin Shan

With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
are allocated by slab or bootmem allocator. Also, the descriptors
might have been allocated and initialized by others. However, the
memory chunk allocated in current implementation wouldn't be put
into the available pool if others have allocated memory chunk for
that.

The patch introduces addtional function sparse_index_free() to
deallocate the memory chunk if the root memory section descriptor
has been initialized by others.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index ce50c8b..bae8f2d 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -86,6 +86,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
 	return section;
 }
 
+static void noinline __init_refok sparse_index_free(struct mem_section *section,
+						    int nid)
+{
+	unsigned long size = SECTIONS_PER_ROOT *
+			     sizeof(struct mem_section);
+
+	if (!section)
+		return;
+
+	if (slab_is_available())
+		kfree(section);
+	else
+		free_bootmem_node(NODE_DATA(nid),
+			virt_to_phys(section), size);
+}
+
 static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 {
 	static DEFINE_SPINLOCK(index_init_lock);
@@ -113,6 +129,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 	mem_section[root] = section;
 out:
 	spin_unlock(&index_init_lock);
+	if (ret == -EEXIST)
+		sparse_index_free(section, nid);
+
 	return ret;
 }
 #else /* !SPARSEMEM_EXTREME */
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/5] mm/sparse: more check on mem_section number
  2012-06-23 15:52 [PATCH 1/5] mm/sparse: check size of struct mm_section Gavin Shan
  2012-06-23 15:52 ` [PATCH 2/5] mm/sparse: optimize sparse_index_alloc Gavin Shan
  2012-06-23 15:52 ` [PATCH 3/5] mm/sparse: fix possible memory leak Gavin Shan
@ 2012-06-23 15:52 ` Gavin Shan
  2012-06-26 16:15   ` Dave Hansen
  2012-06-23 15:52 ` [PATCH 5/5] mm/sparse: return 0 if root mem_section exists Gavin Shan
  2012-06-25 16:03 ` [PATCH 1/5] mm/sparse: check size of struct mm_section Michal Hocko
  4 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-23 15:52 UTC (permalink / raw)
  To: linux-mm; +Cc: rientjes, hannes, akpm, Gavin Shan

Function __section_nr() was implemented to retrieve the corresponding
memory section number according to its descriptor. It's possible that
the specified memory section descriptor isn't existing in the global
array. So here to add more check on that and report error for wrong
case.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/sparse.c b/mm/sparse.c
index bae8f2d..a8b99d3 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -160,6 +160,8 @@ int __section_nr(struct mem_section* ms)
 		     break;
 	}
 
+	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
+
 	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
 }
 
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/5] mm/sparse: return 0 if root mem_section exists
  2012-06-23 15:52 [PATCH 1/5] mm/sparse: check size of struct mm_section Gavin Shan
                   ` (2 preceding siblings ...)
  2012-06-23 15:52 ` [PATCH 4/5] mm/sparse: more check on mem_section number Gavin Shan
@ 2012-06-23 15:52 ` Gavin Shan
  2012-06-24 19:18   ` KOSAKI Motohiro
  2012-06-25 15:39   ` Michal Hocko
  2012-06-25 16:03 ` [PATCH 1/5] mm/sparse: check size of struct mm_section Michal Hocko
  4 siblings, 2 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-23 15:52 UTC (permalink / raw)
  To: linux-mm; +Cc: rientjes, hannes, akpm, Gavin Shan

Function sparse_index_init() is used to setup memory section descriptors
dynamically. zero should be returned while mem_section[root] already has
been allocated.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 mm/sparse.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index a8b99d3..e845a48 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -109,8 +109,12 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 	struct mem_section *section;
 	int ret = 0;
 
+	/*
+	 * If the corresponding mem_section descriptor
+	 * has been created, we needn't bother
+	 */
 	if (mem_section[root])
-		return -EEXIST;
+		return ret;
 
 	section = sparse_index_alloc(nid);
 	if (!section)
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] mm/sparse: return 0 if root mem_section exists
  2012-06-23 15:52 ` [PATCH 5/5] mm/sparse: return 0 if root mem_section exists Gavin Shan
@ 2012-06-24 19:18   ` KOSAKI Motohiro
  2012-06-25  1:09     ` Gavin Shan
  2012-06-25 15:39   ` Michal Hocko
  1 sibling, 1 reply; 26+ messages in thread
From: KOSAKI Motohiro @ 2012-06-24 19:18 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Sat, Jun 23, 2012 at 11:52 AM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> Function sparse_index_init() is used to setup memory section descriptors
> dynamically. zero should be returned while mem_section[root] already has
> been allocated.

Why?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] mm/sparse: return 0 if root mem_section exists
  2012-06-24 19:18   ` KOSAKI Motohiro
@ 2012-06-25  1:09     ` Gavin Shan
  0 siblings, 0 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-25  1:09 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> Function sparse_index_init() is used to setup memory section descriptors
>> dynamically. zero should be returned while mem_section[root] already has
>> been allocated.
>
>Why?
>

When CONFIG_SPARSEMEM_EXTREME is enabled, the memory section descriptors are
allocated dynamically and stored into "struct mem_section *mem_section[NR_SECTION_ROOTS]".

It's possible for multiple sections (e.g. 0, 1) sharing "mem_section[0]". When setup
the descriptor for section 0, the mem_section descriptor for section 1 should have
been created as well. So we needn't do same thing (actually duplicate) for section 1.

And the function returns "-EEXIST" in sparse_index_init() for section 1, which indicates
errors. Actually, here we need "0".

Does it make sense?

Thanks,
Gavin

>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] mm/sparse: optimize sparse_index_alloc
  2012-06-23 15:52 ` [PATCH 2/5] mm/sparse: optimize sparse_index_alloc Gavin Shan
@ 2012-06-25 15:30   ` Michal Hocko
  2012-06-26  6:07     ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-25 15:30 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Sat 23-06-12 23:52:53, Gavin Shan wrote:
> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
> descriptors are allocated from slab or bootmem. When allocating
> from slab, let slab allocator to clear the memory chunk. However,
> the memory chunk from bootmem allocator, we have to clear that
> explicitly.

I am sorry but I do not see how this optimize the current code. What is
the difference between slab doing memset and doing it explicitly for all
cases?

> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  mm/sparse.c |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index afd0998..ce50c8b 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -74,14 +74,14 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  
>  	if (slab_is_available()) {
>  		if (node_state(nid, N_HIGH_MEMORY))
> -			section = kmalloc_node(array_size, GFP_KERNEL, nid);
> +			section = kzalloc_node(array_size, GFP_KERNEL, nid);
>  		else
> -			section = kmalloc(array_size, GFP_KERNEL);
> -	} else
> +			section = kzalloc(array_size, GFP_KERNEL);
> +	} else {
>  		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
> -
> -	if (section)
> -		memset(section, 0, array_size);
> +		if (section)
> +			memset(section, 0, array_size);
> +	}
>  
>  	return section;
>  }
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] mm/sparse: return 0 if root mem_section exists
  2012-06-23 15:52 ` [PATCH 5/5] mm/sparse: return 0 if root mem_section exists Gavin Shan
  2012-06-24 19:18   ` KOSAKI Motohiro
@ 2012-06-25 15:39   ` Michal Hocko
  1 sibling, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2012-06-25 15:39 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Sat 23-06-12 23:52:56, Gavin Shan wrote:
> Function sparse_index_init() is used to setup memory section descriptors
> dynamically. zero should be returned while mem_section[root] already has
> been allocated.

Doesn't this break sparse_add_one_section which expects EEXIST?

> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  mm/sparse.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index a8b99d3..e845a48 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -109,8 +109,12 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	struct mem_section *section;
>  	int ret = 0;
>  
> +	/*
> +	 * If the corresponding mem_section descriptor
> +	 * has been created, we needn't bother
> +	 */
>  	if (mem_section[root])
> -		return -EEXIST;
> +		return ret;
>  
>  	section = sparse_index_alloc(nid);
>  	if (!section)
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] mm/sparse: fix possible memory leak
  2012-06-23 15:52 ` [PATCH 3/5] mm/sparse: fix possible memory leak Gavin Shan
@ 2012-06-25 15:48   ` Michal Hocko
  2012-06-26  6:11     ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-25 15:48 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Sat 23-06-12 23:52:54, Gavin Shan wrote:
> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
> are allocated by slab or bootmem allocator. Also, the descriptors
> might have been allocated and initialized by others. However, the
> memory chunk allocated in current implementation wouldn't be put
> into the available pool if others have allocated memory chunk for
> that.

Who is others? I assume that we can race in hotplug because other than
that this is an early initialization code. How can others race?

> The patch introduces addtional function sparse_index_free() to
> deallocate the memory chunk if the root memory section descriptor
> has been initialized by others.

The fix itself looks correct but I do not see how this happens...

> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  mm/sparse.c |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index ce50c8b..bae8f2d 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -86,6 +86,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  	return section;
>  }
>  
> +static void noinline __init_refok sparse_index_free(struct mem_section *section,
> +						    int nid)
> +{
> +	unsigned long size = SECTIONS_PER_ROOT *
> +			     sizeof(struct mem_section);
> +
> +	if (!section)
> +		return;
> +
> +	if (slab_is_available())
> +		kfree(section);
> +	else
> +		free_bootmem_node(NODE_DATA(nid),
> +			virt_to_phys(section), size);
> +}
> +
>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  {
>  	static DEFINE_SPINLOCK(index_init_lock);
> @@ -113,6 +129,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	mem_section[root] = section;
>  out:
>  	spin_unlock(&index_init_lock);
> +	if (ret == -EEXIST)
> +		sparse_index_free(section, nid);

Maybe a generic if (ret) would be more appropriate.

> +
>  	return ret;
>  }
>  #else /* !SPARSEMEM_EXTREME */
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-23 15:52 [PATCH 1/5] mm/sparse: check size of struct mm_section Gavin Shan
                   ` (3 preceding siblings ...)
  2012-06-23 15:52 ` [PATCH 5/5] mm/sparse: return 0 if root mem_section exists Gavin Shan
@ 2012-06-25 16:03 ` Michal Hocko
  2012-06-25 16:35   ` Gavin Shan
  4 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-25 16:03 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Sat 23-06-12 23:52:52, Gavin Shan wrote:
> Platforms like PPC might need two level mem_section for SPARSEMEM
> with enabled CONFIG_SPARSEMEM_EXTREME. On the other hand, the
> memory section descriptor might be allocated from bootmem allocator
> with PAGE_SIZE alignment. In order to fully utilize the memory chunk
> allocated from bootmem allocator, it'd better to assure memory
> sector descriptor won't run across the boundary (PAGE_SIZE).

Why? The memory is continuous, right?

> 
> The patch introduces the check on size of "struct mm_section" to
> assure that.
> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  mm/sparse.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 6a4bf91..afd0998 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -63,6 +63,15 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>  	unsigned long array_size = SECTIONS_PER_ROOT *
>  				   sizeof(struct mem_section);
>  
> +	/*
> +	 * The root memory section descriptor might be allocated
> +	 * from bootmem, which has minimal memory chunk requirement
> +	 * of page. In order to fully utilize the memory, the sparse
> +	 * memory section descriptor shouldn't run across the boundary
> +	 * that bootmem allocator has.
> +	 */
> +	BUILD_BUG_ON(PAGE_SIZE % sizeof(struct mem_section));
> +
>  	if (slab_is_available()) {
>  		if (node_state(nid, N_HIGH_MEMORY))
>  			section = kmalloc_node(array_size, GFP_KERNEL, nid);
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-25 16:03 ` [PATCH 1/5] mm/sparse: check size of struct mm_section Michal Hocko
@ 2012-06-25 16:35   ` Gavin Shan
  2012-06-26  7:39     ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-25 16:35 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> Platforms like PPC might need two level mem_section for SPARSEMEM
>> with enabled CONFIG_SPARSEMEM_EXTREME. On the other hand, the
>> memory section descriptor might be allocated from bootmem allocator
>> with PAGE_SIZE alignment. In order to fully utilize the memory chunk
>> allocated from bootmem allocator, it'd better to assure memory
>> sector descriptor won't run across the boundary (PAGE_SIZE).
>
>Why? The memory is continuous, right?
>

Yes, the memory is conginous and the capacity of specific entry
in mem_section[NR_SECTION_ROOTS] has been defined as follows:


#define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))

Also, the memory is prone to be allocated from bootmem by function
alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
reasonable to introduce the extra check here from my personal view :-)

Thanks,
Gavin

>> 
>> The patch introduces the check on size of "struct mm_section" to
>> assure that.
>> 
>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> ---
>>  mm/sparse.c |    9 +++++++++
>>  1 file changed, 9 insertions(+)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 6a4bf91..afd0998 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -63,6 +63,15 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  	unsigned long array_size = SECTIONS_PER_ROOT *
>>  				   sizeof(struct mem_section);
>>  
>> +	/*
>> +	 * The root memory section descriptor might be allocated
>> +	 * from bootmem, which has minimal memory chunk requirement
>> +	 * of page. In order to fully utilize the memory, the sparse
>> +	 * memory section descriptor shouldn't run across the boundary
>> +	 * that bootmem allocator has.
>> +	 */
>> +	BUILD_BUG_ON(PAGE_SIZE % sizeof(struct mem_section));
>> +
>>  	if (slab_is_available()) {
>>  		if (node_state(nid, N_HIGH_MEMORY))
>>  			section = kmalloc_node(array_size, GFP_KERNEL, nid);
>> -- 
>> 1.7.9.5
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] mm/sparse: optimize sparse_index_alloc
  2012-06-25 15:30   ` Michal Hocko
@ 2012-06-26  6:07     ` Gavin Shan
  2012-06-26  7:04       ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-26  6:07 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
>> descriptors are allocated from slab or bootmem. When allocating
>> from slab, let slab allocator to clear the memory chunk. However,
>> the memory chunk from bootmem allocator, we have to clear that
>> explicitly.
>
>I am sorry but I do not see how this optimize the current code. What is
>the difference between slab doing memset and doing it explicitly for all
>cases?
>

Yeah, I do agree it won't do much optimization here. However, I'm wandering
if I can remove the whole peice of code doing memset(setion, 0, array_size)
since it seems that alloc_bootmem_node() also clears the allocated memory
chunk :-)

Please correct me if I'm wrong about alloc_bootmem_node() :-)

Thanks,
Gavin

>> 
>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> ---
>>  mm/sparse.c |   12 ++++++------
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index afd0998..ce50c8b 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -74,14 +74,14 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  
>>  	if (slab_is_available()) {
>>  		if (node_state(nid, N_HIGH_MEMORY))
>> -			section = kmalloc_node(array_size, GFP_KERNEL, nid);
>> +			section = kzalloc_node(array_size, GFP_KERNEL, nid);
>>  		else
>> -			section = kmalloc(array_size, GFP_KERNEL);
>> -	} else
>> +			section = kzalloc(array_size, GFP_KERNEL);
>> +	} else {
>>  		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
>> -
>> -	if (section)
>> -		memset(section, 0, array_size);
>> +		if (section)
>> +			memset(section, 0, array_size);
>> +	}
>>  
>>  	return section;
>>  }
>> -- 
>> 1.7.9.5
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] mm/sparse: fix possible memory leak
  2012-06-25 15:48   ` Michal Hocko
@ 2012-06-26  6:11     ` Gavin Shan
  2012-06-26  7:14       ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-26  6:11 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
>> are allocated by slab or bootmem allocator. Also, the descriptors
>> might have been allocated and initialized by others. However, the
>> memory chunk allocated in current implementation wouldn't be put
>> into the available pool if others have allocated memory chunk for
>> that.
>
>Who is others? I assume that we can race in hotplug because other than
>that this is an early initialization code. How can others race?
>

I'm sorry that I don't have the real bug against the issue. I just
catch it when reading the source code :-)

I do agree with you that the trace is possiblly introduced by the
hotplug case.

>> The patch introduces addtional function sparse_index_free() to
>> deallocate the memory chunk if the root memory section descriptor
>> has been initialized by others.
>
>The fix itself looks correct but I do not see how this happens...
>
>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> ---
>>  mm/sparse.c |   19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index ce50c8b..bae8f2d 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -86,6 +86,22 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>>  	return section;
>>  }
>>  
>> +static void noinline __init_refok sparse_index_free(struct mem_section *section,
>> +						    int nid)
>> +{
>> +	unsigned long size = SECTIONS_PER_ROOT *
>> +			     sizeof(struct mem_section);
>> +
>> +	if (!section)
>> +		return;
>> +
>> +	if (slab_is_available())
>> +		kfree(section);
>> +	else
>> +		free_bootmem_node(NODE_DATA(nid),
>> +			virt_to_phys(section), size);
>> +}
>> +
>>  static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  {
>>  	static DEFINE_SPINLOCK(index_init_lock);
>> @@ -113,6 +129,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  	mem_section[root] = section;
>>  out:
>>  	spin_unlock(&index_init_lock);
>> +	if (ret == -EEXIST)
>> +		sparse_index_free(section, nid);
>
>Maybe a generic if (ret) would be more appropriate.
>

I will do it in next revision :-)

Thanks,
Gavin

>> +
>>  	return ret;
>>  }
>>  #else /* !SPARSEMEM_EXTREME */
>> -- 
>> 1.7.9.5
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] mm/sparse: optimize sparse_index_alloc
  2012-06-26  6:07     ` Gavin Shan
@ 2012-06-26  7:04       ` Michal Hocko
  2012-06-26  7:13         ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-26  7:04 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Tue 26-06-12 14:07:35, Gavin Shan wrote:
> >> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
> >> descriptors are allocated from slab or bootmem. When allocating
> >> from slab, let slab allocator to clear the memory chunk. However,
> >> the memory chunk from bootmem allocator, we have to clear that
> >> explicitly.
> >
> >I am sorry but I do not see how this optimize the current code. What is
> >the difference between slab doing memset and doing it explicitly for all
> >cases?
> >
> 
> Yeah, I do agree it won't do much optimization here. However, I'm wandering
> if I can remove the whole peice of code doing memset(setion, 0, array_size)
> since it seems that alloc_bootmem_node() also clears the allocated memory
> chunk :-)

Yes, alloc_bootem_node clears the memory (strange, I thought it doesn't
do that), so the memset is really not necessary after s/kmalloc/kzalloc/.

> 
> Please correct me if I'm wrong about alloc_bootmem_node() :-)
> 
> Thanks,
> Gavin
> 
> >> 
> >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> >> ---
> >>  mm/sparse.c |   12 ++++++------
> >>  1 file changed, 6 insertions(+), 6 deletions(-)
> >> 
> >> diff --git a/mm/sparse.c b/mm/sparse.c
> >> index afd0998..ce50c8b 100644
> >> --- a/mm/sparse.c
> >> +++ b/mm/sparse.c
> >> @@ -74,14 +74,14 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
> >>  
> >>  	if (slab_is_available()) {
> >>  		if (node_state(nid, N_HIGH_MEMORY))
> >> -			section = kmalloc_node(array_size, GFP_KERNEL, nid);
> >> +			section = kzalloc_node(array_size, GFP_KERNEL, nid);
> >>  		else
> >> -			section = kmalloc(array_size, GFP_KERNEL);
> >> -	} else
> >> +			section = kzalloc(array_size, GFP_KERNEL);
> >> +	} else {
> >>  		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
> >> -
> >> -	if (section)
> >> -		memset(section, 0, array_size);
> >> +		if (section)
> >> +			memset(section, 0, array_size);
> >> +	}
> >>  
> >>  	return section;
> >>  }
> >> -- 
> >> 1.7.9.5
> >> 
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org.  For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> >-- 
> >Michal Hocko
> >SUSE Labs
> >SUSE LINUX s.r.o.
> >Lihovarska 1060/12
> >190 00 Praha 9    
> >Czech Republic
> >
> >--
> >To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >the body to majordomo@kvack.org.  For more info on Linux MM,
> >see: http://www.linux-mm.org/ .
> >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> 

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] mm/sparse: optimize sparse_index_alloc
  2012-06-26  7:04       ` Michal Hocko
@ 2012-06-26  7:13         ` Gavin Shan
  0 siblings, 0 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-26  7:13 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

On Tue, Jun 26, 2012 at 09:04:21AM +0200, Michal Hocko wrote:
>On Tue 26-06-12 14:07:35, Gavin Shan wrote:
>> >> With CONFIG_SPARSEMEM_EXTREME, the two level of memory section
>> >> descriptors are allocated from slab or bootmem. When allocating
>> >> from slab, let slab allocator to clear the memory chunk. However,
>> >> the memory chunk from bootmem allocator, we have to clear that
>> >> explicitly.
>> >
>> >I am sorry but I do not see how this optimize the current code. What is
>> >the difference between slab doing memset and doing it explicitly for all
>> >cases?
>> >
>> 
>> Yeah, I do agree it won't do much optimization here. However, I'm wandering
>> if I can remove the whole peice of code doing memset(setion, 0, array_size)
>> since it seems that alloc_bootmem_node() also clears the allocated memory
>> chunk :-)
>
>Yes, alloc_bootem_node clears the memory (strange, I thought it doesn't
>do that), so the memset is really not necessary after s/kmalloc/kzalloc/.
>

Thanks for the confirm, Michal. Let me remove it in next revision :-)

Thanks,
Gavin

>> 
>> Please correct me if I'm wrong about alloc_bootmem_node() :-)
>> 
>> Thanks,
>> Gavin
>> 
>> >> 
>> >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> >> ---
>> >>  mm/sparse.c |   12 ++++++------
>> >>  1 file changed, 6 insertions(+), 6 deletions(-)
>> >> 
>> >> diff --git a/mm/sparse.c b/mm/sparse.c
>> >> index afd0998..ce50c8b 100644
>> >> --- a/mm/sparse.c
>> >> +++ b/mm/sparse.c
>> >> @@ -74,14 +74,14 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid)
>> >>  
>> >>  	if (slab_is_available()) {
>> >>  		if (node_state(nid, N_HIGH_MEMORY))
>> >> -			section = kmalloc_node(array_size, GFP_KERNEL, nid);
>> >> +			section = kzalloc_node(array_size, GFP_KERNEL, nid);
>> >>  		else
>> >> -			section = kmalloc(array_size, GFP_KERNEL);
>> >> -	} else
>> >> +			section = kzalloc(array_size, GFP_KERNEL);
>> >> +	} else {
>> >>  		section = alloc_bootmem_node(NODE_DATA(nid), array_size);
>> >> -
>> >> -	if (section)
>> >> -		memset(section, 0, array_size);
>> >> +		if (section)
>> >> +			memset(section, 0, array_size);
>> >> +	}
>> >>  
>> >>  	return section;
>> >>  }
>> >> -- 
>> >> 1.7.9.5
>> >> 
>> >> --
>> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >> the body to majordomo@kvack.org.  For more info on Linux MM,
>> >> see: http://www.linux-mm.org/ .
>> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> >
>> >-- 
>> >Michal Hocko
>> >SUSE Labs
>> >SUSE LINUX s.r.o.
>> >Lihovarska 1060/12
>> >190 00 Praha 9    
>> >Czech Republic
>> >
>> >--
>> >To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >the body to majordomo@kvack.org.  For more info on Linux MM,
>> >see: http://www.linux-mm.org/ .
>> >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> >
>> 
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] mm/sparse: fix possible memory leak
  2012-06-26  6:11     ` Gavin Shan
@ 2012-06-26  7:14       ` Michal Hocko
  2012-06-26  7:17         ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-26  7:14 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Tue 26-06-12 14:11:47, Gavin Shan wrote:
> >> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
> >> are allocated by slab or bootmem allocator. Also, the descriptors
> >> might have been allocated and initialized by others. However, the
> >> memory chunk allocated in current implementation wouldn't be put
> >> into the available pool if others have allocated memory chunk for
> >> that.
> >
> >Who is others? I assume that we can race in hotplug because other than
> >that this is an early initialization code. How can others race?
> >
> 
> I'm sorry that I don't have the real bug against the issue. 

I am not saying the bug is not real. It is just that the changelog
doesn's say how the bug is hit, who is affected and when it has been
introduced. These is essential for stable.


-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] mm/sparse: fix possible memory leak
  2012-06-26  7:14       ` Michal Hocko
@ 2012-06-26  7:17         ` Gavin Shan
  0 siblings, 0 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-26  7:17 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> >> With CONFIG_SPARSEMEM_EXTREME, the root memory section descriptors
>> >> are allocated by slab or bootmem allocator. Also, the descriptors
>> >> might have been allocated and initialized by others. However, the
>> >> memory chunk allocated in current implementation wouldn't be put
>> >> into the available pool if others have allocated memory chunk for
>> >> that.
>> >
>> >Who is others? I assume that we can race in hotplug because other than
>> >that this is an early initialization code. How can others race?
>> >
>> 
>> I'm sorry that I don't have the real bug against the issue. 
>
>I am not saying the bug is not real. It is just that the changelog
>doesn's say how the bug is hit, who is affected and when it has been
>introduced. These is essential for stable.
>

Thanks, Michal. Let me replace "others" with "hotplug" in next revision :-)

Thanks,
Gavin

>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-25 16:35   ` Gavin Shan
@ 2012-06-26  7:39     ` Michal Hocko
  2012-06-26  7:48       ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-26  7:39 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Tue 26-06-12 00:35:22, Gavin Shan wrote:
[...]
> >> In order to fully utilize the memory chunk allocated from bootmem
> >> allocator, it'd better to assure memory sector descriptor won't run
> >> across the boundary (PAGE_SIZE).

OK, I misread this part of the changelog changelog.

> >
> >Why? The memory is continuous, right?
> 
> Yes, the memory is conginous and the capacity of specific entry
> in mem_section[NR_SECTION_ROOTS] has been defined as follows:
> 
> 
> #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
> 
> Also, the memory is prone to be allocated from bootmem by function
> alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
> reasonable to introduce the extra check here from my personal view :-)

No it is not necessary because we will never cross the page boundary
because (SECTIONS_PER_ROOT uses an int division)
-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-26  7:39     ` Michal Hocko
@ 2012-06-26  7:48       ` Gavin Shan
  2012-06-26  8:06         ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-26  7:48 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> >> In order to fully utilize the memory chunk allocated from bootmem
>> >> allocator, it'd better to assure memory sector descriptor won't run
>> >> across the boundary (PAGE_SIZE).
>
>OK, I misread this part of the changelog changelog.
>

I should have clarified that more clear :-)

>> >
>> >Why? The memory is continuous, right?
>> 
>> Yes, the memory is conginous and the capacity of specific entry
>> in mem_section[NR_SECTION_ROOTS] has been defined as follows:
>> 
>> 
>> #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
>> 
>> Also, the memory is prone to be allocated from bootmem by function
>> alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
>> reasonable to introduce the extra check here from my personal view :-)
>
>No it is not necessary because we will never cross the page boundary
>because (SECTIONS_PER_ROOT uses an int division)

Current situation is that we don't cross the page foundary, but somebody
else might change the data struct (struct mem_section) in future. It will
trigger warning at build time to alarm that the struct should fit with
page size.

Anyway, I will drop this in next revision if you want keep as of being.
Otherwise, I will include it in next revision :-)

Thanks for your time, Michal.

Thanks,
Gavin

>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-26  7:48       ` Gavin Shan
@ 2012-06-26  8:06         ` Michal Hocko
  2012-06-26  8:24           ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-26  8:06 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Tue 26-06-12 15:48:54, Gavin Shan wrote:
> >> >> In order to fully utilize the memory chunk allocated from bootmem
> >> >> allocator, it'd better to assure memory sector descriptor won't run
> >> >> across the boundary (PAGE_SIZE).
> >
> >OK, I misread this part of the changelog changelog.
> >
> 
> I should have clarified that more clear :-)
> 
> >> >
> >> >Why? The memory is continuous, right?
> >> 
> >> Yes, the memory is conginous and the capacity of specific entry
> >> in mem_section[NR_SECTION_ROOTS] has been defined as follows:
> >> 
> >> 
> >> #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
> >> 
> >> Also, the memory is prone to be allocated from bootmem by function
> >> alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
> >> reasonable to introduce the extra check here from my personal view :-)
> >
> >No it is not necessary because we will never cross the page boundary
> >because (SECTIONS_PER_ROOT uses an int division)
> 
> Current situation is that we don't cross the page foundary, but somebody
> else might change the data struct (struct mem_section) in future. 

No, this is safe even if the structure size changes (unless it is bigger
than PAGE_SIZE).
-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-26  8:06         ` Michal Hocko
@ 2012-06-26  8:24           ` Gavin Shan
  2012-06-26 17:57             ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Gavin Shan @ 2012-06-26  8:24 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> >> >> In order to fully utilize the memory chunk allocated from bootmem
>> >> >> allocator, it'd better to assure memory sector descriptor won't run
>> >> >> across the boundary (PAGE_SIZE).
>> >
>> >OK, I misread this part of the changelog changelog.
>> >
>> 
>> I should have clarified that more clear :-)
>> 
>> >> >
>> >> >Why? The memory is continuous, right?
>> >> 
>> >> Yes, the memory is conginous and the capacity of specific entry
>> >> in mem_section[NR_SECTION_ROOTS] has been defined as follows:
>> >> 
>> >> 
>> >> #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
>> >> 
>> >> Also, the memory is prone to be allocated from bootmem by function
>> >> alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
>> >> reasonable to introduce the extra check here from my personal view :-)
>> >
>> >No it is not necessary because we will never cross the page boundary
>> >because (SECTIONS_PER_ROOT uses an int division)
>> 
>> Current situation is that we don't cross the page foundary, but somebody
>> else might change the data struct (struct mem_section) in future. 
>
>No, this is safe even if the structure size changes (unless it is bigger
>than PAGE_SIZE).

Yeah, but it can't fully utilize the allocated memory chunk if the size of
the struct isn't aligned well.

Let me drop it in next revision :-)

Thanks,
Gavin


>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/5] mm/sparse: more check on mem_section number
  2012-06-23 15:52 ` [PATCH 4/5] mm/sparse: more check on mem_section number Gavin Shan
@ 2012-06-26 16:15   ` Dave Hansen
  2012-06-27  0:29     ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Dave Hansen @ 2012-06-26 16:15 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On 06/23/2012 08:52 AM, Gavin Shan wrote:
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -160,6 +160,8 @@ int __section_nr(struct mem_section* ms)
>  		     break;
>  	}
> 
> +	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> +
>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>  }

If you're going to bother with a VM_BUG_ON(), I'd probably make it:

	VM_BUG_ON(root_nr >= NR_SECTION_ROOTS);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-26  8:24           ` Gavin Shan
@ 2012-06-26 17:57             ` Michal Hocko
  2012-06-27  3:01               ` Gavin Shan
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2012-06-26 17:57 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linux-mm, rientjes, hannes, akpm

On Tue 26-06-12 16:24:39, Gavin Shan wrote:
> >> >> >> In order to fully utilize the memory chunk allocated from bootmem
> >> >> >> allocator, it'd better to assure memory sector descriptor won't run
> >> >> >> across the boundary (PAGE_SIZE).
> >> >
> >> >OK, I misread this part of the changelog changelog.
> >> >
> >> 
> >> I should have clarified that more clear :-)
> >> 
> >> >> >
> >> >> >Why? The memory is continuous, right?
> >> >> 
> >> >> Yes, the memory is conginous and the capacity of specific entry
> >> >> in mem_section[NR_SECTION_ROOTS] has been defined as follows:
> >> >> 
> >> >> 
> >> >> #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
> >> >> 
> >> >> Also, the memory is prone to be allocated from bootmem by function
> >> >> alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
> >> >> reasonable to introduce the extra check here from my personal view :-)
> >> >
> >> >No it is not necessary because we will never cross the page boundary
> >> >because (SECTIONS_PER_ROOT uses an int division)
> >> 
> >> Current situation is that we don't cross the page foundary, but somebody
> >> else might change the data struct (struct mem_section) in future. 
> >
> >No, this is safe even if the structure size changes (unless it is bigger
> >than PAGE_SIZE).
> 
> Yeah, but it can't fully utilize the allocated memory chunk if the size of
> the struct isn't aligned well.

And you think that this is justification is sufficient to fail
compilation? I don't think so...

> 
> Let me drop it in next revision :-)
> 
> Thanks,
> Gavin

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/5] mm/sparse: more check on mem_section number
  2012-06-26 16:15   ` Dave Hansen
@ 2012-06-27  0:29     ` Gavin Shan
  0 siblings, 0 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-27  0:29 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -160,6 +160,8 @@ int __section_nr(struct mem_section* ms)
>>  		     break;
>>  	}
>> 
>> +	VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
>> +
>>  	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
>>  }
>
>If you're going to bother with a VM_BUG_ON(), I'd probably make it:
>
>	VM_BUG_ON(root_nr >= NR_SECTION_ROOTS);

Thanks, I'll change it according to your suggestion in next revision.

Thanks,
Gavin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] mm/sparse: check size of struct mm_section
  2012-06-26 17:57             ` Michal Hocko
@ 2012-06-27  3:01               ` Gavin Shan
  0 siblings, 0 replies; 26+ messages in thread
From: Gavin Shan @ 2012-06-27  3:01 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Gavin Shan, linux-mm, rientjes, hannes, akpm

>> >> >
>> >> 
>> >> I should have clarified that more clear :-)
>> >> 
>> >> >> >
>> >> >> >Why? The memory is continuous, right?
>> >> >> 
>> >> >> Yes, the memory is conginous and the capacity of specific entry
>> >> >> in mem_section[NR_SECTION_ROOTS] has been defined as follows:
>> >> >> 
>> >> >> 
>> >> >> #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
>> >> >> 
>> >> >> Also, the memory is prone to be allocated from bootmem by function
>> >> >> alloc_bootmem_node(), which has PAGE_SIZE alignment. So I think it's
>> >> >> reasonable to introduce the extra check here from my personal view :-)
>> >> >
>> >> >No it is not necessary because we will never cross the page boundary
>> >> >because (SECTIONS_PER_ROOT uses an int division)
>> >> 
>> >> Current situation is that we don't cross the page foundary, but somebody
>> >> else might change the data struct (struct mem_section) in future. 
>> >
>> >No, this is safe even if the structure size changes (unless it is bigger
>> >than PAGE_SIZE).
>> 
>> Yeah, but it can't fully utilize the allocated memory chunk if the size of
>> the struct isn't aligned well.
>
>And you think that this is justification is sufficient to fail
>compilation? I don't think so...
>

Yeah, it's not reasonable to breat the compilation. So I'm not sure if
linux already had one macro to do warning for the case?

Thanks,
Gavin

>> 
>> Let me drop it in next revision :-)
>> 
>> Thanks,
>> Gavin
>
>-- 
>Michal Hocko
>SUSE Labs
>SUSE LINUX s.r.o.
>Lihovarska 1060/12
>190 00 Praha 9    
>Czech Republic
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-06-27  3:02 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-23 15:52 [PATCH 1/5] mm/sparse: check size of struct mm_section Gavin Shan
2012-06-23 15:52 ` [PATCH 2/5] mm/sparse: optimize sparse_index_alloc Gavin Shan
2012-06-25 15:30   ` Michal Hocko
2012-06-26  6:07     ` Gavin Shan
2012-06-26  7:04       ` Michal Hocko
2012-06-26  7:13         ` Gavin Shan
2012-06-23 15:52 ` [PATCH 3/5] mm/sparse: fix possible memory leak Gavin Shan
2012-06-25 15:48   ` Michal Hocko
2012-06-26  6:11     ` Gavin Shan
2012-06-26  7:14       ` Michal Hocko
2012-06-26  7:17         ` Gavin Shan
2012-06-23 15:52 ` [PATCH 4/5] mm/sparse: more check on mem_section number Gavin Shan
2012-06-26 16:15   ` Dave Hansen
2012-06-27  0:29     ` Gavin Shan
2012-06-23 15:52 ` [PATCH 5/5] mm/sparse: return 0 if root mem_section exists Gavin Shan
2012-06-24 19:18   ` KOSAKI Motohiro
2012-06-25  1:09     ` Gavin Shan
2012-06-25 15:39   ` Michal Hocko
2012-06-25 16:03 ` [PATCH 1/5] mm/sparse: check size of struct mm_section Michal Hocko
2012-06-25 16:35   ` Gavin Shan
2012-06-26  7:39     ` Michal Hocko
2012-06-26  7:48       ` Gavin Shan
2012-06-26  8:06         ` Michal Hocko
2012-06-26  8:24           ` Gavin Shan
2012-06-26 17:57             ` Michal Hocko
2012-06-27  3:01               ` Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.