* [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc @ 2012-07-02 9:28 Gavin Shan 2012-07-02 9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan 2012-07-02 9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan 0 siblings, 2 replies; 13+ messages in thread From: Gavin Shan @ 2012-07-02 9:28 UTC (permalink / raw) To: linux-mm; +Cc: dave, mhocko, rientjes, akpm, Gavin Shan With CONFIG_SPARSEMEM_EXTREME, the two level of memory section descriptors are allocated from slab or bootmem. When allocating from slab, let slab/bootmem allocator to clear the memory chunk. We needn't clear that explicitly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> --- mm/sparse.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/mm/sparse.c b/mm/sparse.c index 6a4bf91..781fa04 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -65,14 +65,12 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) if (slab_is_available()) { if (node_state(nid, N_HIGH_MEMORY)) - section = kmalloc_node(array_size, GFP_KERNEL, nid); + section = kzalloc_node(array_size, GFP_KERNEL, nid); else - section = kmalloc(array_size, GFP_KERNEL); - } else + section = kzalloc(array_size, GFP_KERNEL); + } else { section = alloc_bootmem_node(NODE_DATA(nid), array_size); - - if (section) - memset(section, 0, array_size); + } return section; } -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 9:28 [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan @ 2012-07-02 9:28 ` Gavin Shan 2012-07-02 9:43 ` Michal Hocko 2012-07-02 11:04 ` David Rientjes 2012-07-02 9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan 1 sibling, 2 replies; 13+ messages in thread From: Gavin Shan @ 2012-07-02 9:28 UTC (permalink / raw) To: linux-mm; +Cc: dave, mhocko, rientjes, akpm, Gavin Shan sparse_index_init() is designed to be safe if two copies of it race. It uses "index_init_lock" to ensure that, even in the case of a race, only one CPU will manage to do: mem_section[root] = section; However, in the case where two copies of sparse_index_init() _do_ race, the one that loses the race will leak the "section" that sparse_index_alloc() allocated for it. This patch fixes that leak. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> --- mm/sparse.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/mm/sparse.c b/mm/sparse.c index 781fa04..a6984d9 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) return section; } +static inline void __meminit sparse_index_free(struct mem_section *section) +{ + unsigned long size = SECTIONS_PER_ROOT * + sizeof(struct mem_section); + + if (!section) + return; + + if (slab_is_available()) + kfree(section); + else + free_bootmem(virt_to_phys(section), size); +} + static int __meminit sparse_index_init(unsigned long section_nr, int nid) { static DEFINE_SPINLOCK(index_init_lock); @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) mem_section[root] = section; out: spin_unlock(&index_init_lock); + if (ret) + sparse_index_free(section); + return ret; } #else /* !SPARSEMEM_EXTREME */ -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan @ 2012-07-02 9:43 ` Michal Hocko 2012-07-02 13:40 ` Gavin Shan 2012-07-02 11:04 ` David Rientjes 1 sibling, 1 reply; 13+ messages in thread From: Michal Hocko @ 2012-07-02 9:43 UTC (permalink / raw) To: Gavin Shan; +Cc: linux-mm, dave, rientjes, akpm On Mon 02-07-12 17:28:56, Gavin Shan wrote: > sparse_index_init() is designed to be safe if two copies of it race. It > uses "index_init_lock" to ensure that, even in the case of a race, only > one CPU will manage to do: > > mem_section[root] = section; > > However, in the case where two copies of sparse_index_init() _do_ race, > the one that loses the race will leak the "section" that > sparse_index_alloc() allocated for it. This patch fixes that leak. I would still like to hear how we can possibly race in this code path. I've thought that memory onlining is done from a single CPU. > > Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> > --- > mm/sparse.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/mm/sparse.c b/mm/sparse.c > index 781fa04..a6984d9 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) > return section; > } > > +static inline void __meminit sparse_index_free(struct mem_section *section) > +{ > + unsigned long size = SECTIONS_PER_ROOT * > + sizeof(struct mem_section); > + > + if (!section) > + return; > + > + if (slab_is_available()) > + kfree(section); > + else > + free_bootmem(virt_to_phys(section), size); > +} > + > static int __meminit sparse_index_init(unsigned long section_nr, int nid) > { > static DEFINE_SPINLOCK(index_init_lock); > @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) > mem_section[root] = section; > out: > spin_unlock(&index_init_lock); > + if (ret) > + sparse_index_free(section); > + > return ret; > } > #else /* !SPARSEMEM_EXTREME */ > -- > 1.7.9.5 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 9:43 ` Michal Hocko @ 2012-07-02 13:40 ` Gavin Shan 2012-07-02 15:46 ` Michal Hocko 0 siblings, 1 reply; 13+ messages in thread From: Gavin Shan @ 2012-07-02 13:40 UTC (permalink / raw) To: Michal Hocko; +Cc: Gavin Shan, linux-mm, dave, rientjes, akpm On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote: >On Mon 02-07-12 17:28:56, Gavin Shan wrote: >> sparse_index_init() is designed to be safe if two copies of it race. It >> uses "index_init_lock" to ensure that, even in the case of a race, only >> one CPU will manage to do: >> >> mem_section[root] = section; >> >> However, in the case where two copies of sparse_index_init() _do_ race, >> the one that loses the race will leak the "section" that >> sparse_index_alloc() allocated for it. This patch fixes that leak. > >I would still like to hear how we can possibly race in this code path. >I've thought that memory onlining is done from a single CPU. > Hi Michael, how about to use the following changelog? :-) ----- sparse_index_init() is designed to be safe if two copies of it race. It uses "index_init_lock" to ensure that, even in the case of a race, only one CPU will manage to do: mem_section[root] = section; However, in the case where two copies of sparse_index_init() _do_ race, which is probablly caused by making online for multiple memory sections that depend on same entry of array mem_section[] simultaneously from different CPUs. The one that loses the race will leak the "section" that sparse_index_alloc() allocated for it. This patch fixes that leak. ----- Thanks, Gavin >> >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> >> --- >> mm/sparse.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/mm/sparse.c b/mm/sparse.c >> index 781fa04..a6984d9 100644 >> --- a/mm/sparse.c >> +++ b/mm/sparse.c >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) >> return section; >> } >> >> +static inline void __meminit sparse_index_free(struct mem_section *section) >> +{ >> + unsigned long size = SECTIONS_PER_ROOT * >> + sizeof(struct mem_section); >> + >> + if (!section) >> + return; >> + >> + if (slab_is_available()) >> + kfree(section); >> + else >> + free_bootmem(virt_to_phys(section), size); >> +} >> + >> static int __meminit sparse_index_init(unsigned long section_nr, int nid) >> { >> static DEFINE_SPINLOCK(index_init_lock); >> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) >> mem_section[root] = section; >> out: >> spin_unlock(&index_init_lock); >> + if (ret) >> + sparse_index_free(section); >> + >> return ret; >> } >> #else /* !SPARSEMEM_EXTREME */ >> -- >> 1.7.9.5 >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > >-- >Michal Hocko >SUSE Labs >SUSE LINUX s.r.o. >Lihovarska 1060/12 >190 00 Praha 9 >Czech Republic > >-- >To unsubscribe, send a message with 'unsubscribe linux-mm' in >the body to majordomo@kvack.org. For more info on Linux MM, >see: http://www.linux-mm.org/ . >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 13:40 ` Gavin Shan @ 2012-07-02 15:46 ` Michal Hocko 2012-07-03 3:38 ` Gavin Shan 0 siblings, 1 reply; 13+ messages in thread From: Michal Hocko @ 2012-07-02 15:46 UTC (permalink / raw) To: Gavin Shan; +Cc: linux-mm, dave, rientjes, akpm On Mon 02-07-12 21:40:53, Gavin Shan wrote: > On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote: > >On Mon 02-07-12 17:28:56, Gavin Shan wrote: > >> sparse_index_init() is designed to be safe if two copies of it race. It > >> uses "index_init_lock" to ensure that, even in the case of a race, only > >> one CPU will manage to do: > >> > >> mem_section[root] = section; > >> > >> However, in the case where two copies of sparse_index_init() _do_ race, > >> the one that loses the race will leak the "section" that > >> sparse_index_alloc() allocated for it. This patch fixes that leak. > > > >I would still like to hear how we can possibly race in this code path. > >I've thought that memory onlining is done from a single CPU. > > > > Hi Michael, how about to use the following changelog? :-) > > ----- > > sparse_index_init() is designed to be safe if two copies of it race. It > uses "index_init_lock" to ensure that, even in the case of a race, only > one CPU will manage to do: > > mem_section[root] = section; > > However, in the case where two copies of sparse_index_init() _do_ race, > which is probablly caused by making online for multiple memory sections > that depend on same entry of array mem_section[] simultaneously from > different CPUs. And you really think that this clarified the things? You have just tweaked the comment to sound more obscure. OK, so you have pushed me into the code... If you had looked into the hotplug callchain up to add_memory you would have seen that the whole arch_add_memory -> __add_pages -> ... -> sparse_index_init is called with lock_memory_hotplug held so the hotplug cannot run from the multiple CPUs. I do not see any other users apart from boot time sparse_memory_present_with_active_regions and add_memory so I think that the lock is just a heritage from old days. So please make sure you are fixing a real issue rather than add another code which simply never gets executed. And no obscuring the changelog doesn't help anybody. > The one that loses the race will leak the "section" that > sparse_index_alloc() allocated for it. This patch fixes that leak. > > ----- > > Thanks, > Gavin > > >> > >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> > >> --- > >> mm/sparse.c | 17 +++++++++++++++++ > >> 1 file changed, 17 insertions(+) > >> > >> diff --git a/mm/sparse.c b/mm/sparse.c > >> index 781fa04..a6984d9 100644 > >> --- a/mm/sparse.c > >> +++ b/mm/sparse.c > >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) > >> return section; > >> } > >> > >> +static inline void __meminit sparse_index_free(struct mem_section *section) > >> +{ > >> + unsigned long size = SECTIONS_PER_ROOT * > >> + sizeof(struct mem_section); > >> + > >> + if (!section) > >> + return; > >> + > >> + if (slab_is_available()) > >> + kfree(section); > >> + else > >> + free_bootmem(virt_to_phys(section), size); > >> +} > >> + > >> static int __meminit sparse_index_init(unsigned long section_nr, int nid) > >> { > >> static DEFINE_SPINLOCK(index_init_lock); > >> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) > >> mem_section[root] = section; > >> out: > >> spin_unlock(&index_init_lock); > >> + if (ret) > >> + sparse_index_free(section); > >> + > >> return ret; > >> } > >> #else /* !SPARSEMEM_EXTREME */ > >> -- > >> 1.7.9.5 > >> > >> -- > >> To unsubscribe, send a message with 'unsubscribe linux-mm' in > >> the body to majordomo@kvack.org. For more info on Linux MM, > >> see: http://www.linux-mm.org/ . > >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > >-- > >Michal Hocko > >SUSE Labs > >SUSE LINUX s.r.o. > >Lihovarska 1060/12 > >190 00 Praha 9 > >Czech Republic > > > >-- > >To unsubscribe, send a message with 'unsubscribe linux-mm' in > >the body to majordomo@kvack.org. For more info on Linux MM, > >see: http://www.linux-mm.org/ . > >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 15:46 ` Michal Hocko @ 2012-07-03 3:38 ` Gavin Shan 2012-07-03 12:51 ` Michal Hocko 0 siblings, 1 reply; 13+ messages in thread From: Gavin Shan @ 2012-07-03 3:38 UTC (permalink / raw) To: Michal Hocko; +Cc: Gavin Shan, linux-mm, dave, rientjes, akpm On Mon, Jul 02, 2012 at 05:46:28PM +0200, Michal Hocko wrote: >On Mon 02-07-12 21:40:53, Gavin Shan wrote: >> On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote: >> >On Mon 02-07-12 17:28:56, Gavin Shan wrote: >> >> sparse_index_init() is designed to be safe if two copies of it race. It >> >> uses "index_init_lock" to ensure that, even in the case of a race, only >> >> one CPU will manage to do: >> >> >> >> mem_section[root] = section; >> >> >> >> However, in the case where two copies of sparse_index_init() _do_ race, >> >> the one that loses the race will leak the "section" that >> >> sparse_index_alloc() allocated for it. This patch fixes that leak. >> > >> >I would still like to hear how we can possibly race in this code path. >> >I've thought that memory onlining is done from a single CPU. >> > >> >> Hi Michael, how about to use the following changelog? :-) >> >> ----- >> >> sparse_index_init() is designed to be safe if two copies of it race. It >> uses "index_init_lock" to ensure that, even in the case of a race, only >> one CPU will manage to do: >> >> mem_section[root] = section; >> >> However, in the case where two copies of sparse_index_init() _do_ race, >> which is probablly caused by making online for multiple memory sections >> that depend on same entry of array mem_section[] simultaneously from >> different CPUs. > >And you really think that this clarified the things? You have just >tweaked the comment to sound more obscure. > >OK, so you have pushed me into the code... >If you had looked into the hotplug callchain up to add_memory you would >have seen that the whole arch_add_memory -> __add_pages -> ... -> >sparse_index_init is called with lock_memory_hotplug held so the hotplug >cannot run from the multiple CPUs. > >I do not see any other users apart from boot time >sparse_memory_present_with_active_regions and add_memory so I think that >the lock is just a heritage from old days. > I just had quick go-through on the source code as you suggested and I think you're right, Michal. So please drop this :-) With CONFIG_ARCH_MEMORY_PROBE enabled on Power machines, following functions would be included in hotplug path. memory_probe_store add_memory lock_memory_hotplug /* protect the whole hotplug path */ arch_add_memory __add_pages __add_section sparse_add_one_section sparse_index_init sparse_index_alloc The mutex "mem_hotplug_mutex" will be hold by lock_memory_hotplug() to protect the whole hotplug path. However, I'm wandering if we can remove the "index_init_lock" of function sparse_index_init() since that sounds duplicate lock. static int __meminit sparse_index_init(unsigned long section_nr, int nid) { static DEFINE_SPINLOCK(index_init_lock); unsigned long root = SECTION_NR_TO_ROOT(section_nr); struct mem_section *section; int ret = 0; if (mem_section[root]) return -EEXIST; section = sparse_index_alloc(nid); if (!section) return -ENOMEM; /* * This lock keeps two different sections from * reallocating for the same index */ spin_lock(&index_init_lock); if (mem_section[root]) { ret = -EEXIST; goto out; } mem_section[root] = section; out: spin_unlock(&index_init_lock); if (ret) sparse_index_free(section); return ret; } Thanks, Gavin >So please make sure you are fixing a real issue rather than add another >code which simply never gets executed. > >And no obscuring the changelog doesn't help anybody. > >> The one that loses the race will leak the "section" that >> sparse_index_alloc() allocated for it. This patch fixes that leak. > >> >> ----- >> >> Thanks, >> Gavin >> >> >> >> >> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> >> >> --- >> >> mm/sparse.c | 17 +++++++++++++++++ >> >> 1 file changed, 17 insertions(+) >> >> >> >> diff --git a/mm/sparse.c b/mm/sparse.c >> >> index 781fa04..a6984d9 100644 >> >> --- a/mm/sparse.c >> >> +++ b/mm/sparse.c >> >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) >> >> return section; >> >> } >> >> >> >> +static inline void __meminit sparse_index_free(struct mem_section *section) >> >> +{ >> >> + unsigned long size = SECTIONS_PER_ROOT * >> >> + sizeof(struct mem_section); >> >> + >> >> + if (!section) >> >> + return; >> >> + >> >> + if (slab_is_available()) >> >> + kfree(section); >> >> + else >> >> + free_bootmem(virt_to_phys(section), size); >> >> +} >> >> + >> >> static int __meminit sparse_index_init(unsigned long section_nr, int nid) >> >> { >> >> static DEFINE_SPINLOCK(index_init_lock); >> >> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) >> >> mem_section[root] = section; >> >> out: >> >> spin_unlock(&index_init_lock); >> >> + if (ret) >> >> + sparse_index_free(section); >> >> + >> >> return ret; >> >> } >> >> #else /* !SPARSEMEM_EXTREME */ >> >> -- >> >> 1.7.9.5 >> >> >> >> -- >> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> >> the body to majordomo@kvack.org. For more info on Linux MM, >> >> see: http://www.linux-mm.org/ . >> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> >> > >> >-- >> >Michal Hocko >> >SUSE Labs >> >SUSE LINUX s.r.o. >> >Lihovarska 1060/12 >> >190 00 Praha 9 >> >Czech Republic >> > >> >-- >> >To unsubscribe, send a message with 'unsubscribe linux-mm' in >> >the body to majordomo@kvack.org. For more info on Linux MM, >> >see: http://www.linux-mm.org/ . >> >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> >> > >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > >-- >Michal Hocko >SUSE Labs >SUSE LINUX s.r.o. >Lihovarska 1060/12 >190 00 Praha 9 >Czech Republic > >-- >To unsubscribe, send a message with 'unsubscribe linux-mm' in >the body to majordomo@kvack.org. For more info on Linux MM, >see: http://www.linux-mm.org/ . >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-03 3:38 ` Gavin Shan @ 2012-07-03 12:51 ` Michal Hocko 0 siblings, 0 replies; 13+ messages in thread From: Michal Hocko @ 2012-07-03 12:51 UTC (permalink / raw) To: Gavin Shan; +Cc: linux-mm, dave, rientjes, akpm On Tue 03-07-12 11:38:23, Gavin Shan wrote: > On Mon, Jul 02, 2012 at 05:46:28PM +0200, Michal Hocko wrote: > >On Mon 02-07-12 21:40:53, Gavin Shan wrote: > >> On Mon, Jul 02, 2012 at 11:43:31AM +0200, Michal Hocko wrote: > >> >On Mon 02-07-12 17:28:56, Gavin Shan wrote: > >> >> sparse_index_init() is designed to be safe if two copies of it race. It > >> >> uses "index_init_lock" to ensure that, even in the case of a race, only > >> >> one CPU will manage to do: > >> >> > >> >> mem_section[root] = section; > >> >> > >> >> However, in the case where two copies of sparse_index_init() _do_ race, > >> >> the one that loses the race will leak the "section" that > >> >> sparse_index_alloc() allocated for it. This patch fixes that leak. > >> > > >> >I would still like to hear how we can possibly race in this code path. > >> >I've thought that memory onlining is done from a single CPU. > >> > > >> > >> Hi Michael, how about to use the following changelog? :-) > >> > >> ----- > >> > >> sparse_index_init() is designed to be safe if two copies of it race. It > >> uses "index_init_lock" to ensure that, even in the case of a race, only > >> one CPU will manage to do: > >> > >> mem_section[root] = section; > >> > >> However, in the case where two copies of sparse_index_init() _do_ race, > >> which is probablly caused by making online for multiple memory sections > >> that depend on same entry of array mem_section[] simultaneously from > >> different CPUs. > > > >And you really think that this clarified the things? You have just > >tweaked the comment to sound more obscure. > > > >OK, so you have pushed me into the code... > >If you had looked into the hotplug callchain up to add_memory you would > >have seen that the whole arch_add_memory -> __add_pages -> ... -> > >sparse_index_init is called with lock_memory_hotplug held so the hotplug > >cannot run from the multiple CPUs. > > > >I do not see any other users apart from boot time > >sparse_memory_present_with_active_regions and add_memory so I think that > >the lock is just a heritage from old days. > > > > I just had quick go-through on the source code as you suggested and I > think you're right, Michal. So please drop this :-) > > With CONFIG_ARCH_MEMORY_PROBE enabled on Power machines, following > functions would be included in hotplug path. I am not sure why you are mentioning Power arch here, add_memory which does the locking is arch independent. > > memory_probe_store > add_memory > lock_memory_hotplug /* protect the whole hotplug path */ > arch_add_memory > __add_pages > __add_section > sparse_add_one_section > sparse_index_init > sparse_index_alloc > > The mutex "mem_hotplug_mutex" will be hold by lock_memory_hotplug() to protect > the whole hotplug path. > However, I'm wandering if we can remove the "index_init_lock" of > function sparse_index_init() since that sounds duplicate lock. Heh, that's what I am asking from the very beginning... I do not see any purpose of the lock but I might be missing something. So make sure you really understand the locking of this code if you are going to send a patch to remove the lock. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan 2012-07-02 9:43 ` Michal Hocko @ 2012-07-02 11:04 ` David Rientjes 2012-07-02 13:28 ` Gavin Shan 1 sibling, 1 reply; 13+ messages in thread From: David Rientjes @ 2012-07-02 11:04 UTC (permalink / raw) To: Gavin Shan; +Cc: linux-mm, dave, mhocko, akpm On Mon, 2 Jul 2012, Gavin Shan wrote: > diff --git a/mm/sparse.c b/mm/sparse.c > index 781fa04..a6984d9 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) > return section; > } > > +static inline void __meminit sparse_index_free(struct mem_section *section) > +{ > + unsigned long size = SECTIONS_PER_ROOT * > + sizeof(struct mem_section); > + > + if (!section) > + return; > + > + if (slab_is_available()) > + kfree(section); > + else > + free_bootmem(virt_to_phys(section), size); Eek, does that work? > +} > + > static int __meminit sparse_index_init(unsigned long section_nr, int nid) > { > static DEFINE_SPINLOCK(index_init_lock); > @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) > mem_section[root] = section; > out: > spin_unlock(&index_init_lock); > + if (ret) > + sparse_index_free(section); > + > return ret; > } > #else /* !SPARSEMEM_EXTREME */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 11:04 ` David Rientjes @ 2012-07-02 13:28 ` Gavin Shan 2012-07-02 21:19 ` David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Gavin Shan @ 2012-07-02 13:28 UTC (permalink / raw) To: David Rientjes; +Cc: Gavin Shan, linux-mm, dave, mhocko, akpm >> diff --git a/mm/sparse.c b/mm/sparse.c >> index 781fa04..a6984d9 100644 >> --- a/mm/sparse.c >> +++ b/mm/sparse.c >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) >> return section; >> } >> >> +static inline void __meminit sparse_index_free(struct mem_section *section) >> +{ >> + unsigned long size = SECTIONS_PER_ROOT * >> + sizeof(struct mem_section); >> + >> + if (!section) >> + return; >> + >> + if (slab_is_available()) >> + kfree(section); >> + else >> + free_bootmem(virt_to_phys(section), size); > >Eek, does that work? > David, I think it's working fine. If my understanding is wrong, please correct me. Thanks a lot :-) The "section" allocated from the bootmem allocator might take following function call path. In the function alloc_bootmem_core(), all online nodes will be checked for the memory allocation. So we could have memory allocated from different node other than the specified one to alloc_bootmem_node() alloc_bootmem_node(nid, size) __alloc_bootmem_node() ___alloc_bootmem_node_nopanic() alloc_bootmem_core() On the other hand, function free_bootmem() checks which node the memory block belongs to and then free it into that node. That looks reasonable. Thanks, Gavin >> +} >> + >> static int __meminit sparse_index_init(unsigned long section_nr, int nid) >> { >> static DEFINE_SPINLOCK(index_init_lock); >> @@ -102,6 +116,9 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) >> mem_section[root] = section; >> out: >> spin_unlock(&index_init_lock); >> + if (ret) >> + sparse_index_free(section); >> + >> return ret; >> } >> #else /* !SPARSEMEM_EXTREME */ > >-- >To unsubscribe, send a message with 'unsubscribe linux-mm' in >the body to majordomo@kvack.org. For more info on Linux MM, >see: http://www.linux-mm.org/ . >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 13:28 ` Gavin Shan @ 2012-07-02 21:19 ` David Rientjes 2012-07-03 1:19 ` Gavin Shan 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2012-07-02 21:19 UTC (permalink / raw) To: Gavin Shan; +Cc: linux-mm, dave, mhocko, akpm On Mon, 2 Jul 2012, Gavin Shan wrote: > >> diff --git a/mm/sparse.c b/mm/sparse.c > >> index 781fa04..a6984d9 100644 > >> --- a/mm/sparse.c > >> +++ b/mm/sparse.c > >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) > >> return section; > >> } > >> > >> +static inline void __meminit sparse_index_free(struct mem_section *section) > >> +{ > >> + unsigned long size = SECTIONS_PER_ROOT * > >> + sizeof(struct mem_section); > >> + > >> + if (!section) > >> + return; > >> + > >> + if (slab_is_available()) > >> + kfree(section); > >> + else > >> + free_bootmem(virt_to_phys(section), size); > > > >Eek, does that work? > > > > David, I think it's working fine. If my understanding is wrong, please > correct me. Thanks a lot :-) > I'm thinking it should be free_bootmem(__pa(section), size); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 2/3] mm/sparse: fix possible memory leak 2012-07-02 21:19 ` David Rientjes @ 2012-07-03 1:19 ` Gavin Shan 0 siblings, 0 replies; 13+ messages in thread From: Gavin Shan @ 2012-07-03 1:19 UTC (permalink / raw) To: David Rientjes; +Cc: Gavin Shan, linux-mm, dave, mhocko, akpm On Mon, Jul 02, 2012 at 02:19:39PM -0700, David Rientjes wrote: >On Mon, 2 Jul 2012, Gavin Shan wrote: > >> >> diff --git a/mm/sparse.c b/mm/sparse.c >> >> index 781fa04..a6984d9 100644 >> >> --- a/mm/sparse.c >> >> +++ b/mm/sparse.c >> >> @@ -75,6 +75,20 @@ static struct mem_section noinline __init_refok *sparse_index_alloc(int nid) >> >> return section; >> >> } >> >> >> >> +static inline void __meminit sparse_index_free(struct mem_section *section) >> >> +{ >> >> + unsigned long size = SECTIONS_PER_ROOT * >> >> + sizeof(struct mem_section); >> >> + >> >> + if (!section) >> >> + return; >> >> + >> >> + if (slab_is_available()) >> >> + kfree(section); >> >> + else >> >> + free_bootmem(virt_to_phys(section), size); >> > >> >Eek, does that work? >> > >> >> David, I think it's working fine. If my understanding is wrong, please >> correct me. Thanks a lot :-) >> > >I'm thinking it should be free_bootmem(__pa(section), size); > Thanks for pointing it out, David. Thanks, Gavin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 3/3] mm/sparse: more check on mem_section number 2012-07-02 9:28 [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan 2012-07-02 9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan @ 2012-07-02 9:28 ` Gavin Shan 2012-07-02 11:05 ` David Rientjes 1 sibling, 1 reply; 13+ messages in thread From: Gavin Shan @ 2012-07-02 9:28 UTC (permalink / raw) To: linux-mm; +Cc: dave, mhocko, rientjes, akpm, Gavin Shan Function __section_nr() was implemented to retrieve the corresponding memory section number according to its descriptor. It's possible that the specified memory section descriptor isn't existing in the global array. So here to add more check on that and report error for wrong case. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> --- mm/sparse.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/sparse.c b/mm/sparse.c index a6984d9..f2525fd 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -147,6 +147,8 @@ int __section_nr(struct mem_section* ms) break; } + VM_BUG_ON(root_nr == NR_SECTION_ROOTS); + return (root_nr * SECTIONS_PER_ROOT) + (ms - root); } -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 3/3] mm/sparse: more check on mem_section number 2012-07-02 9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan @ 2012-07-02 11:05 ` David Rientjes 0 siblings, 0 replies; 13+ messages in thread From: David Rientjes @ 2012-07-02 11:05 UTC (permalink / raw) To: Gavin Shan; +Cc: linux-mm, dave, mhocko, akpm On Mon, 2 Jul 2012, Gavin Shan wrote: > Function __section_nr() was implemented to retrieve the corresponding > memory section number according to its descriptor. It's possible that > the specified memory section descriptor isn't existing in the global > array. So here to add more check on that and report error for wrong > case. > > Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Acked-by: David Rientjes <rientjes@google.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-07-03 12:51 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-07-02 9:28 [PATCH v3 1/3] mm/sparse: optimize sparse_index_alloc Gavin Shan 2012-07-02 9:28 ` [PATCH v3 2/3] mm/sparse: fix possible memory leak Gavin Shan 2012-07-02 9:43 ` Michal Hocko 2012-07-02 13:40 ` Gavin Shan 2012-07-02 15:46 ` Michal Hocko 2012-07-03 3:38 ` Gavin Shan 2012-07-03 12:51 ` Michal Hocko 2012-07-02 11:04 ` David Rientjes 2012-07-02 13:28 ` Gavin Shan 2012-07-02 21:19 ` David Rientjes 2012-07-03 1:19 ` Gavin Shan 2012-07-02 9:28 ` [PATCH v3 3/3] mm/sparse: more check on mem_section number Gavin Shan 2012-07-02 11:05 ` David Rientjes
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.