linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: percpu allocator
       [not found] <E1SNhwY-0007ui-V7.pp_84-mail-ru@f220.mail.ru>
@ 2012-04-27 14:17 ` Tejun Heo
  2012-04-27 15:42   ` [PATCH] percpu: pcpu_embed_first_chunk() should free unused parts after all allocs are complete Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Tejun Heo @ 2012-04-27 14:17 UTC (permalink / raw)
  To: Pavel V. Panteleev; +Cc: linux-mm, linux-kernel, andi

Hello,

On Fri, Apr 27, 2012 at 01:58:26PM +0400, Pavel V. Panteleev wrote:
> I have the following problem with pcpu embed allocator in kernel
> 2.6.33.1. pcpu_embed_first_chunk() function allocate only size_sum =
> (ai->static_size + ai->reserved_size + ai->dyn_size) for an unit in
> the group. So, for 4 groups (1 unit in each) and with memory only on
> the first node I have the following:
> 
> pcpu_embed_first_chunk(): ai->groups[0].base_offset=0x0
> pcpu_embed_first_chunk(): ai->groups[1].base_offset=0xa000
> pcpu_embed_first_chunk(): ai->groups[2].base_offset=0x14000
> pcpu_embed_first_chunk(): ai->groups[3].base_offset=0x1e000
> 
>  pcpu_embed_first_chunk(): ai->unit_size=0x10000
> 
> It means, that for each group memory of size_sum=0xa000 is used
> only. So, in the case of memory only on the first node, memory for
> the following group will be allocated near the memory of the
> previous group. Even though memory of size_sum=0xa000 is used only,
> but ai->unit_size=0x10000.
>
> After filling group_offsets and group_sizes in
> pcpu_setup_first_chunk() function we have, that (group_offsets[i] +
> group_sizes[i]) can be larger than group_offsets[i+1]. But in
> pcpu_get_vm_areas() function there is checker, which tell us, that
> such situation is impossible:
> 
> BUG_ON(start2 >= start && start2 < end);
> 
> May be I should not use embed allocator in such situation? 

Nice catch.  pcpu_embed_first_chunk() allocates full unit and then
free whatever is unused (for alignment, IIRC) before proceeding to the
next group.  What it should do is first allocate and prepare all
groups and then free whatever is unused.  I'll write up a patch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH] percpu: pcpu_embed_first_chunk() should free unused parts after all allocs are complete
  2012-04-27 14:17 ` percpu allocator Tejun Heo
@ 2012-04-27 15:42   ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2012-04-27 15:42 UTC (permalink / raw)
  To: Pavel V. Panteleev, Christoph Lameter; +Cc: linux-mm, linux-kernel, andi

pcpu_embed_first_chunk() allocates memory for each node, copies percpu
data and frees unused portions of it before proceeding to the next
group.  This assumes that allocations for different nodes doesn't
overlap; however, depending on memory topology, the bootmem allocator
may end up allocating memory from a different node than the requested
one which may overlap with the portion freed from one of the previous
percpu areas.  This leads to percpu groups for different nodes
overlapping which is a serious bug.

This patch separates out copy & partial free from the allocation loop
such that all allocations are complete before partial frees happen.

This also fixes overlapping frees which could happen on allocation
failure path - out_free_areas path frees whole groups but the groups
could have portions freed at that point.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Reported-by: "Pavel V. Panteleev" <pp_84@mail.ru>
LKML-Reference: <E1SNhwY-0007ui-V7.pp_84-mail-ru@f220.mail.ru>
---
Can you please verify this patch fixes the problem?

Thanks.

 mm/percpu.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/percpu.c b/mm/percpu.c
index f47af91..7975693 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1650,6 +1650,16 @@ int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
 		areas[group] = ptr;
 
 		base = min(ptr, base);
+	}
+
+	/*
+	 * Copy data and free unused parts.  This should happen after all
+	 * allocations are complete; otherwise, we may end up with
+	 * overlapping groups.
+	 */
+	for (group = 0; group < ai->nr_groups; group++) {
+		struct pcpu_group_info *gi = &ai->groups[group];
+		void *ptr = areas[group];
 
 		for (i = 0; i < gi->nr_units; i++, ptr += ai->unit_size) {
 			if (gi->cpu_map[i] == NR_CPUS) {

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-04-27 15:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <E1SNhwY-0007ui-V7.pp_84-mail-ru@f220.mail.ru>
2012-04-27 14:17 ` percpu allocator Tejun Heo
2012-04-27 15:42   ` [PATCH] percpu: pcpu_embed_first_chunk() should free unused parts after all allocs are complete Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).