All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
@ 2009-09-22  7:40 ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Hello, all.

This patchset converts ia64 to dynamic percpu allocator and drop the
now unused old percpu allocator.  This patchset contains the following
four patches.

 0001-vmalloc-rename-local-variables-vmalloc_start-and-vma.patch
 0002-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
 0003-ia64-convert-to-dynamic-percpu-allocator.patch
 0004-percpu-kill-legacy-percpu-allocator.patch

0001 is misc prep to avoid macro / local variable collision.  0002
makes ia64 allocate percpu area for cpu0 in the same way it does for
other cpus.  0003 converts ia64 to dynamic percpu allocator and 0004
drops now unused legacy allocator.

Contig memory model was verified with ski emulator.  Discontig and
sparse models were verified on a 4-way SGI altix machine.  I've run
percpu stress test module for quite a while on the machine.

Mike Travis, it would be great if you can test this on your machine.
I'd really like to see how it would behave on a machine with that many
NUMA nodes.

This patchset is available in the following git tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64

Hmmm... kernel.org seems slow to sync today.  If the branch isn't
mirroreed, please pull from the master.

Thanks.

 arch/ia64/Kconfig              |    3 
 arch/ia64/kernel/setup.c       |   12 --
 arch/ia64/kernel/vmlinux.lds.S |   11 +-
 arch/ia64/mm/contig.c          |   87 ++++++++++++++++----
 arch/ia64/mm/discontig.c       |  120 +++++++++++++++++++++++++--
 include/linux/percpu.h         |   24 -----
 kernel/module.c                |  150 ----------------------------------
 mm/Makefile                    |    4 
 mm/allocpercpu.c               |  177 -----------------------------------------
 mm/percpu.c                    |    2 
 mm/vmalloc.c                   |   16 +--
 11 files changed, 193 insertions(+), 413 deletions(-)

--
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
@ 2009-09-22  7:40 ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Hello, all.

This patchset converts ia64 to dynamic percpu allocator and drop the
now unused old percpu allocator.  This patchset contains the following
four patches.

 0001-vmalloc-rename-local-variables-vmalloc_start-and-vma.patch
 0002-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
 0003-ia64-convert-to-dynamic-percpu-allocator.patch
 0004-percpu-kill-legacy-percpu-allocator.patch

0001 is misc prep to avoid macro / local variable collision.  0002
makes ia64 allocate percpu area for cpu0 in the same way it does for
other cpus.  0003 converts ia64 to dynamic percpu allocator and 0004
drops now unused legacy allocator.

Contig memory model was verified with ski emulator.  Discontig and
sparse models were verified on a 4-way SGI altix machine.  I've run
percpu stress test module for quite a while on the machine.

Mike Travis, it would be great if you can test this on your machine.
I'd really like to see how it would behave on a machine with that many
NUMA nodes.

This patchset is available in the following git tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64

Hmmm... kernel.org seems slow to sync today.  If the branch isn't
mirroreed, please pull from the master.

Thanks.

 arch/ia64/Kconfig              |    3 
 arch/ia64/kernel/setup.c       |   12 --
 arch/ia64/kernel/vmlinux.lds.S |   11 +-
 arch/ia64/mm/contig.c          |   87 ++++++++++++++++----
 arch/ia64/mm/discontig.c       |  120 +++++++++++++++++++++++++--
 include/linux/percpu.h         |   24 -----
 kernel/module.c                |  150 ----------------------------------
 mm/Makefile                    |    4 
 mm/allocpercpu.c               |  177 -----------------------------------------
 mm/percpu.c                    |    2 
 mm/vmalloc.c                   |   16 +--
 11 files changed, 193 insertions(+), 413 deletions(-)

--
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 1/4] vmalloc: rename local variables vmalloc_start and vmalloc_end
  2009-09-22  7:40 ` Tejun Heo
@ 2009-09-22  7:40   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64

ia64 defines global vmalloc_end and VMALLOC_END as an alias to it, so
using local variable named vmalloc_end and initializing it from
VMALLOC_END makes it a bogus initialization like the following.

  const unsigned long vmalloc_end = vmalloc_end & ~(align - 1);

Rename local variables vmalloc_start and vmalloc_end to vm_start and
vm_end to avoid the collision.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 mm/vmalloc.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 204b824..416e7fe 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1901,13 +1901,13 @@ static unsigned long pvm_determine_end(struct vmap_area **pnext,
 				       struct vmap_area **pprev,
 				       unsigned long align)
 {
-	const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
+	const unsigned long vm_end = VMALLOC_END & ~(align - 1);
 	unsigned long addr;
 
 	if (*pnext)
-		addr = min((*pnext)->va_start & ~(align - 1), vmalloc_end);
+		addr = min((*pnext)->va_start & ~(align - 1), vm_end);
 	else
-		addr = vmalloc_end;
+		addr = vm_end;
 
 	while (*pprev && (*pprev)->va_end > addr) {
 		*pnext = *pprev;
@@ -1946,8 +1946,8 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
 				     const size_t *sizes, int nr_vms,
 				     size_t align, gfp_t gfp_mask)
 {
-	const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align);
-	const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
+	const unsigned long vm_start = ALIGN(VMALLOC_START, align);
+	const unsigned long vm_end = VMALLOC_END & ~(align - 1);
 	struct vmap_area **vas, *prev, *next;
 	struct vm_struct **vms;
 	int area, area2, last_area, term_area;
@@ -1983,7 +1983,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
 	}
 	last_end = offsets[last_area] + sizes[last_area];
 
-	if (vmalloc_end - vmalloc_start < last_end) {
+	if (vm_end - vm_start < last_end) {
 		WARN_ON(true);
 		return NULL;
 	}
@@ -2008,7 +2008,7 @@ retry:
 	end = start + sizes[area];
 
 	if (!pvm_find_next_prev(vmap_area_pcpu_hole, &next, &prev)) {
-		base = vmalloc_end - last_end;
+		base = vm_end - last_end;
 		goto found;
 	}
 	base = pvm_determine_end(&next, &prev, align) - end;
@@ -2021,7 +2021,7 @@ retry:
 		 * base might have underflowed, add last_end before
 		 * comparing.
 		 */
-		if (base + last_end < vmalloc_start + last_end) {
+		if (base + last_end < vm_start + last_end) {
 			spin_unlock(&vmap_area_lock);
 			if (!purged) {
 				purge_vmap_area_lazy();
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 1/4] vmalloc: rename local variables vmalloc_start and vmalloc_end
@ 2009-09-22  7:40   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64

ia64 defines global vmalloc_end and VMALLOC_END as an alias to it, so
using local variable named vmalloc_end and initializing it from
VMALLOC_END makes it a bogus initialization like the following.

  const unsigned long vmalloc_end = vmalloc_end & ~(align - 1);

Rename local variables vmalloc_start and vmalloc_end to vm_start and
vm_end to avoid the collision.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 mm/vmalloc.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 204b824..416e7fe 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1901,13 +1901,13 @@ static unsigned long pvm_determine_end(struct vmap_area **pnext,
 				       struct vmap_area **pprev,
 				       unsigned long align)
 {
-	const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
+	const unsigned long vm_end = VMALLOC_END & ~(align - 1);
 	unsigned long addr;
 
 	if (*pnext)
-		addr = min((*pnext)->va_start & ~(align - 1), vmalloc_end);
+		addr = min((*pnext)->va_start & ~(align - 1), vm_end);
 	else
-		addr = vmalloc_end;
+		addr = vm_end;
 
 	while (*pprev && (*pprev)->va_end > addr) {
 		*pnext = *pprev;
@@ -1946,8 +1946,8 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
 				     const size_t *sizes, int nr_vms,
 				     size_t align, gfp_t gfp_mask)
 {
-	const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align);
-	const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
+	const unsigned long vm_start = ALIGN(VMALLOC_START, align);
+	const unsigned long vm_end = VMALLOC_END & ~(align - 1);
 	struct vmap_area **vas, *prev, *next;
 	struct vm_struct **vms;
 	int area, area2, last_area, term_area;
@@ -1983,7 +1983,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
 	}
 	last_end = offsets[last_area] + sizes[last_area];
 
-	if (vmalloc_end - vmalloc_start < last_end) {
+	if (vm_end - vm_start < last_end) {
 		WARN_ON(true);
 		return NULL;
 	}
@@ -2008,7 +2008,7 @@ retry:
 	end = start + sizes[area];
 
 	if (!pvm_find_next_prev(vmap_area_pcpu_hole, &next, &prev)) {
-		base = vmalloc_end - last_end;
+		base = vm_end - last_end;
 		goto found;
 	}
 	base = pvm_determine_end(&next, &prev, align) - end;
@@ -2021,7 +2021,7 @@ retry:
 		 * base might have underflowed, add last_end before
 		 * comparing.
 		 */
-		if (base + last_end < vmalloc_start + last_end) {
+		if (base + last_end < vm_start + last_end) {
 			spin_unlock(&vmap_area_lock);
 			if (!purged) {
 				purge_vmap_area_lazy();
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-22  7:40 ` Tejun Heo
@ 2009-09-22  7:40   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
which is set up early in boot by head.S.  However, this doesn't
guarantee that the area will be on the same node as cpu0 and the
percpu area for cpu0 ends up very far away from percpu areas for other
cpus which cause problems for congruent percpu allocator.

This patch makes percpu area initialization allocate percpu area for
cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
resides in the __init area.  This means that for cpu0, percpu area is
first setup at __cpu0_per_cpu early by head.S and then moved to an
area in the linear mapping during memory initialization and it's not
allowed to take a pointer to percpu variables between head.S and
memory initialization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/kernel/vmlinux.lds.S |   11 +++++----
 arch/ia64/mm/contig.c          |   47 +++++++++++++++++++++++++--------------
 arch/ia64/mm/discontig.c       |   35 ++++++++++++++++++++---------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 0a0c77b..1295ba3 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -166,6 +166,12 @@ SECTIONS
 	}
 #endif
 
+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+  __cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
+
   . = ALIGN(PAGE_SIZE);
   __init_end = .;
 
@@ -198,11 +204,6 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
-#ifdef	CONFIG_SMP
-  . = ALIGN(PERCPU_PAGE_SIZE);
-		__cpu0_per_cpu = .;
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
 		INIT_TASK_DATA(PAGE_SIZE)
 		CACHELINE_ALIGNED_DATA(SMP_CACHE_BYTES)
 		READ_MOSTLY_DATA(SMP_CACHE_BYTES)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 2f724d2..9493bbf 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -154,36 +154,49 @@ static void *cpu_data;
 void * __cpuinit
 per_cpu_init (void)
 {
-	int cpu;
-	static int first_time=1;
+	static bool first_time = true;
+	void *cpu0_data = __cpu0_per_cpu;
+	unsigned int cpu;
+
+	if (!first_time)
+		goto skip;
+	first_time = false;
 
 	/*
 	 * get_free_pages() cannot be used before cpu_init() done.  BSP
 	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
 	 * get_zeroed_page().
 	 */
-	if (first_time) {
-		void *cpu0_data = __cpu0_per_cpu;
+	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+		void *src = cpu == 0 ? cpu0_data : __phys_per_cpu_start;
 
-		first_time=0;
+		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)cpu_data - __per_cpu_start;
+		per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
 
-		__per_cpu_offset[0] = (char *) cpu0_data - __per_cpu_start;
-		per_cpu(local_per_cpu_offset, 0) = __per_cpu_offset[0];
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu == 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA, __pa(cpu_data) -
+				    (unsigned long)__per_cpu_start);
 
-		for (cpu = 1; cpu < NR_CPUS; cpu++) {
-			memcpy(cpu_data, __phys_per_cpu_start, __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char *) cpu_data - __per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
-		}
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
+skip:
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }
 
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS-1,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
 #else
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index d85ba98..35a61ec 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -143,17 +143,30 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	int cpu;
 
 	for_each_possible_early_cpu(cpu) {
-		if (cpu == 0) {
-			void *cpu0_data = __cpu0_per_cpu;
-			__per_cpu_offset[cpu] = (char*)cpu0_data -
-				__per_cpu_start;
-		} else if (node == node_cpuid[cpu].nid) {
-			memcpy(__va(cpu_data), __phys_per_cpu_start,
-			       __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
-				__per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-		}
+		void *src = cpu == 0 ? __cpu0_per_cpu : __phys_per_cpu_start;
+
+		if (node != node_cpuid[cpu].nid)
+			continue;
+
+		memcpy(__va(cpu_data), src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)__va(cpu_data) -
+			__per_cpu_start;
+
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu == 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA,
+				    (unsigned long)cpu_data -
+				    (unsigned long)__per_cpu_start);
+
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
 #endif
 	return cpu_data;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
@ 2009-09-22  7:40   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
which is set up early in boot by head.S.  However, this doesn't
guarantee that the area will be on the same node as cpu0 and the
percpu area for cpu0 ends up very far away from percpu areas for other
cpus which cause problems for congruent percpu allocator.

This patch makes percpu area initialization allocate percpu area for
cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
resides in the __init area.  This means that for cpu0, percpu area is
first setup at __cpu0_per_cpu early by head.S and then moved to an
area in the linear mapping during memory initialization and it's not
allowed to take a pointer to percpu variables between head.S and
memory initialization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/kernel/vmlinux.lds.S |   11 +++++----
 arch/ia64/mm/contig.c          |   47 +++++++++++++++++++++++++--------------
 arch/ia64/mm/discontig.c       |   35 ++++++++++++++++++++---------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 0a0c77b..1295ba3 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -166,6 +166,12 @@ SECTIONS
 	}
 #endif
 
+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+  __cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
+
   . = ALIGN(PAGE_SIZE);
   __init_end = .;
 
@@ -198,11 +204,6 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
-#ifdef	CONFIG_SMP
-  . = ALIGN(PERCPU_PAGE_SIZE);
-		__cpu0_per_cpu = .;
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
 		INIT_TASK_DATA(PAGE_SIZE)
 		CACHELINE_ALIGNED_DATA(SMP_CACHE_BYTES)
 		READ_MOSTLY_DATA(SMP_CACHE_BYTES)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 2f724d2..9493bbf 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -154,36 +154,49 @@ static void *cpu_data;
 void * __cpuinit
 per_cpu_init (void)
 {
-	int cpu;
-	static int first_time=1;
+	static bool first_time = true;
+	void *cpu0_data = __cpu0_per_cpu;
+	unsigned int cpu;
+
+	if (!first_time)
+		goto skip;
+	first_time = false;
 
 	/*
 	 * get_free_pages() cannot be used before cpu_init() done.  BSP
 	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
 	 * get_zeroed_page().
 	 */
-	if (first_time) {
-		void *cpu0_data = __cpu0_per_cpu;
+	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+		void *src = cpu = 0 ? cpu0_data : __phys_per_cpu_start;
 
-		first_time=0;
+		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)cpu_data - __per_cpu_start;
+		per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
 
-		__per_cpu_offset[0] = (char *) cpu0_data - __per_cpu_start;
-		per_cpu(local_per_cpu_offset, 0) = __per_cpu_offset[0];
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu = 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA, __pa(cpu_data) -
+				    (unsigned long)__per_cpu_start);
 
-		for (cpu = 1; cpu < NR_CPUS; cpu++) {
-			memcpy(cpu_data, __phys_per_cpu_start, __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char *) cpu_data - __per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
-		}
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
+skip:
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }
 
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS-1,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
 #else
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index d85ba98..35a61ec 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -143,17 +143,30 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	int cpu;
 
 	for_each_possible_early_cpu(cpu) {
-		if (cpu = 0) {
-			void *cpu0_data = __cpu0_per_cpu;
-			__per_cpu_offset[cpu] = (char*)cpu0_data -
-				__per_cpu_start;
-		} else if (node = node_cpuid[cpu].nid) {
-			memcpy(__va(cpu_data), __phys_per_cpu_start,
-			       __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
-				__per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-		}
+		void *src = cpu = 0 ? __cpu0_per_cpu : __phys_per_cpu_start;
+
+		if (node != node_cpuid[cpu].nid)
+			continue;
+
+		memcpy(__va(cpu_data), src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)__va(cpu_data) -
+			__per_cpu_start;
+
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu = 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA,
+				    (unsigned long)cpu_data -
+				    (unsigned long)__per_cpu_start);
+
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
 #endif
 	return cpu_data;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 3/4] ia64: convert to dynamic percpu allocator
  2009-09-22  7:40 ` Tejun Heo
@ 2009-09-22  7:40   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

Unlike other archs, ia64 reserves space for percpu areas during early
memory initialization.  These areas occupy a contiguous region indexed
by cpu number on contiguous memory model or are grouped by node on
discontiguous memory model.

As allocation and initialization are done by the arch code, all that
setup_per_cpu_areas() needs to do is communicating the determined
layout to the percpu allocator.  This patch implements
setup_per_cpu_areas() for both contig and discontig memory models and
drops HAVE_LEGACY_PER_CPU_AREA.

Please note that for contig model, the allocation itself is modified
only to allocate for possible cpus instead of NR_CPUS.  As dynamic
percpu allocator can handle non-direct mapping, there's no reason to
allocate memory for cpus which aren't possible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/Kconfig        |    3 --
 arch/ia64/kernel/setup.c |   12 ------
 arch/ia64/mm/contig.c    |   50 ++++++++++++++++++++++++---
 arch/ia64/mm/discontig.c |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 130 insertions(+), 20 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 011a1cd..e624611 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -89,9 +89,6 @@ config GENERIC_TIME_VSYSCALL
 	bool
 	default y
 
-config HAVE_LEGACY_PER_CPU_AREA
-	def_bool y
-
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool y
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 1de86c9..42f8a18 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -856,18 +856,6 @@ identify_cpu (struct cpuinfo_ia64 *c)
 }
 
 /*
- * In UP configuration, setup_per_cpu_areas() is defined in
- * include/linux/percpu.h
- */
-#ifdef CONFIG_SMP
-void __init
-setup_per_cpu_areas (void)
-{
-	/* start_kernel() requires this... */
-}
-#endif
-
-/*
  * Do the following calculations:
  *
  * 1. the max. cache line size.
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 9493bbf..c86ce4f 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -163,11 +163,11 @@ per_cpu_init (void)
 	first_time = false;
 
 	/*
-	 * get_free_pages() cannot be used before cpu_init() done.  BSP
-	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
-	 * get_zeroed_page().
+	 * get_free_pages() cannot be used before cpu_init() done.
+	 * BSP allocates PERCPU_PAGE_SIZE bytes for all possible CPUs
+	 * to avoid that AP calls get_zeroed_page().
 	 */
-	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+	for_each_possible_cpu(cpu) {
 		void *src = cpu == 0 ? cpu0_data : __phys_per_cpu_start;
 
 		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
@@ -196,9 +196,49 @@ skip:
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * num_possible_cpus(),
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
+
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init
+setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *gi;
+	unsigned int cpu;
+	int rc;
+
+	ai = pcpu_alloc_alloc_info(1, num_possible_cpus());
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	gi = &ai->groups[0];
+
+	/* units are assigned consecutively to possible cpus */
+	for_each_possible_cpu(cpu)
+		gi->cpu_map[gi->nr_units++] = cpu;
+
+	/* set parameters */
+	ai->static_size		= __per_cpu_end - __per_cpu_start;
+	ai->reserved_size	= PERCPU_MODULE_RESERVE;
+	ai->dyn_size		= PERCPU_DYNAMIC_RESERVE;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	rc = pcpu_setup_first_chunk(ai, __per_cpu_start + __per_cpu_offset[0]);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
 #else
 #define alloc_per_cpu_data() do { } while (0)
 #endif /* CONFIG_SMP */
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 35a61ec..69e9e91 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -172,6 +172,91 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	return cpu_data;
 }
 
+#ifdef CONFIG_SMP
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *uninitialized_var(gi);
+	unsigned int *cpu_map;
+	void *base;
+	unsigned long base_offset;
+	unsigned int cpu;
+	ssize_t static_size, reserved_size, dyn_size;
+	int node, prev_node, unit, nr_units, rc;
+
+	ai = pcpu_alloc_alloc_info(MAX_NUMNODES, nr_cpu_ids);
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	cpu_map = ai->groups[0].cpu_map;
+
+	/* determine base */
+	base = (void *)ULONG_MAX;
+	for_each_possible_cpu(cpu)
+		base = min(base,
+			   (void *)(__per_cpu_offset[cpu] + __per_cpu_start));
+	base_offset = (void *)__per_cpu_start - base;
+
+	/* build cpu_map, units are grouped by node */
+	unit = 0;
+	for_each_node(node)
+		for_each_possible_cpu(cpu)
+			if (node == node_cpuid[cpu].nid)
+				cpu_map[unit++] = cpu;
+	nr_units = unit;
+
+	/* set basic parameters */
+	static_size = __per_cpu_end - __per_cpu_start;
+	reserved_size = PERCPU_MODULE_RESERVE;
+	dyn_size = PERCPU_PAGE_SIZE - static_size - reserved_size;
+	if (dyn_size < 0)
+		panic("percpu area overflow static=%zd reserved=%zd\n",
+		      static_size, reserved_size);
+
+	ai->static_size		= static_size;
+	ai->reserved_size	= reserved_size;
+	ai->dyn_size		= dyn_size;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	/*
+	 * CPUs are put into groups according to node.  Walk cpu_map
+	 * and create new groups at node boundaries.
+	 */
+	prev_node = -1;
+	ai->nr_groups = 0;
+	for (unit = 0; unit < nr_units; unit++) {
+		cpu = cpu_map[unit];
+		node = node_cpuid[cpu].nid;
+
+		if (node == prev_node) {
+			gi->nr_units++;
+			continue;
+		}
+		prev_node = node;
+
+		gi = &ai->groups[ai->nr_groups++];
+		gi->nr_units		= 1;
+		gi->base_offset		= __per_cpu_offset[cpu] + base_offset;
+		gi->cpu_map		= &cpu_map[unit];
+	}
+
+	rc = pcpu_setup_first_chunk(ai, base);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
+#endif
+
 /**
  * fill_pernode - initialize pernode data.
  * @node: the node id.
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 3/4] ia64: convert to dynamic percpu allocator
@ 2009-09-22  7:40   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

Unlike other archs, ia64 reserves space for percpu areas during early
memory initialization.  These areas occupy a contiguous region indexed
by cpu number on contiguous memory model or are grouped by node on
discontiguous memory model.

As allocation and initialization are done by the arch code, all that
setup_per_cpu_areas() needs to do is communicating the determined
layout to the percpu allocator.  This patch implements
setup_per_cpu_areas() for both contig and discontig memory models and
drops HAVE_LEGACY_PER_CPU_AREA.

Please note that for contig model, the allocation itself is modified
only to allocate for possible cpus instead of NR_CPUS.  As dynamic
percpu allocator can handle non-direct mapping, there's no reason to
allocate memory for cpus which aren't possible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/Kconfig        |    3 --
 arch/ia64/kernel/setup.c |   12 ------
 arch/ia64/mm/contig.c    |   50 ++++++++++++++++++++++++---
 arch/ia64/mm/discontig.c |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 130 insertions(+), 20 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 011a1cd..e624611 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -89,9 +89,6 @@ config GENERIC_TIME_VSYSCALL
 	bool
 	default y
 
-config HAVE_LEGACY_PER_CPU_AREA
-	def_bool y
-
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool y
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 1de86c9..42f8a18 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -856,18 +856,6 @@ identify_cpu (struct cpuinfo_ia64 *c)
 }
 
 /*
- * In UP configuration, setup_per_cpu_areas() is defined in
- * include/linux/percpu.h
- */
-#ifdef CONFIG_SMP
-void __init
-setup_per_cpu_areas (void)
-{
-	/* start_kernel() requires this... */
-}
-#endif
-
-/*
  * Do the following calculations:
  *
  * 1. the max. cache line size.
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 9493bbf..c86ce4f 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -163,11 +163,11 @@ per_cpu_init (void)
 	first_time = false;
 
 	/*
-	 * get_free_pages() cannot be used before cpu_init() done.  BSP
-	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
-	 * get_zeroed_page().
+	 * get_free_pages() cannot be used before cpu_init() done.
+	 * BSP allocates PERCPU_PAGE_SIZE bytes for all possible CPUs
+	 * to avoid that AP calls get_zeroed_page().
 	 */
-	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+	for_each_possible_cpu(cpu) {
 		void *src = cpu = 0 ? cpu0_data : __phys_per_cpu_start;
 
 		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
@@ -196,9 +196,49 @@ skip:
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * num_possible_cpus(),
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
+
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init
+setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *gi;
+	unsigned int cpu;
+	int rc;
+
+	ai = pcpu_alloc_alloc_info(1, num_possible_cpus());
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	gi = &ai->groups[0];
+
+	/* units are assigned consecutively to possible cpus */
+	for_each_possible_cpu(cpu)
+		gi->cpu_map[gi->nr_units++] = cpu;
+
+	/* set parameters */
+	ai->static_size		= __per_cpu_end - __per_cpu_start;
+	ai->reserved_size	= PERCPU_MODULE_RESERVE;
+	ai->dyn_size		= PERCPU_DYNAMIC_RESERVE;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	rc = pcpu_setup_first_chunk(ai, __per_cpu_start + __per_cpu_offset[0]);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
 #else
 #define alloc_per_cpu_data() do { } while (0)
 #endif /* CONFIG_SMP */
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 35a61ec..69e9e91 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -172,6 +172,91 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	return cpu_data;
 }
 
+#ifdef CONFIG_SMP
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *uninitialized_var(gi);
+	unsigned int *cpu_map;
+	void *base;
+	unsigned long base_offset;
+	unsigned int cpu;
+	ssize_t static_size, reserved_size, dyn_size;
+	int node, prev_node, unit, nr_units, rc;
+
+	ai = pcpu_alloc_alloc_info(MAX_NUMNODES, nr_cpu_ids);
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	cpu_map = ai->groups[0].cpu_map;
+
+	/* determine base */
+	base = (void *)ULONG_MAX;
+	for_each_possible_cpu(cpu)
+		base = min(base,
+			   (void *)(__per_cpu_offset[cpu] + __per_cpu_start));
+	base_offset = (void *)__per_cpu_start - base;
+
+	/* build cpu_map, units are grouped by node */
+	unit = 0;
+	for_each_node(node)
+		for_each_possible_cpu(cpu)
+			if (node = node_cpuid[cpu].nid)
+				cpu_map[unit++] = cpu;
+	nr_units = unit;
+
+	/* set basic parameters */
+	static_size = __per_cpu_end - __per_cpu_start;
+	reserved_size = PERCPU_MODULE_RESERVE;
+	dyn_size = PERCPU_PAGE_SIZE - static_size - reserved_size;
+	if (dyn_size < 0)
+		panic("percpu area overflow static=%zd reserved=%zd\n",
+		      static_size, reserved_size);
+
+	ai->static_size		= static_size;
+	ai->reserved_size	= reserved_size;
+	ai->dyn_size		= dyn_size;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	/*
+	 * CPUs are put into groups according to node.  Walk cpu_map
+	 * and create new groups at node boundaries.
+	 */
+	prev_node = -1;
+	ai->nr_groups = 0;
+	for (unit = 0; unit < nr_units; unit++) {
+		cpu = cpu_map[unit];
+		node = node_cpuid[cpu].nid;
+
+		if (node = prev_node) {
+			gi->nr_units++;
+			continue;
+		}
+		prev_node = node;
+
+		gi = &ai->groups[ai->nr_groups++];
+		gi->nr_units		= 1;
+		gi->base_offset		= __per_cpu_offset[cpu] + base_offset;
+		gi->cpu_map		= &cpu_map[unit];
+	}
+
+	rc = pcpu_setup_first_chunk(ai, base);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
+#endif
+
 /**
  * fill_pernode - initialize pernode data.
  * @node: the node id.
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 4/4] percpu: kill legacy percpu allocator
  2009-09-22  7:40 ` Tejun Heo
@ 2009-09-22  7:40   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Ingo Molnar, Rusty Russell, Christoph Lameter

With ia64 converted, there's no arch left which still uses legacy
percpu allocator.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
---
 include/linux/percpu.h |   24 -------
 kernel/module.c        |  150 ----------------------------------------
 mm/Makefile            |    4 -
 mm/allocpercpu.c       |  177 ------------------------------------------------
 mm/percpu.c            |    2 -
 5 files changed, 0 insertions(+), 357 deletions(-)
 delete mode 100644 mm/allocpercpu.c

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 878836c..5baf5b8 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -34,8 +34,6 @@
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 /* minimum unit size, also is the maximum supported allocation size */
 #define PCPU_MIN_UNIT_SIZE		PFN_ALIGN(64 << 10)
 
@@ -130,28 +128,6 @@ extern int __init pcpu_page_first_chunk(size_t reserved_size,
 #define per_cpu_ptr(ptr, cpu)	SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu)))
 
 extern void *__alloc_reserved_percpu(size_t size, size_t align);
-
-#else /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-struct percpu_data {
-	void *ptrs[1];
-};
-
-/* pointer disguising messes up the kmemleak objects tracking */
-#ifndef CONFIG_DEBUG_KMEMLEAK
-#define __percpu_disguise(pdata) (struct percpu_data *)~(unsigned long)(pdata)
-#else
-#define __percpu_disguise(pdata) (struct percpu_data *)(pdata)
-#endif
-
-#define per_cpu_ptr(ptr, cpu)						\
-({									\
-        struct percpu_data *__p = __percpu_disguise(ptr);		\
-        (__typeof__(ptr))__p->ptrs[(cpu)];				\
-})
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 extern void *__alloc_percpu(size_t size, size_t align);
 extern void free_percpu(void *__pdata);
 
diff --git a/kernel/module.c b/kernel/module.c
index b6ee424..bac3fe8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -369,8 +369,6 @@ EXPORT_SYMBOL_GPL(find_module);
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 static void *percpu_modalloc(unsigned long size, unsigned long align,
 			     const char *name)
 {
@@ -394,154 +392,6 @@ static void percpu_modfree(void *freeme)
 	free_percpu(freeme);
 }
 
-#else /* ... CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-/* Number of blocks used and allocated. */
-static unsigned int pcpu_num_used, pcpu_num_allocated;
-/* Size of each block.  -ve means used. */
-static int *pcpu_size;
-
-static int split_block(unsigned int i, unsigned short size)
-{
-	/* Reallocation required? */
-	if (pcpu_num_used + 1 > pcpu_num_allocated) {
-		int *new;
-
-		new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2,
-			       GFP_KERNEL);
-		if (!new)
-			return 0;
-
-		pcpu_num_allocated *= 2;
-		pcpu_size = new;
-	}
-
-	/* Insert a new subblock */
-	memmove(&pcpu_size[i+1], &pcpu_size[i],
-		sizeof(pcpu_size[0]) * (pcpu_num_used - i));
-	pcpu_num_used++;
-
-	pcpu_size[i+1] -= size;
-	pcpu_size[i] = size;
-	return 1;
-}
-
-static inline unsigned int block_size(int val)
-{
-	if (val < 0)
-		return -val;
-	return val;
-}
-
-static void *percpu_modalloc(unsigned long size, unsigned long align,
-			     const char *name)
-{
-	unsigned long extra;
-	unsigned int i;
-	void *ptr;
-	int cpu;
-
-	if (align > PAGE_SIZE) {
-		printk(KERN_WARNING "%s: per-cpu alignment %li > %li\n",
-		       name, align, PAGE_SIZE);
-		align = PAGE_SIZE;
-	}
-
-	ptr = __per_cpu_start;
-	for (i = 0; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		/* Extra for alignment requirement. */
-		extra = ALIGN((unsigned long)ptr, align) - (unsigned long)ptr;
-		BUG_ON(i == 0 && extra != 0);
-
-		if (pcpu_size[i] < 0 || pcpu_size[i] < extra + size)
-			continue;
-
-		/* Transfer extra to previous block. */
-		if (pcpu_size[i-1] < 0)
-			pcpu_size[i-1] -= extra;
-		else
-			pcpu_size[i-1] += extra;
-		pcpu_size[i] -= extra;
-		ptr += extra;
-
-		/* Split block if warranted */
-		if (pcpu_size[i] - size > sizeof(unsigned long))
-			if (!split_block(i, size))
-				return NULL;
-
-		/* add the per-cpu scanning areas */
-		for_each_possible_cpu(cpu)
-			kmemleak_alloc(ptr + per_cpu_offset(cpu), size, 0,
-				       GFP_KERNEL);
-
-		/* Mark allocated */
-		pcpu_size[i] = -pcpu_size[i];
-		return ptr;
-	}
-
-	printk(KERN_WARNING "Could not allocate %lu bytes percpu data\n",
-	       size);
-	return NULL;
-}
-
-static void percpu_modfree(void *freeme)
-{
-	unsigned int i;
-	void *ptr = __per_cpu_start + block_size(pcpu_size[0]);
-	int cpu;
-
-	/* First entry is core kernel percpu data. */
-	for (i = 1; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		if (ptr == freeme) {
-			pcpu_size[i] = -pcpu_size[i];
-			goto free;
-		}
-	}
-	BUG();
-
- free:
-	/* remove the per-cpu scanning areas */
-	for_each_possible_cpu(cpu)
-		kmemleak_free(freeme + per_cpu_offset(cpu));
-
-	/* Merge with previous? */
-	if (pcpu_size[i-1] >= 0) {
-		pcpu_size[i-1] += pcpu_size[i];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i], &pcpu_size[i+1],
-			(pcpu_num_used - i) * sizeof(pcpu_size[0]));
-		i--;
-	}
-	/* Merge with next? */
-	if (i+1 < pcpu_num_used && pcpu_size[i+1] >= 0) {
-		pcpu_size[i] += pcpu_size[i+1];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i+1], &pcpu_size[i+2],
-			(pcpu_num_used - (i+1)) * sizeof(pcpu_size[0]));
-	}
-}
-
-static int percpu_modinit(void)
-{
-	pcpu_num_used = 2;
-	pcpu_num_allocated = 2;
-	pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated,
-			    GFP_KERNEL);
-	/* Static in-kernel percpu data (used). */
-	pcpu_size[0] = -(__per_cpu_end-__per_cpu_start);
-	/* Free room. */
-	pcpu_size[1] = PERCPU_ENOUGH_ROOM + pcpu_size[0];
-	if (pcpu_size[1] < 0) {
-		printk(KERN_ERR "No per-cpu room for modules.\n");
-		pcpu_num_used = 1;
-	}
-
-	return 0;
-}
-__initcall(percpu_modinit);
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 static unsigned int find_pcpusec(Elf_Ehdr *hdr,
 				 Elf_Shdr *sechdrs,
 				 const char *secstrings)
diff --git a/mm/Makefile b/mm/Makefile
index ea4b18b..1195920 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -33,11 +33,7 @@ obj-$(CONFIG_FAILSLAB) += failslab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
-ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
 obj-$(CONFIG_SMP) += percpu.o
-else
-obj-$(CONFIG_SMP) += allocpercpu.o
-endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
diff --git a/mm/allocpercpu.c b/mm/allocpercpu.c
deleted file mode 100644
index df34cea..0000000
--- a/mm/allocpercpu.c
+++ /dev/null
@@ -1,177 +0,0 @@
-/*
- * linux/mm/allocpercpu.c
- *
- * Separated from slab.c August 11, 2006 Christoph Lameter
- */
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/bootmem.h>
-#include <asm/sections.h>
-
-#ifndef cache_line_size
-#define cache_line_size()	L1_CACHE_BYTES
-#endif
-
-/**
- * percpu_depopulate - depopulate per-cpu data for given cpu
- * @__pdata: per-cpu data to depopulate
- * @cpu: depopulate per-cpu data for this cpu
- *
- * Depopulating per-cpu data for a cpu going offline would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- */
-static void percpu_depopulate(void *__pdata, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-
-	kfree(pdata->ptrs[cpu]);
-	pdata->ptrs[cpu] = NULL;
-}
-
-/**
- * percpu_depopulate_mask - depopulate per-cpu data for some cpu's
- * @__pdata: per-cpu data to depopulate
- * @mask: depopulate per-cpu data for cpu's selected through mask bits
- */
-static void __percpu_depopulate_mask(void *__pdata, const cpumask_t *mask)
-{
-	int cpu;
-	for_each_cpu_mask_nr(cpu, *mask)
-		percpu_depopulate(__pdata, cpu);
-}
-
-#define percpu_depopulate_mask(__pdata, mask) \
-	__percpu_depopulate_mask((__pdata), &(mask))
-
-/**
- * percpu_populate - populate per-cpu data for given cpu
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @cpu: populate per-data for this cpu
- *
- * Populating per-cpu data for a cpu coming online would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- * Per-cpu object is populated with zeroed buffer.
- */
-static void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-	int node = cpu_to_node(cpu);
-
-	/*
-	 * We should make sure each CPU gets private memory.
-	 */
-	size = roundup(size, cache_line_size());
-
-	BUG_ON(pdata->ptrs[cpu]);
-	if (node_online(node))
-		pdata->ptrs[cpu] = kmalloc_node(size, gfp|__GFP_ZERO, node);
-	else
-		pdata->ptrs[cpu] = kzalloc(size, gfp);
-	return pdata->ptrs[cpu];
-}
-
-/**
- * percpu_populate_mask - populate per-cpu data for more cpu's
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @mask: populate per-cpu data for cpu's selected through mask bits
- *
- * Per-cpu objects are populated with zeroed buffers.
- */
-static int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-				  cpumask_t *mask)
-{
-	cpumask_t populated;
-	int cpu;
-
-	cpus_clear(populated);
-	for_each_cpu_mask_nr(cpu, *mask)
-		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
-			__percpu_depopulate_mask(__pdata, &populated);
-			return -ENOMEM;
-		} else
-			cpu_set(cpu, populated);
-	return 0;
-}
-
-#define percpu_populate_mask(__pdata, size, gfp, mask) \
-	__percpu_populate_mask((__pdata), (size), (gfp), &(mask))
-
-/**
- * alloc_percpu - initial setup of per-cpu data
- * @size: size of per-cpu object
- * @align: alignment
- *
- * Allocate dynamic percpu area.  Percpu objects are populated with
- * zeroed buffers.
- */
-void *__alloc_percpu(size_t size, size_t align)
-{
-	/*
-	 * We allocate whole cache lines to avoid false sharing
-	 */
-	size_t sz = roundup(nr_cpu_ids * sizeof(void *), cache_line_size());
-	void *pdata = kzalloc(sz, GFP_KERNEL);
-	void *__pdata = __percpu_disguise(pdata);
-
-	/*
-	 * Can't easily make larger alignment work with kmalloc.  WARN
-	 * on it.  Larger alignment should only be used for module
-	 * percpu sections on SMP for which this path isn't used.
-	 */
-	WARN_ON_ONCE(align > SMP_CACHE_BYTES);
-
-	if (unlikely(!pdata))
-		return NULL;
-	if (likely(!__percpu_populate_mask(__pdata, size, GFP_KERNEL,
-					   &cpu_possible_map)))
-		return __pdata;
-	kfree(pdata);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(__alloc_percpu);
-
-/**
- * free_percpu - final cleanup of per-cpu data
- * @__pdata: object to clean up
- *
- * We simply clean up any per-cpu object left. No need for the client to
- * track and specify through a bis mask which per-cpu objects are to free.
- */
-void free_percpu(void *__pdata)
-{
-	if (unlikely(!__pdata))
-		return;
-	__percpu_depopulate_mask(__pdata, cpu_possible_mask);
-	kfree(__percpu_disguise(__pdata));
-}
-EXPORT_SYMBOL_GPL(free_percpu);
-
-/*
- * Generic percpu area setup.
- */
-#ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
-unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
-
-EXPORT_SYMBOL(__per_cpu_offset);
-
-void __init setup_per_cpu_areas(void)
-{
-	unsigned long size, i;
-	char *ptr;
-	unsigned long nr_possible_cpus = num_possible_cpus();
-
-	/* Copy section for each CPU (we discard the original) */
-	size = ALIGN(PERCPU_ENOUGH_ROOM, PAGE_SIZE);
-	ptr = alloc_bootmem_pages(size * nr_possible_cpus);
-
-	for_each_possible_cpu(i) {
-		__per_cpu_offset[i] = ptr - __per_cpu_start;
-		memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start);
-		ptr += size;
-	}
-}
-#endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
diff --git a/mm/percpu.c b/mm/percpu.c
index 43d8cac..adbc5a4 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -46,8 +46,6 @@
  *
  * To use this allocator, arch code should do the followings.
  *
- * - drop CONFIG_HAVE_LEGACY_PER_CPU_AREA
- *
  * - define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr() to translate
  *   regular address to percpu pointer and back if they need to be
  *   different from the default
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 4/4] percpu: kill legacy percpu allocator
@ 2009-09-22  7:40   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22  7:40 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Ingo Molnar, Rusty Russell, Christoph Lameter

With ia64 converted, there's no arch left which still uses legacy
percpu allocator.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
---
 include/linux/percpu.h |   24 -------
 kernel/module.c        |  150 ----------------------------------------
 mm/Makefile            |    4 -
 mm/allocpercpu.c       |  177 ------------------------------------------------
 mm/percpu.c            |    2 -
 5 files changed, 0 insertions(+), 357 deletions(-)
 delete mode 100644 mm/allocpercpu.c

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 878836c..5baf5b8 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -34,8 +34,6 @@
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 /* minimum unit size, also is the maximum supported allocation size */
 #define PCPU_MIN_UNIT_SIZE		PFN_ALIGN(64 << 10)
 
@@ -130,28 +128,6 @@ extern int __init pcpu_page_first_chunk(size_t reserved_size,
 #define per_cpu_ptr(ptr, cpu)	SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu)))
 
 extern void *__alloc_reserved_percpu(size_t size, size_t align);
-
-#else /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-struct percpu_data {
-	void *ptrs[1];
-};
-
-/* pointer disguising messes up the kmemleak objects tracking */
-#ifndef CONFIG_DEBUG_KMEMLEAK
-#define __percpu_disguise(pdata) (struct percpu_data *)~(unsigned long)(pdata)
-#else
-#define __percpu_disguise(pdata) (struct percpu_data *)(pdata)
-#endif
-
-#define per_cpu_ptr(ptr, cpu)						\
-({									\
-        struct percpu_data *__p = __percpu_disguise(ptr);		\
-        (__typeof__(ptr))__p->ptrs[(cpu)];				\
-})
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 extern void *__alloc_percpu(size_t size, size_t align);
 extern void free_percpu(void *__pdata);
 
diff --git a/kernel/module.c b/kernel/module.c
index b6ee424..bac3fe8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -369,8 +369,6 @@ EXPORT_SYMBOL_GPL(find_module);
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 static void *percpu_modalloc(unsigned long size, unsigned long align,
 			     const char *name)
 {
@@ -394,154 +392,6 @@ static void percpu_modfree(void *freeme)
 	free_percpu(freeme);
 }
 
-#else /* ... CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-/* Number of blocks used and allocated. */
-static unsigned int pcpu_num_used, pcpu_num_allocated;
-/* Size of each block.  -ve means used. */
-static int *pcpu_size;
-
-static int split_block(unsigned int i, unsigned short size)
-{
-	/* Reallocation required? */
-	if (pcpu_num_used + 1 > pcpu_num_allocated) {
-		int *new;
-
-		new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2,
-			       GFP_KERNEL);
-		if (!new)
-			return 0;
-
-		pcpu_num_allocated *= 2;
-		pcpu_size = new;
-	}
-
-	/* Insert a new subblock */
-	memmove(&pcpu_size[i+1], &pcpu_size[i],
-		sizeof(pcpu_size[0]) * (pcpu_num_used - i));
-	pcpu_num_used++;
-
-	pcpu_size[i+1] -= size;
-	pcpu_size[i] = size;
-	return 1;
-}
-
-static inline unsigned int block_size(int val)
-{
-	if (val < 0)
-		return -val;
-	return val;
-}
-
-static void *percpu_modalloc(unsigned long size, unsigned long align,
-			     const char *name)
-{
-	unsigned long extra;
-	unsigned int i;
-	void *ptr;
-	int cpu;
-
-	if (align > PAGE_SIZE) {
-		printk(KERN_WARNING "%s: per-cpu alignment %li > %li\n",
-		       name, align, PAGE_SIZE);
-		align = PAGE_SIZE;
-	}
-
-	ptr = __per_cpu_start;
-	for (i = 0; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		/* Extra for alignment requirement. */
-		extra = ALIGN((unsigned long)ptr, align) - (unsigned long)ptr;
-		BUG_ON(i = 0 && extra != 0);
-
-		if (pcpu_size[i] < 0 || pcpu_size[i] < extra + size)
-			continue;
-
-		/* Transfer extra to previous block. */
-		if (pcpu_size[i-1] < 0)
-			pcpu_size[i-1] -= extra;
-		else
-			pcpu_size[i-1] += extra;
-		pcpu_size[i] -= extra;
-		ptr += extra;
-
-		/* Split block if warranted */
-		if (pcpu_size[i] - size > sizeof(unsigned long))
-			if (!split_block(i, size))
-				return NULL;
-
-		/* add the per-cpu scanning areas */
-		for_each_possible_cpu(cpu)
-			kmemleak_alloc(ptr + per_cpu_offset(cpu), size, 0,
-				       GFP_KERNEL);
-
-		/* Mark allocated */
-		pcpu_size[i] = -pcpu_size[i];
-		return ptr;
-	}
-
-	printk(KERN_WARNING "Could not allocate %lu bytes percpu data\n",
-	       size);
-	return NULL;
-}
-
-static void percpu_modfree(void *freeme)
-{
-	unsigned int i;
-	void *ptr = __per_cpu_start + block_size(pcpu_size[0]);
-	int cpu;
-
-	/* First entry is core kernel percpu data. */
-	for (i = 1; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		if (ptr = freeme) {
-			pcpu_size[i] = -pcpu_size[i];
-			goto free;
-		}
-	}
-	BUG();
-
- free:
-	/* remove the per-cpu scanning areas */
-	for_each_possible_cpu(cpu)
-		kmemleak_free(freeme + per_cpu_offset(cpu));
-
-	/* Merge with previous? */
-	if (pcpu_size[i-1] >= 0) {
-		pcpu_size[i-1] += pcpu_size[i];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i], &pcpu_size[i+1],
-			(pcpu_num_used - i) * sizeof(pcpu_size[0]));
-		i--;
-	}
-	/* Merge with next? */
-	if (i+1 < pcpu_num_used && pcpu_size[i+1] >= 0) {
-		pcpu_size[i] += pcpu_size[i+1];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i+1], &pcpu_size[i+2],
-			(pcpu_num_used - (i+1)) * sizeof(pcpu_size[0]));
-	}
-}
-
-static int percpu_modinit(void)
-{
-	pcpu_num_used = 2;
-	pcpu_num_allocated = 2;
-	pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated,
-			    GFP_KERNEL);
-	/* Static in-kernel percpu data (used). */
-	pcpu_size[0] = -(__per_cpu_end-__per_cpu_start);
-	/* Free room. */
-	pcpu_size[1] = PERCPU_ENOUGH_ROOM + pcpu_size[0];
-	if (pcpu_size[1] < 0) {
-		printk(KERN_ERR "No per-cpu room for modules.\n");
-		pcpu_num_used = 1;
-	}
-
-	return 0;
-}
-__initcall(percpu_modinit);
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 static unsigned int find_pcpusec(Elf_Ehdr *hdr,
 				 Elf_Shdr *sechdrs,
 				 const char *secstrings)
diff --git a/mm/Makefile b/mm/Makefile
index ea4b18b..1195920 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -33,11 +33,7 @@ obj-$(CONFIG_FAILSLAB) += failslab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
-ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
 obj-$(CONFIG_SMP) += percpu.o
-else
-obj-$(CONFIG_SMP) += allocpercpu.o
-endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
diff --git a/mm/allocpercpu.c b/mm/allocpercpu.c
deleted file mode 100644
index df34cea..0000000
--- a/mm/allocpercpu.c
+++ /dev/null
@@ -1,177 +0,0 @@
-/*
- * linux/mm/allocpercpu.c
- *
- * Separated from slab.c August 11, 2006 Christoph Lameter
- */
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/bootmem.h>
-#include <asm/sections.h>
-
-#ifndef cache_line_size
-#define cache_line_size()	L1_CACHE_BYTES
-#endif
-
-/**
- * percpu_depopulate - depopulate per-cpu data for given cpu
- * @__pdata: per-cpu data to depopulate
- * @cpu: depopulate per-cpu data for this cpu
- *
- * Depopulating per-cpu data for a cpu going offline would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- */
-static void percpu_depopulate(void *__pdata, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-
-	kfree(pdata->ptrs[cpu]);
-	pdata->ptrs[cpu] = NULL;
-}
-
-/**
- * percpu_depopulate_mask - depopulate per-cpu data for some cpu's
- * @__pdata: per-cpu data to depopulate
- * @mask: depopulate per-cpu data for cpu's selected through mask bits
- */
-static void __percpu_depopulate_mask(void *__pdata, const cpumask_t *mask)
-{
-	int cpu;
-	for_each_cpu_mask_nr(cpu, *mask)
-		percpu_depopulate(__pdata, cpu);
-}
-
-#define percpu_depopulate_mask(__pdata, mask) \
-	__percpu_depopulate_mask((__pdata), &(mask))
-
-/**
- * percpu_populate - populate per-cpu data for given cpu
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @cpu: populate per-data for this cpu
- *
- * Populating per-cpu data for a cpu coming online would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- * Per-cpu object is populated with zeroed buffer.
- */
-static void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-	int node = cpu_to_node(cpu);
-
-	/*
-	 * We should make sure each CPU gets private memory.
-	 */
-	size = roundup(size, cache_line_size());
-
-	BUG_ON(pdata->ptrs[cpu]);
-	if (node_online(node))
-		pdata->ptrs[cpu] = kmalloc_node(size, gfp|__GFP_ZERO, node);
-	else
-		pdata->ptrs[cpu] = kzalloc(size, gfp);
-	return pdata->ptrs[cpu];
-}
-
-/**
- * percpu_populate_mask - populate per-cpu data for more cpu's
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @mask: populate per-cpu data for cpu's selected through mask bits
- *
- * Per-cpu objects are populated with zeroed buffers.
- */
-static int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-				  cpumask_t *mask)
-{
-	cpumask_t populated;
-	int cpu;
-
-	cpus_clear(populated);
-	for_each_cpu_mask_nr(cpu, *mask)
-		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
-			__percpu_depopulate_mask(__pdata, &populated);
-			return -ENOMEM;
-		} else
-			cpu_set(cpu, populated);
-	return 0;
-}
-
-#define percpu_populate_mask(__pdata, size, gfp, mask) \
-	__percpu_populate_mask((__pdata), (size), (gfp), &(mask))
-
-/**
- * alloc_percpu - initial setup of per-cpu data
- * @size: size of per-cpu object
- * @align: alignment
- *
- * Allocate dynamic percpu area.  Percpu objects are populated with
- * zeroed buffers.
- */
-void *__alloc_percpu(size_t size, size_t align)
-{
-	/*
-	 * We allocate whole cache lines to avoid false sharing
-	 */
-	size_t sz = roundup(nr_cpu_ids * sizeof(void *), cache_line_size());
-	void *pdata = kzalloc(sz, GFP_KERNEL);
-	void *__pdata = __percpu_disguise(pdata);
-
-	/*
-	 * Can't easily make larger alignment work with kmalloc.  WARN
-	 * on it.  Larger alignment should only be used for module
-	 * percpu sections on SMP for which this path isn't used.
-	 */
-	WARN_ON_ONCE(align > SMP_CACHE_BYTES);
-
-	if (unlikely(!pdata))
-		return NULL;
-	if (likely(!__percpu_populate_mask(__pdata, size, GFP_KERNEL,
-					   &cpu_possible_map)))
-		return __pdata;
-	kfree(pdata);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(__alloc_percpu);
-
-/**
- * free_percpu - final cleanup of per-cpu data
- * @__pdata: object to clean up
- *
- * We simply clean up any per-cpu object left. No need for the client to
- * track and specify through a bis mask which per-cpu objects are to free.
- */
-void free_percpu(void *__pdata)
-{
-	if (unlikely(!__pdata))
-		return;
-	__percpu_depopulate_mask(__pdata, cpu_possible_mask);
-	kfree(__percpu_disguise(__pdata));
-}
-EXPORT_SYMBOL_GPL(free_percpu);
-
-/*
- * Generic percpu area setup.
- */
-#ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
-unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
-
-EXPORT_SYMBOL(__per_cpu_offset);
-
-void __init setup_per_cpu_areas(void)
-{
-	unsigned long size, i;
-	char *ptr;
-	unsigned long nr_possible_cpus = num_possible_cpus();
-
-	/* Copy section for each CPU (we discard the original) */
-	size = ALIGN(PERCPU_ENOUGH_ROOM, PAGE_SIZE);
-	ptr = alloc_bootmem_pages(size * nr_possible_cpus);
-
-	for_each_possible_cpu(i) {
-		__per_cpu_offset[i] = ptr - __per_cpu_start;
-		memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start);
-		ptr += size;
-	}
-}
-#endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
diff --git a/mm/percpu.c b/mm/percpu.c
index 43d8cac..adbc5a4 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -46,8 +46,6 @@
  *
  * To use this allocator, arch code should do the followings.
  *
- * - drop CONFIG_HAVE_LEGACY_PER_CPU_AREA
- *
  * - define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr() to translate
  *   regular address to percpu pointer and back if they need to be
  *   different from the default
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
  2009-09-22  7:40 ` Tejun Heo
@ 2009-09-22  8:16   ` Ingo Molnar
  -1 siblings, 0 replies; 72+ messages in thread
From: Ingo Molnar @ 2009-09-22  8:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel


* Tejun Heo <tj@kernel.org> wrote:

> Hello, all.
> 
> This patchset converts ia64 to dynamic percpu allocator and drop the
> now unused old percpu allocator.  This patchset contains the following
> four patches.
> 
>  0001-vmalloc-rename-local-variables-vmalloc_start-and-vma.patch
>  0002-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
>  0003-ia64-convert-to-dynamic-percpu-allocator.patch
>  0004-percpu-kill-legacy-percpu-allocator.patch
> 
> 0001 is misc prep to avoid macro / local variable collision.  0002
> makes ia64 allocate percpu area for cpu0 in the same way it does for
> other cpus.  0003 converts ia64 to dynamic percpu allocator and 0004
> drops now unused legacy allocator.
> 
> Contig memory model was verified with ski emulator.  Discontig and
> sparse models were verified on a 4-way SGI altix machine.  I've run
> percpu stress test module for quite a while on the machine.
> 
> Mike Travis, it would be great if you can test this on your machine.
> I'd really like to see how it would behave on a machine with that many
> NUMA nodes.
> 
> This patchset is available in the following git tree.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64
> 
> Hmmm... kernel.org seems slow to sync today.  If the branch isn't
> mirroreed, please pull from the master.
> 
> Thanks.
> 
>  arch/ia64/Kconfig              |    3 
>  arch/ia64/kernel/setup.c       |   12 --
>  arch/ia64/kernel/vmlinux.lds.S |   11 +-
>  arch/ia64/mm/contig.c          |   87 ++++++++++++++++----
>  arch/ia64/mm/discontig.c       |  120 +++++++++++++++++++++++++--
>  include/linux/percpu.h         |   24 -----
>  kernel/module.c                |  150 ----------------------------------
>  mm/Makefile                    |    4 
>  mm/allocpercpu.c               |  177 -----------------------------------------
>  mm/percpu.c                    |    2 
>  mm/vmalloc.c                   |   16 +--
>  11 files changed, 193 insertions(+), 413 deletions(-)

Kudos, really nice stuff!

	Ingo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic
@ 2009-09-22  8:16   ` Ingo Molnar
  0 siblings, 0 replies; 72+ messages in thread
From: Ingo Molnar @ 2009-09-22  8:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel


* Tejun Heo <tj@kernel.org> wrote:

> Hello, all.
> 
> This patchset converts ia64 to dynamic percpu allocator and drop the
> now unused old percpu allocator.  This patchset contains the following
> four patches.
> 
>  0001-vmalloc-rename-local-variables-vmalloc_start-and-vma.patch
>  0002-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
>  0003-ia64-convert-to-dynamic-percpu-allocator.patch
>  0004-percpu-kill-legacy-percpu-allocator.patch
> 
> 0001 is misc prep to avoid macro / local variable collision.  0002
> makes ia64 allocate percpu area for cpu0 in the same way it does for
> other cpus.  0003 converts ia64 to dynamic percpu allocator and 0004
> drops now unused legacy allocator.
> 
> Contig memory model was verified with ski emulator.  Discontig and
> sparse models were verified on a 4-way SGI altix machine.  I've run
> percpu stress test module for quite a while on the machine.
> 
> Mike Travis, it would be great if you can test this on your machine.
> I'd really like to see how it would behave on a machine with that many
> NUMA nodes.
> 
> This patchset is available in the following git tree.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64
> 
> Hmmm... kernel.org seems slow to sync today.  If the branch isn't
> mirroreed, please pull from the master.
> 
> Thanks.
> 
>  arch/ia64/Kconfig              |    3 
>  arch/ia64/kernel/setup.c       |   12 --
>  arch/ia64/kernel/vmlinux.lds.S |   11 +-
>  arch/ia64/mm/contig.c          |   87 ++++++++++++++++----
>  arch/ia64/mm/discontig.c       |  120 +++++++++++++++++++++++++--
>  include/linux/percpu.h         |   24 -----
>  kernel/module.c                |  150 ----------------------------------
>  mm/Makefile                    |    4 
>  mm/allocpercpu.c               |  177 -----------------------------------------
>  mm/percpu.c                    |    2 
>  mm/vmalloc.c                   |   16 +--
>  11 files changed, 193 insertions(+), 413 deletions(-)

Kudos, really nice stuff!

	Ingo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
  2009-09-22  8:16   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Ingo Molnar
@ 2009-09-22 20:49     ` Luck, Tony
  -1 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-22 20:49 UTC (permalink / raw)
  To: Ingo Molnar, Tejun Heo
  Cc: Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar, Rusty Russell,
	Christoph Lameter, linux-kernel

> Contig memory model was verified with ski emulator.  Discontig and
> sparse models were verified on a 4-way SGI altix machine.  I've run
> percpu stress test module for quite a while on the machine.

Ski must have missed something.  I just tried to boot this on a
"tiger_defconfig" kernel[1] and it panic'd early in boot.  I'll need
to re-connect my serial console to get the useful part of the
panic message ... what's on the VGA console isn't very helpful :-(

-Tony

[1] uses contig.c

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic
@ 2009-09-22 20:49     ` Luck, Tony
  0 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-22 20:49 UTC (permalink / raw)
  To: Ingo Molnar, Tejun Heo
  Cc: Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar, Rusty Russell,
	Christoph Lameter, linux-kernel

> Contig memory model was verified with ski emulator.  Discontig and
> sparse models were verified on a 4-way SGI altix machine.  I've run
> percpu stress test module for quite a while on the machine.

Ski must have missed something.  I just tried to boot this on a
"tiger_defconfig" kernel[1] and it panic'd early in boot.  I'll need
to re-connect my serial console to get the useful part of the
panic message ... what's on the VGA console isn't very helpful :-(

-Tony

[1] uses contig.c

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
  2009-09-22 20:49     ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
@ 2009-09-22 21:10       ` Luck, Tony
  -1 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-22 21:10 UTC (permalink / raw)
  To: Luck, Tony, Ingo Molnar, Tejun Heo
  Cc: Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar, Rusty Russell,
	Christoph Lameter, linux-kernel

> Ski must have missed something.  I just tried to boot this on a
> "tiger_defconfig" kernel[1] and it panic'd early in boot.  I'll need
> to re-connect my serial console to get the useful part of the
> panic message ... what's on the VGA console isn't very helpful :-(

Ok.  Here is the tail of the console log.  The instruction at the
faulting address is a "ld8 r3=[r14]" and r14 is indeed 0x0.

... nothing apparently odd leading up to here ...
ACPI: Core revision 20090521
Boot processor id 0x0/0xc618
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
migration/0[3]: Oops 8813272891392 [1]
Modules linked in:

Pid: 3, CPU 0, comm:          migration/0
psr : 00001010085a2018 ifs : 800000000000050e ip  : [<a00000010006a470>]    Not tainted (2.6.31-tiger-smp)
ip is at __wake_up_common+0xb0/0x120
unat: 0000000000000000 pfs : 000000000000030b rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 0000000000002941
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0930ffff00090000 ssd : 0930ffff00090000
b0  : a00000010006c1a0 b6  : a000000100080f20 b7  : a00000010000bc20
f6  : 1003e0000000000000000 f7  : 1003e0000000000000002
f8  : 1003e00000000a0722fad f9  : 1003e000000000cf5bbc7
f10 : 1003e081f5d33de276e7b f11 : 1003e0000000000000000
r1  : a000000100da9de0 r2  : 00000000fffedfeb r3  : e0000001c0210230
r8  : 00000010085a6018 r9  : 0000000000000001 r10 : ffffffffffff7100
r11 : ffffffffffff7100 r12 : e0000001c021fe00 r13 : e0000001c0210000
r14 : 0000000000000000 r15 : ffffffffffffffe8 r16 : a000000100bc4708
r17 : a000000100bcbe30 r18 : 0000000000000000 r19 : e0000001c0210be4
r20 : 0000000000000001 r21 : 0000000000000001 r22 : 0000000000000000
r23 : e0000001c0210038 r24 : e0000001c0210000 r25 : e000000180007090
r26 : a000000100076000 r27 : 00000010085a6018 r28 : 00000000ffffffff
r29 : e0000001c021001c r30 : 0000000000000000 r31 : ffffffffffff7120

Call Trace:
 [<a000000100015a30>] show_stack+0x50/0xa0
                                sp=e0000001c021f9d0 bsp=e0000001c0210f00
 [<a0000001000162a0>] show_regs+0x820/0x860
                                sp=e0000001c021fba0 bsp=e0000001c0210ea8
 [<a00000010003abc0>] die+0x1a0/0x2c0
                                sp=e0000001c021fba0 bsp=e0000001c0210e68
 [<a0000001000645d0>] ia64_do_page_fault+0x8b0/0x9e0
                                sp=e0000001c021fba0 bsp=e0000001c0210e18
 [<a00000010000c420>] ia64_native_leave_kernel+0x0/0x270
                                sp=e0000001c021fc30 bsp=e0000001c0210e18
 [<a00000010006a470>] __wake_up_common+0xb0/0x120
                                sp=e0000001c021fe00 bsp=e0000001c0210da0
 [<a00000010006c1a0>] complete+0x60/0xa0
                                sp=e0000001c021fe00 bsp=e0000001c0210d70
 [<a0000001000815a0>] migration_thread+0x680/0x700
                                sp=e0000001c021fe00 bsp=e0000001c0210ca0
 [<a0000001000b9630>] kthread+0x110/0x140
                                sp=e0000001c021fe00 bsp=e0000001c0210c68
 [<a000000100013cf0>] kernel_thread_helper+0x30/0x60
                                sp=e0000001c021fe30 bsp=e0000001c0210c40
 [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
                                sp=e0000001c021fe30 bsp=e0000001c0210c40


^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic
@ 2009-09-22 21:10       ` Luck, Tony
  0 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-22 21:10 UTC (permalink / raw)
  To: Luck, Tony, Ingo Molnar, Tejun Heo
  Cc: Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar, Rusty Russell,
	Christoph Lameter, linux-kernel

> Ski must have missed something.  I just tried to boot this on a
> "tiger_defconfig" kernel[1] and it panic'd early in boot.  I'll need
> to re-connect my serial console to get the useful part of the
> panic message ... what's on the VGA console isn't very helpful :-(

Ok.  Here is the tail of the console log.  The instruction at the
faulting address is a "ld8 r3=[r14]" and r14 is indeed 0x0.

... nothing apparently odd leading up to here ...
ACPI: Core revision 20090521
Boot processor id 0x0/0xc618
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
migration/0[3]: Oops 8813272891392 [1]
Modules linked in:

Pid: 3, CPU 0, comm:          migration/0
psr : 00001010085a2018 ifs : 800000000000050e ip  : [<a00000010006a470>]    Not tainted (2.6.31-tiger-smp)
ip is at __wake_up_common+0xb0/0x120
unat: 0000000000000000 pfs : 000000000000030b rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 0000000000002941
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0930ffff00090000 ssd : 0930ffff00090000
b0  : a00000010006c1a0 b6  : a000000100080f20 b7  : a00000010000bc20
f6  : 1003e0000000000000000 f7  : 1003e0000000000000002
f8  : 1003e00000000a0722fad f9  : 1003e000000000cf5bbc7
f10 : 1003e081f5d33de276e7b f11 : 1003e0000000000000000
r1  : a000000100da9de0 r2  : 00000000fffedfeb r3  : e0000001c0210230
r8  : 00000010085a6018 r9  : 0000000000000001 r10 : ffffffffffff7100
r11 : ffffffffffff7100 r12 : e0000001c021fe00 r13 : e0000001c0210000
r14 : 0000000000000000 r15 : ffffffffffffffe8 r16 : a000000100bc4708
r17 : a000000100bcbe30 r18 : 0000000000000000 r19 : e0000001c0210be4
r20 : 0000000000000001 r21 : 0000000000000001 r22 : 0000000000000000
r23 : e0000001c0210038 r24 : e0000001c0210000 r25 : e000000180007090
r26 : a000000100076000 r27 : 00000010085a6018 r28 : 00000000ffffffff
r29 : e0000001c021001c r30 : 0000000000000000 r31 : ffffffffffff7120

Call Trace:
 [<a000000100015a30>] show_stack+0x50/0xa0
                                spà000001c021f9d0 bspà000001c0210f00
 [<a0000001000162a0>] show_regs+0x820/0x860
                                spà000001c021fba0 bspà000001c0210ea8
 [<a00000010003abc0>] die+0x1a0/0x2c0
                                spà000001c021fba0 bspà000001c0210e68
 [<a0000001000645d0>] ia64_do_page_fault+0x8b0/0x9e0
                                spà000001c021fba0 bspà000001c0210e18
 [<a00000010000c420>] ia64_native_leave_kernel+0x0/0x270
                                spà000001c021fc30 bspà000001c0210e18
 [<a00000010006a470>] __wake_up_common+0xb0/0x120
                                spà000001c021fe00 bspà000001c0210da0
 [<a00000010006c1a0>] complete+0x60/0xa0
                                spà000001c021fe00 bspà000001c0210d70
 [<a0000001000815a0>] migration_thread+0x680/0x700
                                spà000001c021fe00 bspà000001c0210ca0
 [<a0000001000b9630>] kthread+0x110/0x140
                                spà000001c021fe00 bspà000001c0210c68
 [<a000000100013cf0>] kernel_thread_helper+0x30/0x60
                                spà000001c021fe30 bspà000001c0210c40
 [<a00000010000a0c0>] start_kernel_thread+0x20/0x40
                                spà000001c021fe30 bspà000001c0210c40


^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
  2009-09-22 21:10       ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
@ 2009-09-22 21:24         ` Luck, Tony
  -1 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-22 21:24 UTC (permalink / raw)
  To: Luck, Tony, Ingo Molnar, Tejun Heo
  Cc: Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar, Rusty Russell,
	Christoph Lameter, linux-kernel

I just noticed the "#for-next" in the Subject line for these
patches.  Do they depend on some stuff in linux-next that is
not in Linus' tree (pulled today HEAD=7fa07729e...)?  If so, then
ignore my results.

Kernel built from generic_defconfig does boot OK though, so I suspect
this is a discontig vs. contig problem.

-Tony

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic
@ 2009-09-22 21:24         ` Luck, Tony
  0 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-22 21:24 UTC (permalink / raw)
  To: Luck, Tony, Ingo Molnar, Tejun Heo
  Cc: Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar, Rusty Russell,
	Christoph Lameter, linux-kernel

I just noticed the "#for-next" in the Subject line for these
patches.  Do they depend on some stuff in linux-next that is
not in Linus' tree (pulled today HEAD\x7fa07729e...)?  If so, then
ignore my results.

Kernel built from generic_defconfig does boot OK though, so I suspect
this is a discontig vs. contig problem.

-Tony

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one
  2009-09-22 21:24         ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
@ 2009-09-22 21:50           ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22 21:50 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Ingo Molnar, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Hello,

Luck, Tony wrote:
> I just noticed the "#for-next" in the Subject line for these
> patches.  Do they depend on some stuff in linux-next that is
> not in Linus' tree (pulled today HEAD=7fa07729e...)?  If so, then
> ignore my results.

Nope, it should work on top of Linus's tree.

> Kernel built from generic_defconfig does boot OK though, so I suspect
> this is a discontig vs. contig problem.

Yeah, it's probably something broken with contig SMP configuration.
I've just found a machine with contig mem and multiple processors.
Will test on it later today.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu
@ 2009-09-22 21:50           ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-22 21:50 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Ingo Molnar, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Hello,

Luck, Tony wrote:
> I just noticed the "#for-next" in the Subject line for these
> patches.  Do they depend on some stuff in linux-next that is
> not in Linus' tree (pulled today HEAD\x7fa07729e...)?  If so, then
> ignore my results.

Nope, it should work on top of Linus's tree.

> Kernel built from generic_defconfig does boot OK though, so I suspect
> this is a discontig vs. contig problem.

Yeah, it's probably something broken with contig SMP configuration.
I've just found a machine with contig mem and multiple processors.
Will test on it later today.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/4] vmalloc: rename local variables vmalloc_start and vmalloc_end
  2009-09-22  7:40   ` Tejun Heo
@ 2009-09-22 22:52     ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-22 22:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Tue, 22 Sep 2009, Tejun Heo wrote:

>   const unsigned long vmalloc_end = vmalloc_end & ~(align - 1);
>
> Rename local variables vmalloc_start and vmalloc_end to vm_start and
> vm_end to avoid the collision.

Could you keep vmalloc_end and vmalloc_start? vm_start and vm_end may led
to misinterpretations as start and end of the memory area covered by the
VM.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/4] vmalloc: rename local variables vmalloc_start and
@ 2009-09-22 22:52     ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-22 22:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Tue, 22 Sep 2009, Tejun Heo wrote:

>   const unsigned long vmalloc_end = vmalloc_end & ~(align - 1);
>
> Rename local variables vmalloc_start and vmalloc_end to vm_start and
> vm_end to avoid the collision.

Could you keep vmalloc_end and vmalloc_start? vm_start and vm_end may led
to misinterpretations as start and end of the memory area covered by the
VM.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-22  7:40   ` Tejun Heo
@ 2009-09-22 22:59     ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-22 22:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Tue, 22 Sep 2009, Tejun Heo wrote:

>
> +#ifdef	CONFIG_SMP
> +  . = ALIGN(PERCPU_PAGE_SIZE);
> +  __cpu0_per_cpu = .;

__per_cpu_start?

> +  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
> +#endif

This is a statically sized per cpu area that is used by __get_cpu_var()
Data is access via a cpu specific memory mapping. How does this work when
the area grows beyond  PERCPU_PAGE_SIZE? As far as I can see: It seems
that __get_cpu_var would then cause a memory fault?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu
@ 2009-09-22 22:59     ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-22 22:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Tue, 22 Sep 2009, Tejun Heo wrote:

>
> +#ifdef	CONFIG_SMP
> +  . = ALIGN(PERCPU_PAGE_SIZE);
> +  __cpu0_per_cpu = .;

__per_cpu_start?

> +  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
> +#endif

This is a statically sized per cpu area that is used by __get_cpu_var()
Data is access via a cpu specific memory mapping. How does this work when
the area grows beyond  PERCPU_PAGE_SIZE? As far as I can see: It seems
that __get_cpu_var would then cause a memory fault?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/4] vmalloc: rename local variables vmalloc_start and vmalloc_end
  2009-09-22 22:52     ` [PATCH 1/4] vmalloc: rename local variables vmalloc_start and Christoph Lameter
@ 2009-09-23  2:08       ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  2:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Christoph Lameter wrote:
> On Tue, 22 Sep 2009, Tejun Heo wrote:
> 
>>   const unsigned long vmalloc_end = vmalloc_end & ~(align - 1);
>>
>> Rename local variables vmalloc_start and vmalloc_end to vm_start and
>> vm_end to avoid the collision.
> 
> Could you keep vmalloc_end and vmalloc_start? vm_start and vm_end may led
> to misinterpretations as start and end of the memory area covered by the
> VM.

Hmmm... yeah, the right thing to do would be either let ia64 use
VMALLOC_END directly as variable name or let it alias an unlikely
symbol like ____vmalloc_end as macro which ends up expanding to a
seemingly normal symbol like vmalloc_end is just rude.  I'll change
ia64 part.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/4] vmalloc: rename local variables vmalloc_start and
@ 2009-09-23  2:08       ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  2:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Christoph Lameter wrote:
> On Tue, 22 Sep 2009, Tejun Heo wrote:
> 
>>   const unsigned long vmalloc_end = vmalloc_end & ~(align - 1);
>>
>> Rename local variables vmalloc_start and vmalloc_end to vm_start and
>> vm_end to avoid the collision.
> 
> Could you keep vmalloc_end and vmalloc_start? vm_start and vm_end may led
> to misinterpretations as start and end of the memory area covered by the
> VM.

Hmmm... yeah, the right thing to do would be either let ia64 use
VMALLOC_END directly as variable name or let it alias an unlikely
symbol like ____vmalloc_end as macro which ends up expanding to a
seemingly normal symbol like vmalloc_end is just rude.  I'll change
ia64 part.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-22 22:59     ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
@ 2009-09-23  2:11       ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  2:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Christoph Lameter wrote:
> On Tue, 22 Sep 2009, Tejun Heo wrote:
> 
>> +#ifdef	CONFIG_SMP
>> +  . = ALIGN(PERCPU_PAGE_SIZE);
>> +  __cpu0_per_cpu = .;
> 
> __per_cpu_start?
> 
>> +  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
>> +#endif
> 
> This is a statically sized per cpu area that is used by __get_cpu_var()
> Data is access via a cpu specific memory mapping. How does this work when
> the area grows beyond  PERCPU_PAGE_SIZE? As far as I can see: It seems
> that __get_cpu_var would then cause a memory fault?

On ia64, the first chunk is fixed at PERCPU_PAGE_SIZE.  It's something
hardwired into the page fault logic and the linker script.  Build will
fail if the static + reserved area goes over PERCPU_PAGE_SIZE and in
that case ia64 will need to update the special case page fault logic
and increase PERCPU_PAGE_SIZE.  The area reserved above is interim
per-cpu area for cpu0 which is used between head.S and proper percpu
area setup and will be ditched once initialization is complete.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas
@ 2009-09-23  2:11       ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  2:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Christoph Lameter wrote:
> On Tue, 22 Sep 2009, Tejun Heo wrote:
> 
>> +#ifdef	CONFIG_SMP
>> +  . = ALIGN(PERCPU_PAGE_SIZE);
>> +  __cpu0_per_cpu = .;
> 
> __per_cpu_start?
> 
>> +  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
>> +#endif
> 
> This is a statically sized per cpu area that is used by __get_cpu_var()
> Data is access via a cpu specific memory mapping. How does this work when
> the area grows beyond  PERCPU_PAGE_SIZE? As far as I can see: It seems
> that __get_cpu_var would then cause a memory fault?

On ia64, the first chunk is fixed at PERCPU_PAGE_SIZE.  It's something
hardwired into the page fault logic and the linker script.  Build will
fail if the static + reserved area goes over PERCPU_PAGE_SIZE and in
that case ia64 will need to update the special case page fault logic
and increase PERCPU_PAGE_SIZE.  The area reserved above is interim
per-cpu area for cpu0 which is used between head.S and proper percpu
area setup and will be ditched once initialization is complete.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-22  7:40 ` Tejun Heo
@ 2009-09-23  5:06 ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Hello, all.

This is the second take of convert-ia64-to-dynamic-percpu patchset.
Changes from the last take[L] are

* 0001 now updates ia64 to not define VMALLOC_END as a macro to
  vmalloc_end instead of disallowing vmalloc_end as a variable name as
  suggested by Christoph.

* 0002 added to initialize cpu maps early.  This is necessary to get
  contig memory model working.

* 0004 updated so that dyn_size is calculated correctly for contig
  model.

This patchset contains the following five patches.

 0001-ia64-don-t-alias-VMALLOC_END-to-vmalloc_end.patch
 0002-ia64-initialize-cpu-maps-early.patch
 0003-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
 0004-ia64-convert-to-dynamic-percpu-allocator.patch
 0005-percpu-kill-legacy-percpu-allocator.patch

0001 is misc prep to avoid macro / local variable collision.  0002
makes ia64 arch code initialize cpu possible and present maps before
memory initialization.  0003 makes ia64 allocate percpu area for cpu0
in the same way it does for other cpus.  0004 converts ia64 to dynamic
percpu allocator and 0005 drops now unused legacy allocator.

Contig memory model was tested on a 16p Tiger4 machine.  Discontig and
sparse tested on 4-way SGI altix.  ski seems to be happy with contig
up/smp.

This patchset is available in the following git tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64

The new commit ID is dcc91f19c6662b24f1f4e5878d773244f1079724 and it's
on top of today's Linus 7fa07729e439a6184bd824746d06a49cca553f15.

Thanks.

 arch/ia64/Kconfig              |    3 
 arch/ia64/kernel/setup.c       |   12 --
 arch/ia64/kernel/vmlinux.lds.S |   11 +-
 arch/ia64/mm/contig.c          |   87 ++++++++++++++++----
 arch/ia64/mm/discontig.c       |  120 +++++++++++++++++++++++++--
 include/linux/percpu.h         |   24 -----
 kernel/module.c                |  150 ----------------------------------
 mm/Makefile                    |    4 
 mm/allocpercpu.c               |  177 -----------------------------------------
 mm/percpu.c                    |    2 
 mm/vmalloc.c                   |   16 +--
 11 files changed, 193 insertions(+), 413 deletions(-)

--
tejun

[L] http://thread.gmane.org/gmane.linux.ports.ia64/20812

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
@ 2009-09-23  5:06 ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Hello, all.

This is the second take of convert-ia64-to-dynamic-percpu patchset.
Changes from the last take[L] are

* 0001 now updates ia64 to not define VMALLOC_END as a macro to
  vmalloc_end instead of disallowing vmalloc_end as a variable name as
  suggested by Christoph.

* 0002 added to initialize cpu maps early.  This is necessary to get
  contig memory model working.

* 0004 updated so that dyn_size is calculated correctly for contig
  model.

This patchset contains the following five patches.

 0001-ia64-don-t-alias-VMALLOC_END-to-vmalloc_end.patch
 0002-ia64-initialize-cpu-maps-early.patch
 0003-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
 0004-ia64-convert-to-dynamic-percpu-allocator.patch
 0005-percpu-kill-legacy-percpu-allocator.patch

0001 is misc prep to avoid macro / local variable collision.  0002
makes ia64 arch code initialize cpu possible and present maps before
memory initialization.  0003 makes ia64 allocate percpu area for cpu0
in the same way it does for other cpus.  0004 converts ia64 to dynamic
percpu allocator and 0005 drops now unused legacy allocator.

Contig memory model was tested on a 16p Tiger4 machine.  Discontig and
sparse tested on 4-way SGI altix.  ski seems to be happy with contig
up/smp.

This patchset is available in the following git tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64

The new commit ID is dcc91f19c6662b24f1f4e5878d773244f1079724 and it's
on top of today's Linus 7fa07729e439a6184bd824746d06a49cca553f15.

Thanks.

 arch/ia64/Kconfig              |    3 
 arch/ia64/kernel/setup.c       |   12 --
 arch/ia64/kernel/vmlinux.lds.S |   11 +-
 arch/ia64/mm/contig.c          |   87 ++++++++++++++++----
 arch/ia64/mm/discontig.c       |  120 +++++++++++++++++++++++++--
 include/linux/percpu.h         |   24 -----
 kernel/module.c                |  150 ----------------------------------
 mm/Makefile                    |    4 
 mm/allocpercpu.c               |  177 -----------------------------------------
 mm/percpu.c                    |    2 
 mm/vmalloc.c                   |   16 +--
 11 files changed, 193 insertions(+), 413 deletions(-)

--
tejun

[L] http://thread.gmane.org/gmane.linux.ports.ia64/20812

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 1/5] ia64: don't alias VMALLOC_END to vmalloc_end
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-23  5:06   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64, Christoph Lameter

If CONFIG_VIRTUAL_MEM_MAP is enabled, ia64 defines macro VMALLOC_END
as unsigned long variable vmalloc_end which is adjusted to prepare
room for vmemmap.  This becomes probnlematic if a local variables
vmalloc_end is defined in some function (not very unlikely) and
VMALLOC_END is used in the function - the function thinks its
referencing the global VMALLOC_END value but would be referencing its
own local vmalloc_end variable.

There's no reason VMALLOC_END should be a macro.  Just define it as an
unsigned long variable if CONFIG_VIRTUAL_MEM_MAP is set to avoid nasty
surprises.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
---
 arch/ia64/include/asm/meminit.h |    2 +-
 arch/ia64/include/asm/pgtable.h |    3 +--
 arch/ia64/mm/contig.c           |    4 ++--
 arch/ia64/mm/discontig.c        |    4 ++--
 arch/ia64/mm/init.c             |    4 ++--
 5 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/ia64/include/asm/meminit.h b/arch/ia64/include/asm/meminit.h
index 688a812..61c7b17 100644
--- a/arch/ia64/include/asm/meminit.h
+++ b/arch/ia64/include/asm/meminit.h
@@ -61,7 +61,7 @@ extern int register_active_ranges(u64 start, u64 len, int nid);
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
 # define LARGE_GAP	0x40000000 /* Use virtual mem map if hole is > than this */
-  extern unsigned long vmalloc_end;
+  extern unsigned long VMALLOC_END;
   extern struct page *vmem_map;
   extern int find_largest_hole(u64 start, u64 end, void *arg);
   extern int create_mem_map_page_table(u64 start, u64 end, void *arg);
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 8840a69..69bf138 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -228,8 +228,7 @@ ia64_phys_addr_valid (unsigned long addr)
 #define VMALLOC_START		(RGN_BASE(RGN_GATE) + 0x200000000UL)
 #ifdef CONFIG_VIRTUAL_MEM_MAP
 # define VMALLOC_END_INIT	(RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 9)))
-# define VMALLOC_END		vmalloc_end
-  extern unsigned long vmalloc_end;
+extern unsigned long VMALLOC_END;
 #else
 #if defined(CONFIG_SPARSEMEM) && defined(CONFIG_SPARSEMEM_VMEMMAP)
 /* SPARSEMEM_VMEMMAP uses half of vmalloc... */
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 2f724d2..1341437 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -270,8 +270,8 @@ paging_init (void)
 
 		map_size = PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
 			sizeof(struct page));
-		vmalloc_end -= map_size;
-		vmem_map = (struct page *) vmalloc_end;
+		VMALLOC_END -= map_size;
+		vmem_map = (struct page *) VMALLOC_END;
 		efi_memmap_walk(create_mem_map_page_table, NULL);
 
 		/*
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index d85ba98..9f24b3c 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -666,9 +666,9 @@ void __init paging_init(void)
 	sparse_init();
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-	vmalloc_end -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
+	VMALLOC_END -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
 		sizeof(struct page));
-	vmem_map = (struct page *) vmalloc_end;
+	vmem_map = (struct page *) VMALLOC_END;
 	efi_memmap_walk(create_mem_map_page_table, NULL);
 	printk("Virtual mem_map starts at 0x%p\n", vmem_map);
 #endif
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 1d28624..f301071 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -44,8 +44,8 @@ extern void ia64_tlb_init (void);
 unsigned long MAX_DMA_ADDRESS = PAGE_OFFSET + 0x100000000UL;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-unsigned long vmalloc_end = VMALLOC_END_INIT;
-EXPORT_SYMBOL(vmalloc_end);
+unsigned long VMALLOC_END = VMALLOC_END_INIT;
+EXPORT_SYMBOL(VMALLOC_END);
 struct page *vmem_map;
 EXPORT_SYMBOL(vmem_map);
 #endif
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 1/5] ia64: don't alias VMALLOC_END to vmalloc_end
@ 2009-09-23  5:06   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64, Christoph Lameter

If CONFIG_VIRTUAL_MEM_MAP is enabled, ia64 defines macro VMALLOC_END
as unsigned long variable vmalloc_end which is adjusted to prepare
room for vmemmap.  This becomes probnlematic if a local variables
vmalloc_end is defined in some function (not very unlikely) and
VMALLOC_END is used in the function - the function thinks its
referencing the global VMALLOC_END value but would be referencing its
own local vmalloc_end variable.

There's no reason VMALLOC_END should be a macro.  Just define it as an
unsigned long variable if CONFIG_VIRTUAL_MEM_MAP is set to avoid nasty
surprises.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
---
 arch/ia64/include/asm/meminit.h |    2 +-
 arch/ia64/include/asm/pgtable.h |    3 +--
 arch/ia64/mm/contig.c           |    4 ++--
 arch/ia64/mm/discontig.c        |    4 ++--
 arch/ia64/mm/init.c             |    4 ++--
 5 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/ia64/include/asm/meminit.h b/arch/ia64/include/asm/meminit.h
index 688a812..61c7b17 100644
--- a/arch/ia64/include/asm/meminit.h
+++ b/arch/ia64/include/asm/meminit.h
@@ -61,7 +61,7 @@ extern int register_active_ranges(u64 start, u64 len, int nid);
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
 # define LARGE_GAP	0x40000000 /* Use virtual mem map if hole is > than this */
-  extern unsigned long vmalloc_end;
+  extern unsigned long VMALLOC_END;
   extern struct page *vmem_map;
   extern int find_largest_hole(u64 start, u64 end, void *arg);
   extern int create_mem_map_page_table(u64 start, u64 end, void *arg);
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 8840a69..69bf138 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -228,8 +228,7 @@ ia64_phys_addr_valid (unsigned long addr)
 #define VMALLOC_START		(RGN_BASE(RGN_GATE) + 0x200000000UL)
 #ifdef CONFIG_VIRTUAL_MEM_MAP
 # define VMALLOC_END_INIT	(RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 9)))
-# define VMALLOC_END		vmalloc_end
-  extern unsigned long vmalloc_end;
+extern unsigned long VMALLOC_END;
 #else
 #if defined(CONFIG_SPARSEMEM) && defined(CONFIG_SPARSEMEM_VMEMMAP)
 /* SPARSEMEM_VMEMMAP uses half of vmalloc... */
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 2f724d2..1341437 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -270,8 +270,8 @@ paging_init (void)
 
 		map_size = PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
 			sizeof(struct page));
-		vmalloc_end -= map_size;
-		vmem_map = (struct page *) vmalloc_end;
+		VMALLOC_END -= map_size;
+		vmem_map = (struct page *) VMALLOC_END;
 		efi_memmap_walk(create_mem_map_page_table, NULL);
 
 		/*
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index d85ba98..9f24b3c 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -666,9 +666,9 @@ void __init paging_init(void)
 	sparse_init();
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-	vmalloc_end -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
+	VMALLOC_END -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
 		sizeof(struct page));
-	vmem_map = (struct page *) vmalloc_end;
+	vmem_map = (struct page *) VMALLOC_END;
 	efi_memmap_walk(create_mem_map_page_table, NULL);
 	printk("Virtual mem_map starts at 0x%p\n", vmem_map);
 #endif
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 1d28624..f301071 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -44,8 +44,8 @@ extern void ia64_tlb_init (void);
 unsigned long MAX_DMA_ADDRESS = PAGE_OFFSET + 0x100000000UL;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-unsigned long vmalloc_end = VMALLOC_END_INIT;
-EXPORT_SYMBOL(vmalloc_end);
+unsigned long VMALLOC_END = VMALLOC_END_INIT;
+EXPORT_SYMBOL(VMALLOC_END);
 struct page *vmem_map;
 EXPORT_SYMBOL(vmem_map);
 #endif
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 2/5] ia64: initialize cpu maps early
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-23  5:06   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

All information necessary to initialize cpu possible and present maps
are available once early_acpi_boot_init() is complete.  Reorganize
setup_arch() and acpi init functions such that,

* CPU information is printed after LAPIC entries are parsed in
  early_acpi_boot_init().

* smp_build_cpu_map() is called by setup_arch() instead of acpi
  functions.

* smp_build_cpu_map() is called once all CPU related information is
  available before memory is initialized.

This is primarily to allow find_memory() to use cpu maps but is also a
general cleanup.  Please note that with this change, the somewhat
ad-hoc early_cpu_possible_map defined and used for NUMA configurations
is probably unnecessary.  Something to clean up another day.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/kernel/acpi.c  |   33 +++++++++++++++------------------
 arch/ia64/kernel/setup.c |   11 +++++------
 2 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index baec6f0..40574ae 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -702,11 +702,23 @@ int __init early_acpi_boot_init(void)
 		printk(KERN_ERR PREFIX
 		       "Error parsing MADT - no LAPIC entries\n");
 
+#ifdef CONFIG_SMP
+	if (available_cpus == 0) {
+		printk(KERN_INFO "ACPI: Found 0 CPUS; assuming 1\n");
+		printk(KERN_INFO "CPU 0 (0x%04x)", hard_smp_processor_id());
+		smp_boot_data.cpu_phys_id[available_cpus] =
+		    hard_smp_processor_id();
+		available_cpus = 1;	/* We've got at least one of these, no? */
+	}
+	smp_boot_data.cpu_count = available_cpus;
+#endif
+	/* Make boot-up look pretty */
+	printk(KERN_INFO "%d CPUs available, %d CPUs total\n", available_cpus,
+	       total_cpus);
+
 	return 0;
 }
 
-
-
 int __init acpi_boot_init(void)
 {
 
@@ -769,18 +781,8 @@ int __init acpi_boot_init(void)
 	if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt))
 		printk(KERN_ERR PREFIX "Can't find FADT\n");
 
+#ifdef CONFIG_ACPI_NUMA
 #ifdef CONFIG_SMP
-	if (available_cpus == 0) {
-		printk(KERN_INFO "ACPI: Found 0 CPUS; assuming 1\n");
-		printk(KERN_INFO "CPU 0 (0x%04x)", hard_smp_processor_id());
-		smp_boot_data.cpu_phys_id[available_cpus] =
-		    hard_smp_processor_id();
-		available_cpus = 1;	/* We've got at least one of these, no? */
-	}
-	smp_boot_data.cpu_count = available_cpus;
-
-	smp_build_cpu_map();
-# ifdef CONFIG_ACPI_NUMA
 	if (srat_num_cpus == 0) {
 		int cpu, i = 1;
 		for (cpu = 0; cpu < smp_boot_data.cpu_count; cpu++)
@@ -789,14 +791,9 @@ int __init acpi_boot_init(void)
 				node_cpuid[i++].phys_id =
 				    smp_boot_data.cpu_phys_id[cpu];
 	}
-# endif
 #endif
-#ifdef CONFIG_ACPI_NUMA
 	build_cpu_to_node_map();
 #endif
-	/* Make boot-up look pretty */
-	printk(KERN_INFO "%d CPUs available, %d CPUs total\n", available_cpus,
-	       total_cpus);
 	return 0;
 }
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 1de86c9..5d77c1e 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -566,19 +566,18 @@ setup_arch (char **cmdline_p)
 	early_acpi_boot_init();
 # ifdef CONFIG_ACPI_NUMA
 	acpi_numa_init();
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#  ifdef CONFIG_ACPI_HOTPLUG_CPU
 	prefill_possible_map();
-#endif
+#  endif
 	per_cpu_scan_finalize((cpus_weight(early_cpu_possible_map) == 0 ?
 		32 : cpus_weight(early_cpu_possible_map)),
 		additional_cpus > 0 ? additional_cpus : 0);
 # endif
-#else
-# ifdef CONFIG_SMP
-	smp_build_cpu_map();	/* happens, e.g., with the Ski simulator */
-# endif
 #endif /* CONFIG_APCI_BOOT */
 
+#ifdef CONFIG_SMP
+	smp_build_cpu_map();
+#endif
 	find_memory();
 
 	/* process SAL system table: */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 2/5] ia64: initialize cpu maps early
@ 2009-09-23  5:06   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

All information necessary to initialize cpu possible and present maps
are available once early_acpi_boot_init() is complete.  Reorganize
setup_arch() and acpi init functions such that,

* CPU information is printed after LAPIC entries are parsed in
  early_acpi_boot_init().

* smp_build_cpu_map() is called by setup_arch() instead of acpi
  functions.

* smp_build_cpu_map() is called once all CPU related information is
  available before memory is initialized.

This is primarily to allow find_memory() to use cpu maps but is also a
general cleanup.  Please note that with this change, the somewhat
ad-hoc early_cpu_possible_map defined and used for NUMA configurations
is probably unnecessary.  Something to clean up another day.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/kernel/acpi.c  |   33 +++++++++++++++------------------
 arch/ia64/kernel/setup.c |   11 +++++------
 2 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index baec6f0..40574ae 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -702,11 +702,23 @@ int __init early_acpi_boot_init(void)
 		printk(KERN_ERR PREFIX
 		       "Error parsing MADT - no LAPIC entries\n");
 
+#ifdef CONFIG_SMP
+	if (available_cpus = 0) {
+		printk(KERN_INFO "ACPI: Found 0 CPUS; assuming 1\n");
+		printk(KERN_INFO "CPU 0 (0x%04x)", hard_smp_processor_id());
+		smp_boot_data.cpu_phys_id[available_cpus] +		    hard_smp_processor_id();
+		available_cpus = 1;	/* We've got at least one of these, no? */
+	}
+	smp_boot_data.cpu_count = available_cpus;
+#endif
+	/* Make boot-up look pretty */
+	printk(KERN_INFO "%d CPUs available, %d CPUs total\n", available_cpus,
+	       total_cpus);
+
 	return 0;
 }
 
-
-
 int __init acpi_boot_init(void)
 {
 
@@ -769,18 +781,8 @@ int __init acpi_boot_init(void)
 	if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt))
 		printk(KERN_ERR PREFIX "Can't find FADT\n");
 
+#ifdef CONFIG_ACPI_NUMA
 #ifdef CONFIG_SMP
-	if (available_cpus = 0) {
-		printk(KERN_INFO "ACPI: Found 0 CPUS; assuming 1\n");
-		printk(KERN_INFO "CPU 0 (0x%04x)", hard_smp_processor_id());
-		smp_boot_data.cpu_phys_id[available_cpus] -		    hard_smp_processor_id();
-		available_cpus = 1;	/* We've got at least one of these, no? */
-	}
-	smp_boot_data.cpu_count = available_cpus;
-
-	smp_build_cpu_map();
-# ifdef CONFIG_ACPI_NUMA
 	if (srat_num_cpus = 0) {
 		int cpu, i = 1;
 		for (cpu = 0; cpu < smp_boot_data.cpu_count; cpu++)
@@ -789,14 +791,9 @@ int __init acpi_boot_init(void)
 				node_cpuid[i++].phys_id  				    smp_boot_data.cpu_phys_id[cpu];
 	}
-# endif
 #endif
-#ifdef CONFIG_ACPI_NUMA
 	build_cpu_to_node_map();
 #endif
-	/* Make boot-up look pretty */
-	printk(KERN_INFO "%d CPUs available, %d CPUs total\n", available_cpus,
-	       total_cpus);
 	return 0;
 }
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 1de86c9..5d77c1e 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -566,19 +566,18 @@ setup_arch (char **cmdline_p)
 	early_acpi_boot_init();
 # ifdef CONFIG_ACPI_NUMA
 	acpi_numa_init();
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#  ifdef CONFIG_ACPI_HOTPLUG_CPU
 	prefill_possible_map();
-#endif
+#  endif
 	per_cpu_scan_finalize((cpus_weight(early_cpu_possible_map) = 0 ?
 		32 : cpus_weight(early_cpu_possible_map)),
 		additional_cpus > 0 ? additional_cpus : 0);
 # endif
-#else
-# ifdef CONFIG_SMP
-	smp_build_cpu_map();	/* happens, e.g., with the Ski simulator */
-# endif
 #endif /* CONFIG_APCI_BOOT */
 
+#ifdef CONFIG_SMP
+	smp_build_cpu_map();
+#endif
 	find_memory();
 
 	/* process SAL system table: */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 3/5] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-23  5:06   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
which is set up early in boot by head.S.  However, this doesn't
guarantee that the area will be on the same node as cpu0 and the
percpu area for cpu0 ends up very far away from percpu areas for other
cpus which cause problems for congruent percpu allocator.

This patch makes percpu area initialization allocate percpu area for
cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
resides in the __init area.  This means that for cpu0, percpu area is
first setup at __cpu0_per_cpu early by head.S and then moved to an
area in the linear mapping during memory initialization and it's not
allowed to take a pointer to percpu variables between head.S and
memory initialization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/kernel/vmlinux.lds.S |   11 +++++----
 arch/ia64/mm/contig.c          |   47 +++++++++++++++++++++++++--------------
 arch/ia64/mm/discontig.c       |   35 ++++++++++++++++++++---------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 0a0c77b..1295ba3 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -166,6 +166,12 @@ SECTIONS
 	}
 #endif
 
+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+  __cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
+
   . = ALIGN(PAGE_SIZE);
   __init_end = .;
 
@@ -198,11 +204,6 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
-#ifdef	CONFIG_SMP
-  . = ALIGN(PERCPU_PAGE_SIZE);
-		__cpu0_per_cpu = .;
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
 		INIT_TASK_DATA(PAGE_SIZE)
 		CACHELINE_ALIGNED_DATA(SMP_CACHE_BYTES)
 		READ_MOSTLY_DATA(SMP_CACHE_BYTES)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 1341437..351da0a 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -154,36 +154,49 @@ static void *cpu_data;
 void * __cpuinit
 per_cpu_init (void)
 {
-	int cpu;
-	static int first_time=1;
+	static bool first_time = true;
+	void *cpu0_data = __cpu0_per_cpu;
+	unsigned int cpu;
+
+	if (!first_time)
+		goto skip;
+	first_time = false;
 
 	/*
 	 * get_free_pages() cannot be used before cpu_init() done.  BSP
 	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
 	 * get_zeroed_page().
 	 */
-	if (first_time) {
-		void *cpu0_data = __cpu0_per_cpu;
+	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+		void *src = cpu == 0 ? cpu0_data : __phys_per_cpu_start;
 
-		first_time=0;
+		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)cpu_data - __per_cpu_start;
+		per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
 
-		__per_cpu_offset[0] = (char *) cpu0_data - __per_cpu_start;
-		per_cpu(local_per_cpu_offset, 0) = __per_cpu_offset[0];
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu == 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA, __pa(cpu_data) -
+				    (unsigned long)__per_cpu_start);
 
-		for (cpu = 1; cpu < NR_CPUS; cpu++) {
-			memcpy(cpu_data, __phys_per_cpu_start, __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char *) cpu_data - __per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
-		}
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
+skip:
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }
 
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS-1,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
 #else
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 9f24b3c..200282b 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -143,17 +143,30 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	int cpu;
 
 	for_each_possible_early_cpu(cpu) {
-		if (cpu == 0) {
-			void *cpu0_data = __cpu0_per_cpu;
-			__per_cpu_offset[cpu] = (char*)cpu0_data -
-				__per_cpu_start;
-		} else if (node == node_cpuid[cpu].nid) {
-			memcpy(__va(cpu_data), __phys_per_cpu_start,
-			       __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
-				__per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-		}
+		void *src = cpu == 0 ? __cpu0_per_cpu : __phys_per_cpu_start;
+
+		if (node != node_cpuid[cpu].nid)
+			continue;
+
+		memcpy(__va(cpu_data), src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)__va(cpu_data) -
+			__per_cpu_start;
+
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu == 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA,
+				    (unsigned long)cpu_data -
+				    (unsigned long)__per_cpu_start);
+
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
 #endif
 	return cpu_data;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 3/5] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
@ 2009-09-23  5:06   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
which is set up early in boot by head.S.  However, this doesn't
guarantee that the area will be on the same node as cpu0 and the
percpu area for cpu0 ends up very far away from percpu areas for other
cpus which cause problems for congruent percpu allocator.

This patch makes percpu area initialization allocate percpu area for
cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
resides in the __init area.  This means that for cpu0, percpu area is
first setup at __cpu0_per_cpu early by head.S and then moved to an
area in the linear mapping during memory initialization and it's not
allowed to take a pointer to percpu variables between head.S and
memory initialization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/kernel/vmlinux.lds.S |   11 +++++----
 arch/ia64/mm/contig.c          |   47 +++++++++++++++++++++++++--------------
 arch/ia64/mm/discontig.c       |   35 ++++++++++++++++++++---------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 0a0c77b..1295ba3 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -166,6 +166,12 @@ SECTIONS
 	}
 #endif
 
+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+  __cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
+
   . = ALIGN(PAGE_SIZE);
   __init_end = .;
 
@@ -198,11 +204,6 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
-#ifdef	CONFIG_SMP
-  . = ALIGN(PERCPU_PAGE_SIZE);
-		__cpu0_per_cpu = .;
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
 		INIT_TASK_DATA(PAGE_SIZE)
 		CACHELINE_ALIGNED_DATA(SMP_CACHE_BYTES)
 		READ_MOSTLY_DATA(SMP_CACHE_BYTES)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 1341437..351da0a 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -154,36 +154,49 @@ static void *cpu_data;
 void * __cpuinit
 per_cpu_init (void)
 {
-	int cpu;
-	static int first_time=1;
+	static bool first_time = true;
+	void *cpu0_data = __cpu0_per_cpu;
+	unsigned int cpu;
+
+	if (!first_time)
+		goto skip;
+	first_time = false;
 
 	/*
 	 * get_free_pages() cannot be used before cpu_init() done.  BSP
 	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
 	 * get_zeroed_page().
 	 */
-	if (first_time) {
-		void *cpu0_data = __cpu0_per_cpu;
+	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+		void *src = cpu = 0 ? cpu0_data : __phys_per_cpu_start;
 
-		first_time=0;
+		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)cpu_data - __per_cpu_start;
+		per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
 
-		__per_cpu_offset[0] = (char *) cpu0_data - __per_cpu_start;
-		per_cpu(local_per_cpu_offset, 0) = __per_cpu_offset[0];
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu = 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA, __pa(cpu_data) -
+				    (unsigned long)__per_cpu_start);
 
-		for (cpu = 1; cpu < NR_CPUS; cpu++) {
-			memcpy(cpu_data, __phys_per_cpu_start, __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char *) cpu_data - __per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
-		}
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
+skip:
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }
 
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS-1,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
 #else
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 9f24b3c..200282b 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -143,17 +143,30 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	int cpu;
 
 	for_each_possible_early_cpu(cpu) {
-		if (cpu = 0) {
-			void *cpu0_data = __cpu0_per_cpu;
-			__per_cpu_offset[cpu] = (char*)cpu0_data -
-				__per_cpu_start;
-		} else if (node = node_cpuid[cpu].nid) {
-			memcpy(__va(cpu_data), __phys_per_cpu_start,
-			       __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
-				__per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-		}
+		void *src = cpu = 0 ? __cpu0_per_cpu : __phys_per_cpu_start;
+
+		if (node != node_cpuid[cpu].nid)
+			continue;
+
+		memcpy(__va(cpu_data), src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)__va(cpu_data) -
+			__per_cpu_start;
+
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu = 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA,
+				    (unsigned long)cpu_data -
+				    (unsigned long)__per_cpu_start);
+
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
 #endif
 	return cpu_data;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 4/5] ia64: convert to dynamic percpu allocator
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-23  5:06   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

Unlike other archs, ia64 reserves space for percpu areas during early
memory initialization.  These areas occupy a contiguous region indexed
by cpu number on contiguous memory model or are grouped by node on
discontiguous memory model.

As allocation and initialization are done by the arch code, all that
setup_per_cpu_areas() needs to do is communicating the determined
layout to the percpu allocator.  This patch implements
setup_per_cpu_areas() for both contig and discontig memory models and
drops HAVE_LEGACY_PER_CPU_AREA.

Please note that for contig model, the allocation itself is modified
only to allocate for possible cpus instead of NR_CPUS.  As dynamic
percpu allocator can handle non-direct mapping, there's no reason to
allocate memory for cpus which aren't possible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/Kconfig        |    3 --
 arch/ia64/kernel/setup.c |   12 ------
 arch/ia64/mm/contig.c    |   58 ++++++++++++++++++++++++++++---
 arch/ia64/mm/discontig.c |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 138 insertions(+), 20 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 011a1cd..e624611 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -89,9 +89,6 @@ config GENERIC_TIME_VSYSCALL
 	bool
 	default y
 
-config HAVE_LEGACY_PER_CPU_AREA
-	def_bool y
-
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool y
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5d77c1e..bc1ef4a 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -855,18 +855,6 @@ identify_cpu (struct cpuinfo_ia64 *c)
 }
 
 /*
- * In UP configuration, setup_per_cpu_areas() is defined in
- * include/linux/percpu.h
- */
-#ifdef CONFIG_SMP
-void __init
-setup_per_cpu_areas (void)
-{
-	/* start_kernel() requires this... */
-}
-#endif
-
-/*
  * Do the following calculations:
  *
  * 1. the max. cache line size.
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 351da0a..54bf540 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -163,11 +163,11 @@ per_cpu_init (void)
 	first_time = false;
 
 	/*
-	 * get_free_pages() cannot be used before cpu_init() done.  BSP
-	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
-	 * get_zeroed_page().
+	 * get_free_pages() cannot be used before cpu_init() done.
+	 * BSP allocates PERCPU_PAGE_SIZE bytes for all possible CPUs
+	 * to avoid that AP calls get_zeroed_page().
 	 */
-	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+	for_each_possible_cpu(cpu) {
 		void *src = cpu == 0 ? cpu0_data : __phys_per_cpu_start;
 
 		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
@@ -196,9 +196,57 @@ skip:
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * num_possible_cpus(),
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
+
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init
+setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *gi;
+	unsigned int cpu;
+	ssize_t static_size, reserved_size, dyn_size;
+	int rc;
+
+	ai = pcpu_alloc_alloc_info(1, num_possible_cpus());
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	gi = &ai->groups[0];
+
+	/* units are assigned consecutively to possible cpus */
+	for_each_possible_cpu(cpu)
+		gi->cpu_map[gi->nr_units++] = cpu;
+
+	/* set parameters */
+	static_size = __per_cpu_end - __per_cpu_start;
+	reserved_size = PERCPU_MODULE_RESERVE;
+	dyn_size = PERCPU_PAGE_SIZE - static_size - reserved_size;
+	if (dyn_size < 0)
+		panic("percpu area overflow static=%zd reserved=%zd\n",
+		      static_size, reserved_size);
+
+	ai->static_size		= static_size;
+	ai->reserved_size	= reserved_size;
+	ai->dyn_size		= dyn_size;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	rc = pcpu_setup_first_chunk(ai, __per_cpu_start + __per_cpu_offset[0]);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
 #else
 #define alloc_per_cpu_data() do { } while (0)
 #endif /* CONFIG_SMP */
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 200282b..40e4c1f 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -172,6 +172,91 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	return cpu_data;
 }
 
+#ifdef CONFIG_SMP
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *uninitialized_var(gi);
+	unsigned int *cpu_map;
+	void *base;
+	unsigned long base_offset;
+	unsigned int cpu;
+	ssize_t static_size, reserved_size, dyn_size;
+	int node, prev_node, unit, nr_units, rc;
+
+	ai = pcpu_alloc_alloc_info(MAX_NUMNODES, nr_cpu_ids);
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	cpu_map = ai->groups[0].cpu_map;
+
+	/* determine base */
+	base = (void *)ULONG_MAX;
+	for_each_possible_cpu(cpu)
+		base = min(base,
+			   (void *)(__per_cpu_offset[cpu] + __per_cpu_start));
+	base_offset = (void *)__per_cpu_start - base;
+
+	/* build cpu_map, units are grouped by node */
+	unit = 0;
+	for_each_node(node)
+		for_each_possible_cpu(cpu)
+			if (node == node_cpuid[cpu].nid)
+				cpu_map[unit++] = cpu;
+	nr_units = unit;
+
+	/* set basic parameters */
+	static_size = __per_cpu_end - __per_cpu_start;
+	reserved_size = PERCPU_MODULE_RESERVE;
+	dyn_size = PERCPU_PAGE_SIZE - static_size - reserved_size;
+	if (dyn_size < 0)
+		panic("percpu area overflow static=%zd reserved=%zd\n",
+		      static_size, reserved_size);
+
+	ai->static_size		= static_size;
+	ai->reserved_size	= reserved_size;
+	ai->dyn_size		= dyn_size;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	/*
+	 * CPUs are put into groups according to node.  Walk cpu_map
+	 * and create new groups at node boundaries.
+	 */
+	prev_node = -1;
+	ai->nr_groups = 0;
+	for (unit = 0; unit < nr_units; unit++) {
+		cpu = cpu_map[unit];
+		node = node_cpuid[cpu].nid;
+
+		if (node == prev_node) {
+			gi->nr_units++;
+			continue;
+		}
+		prev_node = node;
+
+		gi = &ai->groups[ai->nr_groups++];
+		gi->nr_units		= 1;
+		gi->base_offset		= __per_cpu_offset[cpu] + base_offset;
+		gi->cpu_map		= &cpu_map[unit];
+	}
+
+	rc = pcpu_setup_first_chunk(ai, base);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
+#endif
+
 /**
  * fill_pernode - initialize pernode data.
  * @node: the node id.
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 4/5] ia64: convert to dynamic percpu allocator
@ 2009-09-23  5:06   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Tony Luck, Fenghua Yu, linux-ia64

Unlike other archs, ia64 reserves space for percpu areas during early
memory initialization.  These areas occupy a contiguous region indexed
by cpu number on contiguous memory model or are grouped by node on
discontiguous memory model.

As allocation and initialization are done by the arch code, all that
setup_per_cpu_areas() needs to do is communicating the determined
layout to the percpu allocator.  This patch implements
setup_per_cpu_areas() for both contig and discontig memory models and
drops HAVE_LEGACY_PER_CPU_AREA.

Please note that for contig model, the allocation itself is modified
only to allocate for possible cpus instead of NR_CPUS.  As dynamic
percpu allocator can handle non-direct mapping, there's no reason to
allocate memory for cpus which aren't possible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
 arch/ia64/Kconfig        |    3 --
 arch/ia64/kernel/setup.c |   12 ------
 arch/ia64/mm/contig.c    |   58 ++++++++++++++++++++++++++++---
 arch/ia64/mm/discontig.c |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 138 insertions(+), 20 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 011a1cd..e624611 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -89,9 +89,6 @@ config GENERIC_TIME_VSYSCALL
 	bool
 	default y
 
-config HAVE_LEGACY_PER_CPU_AREA
-	def_bool y
-
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool y
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5d77c1e..bc1ef4a 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -855,18 +855,6 @@ identify_cpu (struct cpuinfo_ia64 *c)
 }
 
 /*
- * In UP configuration, setup_per_cpu_areas() is defined in
- * include/linux/percpu.h
- */
-#ifdef CONFIG_SMP
-void __init
-setup_per_cpu_areas (void)
-{
-	/* start_kernel() requires this... */
-}
-#endif
-
-/*
  * Do the following calculations:
  *
  * 1. the max. cache line size.
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 351da0a..54bf540 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -163,11 +163,11 @@ per_cpu_init (void)
 	first_time = false;
 
 	/*
-	 * get_free_pages() cannot be used before cpu_init() done.  BSP
-	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
-	 * get_zeroed_page().
+	 * get_free_pages() cannot be used before cpu_init() done.
+	 * BSP allocates PERCPU_PAGE_SIZE bytes for all possible CPUs
+	 * to avoid that AP calls get_zeroed_page().
 	 */
-	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+	for_each_possible_cpu(cpu) {
 		void *src = cpu = 0 ? cpu0_data : __phys_per_cpu_start;
 
 		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
@@ -196,9 +196,57 @@ skip:
 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * num_possible_cpus(),
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
+
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init
+setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *gi;
+	unsigned int cpu;
+	ssize_t static_size, reserved_size, dyn_size;
+	int rc;
+
+	ai = pcpu_alloc_alloc_info(1, num_possible_cpus());
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	gi = &ai->groups[0];
+
+	/* units are assigned consecutively to possible cpus */
+	for_each_possible_cpu(cpu)
+		gi->cpu_map[gi->nr_units++] = cpu;
+
+	/* set parameters */
+	static_size = __per_cpu_end - __per_cpu_start;
+	reserved_size = PERCPU_MODULE_RESERVE;
+	dyn_size = PERCPU_PAGE_SIZE - static_size - reserved_size;
+	if (dyn_size < 0)
+		panic("percpu area overflow static=%zd reserved=%zd\n",
+		      static_size, reserved_size);
+
+	ai->static_size		= static_size;
+	ai->reserved_size	= reserved_size;
+	ai->dyn_size		= dyn_size;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	rc = pcpu_setup_first_chunk(ai, __per_cpu_start + __per_cpu_offset[0]);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
 #else
 #define alloc_per_cpu_data() do { } while (0)
 #endif /* CONFIG_SMP */
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 200282b..40e4c1f 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -172,6 +172,91 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	return cpu_data;
 }
 
+#ifdef CONFIG_SMP
+/**
+ * setup_per_cpu_areas - setup percpu areas
+ *
+ * Arch code has already allocated and initialized percpu areas.  All
+ * this function has to do is to teach the determined layout to the
+ * dynamic percpu allocator, which happens to be more complex than
+ * creating whole new ones using helpers.
+ */
+void __init setup_per_cpu_areas(void)
+{
+	struct pcpu_alloc_info *ai;
+	struct pcpu_group_info *uninitialized_var(gi);
+	unsigned int *cpu_map;
+	void *base;
+	unsigned long base_offset;
+	unsigned int cpu;
+	ssize_t static_size, reserved_size, dyn_size;
+	int node, prev_node, unit, nr_units, rc;
+
+	ai = pcpu_alloc_alloc_info(MAX_NUMNODES, nr_cpu_ids);
+	if (!ai)
+		panic("failed to allocate pcpu_alloc_info");
+	cpu_map = ai->groups[0].cpu_map;
+
+	/* determine base */
+	base = (void *)ULONG_MAX;
+	for_each_possible_cpu(cpu)
+		base = min(base,
+			   (void *)(__per_cpu_offset[cpu] + __per_cpu_start));
+	base_offset = (void *)__per_cpu_start - base;
+
+	/* build cpu_map, units are grouped by node */
+	unit = 0;
+	for_each_node(node)
+		for_each_possible_cpu(cpu)
+			if (node = node_cpuid[cpu].nid)
+				cpu_map[unit++] = cpu;
+	nr_units = unit;
+
+	/* set basic parameters */
+	static_size = __per_cpu_end - __per_cpu_start;
+	reserved_size = PERCPU_MODULE_RESERVE;
+	dyn_size = PERCPU_PAGE_SIZE - static_size - reserved_size;
+	if (dyn_size < 0)
+		panic("percpu area overflow static=%zd reserved=%zd\n",
+		      static_size, reserved_size);
+
+	ai->static_size		= static_size;
+	ai->reserved_size	= reserved_size;
+	ai->dyn_size		= dyn_size;
+	ai->unit_size		= PERCPU_PAGE_SIZE;
+	ai->atom_size		= PAGE_SIZE;
+	ai->alloc_size		= PERCPU_PAGE_SIZE;
+
+	/*
+	 * CPUs are put into groups according to node.  Walk cpu_map
+	 * and create new groups at node boundaries.
+	 */
+	prev_node = -1;
+	ai->nr_groups = 0;
+	for (unit = 0; unit < nr_units; unit++) {
+		cpu = cpu_map[unit];
+		node = node_cpuid[cpu].nid;
+
+		if (node = prev_node) {
+			gi->nr_units++;
+			continue;
+		}
+		prev_node = node;
+
+		gi = &ai->groups[ai->nr_groups++];
+		gi->nr_units		= 1;
+		gi->base_offset		= __per_cpu_offset[cpu] + base_offset;
+		gi->cpu_map		= &cpu_map[unit];
+	}
+
+	rc = pcpu_setup_first_chunk(ai, base);
+	if (rc)
+		panic("failed to setup percpu area (err=%d)", rc);
+
+	pcpu_free_alloc_info(ai);
+}
+#endif
+
 /**
  * fill_pernode - initialize pernode data.
  * @node: the node id.
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 5/5] percpu: kill legacy percpu allocator
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-23  5:06   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Ingo Molnar, Rusty Russell, Christoph Lameter

With ia64 converted, there's no arch left which still uses legacy
percpu allocator.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
---
 include/linux/percpu.h |   24 -------
 kernel/module.c        |  150 ----------------------------------------
 mm/Makefile            |    4 -
 mm/allocpercpu.c       |  177 ------------------------------------------------
 mm/percpu.c            |    2 -
 5 files changed, 0 insertions(+), 357 deletions(-)
 delete mode 100644 mm/allocpercpu.c

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 878836c..5baf5b8 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -34,8 +34,6 @@
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 /* minimum unit size, also is the maximum supported allocation size */
 #define PCPU_MIN_UNIT_SIZE		PFN_ALIGN(64 << 10)
 
@@ -130,28 +128,6 @@ extern int __init pcpu_page_first_chunk(size_t reserved_size,
 #define per_cpu_ptr(ptr, cpu)	SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu)))
 
 extern void *__alloc_reserved_percpu(size_t size, size_t align);
-
-#else /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-struct percpu_data {
-	void *ptrs[1];
-};
-
-/* pointer disguising messes up the kmemleak objects tracking */
-#ifndef CONFIG_DEBUG_KMEMLEAK
-#define __percpu_disguise(pdata) (struct percpu_data *)~(unsigned long)(pdata)
-#else
-#define __percpu_disguise(pdata) (struct percpu_data *)(pdata)
-#endif
-
-#define per_cpu_ptr(ptr, cpu)						\
-({									\
-        struct percpu_data *__p = __percpu_disguise(ptr);		\
-        (__typeof__(ptr))__p->ptrs[(cpu)];				\
-})
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 extern void *__alloc_percpu(size_t size, size_t align);
 extern void free_percpu(void *__pdata);
 
diff --git a/kernel/module.c b/kernel/module.c
index e6bc4b2..7fd81d8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -370,8 +370,6 @@ EXPORT_SYMBOL_GPL(find_module);
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 static void *percpu_modalloc(unsigned long size, unsigned long align,
 			     const char *name)
 {
@@ -395,154 +393,6 @@ static void percpu_modfree(void *freeme)
 	free_percpu(freeme);
 }
 
-#else /* ... CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-/* Number of blocks used and allocated. */
-static unsigned int pcpu_num_used, pcpu_num_allocated;
-/* Size of each block.  -ve means used. */
-static int *pcpu_size;
-
-static int split_block(unsigned int i, unsigned short size)
-{
-	/* Reallocation required? */
-	if (pcpu_num_used + 1 > pcpu_num_allocated) {
-		int *new;
-
-		new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2,
-			       GFP_KERNEL);
-		if (!new)
-			return 0;
-
-		pcpu_num_allocated *= 2;
-		pcpu_size = new;
-	}
-
-	/* Insert a new subblock */
-	memmove(&pcpu_size[i+1], &pcpu_size[i],
-		sizeof(pcpu_size[0]) * (pcpu_num_used - i));
-	pcpu_num_used++;
-
-	pcpu_size[i+1] -= size;
-	pcpu_size[i] = size;
-	return 1;
-}
-
-static inline unsigned int block_size(int val)
-{
-	if (val < 0)
-		return -val;
-	return val;
-}
-
-static void *percpu_modalloc(unsigned long size, unsigned long align,
-			     const char *name)
-{
-	unsigned long extra;
-	unsigned int i;
-	void *ptr;
-	int cpu;
-
-	if (align > PAGE_SIZE) {
-		printk(KERN_WARNING "%s: per-cpu alignment %li > %li\n",
-		       name, align, PAGE_SIZE);
-		align = PAGE_SIZE;
-	}
-
-	ptr = __per_cpu_start;
-	for (i = 0; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		/* Extra for alignment requirement. */
-		extra = ALIGN((unsigned long)ptr, align) - (unsigned long)ptr;
-		BUG_ON(i == 0 && extra != 0);
-
-		if (pcpu_size[i] < 0 || pcpu_size[i] < extra + size)
-			continue;
-
-		/* Transfer extra to previous block. */
-		if (pcpu_size[i-1] < 0)
-			pcpu_size[i-1] -= extra;
-		else
-			pcpu_size[i-1] += extra;
-		pcpu_size[i] -= extra;
-		ptr += extra;
-
-		/* Split block if warranted */
-		if (pcpu_size[i] - size > sizeof(unsigned long))
-			if (!split_block(i, size))
-				return NULL;
-
-		/* add the per-cpu scanning areas */
-		for_each_possible_cpu(cpu)
-			kmemleak_alloc(ptr + per_cpu_offset(cpu), size, 0,
-				       GFP_KERNEL);
-
-		/* Mark allocated */
-		pcpu_size[i] = -pcpu_size[i];
-		return ptr;
-	}
-
-	printk(KERN_WARNING "Could not allocate %lu bytes percpu data\n",
-	       size);
-	return NULL;
-}
-
-static void percpu_modfree(void *freeme)
-{
-	unsigned int i;
-	void *ptr = __per_cpu_start + block_size(pcpu_size[0]);
-	int cpu;
-
-	/* First entry is core kernel percpu data. */
-	for (i = 1; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		if (ptr == freeme) {
-			pcpu_size[i] = -pcpu_size[i];
-			goto free;
-		}
-	}
-	BUG();
-
- free:
-	/* remove the per-cpu scanning areas */
-	for_each_possible_cpu(cpu)
-		kmemleak_free(freeme + per_cpu_offset(cpu));
-
-	/* Merge with previous? */
-	if (pcpu_size[i-1] >= 0) {
-		pcpu_size[i-1] += pcpu_size[i];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i], &pcpu_size[i+1],
-			(pcpu_num_used - i) * sizeof(pcpu_size[0]));
-		i--;
-	}
-	/* Merge with next? */
-	if (i+1 < pcpu_num_used && pcpu_size[i+1] >= 0) {
-		pcpu_size[i] += pcpu_size[i+1];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i+1], &pcpu_size[i+2],
-			(pcpu_num_used - (i+1)) * sizeof(pcpu_size[0]));
-	}
-}
-
-static int percpu_modinit(void)
-{
-	pcpu_num_used = 2;
-	pcpu_num_allocated = 2;
-	pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated,
-			    GFP_KERNEL);
-	/* Static in-kernel percpu data (used). */
-	pcpu_size[0] = -(__per_cpu_end-__per_cpu_start);
-	/* Free room. */
-	pcpu_size[1] = PERCPU_ENOUGH_ROOM + pcpu_size[0];
-	if (pcpu_size[1] < 0) {
-		printk(KERN_ERR "No per-cpu room for modules.\n");
-		pcpu_num_used = 1;
-	}
-
-	return 0;
-}
-__initcall(percpu_modinit);
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 static unsigned int find_pcpusec(Elf_Ehdr *hdr,
 				 Elf_Shdr *sechdrs,
 				 const char *secstrings)
diff --git a/mm/Makefile b/mm/Makefile
index 728a9fd..3230eb5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -34,11 +34,7 @@ obj-$(CONFIG_FAILSLAB) += failslab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
-ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
 obj-$(CONFIG_SMP) += percpu.o
-else
-obj-$(CONFIG_SMP) += allocpercpu.o
-endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
diff --git a/mm/allocpercpu.c b/mm/allocpercpu.c
deleted file mode 100644
index df34cea..0000000
--- a/mm/allocpercpu.c
+++ /dev/null
@@ -1,177 +0,0 @@
-/*
- * linux/mm/allocpercpu.c
- *
- * Separated from slab.c August 11, 2006 Christoph Lameter
- */
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/bootmem.h>
-#include <asm/sections.h>
-
-#ifndef cache_line_size
-#define cache_line_size()	L1_CACHE_BYTES
-#endif
-
-/**
- * percpu_depopulate - depopulate per-cpu data for given cpu
- * @__pdata: per-cpu data to depopulate
- * @cpu: depopulate per-cpu data for this cpu
- *
- * Depopulating per-cpu data for a cpu going offline would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- */
-static void percpu_depopulate(void *__pdata, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-
-	kfree(pdata->ptrs[cpu]);
-	pdata->ptrs[cpu] = NULL;
-}
-
-/**
- * percpu_depopulate_mask - depopulate per-cpu data for some cpu's
- * @__pdata: per-cpu data to depopulate
- * @mask: depopulate per-cpu data for cpu's selected through mask bits
- */
-static void __percpu_depopulate_mask(void *__pdata, const cpumask_t *mask)
-{
-	int cpu;
-	for_each_cpu_mask_nr(cpu, *mask)
-		percpu_depopulate(__pdata, cpu);
-}
-
-#define percpu_depopulate_mask(__pdata, mask) \
-	__percpu_depopulate_mask((__pdata), &(mask))
-
-/**
- * percpu_populate - populate per-cpu data for given cpu
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @cpu: populate per-data for this cpu
- *
- * Populating per-cpu data for a cpu coming online would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- * Per-cpu object is populated with zeroed buffer.
- */
-static void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-	int node = cpu_to_node(cpu);
-
-	/*
-	 * We should make sure each CPU gets private memory.
-	 */
-	size = roundup(size, cache_line_size());
-
-	BUG_ON(pdata->ptrs[cpu]);
-	if (node_online(node))
-		pdata->ptrs[cpu] = kmalloc_node(size, gfp|__GFP_ZERO, node);
-	else
-		pdata->ptrs[cpu] = kzalloc(size, gfp);
-	return pdata->ptrs[cpu];
-}
-
-/**
- * percpu_populate_mask - populate per-cpu data for more cpu's
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @mask: populate per-cpu data for cpu's selected through mask bits
- *
- * Per-cpu objects are populated with zeroed buffers.
- */
-static int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-				  cpumask_t *mask)
-{
-	cpumask_t populated;
-	int cpu;
-
-	cpus_clear(populated);
-	for_each_cpu_mask_nr(cpu, *mask)
-		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
-			__percpu_depopulate_mask(__pdata, &populated);
-			return -ENOMEM;
-		} else
-			cpu_set(cpu, populated);
-	return 0;
-}
-
-#define percpu_populate_mask(__pdata, size, gfp, mask) \
-	__percpu_populate_mask((__pdata), (size), (gfp), &(mask))
-
-/**
- * alloc_percpu - initial setup of per-cpu data
- * @size: size of per-cpu object
- * @align: alignment
- *
- * Allocate dynamic percpu area.  Percpu objects are populated with
- * zeroed buffers.
- */
-void *__alloc_percpu(size_t size, size_t align)
-{
-	/*
-	 * We allocate whole cache lines to avoid false sharing
-	 */
-	size_t sz = roundup(nr_cpu_ids * sizeof(void *), cache_line_size());
-	void *pdata = kzalloc(sz, GFP_KERNEL);
-	void *__pdata = __percpu_disguise(pdata);
-
-	/*
-	 * Can't easily make larger alignment work with kmalloc.  WARN
-	 * on it.  Larger alignment should only be used for module
-	 * percpu sections on SMP for which this path isn't used.
-	 */
-	WARN_ON_ONCE(align > SMP_CACHE_BYTES);
-
-	if (unlikely(!pdata))
-		return NULL;
-	if (likely(!__percpu_populate_mask(__pdata, size, GFP_KERNEL,
-					   &cpu_possible_map)))
-		return __pdata;
-	kfree(pdata);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(__alloc_percpu);
-
-/**
- * free_percpu - final cleanup of per-cpu data
- * @__pdata: object to clean up
- *
- * We simply clean up any per-cpu object left. No need for the client to
- * track and specify through a bis mask which per-cpu objects are to free.
- */
-void free_percpu(void *__pdata)
-{
-	if (unlikely(!__pdata))
-		return;
-	__percpu_depopulate_mask(__pdata, cpu_possible_mask);
-	kfree(__percpu_disguise(__pdata));
-}
-EXPORT_SYMBOL_GPL(free_percpu);
-
-/*
- * Generic percpu area setup.
- */
-#ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
-unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
-
-EXPORT_SYMBOL(__per_cpu_offset);
-
-void __init setup_per_cpu_areas(void)
-{
-	unsigned long size, i;
-	char *ptr;
-	unsigned long nr_possible_cpus = num_possible_cpus();
-
-	/* Copy section for each CPU (we discard the original) */
-	size = ALIGN(PERCPU_ENOUGH_ROOM, PAGE_SIZE);
-	ptr = alloc_bootmem_pages(size * nr_possible_cpus);
-
-	for_each_possible_cpu(i) {
-		__per_cpu_offset[i] = ptr - __per_cpu_start;
-		memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start);
-		ptr += size;
-	}
-}
-#endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
diff --git a/mm/percpu.c b/mm/percpu.c
index 43d8cac..adbc5a4 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -46,8 +46,6 @@
  *
  * To use this allocator, arch code should do the followings.
  *
- * - drop CONFIG_HAVE_LEGACY_PER_CPU_AREA
- *
  * - define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr() to translate
  *   regular address to percpu pointer and back if they need to be
  *   different from the default
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 5/5] percpu: kill legacy percpu allocator
@ 2009-09-23  5:06   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23  5:06 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel
  Cc: Tejun Heo, Ingo Molnar, Rusty Russell, Christoph Lameter

With ia64 converted, there's no arch left which still uses legacy
percpu allocator.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
---
 include/linux/percpu.h |   24 -------
 kernel/module.c        |  150 ----------------------------------------
 mm/Makefile            |    4 -
 mm/allocpercpu.c       |  177 ------------------------------------------------
 mm/percpu.c            |    2 -
 5 files changed, 0 insertions(+), 357 deletions(-)
 delete mode 100644 mm/allocpercpu.c

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 878836c..5baf5b8 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -34,8 +34,6 @@
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 /* minimum unit size, also is the maximum supported allocation size */
 #define PCPU_MIN_UNIT_SIZE		PFN_ALIGN(64 << 10)
 
@@ -130,28 +128,6 @@ extern int __init pcpu_page_first_chunk(size_t reserved_size,
 #define per_cpu_ptr(ptr, cpu)	SHIFT_PERCPU_PTR((ptr), per_cpu_offset((cpu)))
 
 extern void *__alloc_reserved_percpu(size_t size, size_t align);
-
-#else /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-struct percpu_data {
-	void *ptrs[1];
-};
-
-/* pointer disguising messes up the kmemleak objects tracking */
-#ifndef CONFIG_DEBUG_KMEMLEAK
-#define __percpu_disguise(pdata) (struct percpu_data *)~(unsigned long)(pdata)
-#else
-#define __percpu_disguise(pdata) (struct percpu_data *)(pdata)
-#endif
-
-#define per_cpu_ptr(ptr, cpu)						\
-({									\
-        struct percpu_data *__p = __percpu_disguise(ptr);		\
-        (__typeof__(ptr))__p->ptrs[(cpu)];				\
-})
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 extern void *__alloc_percpu(size_t size, size_t align);
 extern void free_percpu(void *__pdata);
 
diff --git a/kernel/module.c b/kernel/module.c
index e6bc4b2..7fd81d8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -370,8 +370,6 @@ EXPORT_SYMBOL_GPL(find_module);
 
 #ifdef CONFIG_SMP
 
-#ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
-
 static void *percpu_modalloc(unsigned long size, unsigned long align,
 			     const char *name)
 {
@@ -395,154 +393,6 @@ static void percpu_modfree(void *freeme)
 	free_percpu(freeme);
 }
 
-#else /* ... CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
-/* Number of blocks used and allocated. */
-static unsigned int pcpu_num_used, pcpu_num_allocated;
-/* Size of each block.  -ve means used. */
-static int *pcpu_size;
-
-static int split_block(unsigned int i, unsigned short size)
-{
-	/* Reallocation required? */
-	if (pcpu_num_used + 1 > pcpu_num_allocated) {
-		int *new;
-
-		new = krealloc(pcpu_size, sizeof(new[0])*pcpu_num_allocated*2,
-			       GFP_KERNEL);
-		if (!new)
-			return 0;
-
-		pcpu_num_allocated *= 2;
-		pcpu_size = new;
-	}
-
-	/* Insert a new subblock */
-	memmove(&pcpu_size[i+1], &pcpu_size[i],
-		sizeof(pcpu_size[0]) * (pcpu_num_used - i));
-	pcpu_num_used++;
-
-	pcpu_size[i+1] -= size;
-	pcpu_size[i] = size;
-	return 1;
-}
-
-static inline unsigned int block_size(int val)
-{
-	if (val < 0)
-		return -val;
-	return val;
-}
-
-static void *percpu_modalloc(unsigned long size, unsigned long align,
-			     const char *name)
-{
-	unsigned long extra;
-	unsigned int i;
-	void *ptr;
-	int cpu;
-
-	if (align > PAGE_SIZE) {
-		printk(KERN_WARNING "%s: per-cpu alignment %li > %li\n",
-		       name, align, PAGE_SIZE);
-		align = PAGE_SIZE;
-	}
-
-	ptr = __per_cpu_start;
-	for (i = 0; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		/* Extra for alignment requirement. */
-		extra = ALIGN((unsigned long)ptr, align) - (unsigned long)ptr;
-		BUG_ON(i = 0 && extra != 0);
-
-		if (pcpu_size[i] < 0 || pcpu_size[i] < extra + size)
-			continue;
-
-		/* Transfer extra to previous block. */
-		if (pcpu_size[i-1] < 0)
-			pcpu_size[i-1] -= extra;
-		else
-			pcpu_size[i-1] += extra;
-		pcpu_size[i] -= extra;
-		ptr += extra;
-
-		/* Split block if warranted */
-		if (pcpu_size[i] - size > sizeof(unsigned long))
-			if (!split_block(i, size))
-				return NULL;
-
-		/* add the per-cpu scanning areas */
-		for_each_possible_cpu(cpu)
-			kmemleak_alloc(ptr + per_cpu_offset(cpu), size, 0,
-				       GFP_KERNEL);
-
-		/* Mark allocated */
-		pcpu_size[i] = -pcpu_size[i];
-		return ptr;
-	}
-
-	printk(KERN_WARNING "Could not allocate %lu bytes percpu data\n",
-	       size);
-	return NULL;
-}
-
-static void percpu_modfree(void *freeme)
-{
-	unsigned int i;
-	void *ptr = __per_cpu_start + block_size(pcpu_size[0]);
-	int cpu;
-
-	/* First entry is core kernel percpu data. */
-	for (i = 1; i < pcpu_num_used; ptr += block_size(pcpu_size[i]), i++) {
-		if (ptr = freeme) {
-			pcpu_size[i] = -pcpu_size[i];
-			goto free;
-		}
-	}
-	BUG();
-
- free:
-	/* remove the per-cpu scanning areas */
-	for_each_possible_cpu(cpu)
-		kmemleak_free(freeme + per_cpu_offset(cpu));
-
-	/* Merge with previous? */
-	if (pcpu_size[i-1] >= 0) {
-		pcpu_size[i-1] += pcpu_size[i];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i], &pcpu_size[i+1],
-			(pcpu_num_used - i) * sizeof(pcpu_size[0]));
-		i--;
-	}
-	/* Merge with next? */
-	if (i+1 < pcpu_num_used && pcpu_size[i+1] >= 0) {
-		pcpu_size[i] += pcpu_size[i+1];
-		pcpu_num_used--;
-		memmove(&pcpu_size[i+1], &pcpu_size[i+2],
-			(pcpu_num_used - (i+1)) * sizeof(pcpu_size[0]));
-	}
-}
-
-static int percpu_modinit(void)
-{
-	pcpu_num_used = 2;
-	pcpu_num_allocated = 2;
-	pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated,
-			    GFP_KERNEL);
-	/* Static in-kernel percpu data (used). */
-	pcpu_size[0] = -(__per_cpu_end-__per_cpu_start);
-	/* Free room. */
-	pcpu_size[1] = PERCPU_ENOUGH_ROOM + pcpu_size[0];
-	if (pcpu_size[1] < 0) {
-		printk(KERN_ERR "No per-cpu room for modules.\n");
-		pcpu_num_used = 1;
-	}
-
-	return 0;
-}
-__initcall(percpu_modinit);
-
-#endif /* CONFIG_HAVE_LEGACY_PER_CPU_AREA */
-
 static unsigned int find_pcpusec(Elf_Ehdr *hdr,
 				 Elf_Shdr *sechdrs,
 				 const char *secstrings)
diff --git a/mm/Makefile b/mm/Makefile
index 728a9fd..3230eb5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -34,11 +34,7 @@ obj-$(CONFIG_FAILSLAB) += failslab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
-ifndef CONFIG_HAVE_LEGACY_PER_CPU_AREA
 obj-$(CONFIG_SMP) += percpu.o
-else
-obj-$(CONFIG_SMP) += allocpercpu.o
-endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
diff --git a/mm/allocpercpu.c b/mm/allocpercpu.c
deleted file mode 100644
index df34cea..0000000
--- a/mm/allocpercpu.c
+++ /dev/null
@@ -1,177 +0,0 @@
-/*
- * linux/mm/allocpercpu.c
- *
- * Separated from slab.c August 11, 2006 Christoph Lameter
- */
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/bootmem.h>
-#include <asm/sections.h>
-
-#ifndef cache_line_size
-#define cache_line_size()	L1_CACHE_BYTES
-#endif
-
-/**
- * percpu_depopulate - depopulate per-cpu data for given cpu
- * @__pdata: per-cpu data to depopulate
- * @cpu: depopulate per-cpu data for this cpu
- *
- * Depopulating per-cpu data for a cpu going offline would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- */
-static void percpu_depopulate(void *__pdata, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-
-	kfree(pdata->ptrs[cpu]);
-	pdata->ptrs[cpu] = NULL;
-}
-
-/**
- * percpu_depopulate_mask - depopulate per-cpu data for some cpu's
- * @__pdata: per-cpu data to depopulate
- * @mask: depopulate per-cpu data for cpu's selected through mask bits
- */
-static void __percpu_depopulate_mask(void *__pdata, const cpumask_t *mask)
-{
-	int cpu;
-	for_each_cpu_mask_nr(cpu, *mask)
-		percpu_depopulate(__pdata, cpu);
-}
-
-#define percpu_depopulate_mask(__pdata, mask) \
-	__percpu_depopulate_mask((__pdata), &(mask))
-
-/**
- * percpu_populate - populate per-cpu data for given cpu
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @cpu: populate per-data for this cpu
- *
- * Populating per-cpu data for a cpu coming online would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- * Per-cpu object is populated with zeroed buffer.
- */
-static void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-	int node = cpu_to_node(cpu);
-
-	/*
-	 * We should make sure each CPU gets private memory.
-	 */
-	size = roundup(size, cache_line_size());
-
-	BUG_ON(pdata->ptrs[cpu]);
-	if (node_online(node))
-		pdata->ptrs[cpu] = kmalloc_node(size, gfp|__GFP_ZERO, node);
-	else
-		pdata->ptrs[cpu] = kzalloc(size, gfp);
-	return pdata->ptrs[cpu];
-}
-
-/**
- * percpu_populate_mask - populate per-cpu data for more cpu's
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @mask: populate per-cpu data for cpu's selected through mask bits
- *
- * Per-cpu objects are populated with zeroed buffers.
- */
-static int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-				  cpumask_t *mask)
-{
-	cpumask_t populated;
-	int cpu;
-
-	cpus_clear(populated);
-	for_each_cpu_mask_nr(cpu, *mask)
-		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
-			__percpu_depopulate_mask(__pdata, &populated);
-			return -ENOMEM;
-		} else
-			cpu_set(cpu, populated);
-	return 0;
-}
-
-#define percpu_populate_mask(__pdata, size, gfp, mask) \
-	__percpu_populate_mask((__pdata), (size), (gfp), &(mask))
-
-/**
- * alloc_percpu - initial setup of per-cpu data
- * @size: size of per-cpu object
- * @align: alignment
- *
- * Allocate dynamic percpu area.  Percpu objects are populated with
- * zeroed buffers.
- */
-void *__alloc_percpu(size_t size, size_t align)
-{
-	/*
-	 * We allocate whole cache lines to avoid false sharing
-	 */
-	size_t sz = roundup(nr_cpu_ids * sizeof(void *), cache_line_size());
-	void *pdata = kzalloc(sz, GFP_KERNEL);
-	void *__pdata = __percpu_disguise(pdata);
-
-	/*
-	 * Can't easily make larger alignment work with kmalloc.  WARN
-	 * on it.  Larger alignment should only be used for module
-	 * percpu sections on SMP for which this path isn't used.
-	 */
-	WARN_ON_ONCE(align > SMP_CACHE_BYTES);
-
-	if (unlikely(!pdata))
-		return NULL;
-	if (likely(!__percpu_populate_mask(__pdata, size, GFP_KERNEL,
-					   &cpu_possible_map)))
-		return __pdata;
-	kfree(pdata);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(__alloc_percpu);
-
-/**
- * free_percpu - final cleanup of per-cpu data
- * @__pdata: object to clean up
- *
- * We simply clean up any per-cpu object left. No need for the client to
- * track and specify through a bis mask which per-cpu objects are to free.
- */
-void free_percpu(void *__pdata)
-{
-	if (unlikely(!__pdata))
-		return;
-	__percpu_depopulate_mask(__pdata, cpu_possible_mask);
-	kfree(__percpu_disguise(__pdata));
-}
-EXPORT_SYMBOL_GPL(free_percpu);
-
-/*
- * Generic percpu area setup.
- */
-#ifndef CONFIG_HAVE_SETUP_PER_CPU_AREA
-unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
-
-EXPORT_SYMBOL(__per_cpu_offset);
-
-void __init setup_per_cpu_areas(void)
-{
-	unsigned long size, i;
-	char *ptr;
-	unsigned long nr_possible_cpus = num_possible_cpus();
-
-	/* Copy section for each CPU (we discard the original) */
-	size = ALIGN(PERCPU_ENOUGH_ROOM, PAGE_SIZE);
-	ptr = alloc_bootmem_pages(size * nr_possible_cpus);
-
-	for_each_possible_cpu(i) {
-		__per_cpu_offset[i] = ptr - __per_cpu_start;
-		memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start);
-		ptr += size;
-	}
-}
-#endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
diff --git a/mm/percpu.c b/mm/percpu.c
index 43d8cac..adbc5a4 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -46,8 +46,6 @@
  *
  * To use this allocator, arch code should do the followings.
  *
- * - drop CONFIG_HAVE_LEGACY_PER_CPU_AREA
- *
  * - define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr() to translate
  *   regular address to percpu pointer and back if they need to be
  *   different from the default
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] percpu: kill legacy percpu allocator
  2009-09-23  5:06   ` Tejun Heo
@ 2009-09-23 11:22     ` Rusty Russell
  -1 siblings, 0 replies; 72+ messages in thread
From: Rusty Russell @ 2009-09-23 11:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Christoph Lameter, linux-kernel

On Wed, 23 Sep 2009 02:36:22 pm Tejun Heo wrote:
> With ia64 converted, there's no arch left which still uses legacy
> percpu allocator.  Kill it.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Christoph Lameter <cl@linux-foundation.org>

Delightedly-acked-by: Rusty Russell <rusty@rustcorp.com.au>

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] percpu: kill legacy percpu allocator
@ 2009-09-23 11:22     ` Rusty Russell
  0 siblings, 0 replies; 72+ messages in thread
From: Rusty Russell @ 2009-09-23 11:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Christoph Lameter, linux-kernel

On Wed, 23 Sep 2009 02:36:22 pm Tejun Heo wrote:
> With ia64 converted, there's no arch left which still uses legacy
> percpu allocator.  Kill it.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Christoph Lameter <cl@linux-foundation.org>

Delightedly-acked-by: Rusty Russell <rusty@rustcorp.com.au>

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23  2:11       ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
@ 2009-09-23 13:44         ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-23 13:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Wed, 23 Sep 2009, Tejun Heo wrote:

> On ia64, the first chunk is fixed at PERCPU_PAGE_SIZE.  It's something
> hardwired into the page fault logic and the linker script.  Build will
> fail if the static + reserved area goes over PERCPU_PAGE_SIZE and in
> that case ia64 will need to update the special case page fault logic
> and increase PERCPU_PAGE_SIZE.  The area reserved above is interim
> per-cpu area for cpu0 which is used between head.S and proper percpu
> area setup and will be ditched once initialization is complete.

You did not answer my question.

The local percpu variables are accessed via a static per cpu
virtual mapping. You cannot place per cpu variables outside of that
virtual address range of PERCPU_PAGE_SIZE.

What happens if the percpu allocator allocates more data than available in
the reserved area?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu
@ 2009-09-23 13:44         ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-23 13:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Wed, 23 Sep 2009, Tejun Heo wrote:

> On ia64, the first chunk is fixed at PERCPU_PAGE_SIZE.  It's something
> hardwired into the page fault logic and the linker script.  Build will
> fail if the static + reserved area goes over PERCPU_PAGE_SIZE and in
> that case ia64 will need to update the special case page fault logic
> and increase PERCPU_PAGE_SIZE.  The area reserved above is interim
> per-cpu area for cpu0 which is used between head.S and proper percpu
> area setup and will be ditched once initialization is complete.

You did not answer my question.

The local percpu variables are accessed via a static per cpu
virtual mapping. You cannot place per cpu variables outside of that
virtual address range of PERCPU_PAGE_SIZE.

What happens if the percpu allocator allocates more data than available in
the reserved area?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23 13:44         ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
@ 2009-09-23 14:01           ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23 14:01 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Hello,

Christoph Lameter wrote:
> You did not answer my question.

Hmmm...

> The local percpu variables are accessed via a static per cpu
> virtual mapping. You cannot place per cpu variables outside of that
> virtual address range of PERCPU_PAGE_SIZE.
> 
> What happens if the percpu allocator allocates more data than available in
> the reserved area?

I still don't understand your question.  Static percpu variables are
always allocated from the first chunk inside that PERCPU_PAGE_SIZE
area.  Dynamic allocations can go outside of that but they don't need
any special handling.  What problems are you seeing?

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas
@ 2009-09-23 14:01           ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23 14:01 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Hello,

Christoph Lameter wrote:
> You did not answer my question.

Hmmm...

> The local percpu variables are accessed via a static per cpu
> virtual mapping. You cannot place per cpu variables outside of that
> virtual address range of PERCPU_PAGE_SIZE.
> 
> What happens if the percpu allocator allocates more data than available in
> the reserved area?

I still don't understand your question.  Static percpu variables are
always allocated from the first chunk inside that PERCPU_PAGE_SIZE
area.  Dynamic allocations can go outside of that but they don't need
any special handling.  What problems are you seeing?

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23 14:01           ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
@ 2009-09-23 17:17             ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-23 17:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Wed, 23 Sep 2009, Tejun Heo wrote:

> any special handling.  What problems are you seeing?

per cpu variable access on IA64 does not use the percpu_offset for the
calculation of the current per cpu data area. Its using a virtual mapping.

How does the new percpu allocator support this? Does it use different
methods of access for static and dynamic percpu access?



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu
@ 2009-09-23 17:17             ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-23 17:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Wed, 23 Sep 2009, Tejun Heo wrote:

> any special handling.  What problems are you seeing?

per cpu variable access on IA64 does not use the percpu_offset for the
calculation of the current per cpu data area. Its using a virtual mapping.

How does the new percpu allocator support this? Does it use different
methods of access for static and dynamic percpu access?



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23 17:17             ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
@ 2009-09-23 22:03               ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23 22:03 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Christoph Lameter wrote:
> On Wed, 23 Sep 2009, Tejun Heo wrote:
> 
>> any special handling.  What problems are you seeing?
> 
> per cpu variable access on IA64 does not use the percpu_offset for the
> calculation of the current per cpu data area. Its using a virtual mapping.
> 
> How does the new percpu allocator support this? Does it use different
> methods of access for static and dynamic percpu access?

That's only when __ia64_per_cpu_var() macro is used in arch code which
always references static perpcu variable in the kernel image which
falls inside PERCPU_PAGE_SIZE.  For everything else, __my_cpu_offset
is defined as __ia64_per_cpu_var(local_per_cpu_offset) and regular
pointer offsetting is used.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas
@ 2009-09-23 22:03               ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-23 22:03 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Christoph Lameter wrote:
> On Wed, 23 Sep 2009, Tejun Heo wrote:
> 
>> any special handling.  What problems are you seeing?
> 
> per cpu variable access on IA64 does not use the percpu_offset for the
> calculation of the current per cpu data area. Its using a virtual mapping.
> 
> How does the new percpu allocator support this? Does it use different
> methods of access for static and dynamic percpu access?

That's only when __ia64_per_cpu_var() macro is used in arch code which
always references static perpcu variable in the kernel image which
falls inside PERCPU_PAGE_SIZE.  For everything else, __my_cpu_offset
is defined as __ia64_per_cpu_var(local_per_cpu_offset) and regular
pointer offsetting is used.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23 22:03               ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
@ 2009-09-24  7:36                 ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-24  7:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Thu, 24 Sep 2009, Tejun Heo wrote:

> > How does the new percpu allocator support this? Does it use different
> > methods of access for static and dynamic percpu access?
>
> That's only when __ia64_per_cpu_var() macro is used in arch code which
> always references static perpcu variable in the kernel image which
> falls inside PERCPU_PAGE_SIZE.  For everything else, __my_cpu_offset
> is defined as __ia64_per_cpu_var(local_per_cpu_offset) and regular
> pointer offsetting is used.

So this means that address arithmetic needs to be performed for each
percpu access. The virtual mapping would allow the calculation of the
address at link time. Calculation means that a single atomic instruction
for percpu access wont be possible for ia64.

I can toss my ia64 percpu optimization patches. No point anymore.

Tony: We could then also drop the virtual per cpu mapping. Its only useful
for arch specific code and an alternate method of reference exists.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu
@ 2009-09-24  7:36                 ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-24  7:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Thu, 24 Sep 2009, Tejun Heo wrote:

> > How does the new percpu allocator support this? Does it use different
> > methods of access for static and dynamic percpu access?
>
> That's only when __ia64_per_cpu_var() macro is used in arch code which
> always references static perpcu variable in the kernel image which
> falls inside PERCPU_PAGE_SIZE.  For everything else, __my_cpu_offset
> is defined as __ia64_per_cpu_var(local_per_cpu_offset) and regular
> pointer offsetting is used.

So this means that address arithmetic needs to be performed for each
percpu access. The virtual mapping would allow the calculation of the
address at link time. Calculation means that a single atomic instruction
for percpu access wont be possible for ia64.

I can toss my ia64 percpu optimization patches. No point anymore.

Tony: We could then also drop the virtual per cpu mapping. Its only useful
for arch specific code and an alternate method of reference exists.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-24  7:36                 ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
@ 2009-09-24  8:37                   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-24  8:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Hello, Christoph.

Christoph Lameter wrote:
> On Thu, 24 Sep 2009, Tejun Heo wrote:
> 
>>> How does the new percpu allocator support this? Does it use different
>>> methods of access for static and dynamic percpu access?
>> That's only when __ia64_per_cpu_var() macro is used in arch code which
>> always references static perpcu variable in the kernel image which
>> falls inside PERCPU_PAGE_SIZE.  For everything else, __my_cpu_offset
>> is defined as __ia64_per_cpu_var(local_per_cpu_offset) and regular
>> pointer offsetting is used.
> 
> So this means that address arithmetic needs to be performed for each
> percpu access. The virtual mapping would allow the calculation of the
> address at link time. Calculation means that a single atomic instruction
> for percpu access wont be possible for ia64.
> 
> I can toss my ia64 percpu optimization patches. No point anymore.
> 
> Tony: We could then also drop the virtual per cpu mapping. Its only useful
> for arch specific code and an alternate method of reference exists.

percpu implementation on ia64 has always been like that.  The problem
with the alternate mapping is that you can't take the pointer to it as
it would mean different thing depending on which processor you're on
and the overall generic percpu implementation expects unique addresses
from percpu access macros.

ia64 currently has been and is the only arch which uses virtual percpu
mapping.  The one biggest benefit would be accesses to the
local_per_cpu_offset.  Whether it's beneficial enough to justify the
complexity, I frankly don't know.

Andrew once also suggested taking advantage of those overlapping
virtual mappings for local percpu accesses.  If the generic code
followed such design, ia64's virtual mappings would definitely be more
useful, but that means we would need aliased mappings for percpu areas
and addresses will be different for local and remote accesses.  Also,
getting it right on machines with virtually mapped caches would be
very painful.  Given that %gs/fs offesetting is quite efficient on
x86, I don't think changing the generic mechanism is worthwhile.

So, it would be great if we can find a better way to offset addresses
on ia64.  If not, nothing improves or deteriorates performance-wise
with the new implementation.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas
@ 2009-09-24  8:37                   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-24  8:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

Hello, Christoph.

Christoph Lameter wrote:
> On Thu, 24 Sep 2009, Tejun Heo wrote:
> 
>>> How does the new percpu allocator support this? Does it use different
>>> methods of access for static and dynamic percpu access?
>> That's only when __ia64_per_cpu_var() macro is used in arch code which
>> always references static perpcu variable in the kernel image which
>> falls inside PERCPU_PAGE_SIZE.  For everything else, __my_cpu_offset
>> is defined as __ia64_per_cpu_var(local_per_cpu_offset) and regular
>> pointer offsetting is used.
> 
> So this means that address arithmetic needs to be performed for each
> percpu access. The virtual mapping would allow the calculation of the
> address at link time. Calculation means that a single atomic instruction
> for percpu access wont be possible for ia64.
> 
> I can toss my ia64 percpu optimization patches. No point anymore.
> 
> Tony: We could then also drop the virtual per cpu mapping. Its only useful
> for arch specific code and an alternate method of reference exists.

percpu implementation on ia64 has always been like that.  The problem
with the alternate mapping is that you can't take the pointer to it as
it would mean different thing depending on which processor you're on
and the overall generic percpu implementation expects unique addresses
from percpu access macros.

ia64 currently has been and is the only arch which uses virtual percpu
mapping.  The one biggest benefit would be accesses to the
local_per_cpu_offset.  Whether it's beneficial enough to justify the
complexity, I frankly don't know.

Andrew once also suggested taking advantage of those overlapping
virtual mappings for local percpu accesses.  If the generic code
followed such design, ia64's virtual mappings would definitely be more
useful, but that means we would need aliased mappings for percpu areas
and addresses will be different for local and remote accesses.  Also,
getting it right on machines with virtually mapped caches would be
very painful.  Given that %gs/fs offesetting is quite efficient on
x86, I don't think changing the generic mechanism is worthwhile.

So, it would be great if we can find a better way to offset addresses
on ia64.  If not, nothing improves or deteriorates performance-wise
with the new implementation.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-24  8:37                   ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
@ 2009-09-28 15:12                     ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-28 15:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Thu, 24 Sep 2009, Tejun Heo wrote:

> percpu implementation on ia64 has always been like that.  The problem
> with the alternate mapping is that you can't take the pointer to it as
> it would mean different thing depending on which processor you're on
> and the overall generic percpu implementation expects unique addresses
> from percpu access macros.

The cpu ops patchset uses per cpu addresses that are not relocated to a
certain processor. The relocation is implicit in these instructions and
must be implicit so these operations can be processor atomic.

> ia64 currently has been and is the only arch which uses virtual percpu
> mapping.  The one biggest benefit would be accesses to the
> local_per_cpu_offset.  Whether it's beneficial enough to justify the
> complexity, I frankly don't know.

Its not worth working on given the state of IA64. I talked to Tony at the
Plumbers conference. It may be beneficial to drop the virtual percpu
mapping entirely because that would increase the number of TLB entries
available.

> Andrew once also suggested taking advantage of those overlapping
> virtual mappings for local percpu accesses.  If the generic code
> followed such design, ia64's virtual mappings would definitely be more
> useful, but that means we would need aliased mappings for percpu areas
> and addresses will be different for local and remote accesses.  Also,
> getting it right on machines with virtually mapped caches would be
> very painful.  Given that %gs/fs offesetting is quite efficient on
> x86, I don't think changing the generic mechanism is worthwhile.

There is no problem with using unrelocated percpu addresses as an
"address" for the cpu ops. The IA64 "virtual" addresses are a stand in for
the segment registers on IA64.

> So, it would be great if we can find a better way to offset addresses
> on ia64.  If not, nothing improves or deteriorates performance-wise
> with the new implementation.

Dropping the use of the special mapping over time may be the easiest way
to go for IA64. percpu RMW ops like this_cpu_add are not possible with
IA64 since no lightweight primitives exist. We could only avoid the
calculation of the per cpu variables address. Would allow assignment and
access to be atomic but not the RMW instruction. So it would not be a full
per cpu ops implementation anyways.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu
@ 2009-09-28 15:12                     ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-28 15:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

On Thu, 24 Sep 2009, Tejun Heo wrote:

> percpu implementation on ia64 has always been like that.  The problem
> with the alternate mapping is that you can't take the pointer to it as
> it would mean different thing depending on which processor you're on
> and the overall generic percpu implementation expects unique addresses
> from percpu access macros.

The cpu ops patchset uses per cpu addresses that are not relocated to a
certain processor. The relocation is implicit in these instructions and
must be implicit so these operations can be processor atomic.

> ia64 currently has been and is the only arch which uses virtual percpu
> mapping.  The one biggest benefit would be accesses to the
> local_per_cpu_offset.  Whether it's beneficial enough to justify the
> complexity, I frankly don't know.

Its not worth working on given the state of IA64. I talked to Tony at the
Plumbers conference. It may be beneficial to drop the virtual percpu
mapping entirely because that would increase the number of TLB entries
available.

> Andrew once also suggested taking advantage of those overlapping
> virtual mappings for local percpu accesses.  If the generic code
> followed such design, ia64's virtual mappings would definitely be more
> useful, but that means we would need aliased mappings for percpu areas
> and addresses will be different for local and remote accesses.  Also,
> getting it right on machines with virtually mapped caches would be
> very painful.  Given that %gs/fs offesetting is quite efficient on
> x86, I don't think changing the generic mechanism is worthwhile.

There is no problem with using unrelocated percpu addresses as an
"address" for the cpu ops. The IA64 "virtual" addresses are a stand in for
the segment registers on IA64.

> So, it would be great if we can find a better way to offset addresses
> on ia64.  If not, nothing improves or deteriorates performance-wise
> with the new implementation.

Dropping the use of the special mapping over time may be the easiest way
to go for IA64. percpu RMW ops like this_cpu_add are not possible with
IA64 since no lightweight primitives exist. We could only avoid the
calculation of the per cpu variables address. Would allow assignment and
access to be atomic but not the RMW instruction. So it would not be a full
per cpu ops implementation anyways.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-29  0:25   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-29  0:25 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Tejun Heo wrote:
> Hello, all.
> 
> This is the second take of convert-ia64-to-dynamic-percpu patchset.
> Changes from the last take[L] are
> 
> * 0001 now updates ia64 to not define VMALLOC_END as a macro to
>   vmalloc_end instead of disallowing vmalloc_end as a variable name as
>   suggested by Christoph.
> 
> * 0002 added to initialize cpu maps early.  This is necessary to get
>   contig memory model working.
> 
> * 0004 updated so that dyn_size is calculated correctly for contig
>   model.
> 
> This patchset contains the following five patches.
> 
>  0001-ia64-don-t-alias-VMALLOC_END-to-vmalloc_end.patch
>  0002-ia64-initialize-cpu-maps-early.patch
>  0003-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
>  0004-ia64-convert-to-dynamic-percpu-allocator.patch
>  0005-percpu-kill-legacy-percpu-allocator.patch
> 
> 0001 is misc prep to avoid macro / local variable collision.  0002
> makes ia64 arch code initialize cpu possible and present maps before
> memory initialization.  0003 makes ia64 allocate percpu area for cpu0
> in the same way it does for other cpus.  0004 converts ia64 to dynamic
> percpu allocator and 0005 drops now unused legacy allocator.
> 
> Contig memory model was tested on a 16p Tiger4 machine.  Discontig and
> sparse tested on 4-way SGI altix.  ski seems to be happy with contig
> up/smp.
> 
> This patchset is available in the following git tree.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64
> 
> The new commit ID is dcc91f19c6662b24f1f4e5878d773244f1079724 and it's
> on top of today's Linus 7fa07729e439a6184bd824746d06a49cca553f15.

Tony, can you please ack ia64 part?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu
@ 2009-09-29  0:25   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-29  0:25 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Tejun Heo wrote:
> Hello, all.
> 
> This is the second take of convert-ia64-to-dynamic-percpu patchset.
> Changes from the last take[L] are
> 
> * 0001 now updates ia64 to not define VMALLOC_END as a macro to
>   vmalloc_end instead of disallowing vmalloc_end as a variable name as
>   suggested by Christoph.
> 
> * 0002 added to initialize cpu maps early.  This is necessary to get
>   contig memory model working.
> 
> * 0004 updated so that dyn_size is calculated correctly for contig
>   model.
> 
> This patchset contains the following five patches.
> 
>  0001-ia64-don-t-alias-VMALLOC_END-to-vmalloc_end.patch
>  0002-ia64-initialize-cpu-maps-early.patch
>  0003-ia64-allocate-percpu-area-for-cpu0-like-percpu-areas.patch
>  0004-ia64-convert-to-dynamic-percpu-allocator.patch
>  0005-percpu-kill-legacy-percpu-allocator.patch
> 
> 0001 is misc prep to avoid macro / local variable collision.  0002
> makes ia64 arch code initialize cpu possible and present maps before
> memory initialization.  0003 makes ia64 allocate percpu area for cpu0
> in the same way it does for other cpus.  0004 converts ia64 to dynamic
> percpu allocator and 0005 drops now unused legacy allocator.
> 
> Contig memory model was tested on a 16p Tiger4 machine.  Discontig and
> sparse tested on 4-way SGI altix.  ski seems to be happy with contig
> up/smp.
> 
> This patchset is available in the following git tree.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git convert-ia64
> 
> The new commit ID is dcc91f19c6662b24f1f4e5878d773244f1079724 and it's
> on top of today's Linus 7fa07729e439a6184bd824746d06a49cca553f15.

Tony, can you please ack ia64 part?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH REPOST 3/5] ia64: allocate percpu area for cpu0 like percpu areas for other cpus
  2009-09-23  5:06 ` Tejun Heo
@ 2009-09-30  1:24   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-30  1:24 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
which is set up early in boot by head.S.  However, this doesn't
guarantee that the area will be on the same node as cpu0 and the
percpu area for cpu0 ends up very far away from percpu areas for other
cpus which cause problems for congruent percpu allocator.

This patch makes percpu area initialization allocate percpu area for
cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
resides in the __init area.  This means that for cpu0, percpu area is
first setup at __cpu0_per_cpu early by head.S and then moved to an
area in the linear mapping during memory initialization and it's not
allowed to take a pointer to percpu variables between head.S and
memory initialization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
Tony Luck didn't receive this one and couldn't find it in archives
either.  Repost.  Unchanged from the original posting.

 arch/ia64/kernel/vmlinux.lds.S |   11 +++++----
 arch/ia64/mm/contig.c          |   47 +++++++++++++++++++++++++--------------
 arch/ia64/mm/discontig.c       |   35 ++++++++++++++++++++---------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 0a0c77b..1295ba3 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -166,6 +166,12 @@ SECTIONS
 	}
 #endif

+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+  __cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
+
   . = ALIGN(PAGE_SIZE);
   __init_end = .;

@@ -198,11 +204,6 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
-#ifdef	CONFIG_SMP
-  . = ALIGN(PERCPU_PAGE_SIZE);
-		__cpu0_per_cpu = .;
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
 		INIT_TASK_DATA(PAGE_SIZE)
 		CACHELINE_ALIGNED_DATA(SMP_CACHE_BYTES)
 		READ_MOSTLY_DATA(SMP_CACHE_BYTES)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 1341437..351da0a 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -154,36 +154,49 @@ static void *cpu_data;
 void * __cpuinit
 per_cpu_init (void)
 {
-	int cpu;
-	static int first_time=1;
+	static bool first_time = true;
+	void *cpu0_data = __cpu0_per_cpu;
+	unsigned int cpu;
+
+	if (!first_time)
+		goto skip;
+	first_time = false;

 	/*
 	 * get_free_pages() cannot be used before cpu_init() done.  BSP
 	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
 	 * get_zeroed_page().
 	 */
-	if (first_time) {
-		void *cpu0_data = __cpu0_per_cpu;
+	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+		void *src = cpu == 0 ? cpu0_data : __phys_per_cpu_start;

-		first_time=0;
+		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)cpu_data - __per_cpu_start;
+		per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];

-		__per_cpu_offset[0] = (char *) cpu0_data - __per_cpu_start;
-		per_cpu(local_per_cpu_offset, 0) = __per_cpu_offset[0];
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu == 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA, __pa(cpu_data) -
+				    (unsigned long)__per_cpu_start);

-		for (cpu = 1; cpu < NR_CPUS; cpu++) {
-			memcpy(cpu_data, __phys_per_cpu_start, __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char *) cpu_data - __per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
-		}
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
+skip:
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }

 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS-1,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
 #else
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 9f24b3c..200282b 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -143,17 +143,30 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	int cpu;

 	for_each_possible_early_cpu(cpu) {
-		if (cpu == 0) {
-			void *cpu0_data = __cpu0_per_cpu;
-			__per_cpu_offset[cpu] = (char*)cpu0_data -
-				__per_cpu_start;
-		} else if (node == node_cpuid[cpu].nid) {
-			memcpy(__va(cpu_data), __phys_per_cpu_start,
-			       __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
-				__per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-		}
+		void *src = cpu == 0 ? __cpu0_per_cpu : __phys_per_cpu_start;
+
+		if (node != node_cpuid[cpu].nid)
+			continue;
+
+		memcpy(__va(cpu_data), src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)__va(cpu_data) -
+			__per_cpu_start;
+
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu == 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA,
+				    (unsigned long)cpu_data -
+				    (unsigned long)__per_cpu_start);
+
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
 #endif
 	return cpu_data;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH REPOST 3/5] ia64: allocate percpu area for cpu0 like percpu
@ 2009-09-30  1:24   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-09-30  1:24 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
which is set up early in boot by head.S.  However, this doesn't
guarantee that the area will be on the same node as cpu0 and the
percpu area for cpu0 ends up very far away from percpu areas for other
cpus which cause problems for congruent percpu allocator.

This patch makes percpu area initialization allocate percpu area for
cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
resides in the __init area.  This means that for cpu0, percpu area is
first setup at __cpu0_per_cpu early by head.S and then moved to an
area in the linear mapping during memory initialization and it's not
allowed to take a pointer to percpu variables between head.S and
memory initialization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64 <linux-ia64@vger.kernel.org>
---
Tony Luck didn't receive this one and couldn't find it in archives
either.  Repost.  Unchanged from the original posting.

 arch/ia64/kernel/vmlinux.lds.S |   11 +++++----
 arch/ia64/mm/contig.c          |   47 +++++++++++++++++++++++++--------------
 arch/ia64/mm/discontig.c       |   35 ++++++++++++++++++++---------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 0a0c77b..1295ba3 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -166,6 +166,12 @@ SECTIONS
 	}
 #endif

+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+  __cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
+
   . = ALIGN(PAGE_SIZE);
   __init_end = .;

@@ -198,11 +204,6 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
-#ifdef	CONFIG_SMP
-  . = ALIGN(PERCPU_PAGE_SIZE);
-		__cpu0_per_cpu = .;
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
 		INIT_TASK_DATA(PAGE_SIZE)
 		CACHELINE_ALIGNED_DATA(SMP_CACHE_BYTES)
 		READ_MOSTLY_DATA(SMP_CACHE_BYTES)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 1341437..351da0a 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -154,36 +154,49 @@ static void *cpu_data;
 void * __cpuinit
 per_cpu_init (void)
 {
-	int cpu;
-	static int first_time=1;
+	static bool first_time = true;
+	void *cpu0_data = __cpu0_per_cpu;
+	unsigned int cpu;
+
+	if (!first_time)
+		goto skip;
+	first_time = false;

 	/*
 	 * get_free_pages() cannot be used before cpu_init() done.  BSP
 	 * allocates "NR_CPUS" pages for all CPUs to avoid that AP calls
 	 * get_zeroed_page().
 	 */
-	if (first_time) {
-		void *cpu0_data = __cpu0_per_cpu;
+	for (cpu = 0; cpu < NR_CPUS; cpu++) {
+		void *src = cpu = 0 ? cpu0_data : __phys_per_cpu_start;

-		first_time=0;
+		memcpy(cpu_data, src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)cpu_data - __per_cpu_start;
+		per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];

-		__per_cpu_offset[0] = (char *) cpu0_data - __per_cpu_start;
-		per_cpu(local_per_cpu_offset, 0) = __per_cpu_offset[0];
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu = 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA, __pa(cpu_data) -
+				    (unsigned long)__per_cpu_start);

-		for (cpu = 1; cpu < NR_CPUS; cpu++) {
-			memcpy(cpu_data, __phys_per_cpu_start, __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char *) cpu_data - __per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
-		}
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
+skip:
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }

 static inline void
 alloc_per_cpu_data(void)
 {
-	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS-1,
+	cpu_data = __alloc_bootmem(PERCPU_PAGE_SIZE * NR_CPUS,
 				   PERCPU_PAGE_SIZE, __pa(MAX_DMA_ADDRESS));
 }
 #else
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 9f24b3c..200282b 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -143,17 +143,30 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 	int cpu;

 	for_each_possible_early_cpu(cpu) {
-		if (cpu = 0) {
-			void *cpu0_data = __cpu0_per_cpu;
-			__per_cpu_offset[cpu] = (char*)cpu0_data -
-				__per_cpu_start;
-		} else if (node = node_cpuid[cpu].nid) {
-			memcpy(__va(cpu_data), __phys_per_cpu_start,
-			       __per_cpu_end - __per_cpu_start);
-			__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
-				__per_cpu_start;
-			cpu_data += PERCPU_PAGE_SIZE;
-		}
+		void *src = cpu = 0 ? __cpu0_per_cpu : __phys_per_cpu_start;
+
+		if (node != node_cpuid[cpu].nid)
+			continue;
+
+		memcpy(__va(cpu_data), src, __per_cpu_end - __per_cpu_start);
+		__per_cpu_offset[cpu] = (char *)__va(cpu_data) -
+			__per_cpu_start;
+
+		/*
+		 * percpu area for cpu0 is moved from the __init area
+		 * which is setup by head.S and used till this point.
+		 * Update ar.k3.  This move is ensures that percpu
+		 * area for cpu0 is on the correct node and its
+		 * virtual address isn't insanely far from other
+		 * percpu areas which is important for congruent
+		 * percpu allocator.
+		 */
+		if (cpu = 0)
+			ia64_set_kr(IA64_KR_PER_CPU_DATA,
+				    (unsigned long)cpu_data -
+				    (unsigned long)__per_cpu_start);
+
+		cpu_data += PERCPU_PAGE_SIZE;
 	}
 #endif
 	return cpu_data;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-29  0:25   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Tejun Heo
@ 2009-09-30 20:32     ` Luck, Tony
  -1 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-30 20:32 UTC (permalink / raw)
  To: Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

> Tony, can you please ack ia64 part?

Tejun,

Ok. The new patchset builda and boots for both the contig.c and discontig.c
versions of the kernel.

Acked-by; Tony Luck <tony.luck@intel.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic
@ 2009-09-30 20:32     ` Luck, Tony
  0 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-30 20:32 UTC (permalink / raw)
  To: Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

> Tony, can you please ack ia64 part?

Tejun,

Ok. The new patchset builda and boots for both the contig.c and discontig.c
versions of the kernel.

Acked-by; Tony Luck <tony.luck@intel.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-30 20:32     ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
@ 2009-09-30 20:47       ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-30 20:47 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel


Great. Ill post a new rev of the per cpu ops patchset.

Tony: Could we use a global register for the per cpu address? That would
make IA64 work similar to sparc.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu
@ 2009-09-30 20:47       ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-30 20:47 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel


Great. Ill post a new rev of the per cpu ops patchset.

Tony: Could we use a global register for the per cpu address? That would
make IA64 work similar to sparc.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-30 20:47       ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Christoph Lameter
@ 2009-09-30 22:05         ` Luck, Tony
  -1 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-30 22:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

> Tony: Could we use a global register for the per cpu address? That would
> make IA64 work similar to sparc.

Would that be a useful trade of resources for convenience?  We've already
hard-wired r13 for "current". Grabbing another one would require fixing
(since user code will have clobbered it).  Possibly re-working any existing
code that already uses whatever register you choose.

How would that compare with using [r13]?

We might have a krN free ... but kregs are a lot slower than real registers.

-Tony

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic
@ 2009-09-30 22:05         ` Luck, Tony
  0 siblings, 0 replies; 72+ messages in thread
From: Luck, Tony @ 2009-09-30 22:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64, Ingo Molnar,
	Rusty Russell, linux-kernel

> Tony: Could we use a global register for the per cpu address? That would
> make IA64 work similar to sparc.

Would that be a useful trade of resources for convenience?  We've already
hard-wired r13 for "current". Grabbing another one would require fixing
(since user code will have clobbered it).  Possibly re-working any existing
code that already uses whatever register you choose.

How would that compare with using [r13]?

We might have a krN free ... but kregs are a lot slower than real registers.

-Tony

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-30 22:05         ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
@ 2009-09-30 23:06           ` Peter Chubb
  -1 siblings, 0 replies; 72+ messages in thread
From: Peter Chubb @ 2009-09-30 23:06 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Christoph Lameter, Tejun Heo, Nick Piggin, Yu, Fenghua,
	linux-ia64, Ingo Molnar, Rusty Russell, linux-kernel

>>>>> "Tony" == Tony Luck <Luck> writes:

>> Tony: Could we use a global register for the per cpu address? That
>> would make IA64 work similar to sparc.

Tony> Would that be a useful trade of resources for convenience?
Tony> We've already hard-wired r13 for "current". Grabbing another one
Tony> would require fixing (since user code will have clobbered it).
Tony> Possibly re-working any existing code that already uses whatever
Tony> register you choose.

r3, r4 and r5 are currently unused by the kernel, and unused
by GCC and ICC.   Only hand-written assembler and weird compilers use
those registers(and my virtual-machine monitor :-().  If you wanted to
experiment, that'd be a starting place.  

I'm not sure of the advantage though -- TLB mapping is relatively
cheap, and we're no longer hard-wiring the translation register.

You';d have to do somne careful benchmarking on a wide variety of
workloads and machines to get a definitive answer.


---
Dr Peter Chubb        www.nicta.com.au      peter DOT chubb AT nicta.com.au
http://www.ertos.nicta.com.au           ERTOS within National ICT Australia
>From Imagination to Impact                       Imagining the (ICT) Future

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
@ 2009-09-30 23:06           ` Peter Chubb
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Chubb @ 2009-09-30 23:06 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Christoph Lameter, Tejun Heo, Nick Piggin, Yu, Fenghua,
	linux-ia64, Ingo Molnar, Rusty Russell, linux-kernel

>>>>> "Tony" = Tony Luck <Luck> writes:

>> Tony: Could we use a global register for the per cpu address? That
>> would make IA64 work similar to sparc.

Tony> Would that be a useful trade of resources for convenience?
Tony> We've already hard-wired r13 for "current". Grabbing another one
Tony> would require fixing (since user code will have clobbered it).
Tony> Possibly re-working any existing code that already uses whatever
Tony> register you choose.

r3, r4 and r5 are currently unused by the kernel, and unused
by GCC and ICC.   Only hand-written assembler and weird compilers use
those registers(and my virtual-machine monitor :-().  If you wanted to
experiment, that'd be a starting place.  

I'm not sure of the advantage though -- TLB mapping is relatively
cheap, and we're no longer hard-wiring the translation register.

You';d have to do somne careful benchmarking on a wide variety of
workloads and machines to get a definitive answer.


---
Dr Peter Chubb        www.nicta.com.au      peter DOT chubb AT nicta.com.au
http://www.ertos.nicta.com.au           ERTOS within National ICT Australia
From Imagination to Impact                       Imagining the (ICT) Future

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-30 23:06           ` Peter Chubb
@ 2009-09-30 23:49             ` Christoph Lameter
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-30 23:49 UTC (permalink / raw)
  To: Peter Chubb
  Cc: Luck, Tony, Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64,
	Ingo Molnar, Rusty Russell, linux-kernel

On Thu, 1 Oct 2009, Peter Chubb wrote:

> r3, r4 and r5 are currently unused by the kernel, and unused
> by GCC and ICC.   Only hand-written assembler and weird compilers use
> those registers(and my virtual-machine monitor :-().  If you wanted to
> experiment, that'd be a starting place.
>
> I'm not sure of the advantage though -- TLB mapping is relatively
> cheap, and we're no longer hard-wiring the translation register.

Dynamic and static per cpu variables could use relative access to that
register. This would reduce code size, avoid the use of a TLB entry.

> You';d have to do somne careful benchmarking on a wide variety of
> workloads and machines to get a definitive answer.

I have some patches here that make heavy use of dynamic percpu allocations
in the allocators to optimize the alloc / free paths.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu
@ 2009-09-30 23:49             ` Christoph Lameter
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Lameter @ 2009-09-30 23:49 UTC (permalink / raw)
  To: Peter Chubb
  Cc: Luck, Tony, Tejun Heo, Nick Piggin, Yu, Fenghua, linux-ia64,
	Ingo Molnar, Rusty Russell, linux-kernel

On Thu, 1 Oct 2009, Peter Chubb wrote:

> r3, r4 and r5 are currently unused by the kernel, and unused
> by GCC and ICC.   Only hand-written assembler and weird compilers use
> those registers(and my virtual-machine monitor :-().  If you wanted to
> experiment, that'd be a starting place.
>
> I'm not sure of the advantage though -- TLB mapping is relatively
> cheap, and we're no longer hard-wiring the translation register.

Dynamic and static per cpu variables could use relative access to that
register. This would reduce code size, avoid the use of a TLB entry.

> You';d have to do somne careful benchmarking on a wide variety of
> workloads and machines to get a definitive answer.

I have some patches here that make heavy use of dynamic percpu allocations
in the allocators to optimize the alloc / free paths.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2
  2009-09-23  5:06 ` Tejun Heo
@ 2009-10-02  5:11   ` Tejun Heo
  -1 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-10-02  5:11 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Tejun Heo wrote:
> Hello, all.
> 
> This is the second take of convert-ia64-to-dynamic-percpu patchset.
> Changes from the last take[L] are

Patchset pushed out for linux-next with Rusty and Tony's acks added.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu
@ 2009-10-02  5:11   ` Tejun Heo
  0 siblings, 0 replies; 72+ messages in thread
From: Tejun Heo @ 2009-10-02  5:11 UTC (permalink / raw)
  To: Nick Piggin, Tony Luck, Fenghua Yu, linux-ia64, Ingo Molnar,
	Rusty Russell, Christoph Lameter, linux-kernel

Tejun Heo wrote:
> Hello, all.
> 
> This is the second take of convert-ia64-to-dynamic-percpu patchset.
> Changes from the last take[L] are

Patchset pushed out for linux-next with Rusty and Tony's acks added.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2009-10-02  5:11 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-23  5:06 [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Tejun Heo
2009-09-23  5:06 ` Tejun Heo
2009-09-23  5:06 ` [PATCH 1/5] ia64: don't alias VMALLOC_END to vmalloc_end Tejun Heo
2009-09-23  5:06   ` Tejun Heo
2009-09-23  5:06 ` [PATCH 2/5] ia64: initialize cpu maps early Tejun Heo
2009-09-23  5:06   ` Tejun Heo
2009-09-23  5:06 ` [PATCH 3/5] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-23  5:06   ` Tejun Heo
2009-09-23  5:06 ` [PATCH 4/5] ia64: convert to dynamic percpu allocator Tejun Heo
2009-09-23  5:06   ` Tejun Heo
2009-09-23  5:06 ` [PATCH 5/5] percpu: kill legacy " Tejun Heo
2009-09-23  5:06   ` Tejun Heo
2009-09-23 11:10   ` Rusty Russell
2009-09-23 11:22     ` Rusty Russell
2009-09-29  0:25 ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Tejun Heo
2009-09-29  0:25   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Tejun Heo
2009-09-30 20:32   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Luck, Tony
2009-09-30 20:32     ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
2009-09-30 20:47     ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Christoph Lameter
2009-09-30 20:47       ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Christoph Lameter
2009-09-30 22:05       ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Luck, Tony
2009-09-30 22:05         ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
2009-09-30 23:06         ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Peter Chubb
2009-09-30 23:06           ` Peter Chubb
2009-09-30 23:49           ` Christoph Lameter
2009-09-30 23:49             ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Christoph Lameter
2009-09-30  1:24 ` [PATCH REPOST 3/5] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-30  1:24   ` [PATCH REPOST 3/5] ia64: allocate percpu area for cpu0 like percpu Tejun Heo
2009-10-02  5:11 ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one, take#2 Tejun Heo
2009-10-02  5:11   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2009-09-22  7:40 [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one Tejun Heo
2009-09-22  7:40 ` Tejun Heo
2009-09-22  7:40 ` [PATCH 1/4] vmalloc: rename local variables vmalloc_start and vmalloc_end Tejun Heo
2009-09-22  7:40   ` Tejun Heo
2009-09-22 22:52   ` Christoph Lameter
2009-09-22 22:52     ` [PATCH 1/4] vmalloc: rename local variables vmalloc_start and Christoph Lameter
2009-09-23  2:08     ` [PATCH 1/4] vmalloc: rename local variables vmalloc_start and vmalloc_end Tejun Heo
2009-09-23  2:08       ` [PATCH 1/4] vmalloc: rename local variables vmalloc_start and Tejun Heo
2009-09-22  7:40 ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-22  7:40   ` Tejun Heo
2009-09-22 22:59   ` Christoph Lameter
2009-09-22 22:59     ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
2009-09-23  2:11     ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-23  2:11       ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
2009-09-23 13:44       ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Christoph Lameter
2009-09-23 13:44         ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
2009-09-23 14:01         ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-23 14:01           ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
2009-09-23 17:17           ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Christoph Lameter
2009-09-23 17:17             ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
2009-09-23 22:03             ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-23 22:03               ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
2009-09-24  7:36               ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Christoph Lameter
2009-09-24  7:36                 ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
2009-09-24  8:37                 ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Tejun Heo
2009-09-24  8:37                   ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas Tejun Heo
2009-09-28 15:12                   ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu areas for other cpus Christoph Lameter
2009-09-28 15:12                     ` [PATCH 2/4] ia64: allocate percpu area for cpu0 like percpu Christoph Lameter
2009-09-22  7:40 ` [PATCH 3/4] ia64: convert to dynamic percpu allocator Tejun Heo
2009-09-22  7:40   ` Tejun Heo
2009-09-22  7:40 ` [PATCH 4/4] percpu: kill legacy " Tejun Heo
2009-09-22  7:40   ` Tejun Heo
2009-09-22  8:16 ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one Ingo Molnar
2009-09-22  8:16   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Ingo Molnar
2009-09-22 20:49   ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one Luck, Tony
2009-09-22 20:49     ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
2009-09-22 21:10     ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one Luck, Tony
2009-09-22 21:10       ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
2009-09-22 21:24       ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one Luck, Tony
2009-09-22 21:24         ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic Luck, Tony
2009-09-22 21:50         ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu and drop the old one Tejun Heo
2009-09-22 21:50           ` [PATCHSET percpu#for-next] percpu: convert ia64 to dynamic percpu Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.