* [PATCH 1/3] percpu: make pcpu_build_alloc_info() clear static buffers @ 2009-09-24 12:55 Tejun Heo 2009-09-24 12:56 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator Tejun Heo 0 siblings, 1 reply; 7+ messages in thread From: Tejun Heo @ 2009-09-24 12:55 UTC (permalink / raw) To: Linux Kernel, David Miller, Rusty Russell, Christoph Lameter, Ingo Molnar, H. Peter Anvin pcpu_build_alloc_info() may be called multiple times when percpu is falling back to different first chunk allocator. Make it clear static buffers so that they don't contain values from previous runs. Signed-off-by: Tejun Heo <tj@kernel.org> --- These three patches are scheduled for mainline and aim to work around the cases where distance between the two farthest units is too large compared to vmalloc area size. This happens on sparc64 because vmalloc area size is relatively small there and nodes can easily be placed such that they are too far apart. This patchset implements page mapping first chunk allocator for sparc64 and make embedding allocator fallback to it when vmalloc area doesn't seem large enough. This should make percpu allocator more robust on other archs which implement page mapping allocator (only x86 currently) and diagnosing problems easier on other archs. Thanks. mm/percpu.c | 4 ++++ 1 file changed, 4 insertions(+) Index: work/mm/percpu.c =================================================================== --- work.orig/mm/percpu.c +++ work/mm/percpu.c @@ -1347,6 +1347,10 @@ struct pcpu_alloc_info * __init pcpu_bui struct pcpu_alloc_info *ai; unsigned int *cpu_map; + /* this function may be called multiple times */ + memset(group_map, 0, sizeof(group_map)); + memset(group_cnt, 0, sizeof(group_map)); + /* * Determine min_unit_size, alloc_size and max_upa such that * alloc_size is multiple of atom_size and is the smallest ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator 2009-09-24 12:55 [PATCH 1/3] percpu: make pcpu_build_alloc_info() clear static buffers Tejun Heo @ 2009-09-24 12:56 ` Tejun Heo 2009-09-24 12:57 ` [PATCH 3/3] percpu: make embedding first chunk allocator check vmalloc space size Tejun Heo ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Tejun Heo @ 2009-09-24 12:56 UTC (permalink / raw) To: Linux Kernel, David Miller, Rusty Russell, Christoph Lameter, Ingo Molnar, H. Peter Anvin Implement page mapping percpu first chunk allocator as a fallback to the embedding allocator. The next patch will make the embedding allocator check distances between units to determine whether it fits within the vmalloc area so that this fallback can be used on such cases. sparc64 currently has relatively small vmalloc area which makes it impossible to create any dynamic chunks on certain configurations leading to percpu allocation failures. This and the next patch should allow those configurations to keep working until proper solution is found. Signed-off-by: Tejun Heo <tj@kernel.org> --- David, can you please ack this after reviewing? Thanks. arch/sparc/Kconfig | 3 ++ arch/sparc/kernel/smp_64.c | 51 +++++++++++++++++++++++++++++++++++++-------- 2 files changed, 46 insertions(+), 8 deletions(-) Index: work/arch/sparc/Kconfig =================================================================== --- work.orig/arch/sparc/Kconfig +++ work/arch/sparc/Kconfig @@ -102,6 +102,9 @@ config HAVE_SETUP_PER_CPU_AREA config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y if SPARC64 +config NEED_PER_CPU_PAGE_FIRST_CHUNK + def_bool y if SPARC64 + config GENERIC_HARDIRQS_NO__DO_IRQ bool def_bool y if SPARC64 Index: work/arch/sparc/kernel/smp_64.c =================================================================== --- work.orig/arch/sparc/kernel/smp_64.c +++ work/arch/sparc/kernel/smp_64.c @@ -1420,7 +1420,7 @@ static void __init pcpu_free_bootmem(voi free_bootmem(__pa(ptr), size); } -static int pcpu_cpu_distance(unsigned int from, unsigned int to) +static int __init pcpu_cpu_distance(unsigned int from, unsigned int to) { if (cpu_to_node(from) == cpu_to_node(to)) return LOCAL_DISTANCE; @@ -1428,18 +1428,53 @@ static int pcpu_cpu_distance(unsigned in return REMOTE_DISTANCE; } +static void __init pcpu_populate_pte(unsigned long addr) +{ + pgd_t *pgd = pgd_offset_k(addr); + pud_t *pud; + pmd_t *pmd; + + pud = pud_offset(pgd, addr); + if (pud_none(*pud)) { + pmd_t *new; + + new = __alloc_bootmem(PAGE_SIZE, PAGE_SIZE, PAGE_SIZE); + pud_populate(&init_mm, pud, new); + } + + pmd = pmd_offset(pud, addr); + if (!pmd_present(*pmd)) { + pte_t *new; + + new = __alloc_bootmem(PAGE_SIZE, PAGE_SIZE, PAGE_SIZE); + pmd_populate_kernel(&init_mm, pmd, new); + } +} + void __init setup_per_cpu_areas(void) { unsigned long delta; unsigned int cpu; - int rc; + int rc = -EINVAL; - rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE, - PERCPU_DYNAMIC_RESERVE, 4 << 20, - pcpu_cpu_distance, pcpu_alloc_bootmem, - pcpu_free_bootmem); - if (rc) - panic("failed to initialize first chunk (%d)", rc); + if (pcpu_chosen_fc != PCPU_FC_PAGE) { + rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE, + PERCPU_DYNAMIC_RESERVE, 4 << 20, + pcpu_cpu_distance, + pcpu_alloc_bootmem, + pcpu_free_bootmem); + if (rc) + pr_warning("PERCPU: %s allocator failed (%d), " + "falling back to page size\n", + pcpu_fc_names[pcpu_chosen_fc], rc); + } + if (rc < 0) + rc = pcpu_page_first_chunk(PERCPU_MODULE_RESERVE, + pcpu_alloc_bootmem, + pcpu_free_bootmem, + pcpu_populate_pte); + if (rc < 0) + panic("cannot initialize percpu area (err=%d)", rc); delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start; for_each_possible_cpu(cpu) ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] percpu: make embedding first chunk allocator check vmalloc space size 2009-09-24 12:56 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator Tejun Heo @ 2009-09-24 12:57 ` Tejun Heo 2009-09-24 22:51 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator David Miller 2009-09-28 21:36 ` David Miller 2 siblings, 0 replies; 7+ messages in thread From: Tejun Heo @ 2009-09-24 12:57 UTC (permalink / raw) To: Linux Kernel, David Miller, Rusty Russell, Christoph Lameter, Ingo Molnar, H. Peter Anvin Embedding first chunk allocator maintains the distances between units in the vmalloc area and thus needs vmalloc space to be larger than the maximum distances between units; otherwise, it wouldn't be able to create any dynamic chunks. This patch makes the embedding first chunk allocator check vmalloc space size and if the maximum distance between units is larger than 75% of it, print warning and, if page mapping allocator is available, fail initialization so that the system falls back onto it. This should work around percpu allocation failure problems on certain sparc64 configurations where distances between NUMA nodes are larger than the vmalloc area and makes percpu allocator more robust for future configurations. Signed-off-by: Tejun Heo <tj@kernel.org> --- mm/percpu.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) Index: work/mm/percpu.c =================================================================== --- work.orig/mm/percpu.c +++ work/mm/percpu.c @@ -1786,7 +1786,7 @@ int __init pcpu_embed_first_chunk(size_t void *base = (void *)ULONG_MAX; void **areas = NULL; struct pcpu_alloc_info *ai; - size_t size_sum, areas_size; + size_t size_sum, areas_size, max_distance; int group, i, rc; ai = pcpu_build_alloc_info(reserved_size, dyn_size, atom_size, @@ -1836,8 +1836,24 @@ int __init pcpu_embed_first_chunk(size_t } /* base address is now known, determine group base offsets */ - for (group = 0; group < ai->nr_groups; group++) + max_distance = 0; + for (group = 0; group < ai->nr_groups; group++) { ai->groups[group].base_offset = areas[group] - base; + max_distance = max(max_distance, ai->groups[group].base_offset); + } + max_distance += ai->unit_size; + + /* warn if maximum distance is further than 75% of vmalloc space */ + if (max_distance > (VMALLOC_END - VMALLOC_START) * 3 / 4) { + pr_warning("PERCPU: max_distance=0x%lx too large for vmalloc " + "space 0x%lx\n", + max_distance, VMALLOC_END - VMALLOC_START); +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK + /* and fail if we have fallback */ + rc = -EINVAL; + goto out_free; +#endif + } pr_info("PERCPU: Embedded %zu pages/cpu @%p s%zu r%zu d%zu u%zu\n", PFN_DOWN(size_sum), base, ai->static_size, ai->reserved_size, ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator 2009-09-24 12:56 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator Tejun Heo 2009-09-24 12:57 ` [PATCH 3/3] percpu: make embedding first chunk allocator check vmalloc space size Tejun Heo @ 2009-09-24 22:51 ` David Miller 2009-09-28 21:36 ` David Miller 2 siblings, 0 replies; 7+ messages in thread From: David Miller @ 2009-09-24 22:51 UTC (permalink / raw) To: tj; +Cc: linux-kernel, rusty, cl, mingo, hpa From: Tejun Heo <tj@kernel.org> Date: Thu, 24 Sep 2009 21:56:32 +0900 > Implement page mapping percpu first chunk allocator as a fallback to > the embedding allocator. The next patch will make the embedding > allocator check distances between units to determine whether it fits > within the vmalloc area so that this fallback can be used on such > cases. > > sparc64 currently has relatively small vmalloc area which makes it > impossible to create any dynamic chunks on certain configurations > leading to percpu allocation failures. This and the next patch should > allow those configurations to keep working until proper solution is > found. > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > David, can you please ack this after reviewing? This looks fine to me: Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator 2009-09-24 12:56 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator Tejun Heo 2009-09-24 12:57 ` [PATCH 3/3] percpu: make embedding first chunk allocator check vmalloc space size Tejun Heo 2009-09-24 22:51 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator David Miller @ 2009-09-28 21:36 ` David Miller 2009-09-29 0:14 ` Tejun Heo 2 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2009-09-28 21:36 UTC (permalink / raw) To: tj; +Cc: linux-kernel, rusty, cl, mingo, hpa From: Tejun Heo <tj@kernel.org> Date: Thu, 24 Sep 2009 21:56:32 +0900 > Implement page mapping percpu first chunk allocator as a fallback to > the embedding allocator. The next patch will make the embedding > allocator check distances between units to determine whether it fits > within the vmalloc area so that this fallback can be used on such > cases. > > sparc64 currently has relatively small vmalloc area which makes it > impossible to create any dynamic chunks on certain configurations > leading to percpu allocation failures. This and the next patch should > allow those configurations to keep working until proper solution is > found. > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > David, can you please ack this after reviewing? Tejun I am testing out the following patch which will make these patches of your's basically unnecessary: diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 0ff92fa..f3cb790 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -41,8 +41,8 @@ #define LOW_OBP_ADDRESS _AC(0x00000000f0000000,UL) #define HI_OBP_ADDRESS _AC(0x0000000100000000,UL) #define VMALLOC_START _AC(0x0000000100000000,UL) -#define VMALLOC_END _AC(0x0000000200000000,UL) -#define VMEMMAP_BASE _AC(0x0000000200000000,UL) +#define VMALLOC_END _AC(0x0000010000000000,UL) +#define VMEMMAP_BASE _AC(0x0000010000000000,UL) #define vmemmap ((struct page *)VMEMMAP_BASE) diff --git a/arch/sparc/kernel/ktlb.S b/arch/sparc/kernel/ktlb.S index 3ea6e8c..1d36147 100644 --- a/arch/sparc/kernel/ktlb.S +++ b/arch/sparc/kernel/ktlb.S @@ -280,8 +280,8 @@ kvmap_dtlb_nonlinear: #ifdef CONFIG_SPARSEMEM_VMEMMAP /* Do not use the TSB for vmemmap. */ - mov (VMEMMAP_BASE >> 24), %g5 - sllx %g5, 24, %g5 + mov (VMEMMAP_BASE >> 40), %g5 + sllx %g5, 40, %g5 cmp %g4,%g5 bgeu,pn %xcc, kvmap_vmemmap nop @@ -293,8 +293,8 @@ kvmap_dtlb_tsbmiss: sethi %hi(MODULES_VADDR), %g5 cmp %g4, %g5 blu,pn %xcc, kvmap_dtlb_longpath - mov (VMALLOC_END >> 24), %g5 - sllx %g5, 24, %g5 + mov (VMALLOC_END >> 40), %g5 + sllx %g5, 40, %g5 cmp %g4, %g5 bgeu,pn %xcc, kvmap_dtlb_longpath nop ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator 2009-09-28 21:36 ` David Miller @ 2009-09-29 0:14 ` Tejun Heo 2009-09-29 0:16 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Tejun Heo @ 2009-09-29 0:14 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, rusty, cl, mingo, hpa Hello, David. David Miller wrote: > From: Tejun Heo <tj@kernel.org> > Date: Thu, 24 Sep 2009 21:56:32 +0900 > >> Implement page mapping percpu first chunk allocator as a fallback to >> the embedding allocator. The next patch will make the embedding >> allocator check distances between units to determine whether it fits >> within the vmalloc area so that this fallback can be used on such >> cases. >> >> sparc64 currently has relatively small vmalloc area which makes it >> impossible to create any dynamic chunks on certain configurations >> leading to percpu allocation failures. This and the next patch should >> allow those configurations to keep working until proper solution is >> found. >> >> Signed-off-by: Tejun Heo <tj@kernel.org> >> --- >> David, can you please ack this after reviewing? > > Tejun I am testing out the following patch which will make these > patches of your's basically unnecessary: Ah... great but unless you object, I think it would be better to push it out anyway just to make things a bit more robust and ease tracking and debugging when something goes wrong. The added code is small and ditched once boot is complete. What do you think? Thanks. -- tejun ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator 2009-09-29 0:14 ` Tejun Heo @ 2009-09-29 0:16 ` David Miller 0 siblings, 0 replies; 7+ messages in thread From: David Miller @ 2009-09-29 0:16 UTC (permalink / raw) To: tj; +Cc: linux-kernel, rusty, cl, mingo, hpa From: Tejun Heo <tj@kernel.org> Date: Tue, 29 Sep 2009 09:14:58 +0900 > Ah... great but unless you object, I think it would be better to push > it out anyway just to make things a bit more robust and ease tracking > and debugging when something goes wrong. The added code is small and > ditched once boot is complete. What do you think? That's fine with me: Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-09-29 0:16 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-09-24 12:55 [PATCH 1/3] percpu: make pcpu_build_alloc_info() clear static buffers Tejun Heo 2009-09-24 12:56 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator Tejun Heo 2009-09-24 12:57 ` [PATCH 3/3] percpu: make embedding first chunk allocator check vmalloc space size Tejun Heo 2009-09-24 22:51 ` [PATCH 2/3] sparc64: implement page mapping percpu first chunk allocator David Miller 2009-09-28 21:36 ` David Miller 2009-09-29 0:14 ` Tejun Heo 2009-09-29 0:16 ` David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.