From: Andy Lutomirski <luto@kernel.org> To: x86@kernel.org Cc: Borislav Petkov <bp@alien8.de>, linux-kernel@vger.kernel.org, Brian Gerst <brgerst@gmail.com>, Jann Horn <jann@thejh.net>, Josh Poimboeuf <jpoimboe@redhat.com>, Andy Lutomirski <luto@kernel.org> Subject: [PATCH v2 8/8] fork: Cache two thread stacks per cpu if CONFIG_VMAP_STACK is set Date: Thu, 15 Sep 2016 22:45:49 -0700 Message-ID: <94811d8e3994b2e962f88866290017d498eb069c.1474003868.git.luto@kernel.org> (raw) In-Reply-To: <cover.1474003868.git.luto@kernel.org> In-Reply-To: <cover.1474003868.git.luto@kernel.org> vmalloc is a bit slow, and pounding vmalloc/vfree will eventually force a global TLB flush. To reduce pressure on them, if CONFIG_VMAP_STACK, cache two thread stacks per cpu. This will let us quickly allocate a hopefully cache-hot, TLB-hot stack under heavy forking workloads (shell script style). On my silly pthread_create benchmark, it saves about 2 µs per pthread_create+join with CONFIG_VMAP_STACK=y. Signed-off-by: Andy Lutomirski <luto@kernel.org> --- kernel/fork.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 53 insertions(+), 9 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 5dd0a516626d..2d44a9d05218 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -159,15 +159,41 @@ void __weak arch_release_thread_stack(unsigned long *stack) * kmemcache based allocator. */ # if THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK) + +#ifdef CONFIG_VMAP_STACK +/* + * vmalloc is a bit slow, and calling vfree enough times will force a TLB + * flush. Try to minimize the number of calls by caching stacks. + */ +#define NR_CACHED_STACKS 2 +static DEFINE_PER_CPU(struct vm_struct *, cached_stacks[NR_CACHED_STACKS]); +#endif + static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) { #ifdef CONFIG_VMAP_STACK - void *stack = __vmalloc_node_range(THREAD_SIZE, THREAD_SIZE, - VMALLOC_START, VMALLOC_END, - THREADINFO_GFP | __GFP_HIGHMEM, - PAGE_KERNEL, - 0, node, - __builtin_return_address(0)); + void *stack; + int i; + + local_irq_disable(); + for (i = 0; i < NR_CACHED_STACKS; i++) { + struct vm_struct *s = this_cpu_read(cached_stacks[i]); + + if (!s) + continue; + this_cpu_write(cached_stacks[i], NULL); + + tsk->stack_vm_area = s; + local_irq_enable(); + return s->addr; + } + local_irq_enable(); + + stack = __vmalloc_node_range(THREAD_SIZE, THREAD_SIZE, + VMALLOC_START, VMALLOC_END, + THREADINFO_GFP | __GFP_HIGHMEM, + PAGE_KERNEL, + 0, node, __builtin_return_address(0)); /* * We can't call find_vm_area() in interrupt context, and @@ -187,10 +213,28 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) static inline void free_thread_stack(struct task_struct *tsk) { - if (task_stack_vm_area(tsk)) +#ifdef CONFIG_VMAP_STACK + if (task_stack_vm_area(tsk)) { + unsigned long flags; + int i; + + local_irq_save(flags); + for (i = 0; i < NR_CACHED_STACKS; i++) { + if (this_cpu_read(cached_stacks[i])) + continue; + + this_cpu_write(cached_stacks[i], tsk->stack_vm_area); + local_irq_restore(flags); + return; + } + local_irq_restore(flags); + vfree(tsk->stack); - else - __free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER); + return; + } +#endif + + __free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER); } # else static struct kmem_cache *thread_stack_cache; -- 2.7.4
next prev parent reply index Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-09-16 5:45 [PATCH v2 0/8] thread_info cleanups and stack caching Andy Lutomirski 2016-09-16 5:45 ` [PATCH v2 1/8] x86/entry/64: Fix a minor comment rebase error Andy Lutomirski 2016-09-16 9:16 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2016-09-16 5:45 ` [PATCH v2 2/8] sched: Add try_get_task_stack() and put_task_stack() Andy Lutomirski 2016-09-16 9:16 ` [tip:x86/asm] sched/core: " tip-bot for Andy Lutomirski 2016-09-16 5:45 ` [PATCH v2 3/8] kthread: to_live_kthread() needs try_get_task_stack() Andy Lutomirski 2016-09-16 9:17 ` [tip:x86/asm] kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function tip-bot for Oleg Nesterov 2016-09-16 5:45 ` [PATCH v2 4/8] x86/dumpstack: Pin the target stack when dumping it Andy Lutomirski 2016-09-16 9:17 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2016-09-16 11:55 ` Josh Poimboeuf 2016-09-16 12:28 ` Josh Poimboeuf 2016-09-16 12:57 ` Ingo Molnar 2016-09-16 13:05 ` [PATCH] x86/dumpstack: remove NULL task pointer convention Josh Poimboeuf 2016-09-16 5:45 ` [PATCH v2 5/8] x86/process: Pin the target stack in get_wchan() Andy Lutomirski 2016-09-16 9:18 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2016-09-16 5:45 ` [PATCH v2 6/8] lib/syscall: Pin the task stack in collect_syscall() Andy Lutomirski 2016-09-16 9:18 ` [tip:x86/asm] " tip-bot for Andy Lutomirski 2016-09-16 5:45 ` [PATCH v2 7/8] sched: Free the stack early if CONFIG_THREAD_INFO_IN_TASK Andy Lutomirski 2016-09-16 9:19 ` [tip:x86/asm] sched/core: " tip-bot for Andy Lutomirski 2016-09-16 5:45 ` Andy Lutomirski [this message] 2016-09-16 9:19 ` [tip:x86/asm] fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y tip-bot for Andy Lutomirski
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=94811d8e3994b2e962f88866290017d498eb069c.1474003868.git.luto@kernel.org \ --to=luto@kernel.org \ --cc=bp@alien8.de \ --cc=brgerst@gmail.com \ --cc=jann@thejh.net \ --cc=jpoimboe@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git