From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751664AbcFYXaj (ORCPT ); Sat, 25 Jun 2016 19:30:39 -0400 Received: from mail-vk0-f45.google.com ([209.85.213.45]:34501 "EHLO mail-vk0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751347AbcFYXaf (ORCPT ); Sat, 25 Jun 2016 19:30:35 -0400 MIME-Version: 1.0 In-Reply-To: References: <20160623185340.GO30154@twins.programming.kicks-ass.net> <20160624202530.unmidps4kpebo2na@treble> <20160624205116.4hbnurrnan4afq2t@treble> From: Andy Lutomirski Date: Sat, 25 Jun 2016 16:30:14 -0700 Message-ID: Subject: Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) To: Linus Torvalds Cc: Josh Poimboeuf , Brian Gerst , Peter Zijlstra , Oleg Nesterov , Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , "linux-arch@vger.kernel.org" , Borislav Petkov , Nadav Amit , Kees Cook , "kernel-hardening@lists.openwall.com" , Jann Horn , Heiko Carstens Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 25, 2016 at 4:19 PM, Andy Lutomirski wrote: > I rebased my series onto your tree and then rebased this thing onto my > series, tweaked it some, and split it up a bit. My version works a > bit differently (thread_info has a single element if the new option is > set) but is otherwise more or less your code. It seems to work. > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01ac3242f314405c1bac0f820ec3575a850509d3 > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=87194cac139aebecec89a68eff719ab44f0469a2 > > On top of *that*, I taught the kernel to free stacks in release_task > and to cache stacks if vmalloced. That still blows up: when > cryptomgr_test calls do_exit during boot, do_exit calls exit_notify, > which observes that the task state is TASK_DEAD and thus calls > release_task on itself and goes boom. > > Linus, Oleg, help? How am I supposed to quickly free the stack if the > task goes through this code path? In fact, why does this work on > current kernels? After all, can't schedule() indicates an RCU grace > period? In principle, what prevents delayed_put_task_struct from > deleting the running stack before scheduling? > > My kludged up patch that only early-releases the stack if release_task > is called from a different task is here: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=2327ca2ab3d634d120ea185e737fef2e4e9cf012 Maybe I'm misunderstanding the role of release_task. It looks like there's this path in the scheduler I can borrow: if (unlikely(prev_state == TASK_DEAD)) { With a kludge in place to free the stack in there and release_task and __put_task_struct, whichever is first, I get a nice speedup. Benchmarks coming later on. Can I rely on that code path always being called? --Andy From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) Date: Sat, 25 Jun 2016 16:30:14 -0700 Message-ID: References: <20160623185340.GO30154@twins.programming.kicks-ass.net> <20160624202530.unmidps4kpebo2na@treble> <20160624205116.4hbnurrnan4afq2t@treble> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-vk0-f45.google.com ([209.85.213.45]:34965 "EHLO mail-vk0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751286AbcFYXaf (ORCPT ); Sat, 25 Jun 2016 19:30:35 -0400 Received: by mail-vk0-f45.google.com with SMTP id j2so192640532vkg.2 for ; Sat, 25 Jun 2016 16:30:35 -0700 (PDT) In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Linus Torvalds Cc: Josh Poimboeuf , Brian Gerst , Peter Zijlstra , Oleg Nesterov , Andy Lutomirski , the arch/x86 maintainers , Linux Kernel Mailing List , "linux-arch@vger.kernel.org" , Borislav Petkov , Nadav Amit , Kees Cook , "kernel-hardening@lists.openwall.com" , Jann Horn , Heiko Carstens On Sat, Jun 25, 2016 at 4:19 PM, Andy Lutomirski wrote: > I rebased my series onto your tree and then rebased this thing onto my > series, tweaked it some, and split it up a bit. My version works a > bit differently (thread_info has a single element if the new option is > set) but is otherwise more or less your code. It seems to work. > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01ac3242f314405c1bac0f820ec3575a850509d3 > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=87194cac139aebecec89a68eff719ab44f0469a2 > > On top of *that*, I taught the kernel to free stacks in release_task > and to cache stacks if vmalloced. That still blows up: when > cryptomgr_test calls do_exit during boot, do_exit calls exit_notify, > which observes that the task state is TASK_DEAD and thus calls > release_task on itself and goes boom. > > Linus, Oleg, help? How am I supposed to quickly free the stack if the > task goes through this code path? In fact, why does this work on > current kernels? After all, can't schedule() indicates an RCU grace > period? In principle, what prevents delayed_put_task_struct from > deleting the running stack before scheduling? > > My kludged up patch that only early-releases the stack if release_task > is called from a different task is here: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=2327ca2ab3d634d120ea185e737fef2e4e9cf012 Maybe I'm misunderstanding the role of release_task. It looks like there's this path in the scheduler I can borrow: if (unlikely(prev_state == TASK_DEAD)) { With a kludge in place to free the stack in there and release_task and __put_task_struct, whichever is first, I get a nice speedup. Benchmarks coming later on. Can I rely on that code path always being called? --Andy From mboxrd@z Thu Jan 1 00:00:00 1970 Reply-To: kernel-hardening@lists.openwall.com MIME-Version: 1.0 In-Reply-To: References: <20160623185340.GO30154@twins.programming.kicks-ass.net> <20160624202530.unmidps4kpebo2na@treble> <20160624205116.4hbnurrnan4afq2t@treble> From: Andy Lutomirski Date: Sat, 25 Jun 2016 16:30:14 -0700 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: [kernel-hardening] Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) To: Linus Torvalds Cc: Josh Poimboeuf , Brian Gerst , Peter Zijlstra , Oleg Nesterov , Andy Lutomirski , the arch/x86 maintainers , Linux Kernel Mailing List , "linux-arch@vger.kernel.org" , Borislav Petkov , Nadav Amit , Kees Cook , "kernel-hardening@lists.openwall.com" , Jann Horn , Heiko Carstens List-ID: On Sat, Jun 25, 2016 at 4:19 PM, Andy Lutomirski wrote: > I rebased my series onto your tree and then rebased this thing onto my > series, tweaked it some, and split it up a bit. My version works a > bit differently (thread_info has a single element if the new option is > set) but is otherwise more or less your code. It seems to work. > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01ac3242f314405c1bac0f820ec3575a850509d3 > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=87194cac139aebecec89a68eff719ab44f0469a2 > > On top of *that*, I taught the kernel to free stacks in release_task > and to cache stacks if vmalloced. That still blows up: when > cryptomgr_test calls do_exit during boot, do_exit calls exit_notify, > which observes that the task state is TASK_DEAD and thus calls > release_task on itself and goes boom. > > Linus, Oleg, help? How am I supposed to quickly free the stack if the > task goes through this code path? In fact, why does this work on > current kernels? After all, can't schedule() indicates an RCU grace > period? In principle, what prevents delayed_put_task_struct from > deleting the running stack before scheduling? > > My kludged up patch that only early-releases the stack if release_task > is called from a different task is here: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=2327ca2ab3d634d120ea185e737fef2e4e9cf012 Maybe I'm misunderstanding the role of release_task. It looks like there's this path in the scheduler I can borrow: if (unlikely(prev_state == TASK_DEAD)) { With a kludge in place to free the stack in there and release_task and __put_task_struct, whichever is first, I get a nice speedup. Benchmarks coming later on. Can I rely on that code path always being called? --Andy