From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751631AbcFYXZ5 (ORCPT ); Sat, 25 Jun 2016 19:25:57 -0400 Received: from mail-vk0-f45.google.com ([209.85.213.45]:34968 "EHLO mail-vk0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751298AbcFYXZz (ORCPT ); Sat, 25 Jun 2016 19:25:55 -0400 MIME-Version: 1.0 In-Reply-To: References: <20160623185340.GO30154@twins.programming.kicks-ass.net> <20160624202530.unmidps4kpebo2na@treble> <20160624205116.4hbnurrnan4afq2t@treble> From: Andy Lutomirski Date: Sat, 25 Jun 2016 16:19:30 -0700 Message-ID: Subject: Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) To: Linus Torvalds Cc: Josh Poimboeuf , Brian Gerst , Peter Zijlstra , Oleg Nesterov , Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , "linux-arch@vger.kernel.org" , Borislav Petkov , Nadav Amit , Kees Cook , "kernel-hardening@lists.openwall.com" , Jann Horn , Heiko Carstens Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 24, 2016 at 7:41 PM, Linus Torvalds wrote: > On Fri, Jun 24, 2016 at 2:34 PM, Andy Lutomirski wrote: >>> >>> So let me get the pure semantic patches done, and then for 4.8 when we >>> do the things that actually change real meaning we'll have a sane >>> base. Ok? >> >> Works for me. I'll see whether my vmap patches still apply and, if >> needed, rebase and send a v5. > > Ok, I'e pushed out the cleanups (and all the pulls that always come in > on Friday afternoon - gaah, I shouldn't have tried doing this on a > Friday). > > I'm attaching the current left-over patch that actually changes > things. It's obviously a composite, and includes your "remove > stack_smp_processor_id()" thing etc, so it's not meant to be used > as-is, but it does seem to work. > > Interestingly, it seems pretty clean too, removing more lines than it > adds (despite the fact that it adds a new config option), and > generally making things prettier rather than the reverse. > > That's always a good sign. I rebased my series onto your tree and then rebased this thing onto my series, tweaked it some, and split it up a bit. My version works a bit differently (thread_info has a single element if the new option is set) but is otherwise more or less your code. It seems to work. https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01ac3242f314405c1bac0f820ec3575a850509d3 https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=87194cac139aebecec89a68eff719ab44f0469a2 On top of *that*, I taught the kernel to free stacks in release_task and to cache stacks if vmalloced. That still blows up: when cryptomgr_test calls do_exit during boot, do_exit calls exit_notify, which observes that the task state is TASK_DEAD and thus calls release_task on itself and goes boom. Linus, Oleg, help? How am I supposed to quickly free the stack if the task goes through this code path? In fact, why does this work on current kernels? After all, can't schedule() indicates an RCU grace period? In principle, what prevents delayed_put_task_struct from deleting the running stack before scheduling? My kludged up patch that only early-releases the stack if release_task is called from a different task is here: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=2327ca2ab3d634d120ea185e737fef2e4e9cf012 Ick. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) Date: Sat, 25 Jun 2016 16:19:30 -0700 Message-ID: References: <20160623185340.GO30154@twins.programming.kicks-ass.net> <20160624202530.unmidps4kpebo2na@treble> <20160624205116.4hbnurrnan4afq2t@treble> Reply-To: kernel-hardening@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: List-Post: List-Help: List-Unsubscribe: List-Subscribe: In-Reply-To: To: Linus Torvalds Cc: Josh Poimboeuf , Brian Gerst , Peter Zijlstra , Oleg Nesterov , Andy Lutomirski , the arch/x86 maintainers , Linux Kernel Mailing List , "linux-arch@vger.kernel.org" , Borislav Petkov , Nadav Amit , Kees Cook , "kernel-hardening@lists.openwall.com" , Jann Horn , Heiko Carstens List-Id: linux-arch.vger.kernel.org On Fri, Jun 24, 2016 at 7:41 PM, Linus Torvalds wrote: > On Fri, Jun 24, 2016 at 2:34 PM, Andy Lutomirski wrote: >>> >>> So let me get the pure semantic patches done, and then for 4.8 when we >>> do the things that actually change real meaning we'll have a sane >>> base. Ok? >> >> Works for me. I'll see whether my vmap patches still apply and, if >> needed, rebase and send a v5. > > Ok, I'e pushed out the cleanups (and all the pulls that always come in > on Friday afternoon - gaah, I shouldn't have tried doing this on a > Friday). > > I'm attaching the current left-over patch that actually changes > things. It's obviously a composite, and includes your "remove > stack_smp_processor_id()" thing etc, so it's not meant to be used > as-is, but it does seem to work. > > Interestingly, it seems pretty clean too, removing more lines than it > adds (despite the fact that it adds a new config option), and > generally making things prettier rather than the reverse. > > That's always a good sign. I rebased my series onto your tree and then rebased this thing onto my series, tweaked it some, and split it up a bit. My version works a bit differently (thread_info has a single element if the new option is set) but is otherwise more or less your code. It seems to work. https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01ac3242f314405c1bac0f820ec3575a850509d3 https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=87194cac139aebecec89a68eff719ab44f0469a2 On top of *that*, I taught the kernel to free stacks in release_task and to cache stacks if vmalloced. That still blows up: when cryptomgr_test calls do_exit during boot, do_exit calls exit_notify, which observes that the task state is TASK_DEAD and thus calls release_task on itself and goes boom. Linus, Oleg, help? How am I supposed to quickly free the stack if the task goes through this code path? In fact, why does this work on current kernels? After all, can't schedule() indicates an RCU grace period? In principle, what prevents delayed_put_task_struct from deleting the running stack before scheduling? My kludged up patch that only early-releases the stack if release_task is called from a different task is here: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=2327ca2ab3d634d120ea185e737fef2e4e9cf012 Ick. From mboxrd@z Thu Jan 1 00:00:00 1970 Reply-To: kernel-hardening@lists.openwall.com MIME-Version: 1.0 In-Reply-To: References: <20160623185340.GO30154@twins.programming.kicks-ass.net> <20160624202530.unmidps4kpebo2na@treble> <20160624205116.4hbnurrnan4afq2t@treble> From: Andy Lutomirski Date: Sat, 25 Jun 2016 16:19:30 -0700 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: [kernel-hardening] Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) To: Linus Torvalds Cc: Josh Poimboeuf , Brian Gerst , Peter Zijlstra , Oleg Nesterov , Andy Lutomirski , the arch/x86 maintainers , Linux Kernel Mailing List , "linux-arch@vger.kernel.org" , Borislav Petkov , Nadav Amit , Kees Cook , "kernel-hardening@lists.openwall.com" , Jann Horn , Heiko Carstens List-ID: On Fri, Jun 24, 2016 at 7:41 PM, Linus Torvalds wrote: > On Fri, Jun 24, 2016 at 2:34 PM, Andy Lutomirski wrote: >>> >>> So let me get the pure semantic patches done, and then for 4.8 when we >>> do the things that actually change real meaning we'll have a sane >>> base. Ok? >> >> Works for me. I'll see whether my vmap patches still apply and, if >> needed, rebase and send a v5. > > Ok, I'e pushed out the cleanups (and all the pulls that always come in > on Friday afternoon - gaah, I shouldn't have tried doing this on a > Friday). > > I'm attaching the current left-over patch that actually changes > things. It's obviously a composite, and includes your "remove > stack_smp_processor_id()" thing etc, so it's not meant to be used > as-is, but it does seem to work. > > Interestingly, it seems pretty clean too, removing more lines than it > adds (despite the fact that it adds a new config option), and > generally making things prettier rather than the reverse. > > That's always a good sign. I rebased my series onto your tree and then rebased this thing onto my series, tweaked it some, and split it up a bit. My version works a bit differently (thread_info has a single element if the new option is set) but is otherwise more or less your code. It seems to work. https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=01ac3242f314405c1bac0f820ec3575a850509d3 https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=87194cac139aebecec89a68eff719ab44f0469a2 On top of *that*, I taught the kernel to free stacks in release_task and to cache stacks if vmalloced. That still blows up: when cryptomgr_test calls do_exit during boot, do_exit calls exit_notify, which observes that the task state is TASK_DEAD and thus calls release_task on itself and goes boom. Linus, Oleg, help? How am I supposed to quickly free the stack if the task goes through this code path? In fact, why does this work on current kernels? After all, can't schedule() indicates an RCU grace period? In principle, what prevents delayed_put_task_struct from deleting the running stack before scheduling? My kludged up patch that only early-releases the stack if release_task is called from a different task is here: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=2327ca2ab3d634d120ea185e737fef2e4e9cf012 Ick.