From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751883AbcGNQvy (ORCPT ); Thu, 14 Jul 2016 12:51:54 -0400 Received: from mail-vk0-f51.google.com ([209.85.213.51]:36513 "EHLO mail-vk0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751819AbcGNQvu convert rfc822-to-8bit (ORCPT ); Thu, 14 Jul 2016 12:51:50 -0400 MIME-Version: 1.0 In-Reply-To: <20160714083411.GA15437@gmail.com> References: <8d36dd9b2430b61db64333af7b911d0bca7d5d2f.1468270393.git.luto@kernel.org> <20160713075314.GA32700@gmail.com> <20160714083411.GA15437@gmail.com> From: Andy Lutomirski Date: Thu, 14 Jul 2016 09:51:29 -0700 Message-ID: Subject: Re: [PATCH v5 14/32] x86/mm/64: Enable vmapped stacks To: Ingo Molnar Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , linux-arch , Borislav Petkov , Nadav Amit , Kees Cook , Brian Gerst , "kernel-hardening@lists.openwall.com" , Linus Torvalds , Josh Poimboeuf , Jann Horn , Heiko Carstens Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 14, 2016 at 1:34 AM, Ingo Molnar wrote: > > * Andy Lutomirski wrote: > >> On Wed, Jul 13, 2016 at 12:53 AM, Ingo Molnar wrote: >> > >> > * Andy Lutomirski wrote: >> > >> >> This allows x86_64 kernels to enable vmapped stacks. There are a >> >> couple of interesting bits. >> > >> >> --- a/arch/x86/Kconfig >> >> +++ b/arch/x86/Kconfig >> >> @@ -92,6 +92,7 @@ config X86 >> >> select HAVE_ARCH_TRACEHOOK >> >> select HAVE_ARCH_TRANSPARENT_HUGEPAGE >> >> select HAVE_EBPF_JIT if X86_64 >> >> + select HAVE_ARCH_VMAP_STACK if X86_64 >> > >> > So what is the performance impact? >> >> Seems to be a very slight speedup (0.5 盜 or so) on my silly benchmark >> (pthread_create, pthread_join in a loop). [...] > > Music to my ears - although TBH there's probably two opposing forces: advantages > from the cache versus (possibly very minor, if measurable at all) disadvantages > from the 4K granularity. True. It's also plausible that there will be different lock contention issues on very large system with vmalloc vs using the page allocator. And there's one other issue. The patchset will considerably increase the frequency with which vmap gets called, which will make the giant hole in Chris Metcalf's isolation series more apparent. But that's not really a regression -- the problem is pre-existing, and I've pointed it out a few times before. Arguably it's even a benefit -- someone will have to *fix* it now :) > So I'd prefer the following approach: to apply it to a v4.8-rc1 base in ~2 weeks > and keep it default-y for much of the next development cycle. If no serious > problems are found in those ~2 months then send it to Linus in that fashion. We > can still turn it off by default (or re-spin the whole approach) if it turns out > to be too risky. > > Exposing it as default-n for even a small amount of time will massively reduce the > testing we'll get, as most people will just use the N setting (often without > noticing). > > Plus this also gives net-next and other preparatory patches applied directly to > maintainer trees time to trickle upstream. Works for me. If you think it makes sense, I can split out a bunch of the x86 and mm preparatory patches and, if it reorders cleanly and still works, the THREAD_INFO_IN_TASK thing. Those could plausibly go in 4.7. Would you like me to see how that goes? (I'll verify that it builds, that the 0day bot likes it, and that the final result of the patchset is exactly identical to what I already sent.) --Andy