From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Garnier Subject: Re: x86: PIE support and option to extend KASLR randomization Date: Tue, 15 Aug 2017 07:58:47 -0700 Message-ID: References: <20170810172615.51965-1-thgarnie@google.com> <20170811124127.kkb5pnkljz4umxuj@gmail.com> <20170815075609.mmzbfwritjzvrpsn@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Ingo Molnar , Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Arnd Bergmann , Matthias Kaehlcke , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Joerg Roedel , Tom Lendacky , Andy Lutomirski , Borislav Petkov , Brian Gerst , "Kirill A . Shutemov" , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Tejun H To: Daniel Micay Return-path: List-Post: List-Help: List-Unsubscribe: List-Subscribe: In-Reply-To: List-Id: linux-crypto.vger.kernel.org On Tue, Aug 15, 2017 at 7:47 AM, Daniel Micay wrote: > On 15 August 2017 at 10:20, Thomas Garnier wrote: >> On Tue, Aug 15, 2017 at 12:56 AM, Ingo Molnar wrote: >>> >>> * Thomas Garnier wrote: >>> >>>> > Do these changes get us closer to being able to build the kernel as truly >>>> > position independent, i.e. to place it anywhere in the valid x86-64 address >>>> > space? Or any other advantages? >>>> >>>> Yes, PIE allows us to put the kernel anywhere in memory. It will allow us to >>>> have a full randomized address space where position and order of sections are >>>> completely random. There is still some work to get there but being able to build >>>> a PIE kernel is a significant step. >>> >>> So I _really_ dislike the whole PIE approach, because of the huge slowdown: >>> >>> +config RANDOMIZE_BASE_LARGE >>> + bool "Increase the randomization range of the kernel image" >>> + depends on X86_64 && RANDOMIZE_BASE >>> + select X86_PIE >>> + select X86_MODULE_PLTS if MODULES >>> + default n >>> + ---help--- >>> + Build the kernel as a Position Independent Executable (PIE) and >>> + increase the available randomization range from 1GB to 3GB. >>> + >>> + This option impacts performance on kernel CPU intensive workloads up >>> + to 10% due to PIE generated code. Impact on user-mode processes and >>> + typical usage would be significantly less (0.50% when you build the >>> + kernel). >>> + >>> + The kernel and modules will generate slightly more assembly (1 to 2% >>> + increase on the .text sections). The vmlinux binary will be >>> + significantly smaller due to less relocations. >>> >>> To put 10% kernel overhead into perspective: enabling this option wipes out about >>> 5-10 years worth of painstaking optimizations we've done to keep the kernel fast >>> ... (!!) >> >> Note that 10% is the high-bound of a CPU intensive workload. > > The cost can be reduced by using -fno-plt these days but some work > might be required to make that work with the kernel. > > Where does that 10% estimate in the kernel config docs come from? I'd > be surprised if it really cost that much on x86_64. That's a realistic > cost for i386 with modern GCC (it used to be worse) but I'd expect > x86_64 to be closer to 2% even for CPU intensive workloads. It should > be very close to zero with -fno-plt. I got 8 to 10% on hackbench. Other benchmarks were 4% or lower. I will do look at more recent compiler and no-plt as well. -- Thomas From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: <20170810172615.51965-1-thgarnie@google.com> <20170811124127.kkb5pnkljz4umxuj@gmail.com> <20170815075609.mmzbfwritjzvrpsn@gmail.com> From: Thomas Garnier Date: Tue, 15 Aug 2017 07:58:47 -0700 Message-ID: Content-Type: text/plain; charset="UTF-8" Subject: [kernel-hardening] Re: x86: PIE support and option to extend KASLR randomization To: Daniel Micay Cc: Ingo Molnar , Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Arnd Bergmann , Matthias Kaehlcke , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Joerg Roedel , Tom Lendacky , Andy Lutomirski , Borislav Petkov , Brian Gerst , "Kirill A . Shutemov" , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Tejun Heo , Christoph Lameter , Paul Gortmaker , Chris Metcalf , Andrew Morton , "Paul E . McKenney" , Nicolas Pitre , Christopher Li , "Rafael J . Wysocki" , Lukas Wunner , Mika Westerberg , Dou Liyang , Daniel Borkmann , Alexei Starovoitov , Masahiro Yamada , Markus Trippelsdorf , Steven Rostedt , Kees Cook , Rik van Riel , David Howells , Waiman Long , Kyle Huey , Peter Foley , Tim Chen , Catalin Marinas , Ard Biesheuvel , Michal Hocko , Matthew Wilcox , "H . J . Lu" , Paul Bolle , Rob Landley , Baoquan He , the arch/x86 maintainers , Linux Crypto Mailing List , LKML , xen-devel@lists.xenproject.org, kvm list , Linux PM list , linux-arch , linux-sparse@vger.kernel.org, Kernel Hardening , Linus Torvalds , Peter Zijlstra , Borislav Petkov List-ID: On Tue, Aug 15, 2017 at 7:47 AM, Daniel Micay wrote: > On 15 August 2017 at 10:20, Thomas Garnier wrote: >> On Tue, Aug 15, 2017 at 12:56 AM, Ingo Molnar wrote: >>> >>> * Thomas Garnier wrote: >>> >>>> > Do these changes get us closer to being able to build the kernel as truly >>>> > position independent, i.e. to place it anywhere in the valid x86-64 address >>>> > space? Or any other advantages? >>>> >>>> Yes, PIE allows us to put the kernel anywhere in memory. It will allow us to >>>> have a full randomized address space where position and order of sections are >>>> completely random. There is still some work to get there but being able to build >>>> a PIE kernel is a significant step. >>> >>> So I _really_ dislike the whole PIE approach, because of the huge slowdown: >>> >>> +config RANDOMIZE_BASE_LARGE >>> + bool "Increase the randomization range of the kernel image" >>> + depends on X86_64 && RANDOMIZE_BASE >>> + select X86_PIE >>> + select X86_MODULE_PLTS if MODULES >>> + default n >>> + ---help--- >>> + Build the kernel as a Position Independent Executable (PIE) and >>> + increase the available randomization range from 1GB to 3GB. >>> + >>> + This option impacts performance on kernel CPU intensive workloads up >>> + to 10% due to PIE generated code. Impact on user-mode processes and >>> + typical usage would be significantly less (0.50% when you build the >>> + kernel). >>> + >>> + The kernel and modules will generate slightly more assembly (1 to 2% >>> + increase on the .text sections). The vmlinux binary will be >>> + significantly smaller due to less relocations. >>> >>> To put 10% kernel overhead into perspective: enabling this option wipes out about >>> 5-10 years worth of painstaking optimizations we've done to keep the kernel fast >>> ... (!!) >> >> Note that 10% is the high-bound of a CPU intensive workload. > > The cost can be reduced by using -fno-plt these days but some work > might be required to make that work with the kernel. > > Where does that 10% estimate in the kernel config docs come from? I'd > be surprised if it really cost that much on x86_64. That's a realistic > cost for i386 with modern GCC (it used to be worse) but I'd expect > x86_64 to be closer to 2% even for CPU intensive workloads. It should > be very close to zero with -fno-plt. I got 8 to 10% on hackbench. Other benchmarks were 4% or lower. I will do look at more recent compiler and no-plt as well. -- Thomas