From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753515AbbCPI4j (ORCPT ); Mon, 16 Mar 2015 04:56:39 -0400 Received: from mail-wi0-f170.google.com ([209.85.212.170]:34924 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753211AbbCPI4i (ORCPT ); Mon, 16 Mar 2015 04:56:38 -0400 Date: Mon, 16 Mar 2015 09:56:33 +0100 From: Ingo Molnar To: Andy Lutomirski Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Borislav Petkov , Oleg Nesterov , Denys Vlasenko Subject: Re: [PATCH 1/3] x86: Create and use a TOP_OF_KERNEL_STACK_PADDING macro Message-ID: <20150316085632.GA19903@gmail.com> References: <02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Andy Lutomirski wrote: > x86_32, unlike x86_64, pads the top of the kernel stack. Document > this padding and give it a name. > > This should make no change whatsoever to the compiled kernel image. > It also doesn't fix any of the current bugs in this area. > > Signed-off-by: Andy Lutomirski > --- > arch/x86/include/asm/processor.h | 3 ++- > arch/x86/include/asm/thread_info.h | 30 ++++++++++++++++++++++++++++++ > 2 files changed, 32 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 48a61c1c626e..88d9aa745898 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -849,7 +849,8 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk); > #define task_pt_regs(task) \ > ({ \ > struct pt_regs *__regs__; \ > - __regs__ = (struct pt_regs *)(KSTK_TOP(task_stack_page(task))-8); \ > + __regs__ = (struct pt_regs *)(KSTK_TOP(task_stack_page(task)) - \ > + TOP_OF_KERNEL_STACK_PADDING); \ > __regs__ - 1; \ > }) > > diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h > index 7740edd56fed..74fd74ca50d3 100644 > --- a/arch/x86/include/asm/thread_info.h > +++ b/arch/x86/include/asm/thread_info.h > @@ -49,6 +49,36 @@ struct thread_info { > #define init_thread_info (init_thread_union.thread_info) > #define init_stack (init_thread_union.stack) > > +#ifdef CONFIG_X86_32 > + > +/* > + * TOP_OF_KERNEL_STACK_PADDING is a number of unused bytes that we > + * reserve at the top of the kernel stack. We do it because of a nasty > + * 32-bit corner case. On x86_32, the hardware stack frame is > + * variable-length. Except for vm86 mode, struct pt_regs assumes a > + * maximum-length frame. If we enter from CPL 0, the top 8 bytes of > + * pt_regs don't actually exist. Ordinarily this doesn't matter, but it > + * does in at least one case: > + * > + * If we take an NMI early enough in sysenter, the we can end up with s/the/then I fixed this up in the commit. > + * pt_regs that extends above sp0. On the way out, in the espfix code, > + * we can read the saved SS value, but that value will be above sp0. > + * Without this offset, that can result in a page fault. (We are > + * careful that, in this case, the value we read doesn't matter.) > + * > + * In vm86 mode, the hardware frame is much longer still, but we neither > + * access the extra members from NMI context, nor do we write such a > + * frame at sp0 at all. > + */ Thanks, Ingo