From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754340AbbCPMJR (ORCPT ); Mon, 16 Mar 2015 08:09:17 -0400 Received: from terminus.zytor.com ([198.137.202.10]:36013 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754295AbbCPMJP (ORCPT ); Mon, 16 Mar 2015 08:09:15 -0400 Date: Mon, 16 Mar 2015 05:08:50 -0700 From: tip-bot for Andy Lutomirski Message-ID: Cc: luto@amacapital.net, hpa@zytor.com, dvlasenk@redhat.com, mingo@kernel.org, torvalds@linux-foundation.org, bp@alien8.de, linux-kernel@vger.kernel.org, tglx@linutronix.de, oleg@redhat.com Reply-To: tglx@linutronix.de, oleg@redhat.com, mingo@kernel.org, torvalds@linux-foundation.org, bp@alien8.de, linux-kernel@vger.kernel.org, luto@amacapital.net, dvlasenk@redhat.com, hpa@zytor.com In-Reply-To: <02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net> References: <02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/asm] x86/asm/entry: Create and use a ' TOP_OF_KERNEL_STACK_PADDING' macro Git-Commit-ID: 3bfef48e27863ebbd47d68a3f4a081d2cb5286aa X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 3bfef48e27863ebbd47d68a3f4a081d2cb5286aa Gitweb: http://git.kernel.org/tip/3bfef48e27863ebbd47d68a3f4a081d2cb5286aa Author: Andy Lutomirski AuthorDate: Tue, 10 Mar 2015 11:05:58 -0700 Committer: Ingo Molnar CommitDate: Mon, 16 Mar 2015 11:05:30 +0100 x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro x86_32, unlike x86_64, pads the top of the kernel stack, because the hardware stack frame formats are variable in size. Document this padding and give it a name. This should make no change whatsoever to the compiled kernel image. It also doesn't fix any of the current bugs in this area. Signed-off-by: Andy Lutomirski Acked-by: Denys Vlasenko Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Oleg Nesterov Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net [ Fixed small details, such as a missed magic constant in entry_32.S pointed out by Denys Vlasenko. ] Signed-off-by: Ingo Molnar --- arch/x86/include/asm/processor.h | 3 ++- arch/x86/include/asm/thread_info.h | 27 +++++++++++++++++++++++++++ arch/x86/kernel/entry_32.S | 2 +- 3 files changed, 30 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index c77605d..554da61 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -848,7 +848,8 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk); #define task_pt_regs(task) \ ({ \ struct pt_regs *__regs__; \ - __regs__ = (struct pt_regs *)(KSTK_TOP(task_stack_page(task))-8); \ + __regs__ = (struct pt_regs *)(KSTK_TOP(task_stack_page(task)) - \ + TOP_OF_KERNEL_STACK_PADDING); \ __regs__ - 1; \ }) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 7740edd..ba115eb 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -13,6 +13,33 @@ #include /* + * TOP_OF_KERNEL_STACK_PADDING is a number of unused bytes that we + * reserve at the top of the kernel stack. We do it because of a nasty + * 32-bit corner case. On x86_32, the hardware stack frame is + * variable-length. Except for vm86 mode, struct pt_regs assumes a + * maximum-length frame. If we enter from CPL 0, the top 8 bytes of + * pt_regs don't actually exist. Ordinarily this doesn't matter, but it + * does in at least one case: + * + * If we take an NMI early enough in SYSENTER, then we can end up with + * pt_regs that extends above sp0. On the way out, in the espfix code, + * we can read the saved SS value, but that value will be above sp0. + * Without this offset, that can result in a page fault. (We are + * careful that, in this case, the value we read doesn't matter.) + * + * In vm86 mode, the hardware frame is much longer still, but we neither + * access the extra members from NMI context, nor do we write such a + * frame at sp0 at all. + * + * x86_64 has a fixed-length stack frame. + */ +#ifdef CONFIG_X86_32 +# define TOP_OF_KERNEL_STACK_PADDING 8 +#else +# define TOP_OF_KERNEL_STACK_PADDING 0 +#endif + +/* * low level task data that entry.S needs immediate access to * - this struct should fit entirely inside of one cache line * - this struct shares the supervisor stack pages diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index e33ba51..4c8cc34 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -398,7 +398,7 @@ sysenter_past_esp: * A tiny bit of offset fixup is necessary - 4*4 means the 4 words * pushed above; +8 corresponds to copy_thread's esp0 setting. */ - pushl_cfi ((TI_sysenter_return)-THREAD_SIZE+8+4*4)(%esp) + pushl_cfi ((TI_sysenter_return)-THREAD_SIZE+TOP_OF_KERNEL_STACK_PADDING+4*4)(%esp) CFI_REL_OFFSET eip, 0 pushl_cfi %eax