LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, "H . Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Juergen Gross <jgross@suse.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>, Jiri Kosina <jkosina@suse.cz>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Brian Gerst <brgerst@gmail.com>,
	David Laight <David.Laight@aculab.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Eduardo Valentin <eduval@amazon.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Will Deacon <will.deacon@arm.com>,
	aliguori@amazon.com, daniel.gruss@iaik.tugraz.at,
	hughd@google.com, keescook@google.com,
	Andrea Arcangeli <aarcange@redhat.com>,
	Waiman Long <llong@redhat.com>, Pavel Machek <pavel@ucw.cz>,
	"David H . Gutteridge" <dhgutteridge@sympatico.ca>,
	jroedel@suse.de
Subject: Re: [PATCH 07/39] x86/entry/32: Enter the kernel via trampoline stack
Date: Thu, 12 Jul 2018 14:09:45 -0700
Message-ID: <A66D58A6-3DC6-4CF3-B2A5-433C6E974060@amacapital.net> (raw)
In-Reply-To: <1531308586-29340-8-git-send-email-joro@8bytes.org>



> On Jul 11, 2018, at 4:29 AM, Joerg Roedel <joro@8bytes.org> wrote:
> 
> From: Joerg Roedel <jroedel@suse.de>
> 
> Use the entry-stack as a trampoline to enter the kernel. The
> entry-stack is already in the cpu_entry_area and will be
> mapped to userspace when PTI is enabled.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
> arch/x86/entry/entry_32.S        | 136 +++++++++++++++++++++++++++++++--------
> arch/x86/include/asm/switch_to.h |   6 +-
> arch/x86/kernel/asm-offsets.c    |   1 +
> arch/x86/kernel/cpu/common.c     |   5 +-
> arch/x86/kernel/process.c        |   2 -
> arch/x86/kernel/process_32.c     |  10 +--
> 6 files changed, 121 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index 61303fa..528db7d 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -154,25 +154,36 @@
> 
> #endif /* CONFIG_X86_32_LAZY_GS */
> 
> -.macro SAVE_ALL pt_regs_ax=%eax
> +.macro SAVE_ALL pt_regs_ax=%eax switch_stacks=0
>    cld
> +    /* Push segment registers and %eax */
>    PUSH_GS
>    pushl    %fs
>    pushl    %es
>    pushl    %ds
>    pushl    \pt_regs_ax
> +
> +    /* Load kernel segments */
> +    movl    $(__USER_DS), %eax

If \pt_regs_ax != %eax, then this will behave oddly. Maybe it’s okay. But I don’t see why this change was needed at all.

> +    movl    %eax, %ds
> +    movl    %eax, %es
> +    movl    $(__KERNEL_PERCPU), %eax
> +    movl    %eax, %fs
> +    SET_KERNEL_GS %eax
> +
> +    /* Push integer registers and complete PT_REGS */
>    pushl    %ebp
>    pushl    %edi
>    pushl    %esi
>    pushl    %edx
>    pushl    %ecx
>    pushl    %ebx
> -    movl    $(__USER_DS), %edx
> -    movl    %edx, %ds
> -    movl    %edx, %es
> -    movl    $(__KERNEL_PERCPU), %edx
> -    movl    %edx, %fs
> -    SET_KERNEL_GS %edx
> +
> +    /* Switch to kernel stack if necessary */
> +.if \switch_stacks > 0
> +    SWITCH_TO_KERNEL_STACK
> +.endif
> +
> .endm
> 
> /*
> @@ -269,6 +280,72 @@
> .Lend_\@:
> #endif /* CONFIG_X86_ESPFIX32 */
> .endm
> +
> +
> +/*
> + * Called with pt_regs fully populated and kernel segments loaded,
> + * so we can access PER_CPU and use the integer registers.
> + *
> + * We need to be very careful here with the %esp switch, because an NMI
> + * can happen everywhere. If the NMI handler finds itself on the
> + * entry-stack, it will overwrite the task-stack and everything we
> + * copied there. So allocate the stack-frame on the task-stack and
> + * switch to it before we do any copying.

Ick, right. Same with machine check, though. You could alternatively fix it by running NMIs on an irq stack if the irq count is zero.  How confident are you that you got #MC right?

> + */
> +.macro SWITCH_TO_KERNEL_STACK
> +
> +    ALTERNATIVE     "", "jmp .Lend_\@", X86_FEATURE_XENPV
> +
> +    /* Are we on the entry stack? Bail out if not! */
> +    movl    PER_CPU_VAR(cpu_entry_area), %edi
> +    addl    $CPU_ENTRY_AREA_entry_stack, %edi
> +    cmpl    %esp, %edi
> +    jae    .Lend_\@

That’s an alarming assumption about the address space layout. How about an xor and an and instead of cmpl?  As it stands, if the address layout ever changes, the failure may be rather subtle.

Anyway, wouldn’t it be easier to solve this by just not switching stacks on entries from kernel mode and making the entry stack bigger?  Stick an assertion in the scheduling code that we’re not on an entry stack, perhaps.


  reply index

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 11:29 [PATCH 00/39 v7] PTI support for x86-32 Joerg Roedel
2018-07-11 11:29 ` [PATCH 01/39] x86/asm-offsets: Move TSS_sp0 and TSS_sp1 to asm-offsets.c Joerg Roedel
2018-07-12 20:44   ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 02/39] x86/entry/32: Rename TSS_sysenter_sp0 to TSS_entry_stack Joerg Roedel
2018-07-12 20:44   ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 03/39] x86/entry/32: Load task stack from x86_tss.sp1 in SYSENTER handler Joerg Roedel
2018-07-12 20:49   ` Andy Lutomirski
2018-07-13  9:48     ` Joerg Roedel
2018-07-13 17:19       ` Andy Lutomirski
2018-07-13 23:17         ` Andy Lutomirski
2018-07-17  7:05           ` Joerg Roedel
2018-07-17 20:04             ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 04/39] x86/entry/32: Put ESPFIX code into a macro Joerg Roedel
2018-07-11 11:29 ` [PATCH 05/39] x86/entry/32: Unshare NMI return path Joerg Roedel
2018-07-12 20:53   ` Andy Lutomirski
2018-07-13 10:05     ` Joerg Roedel
2018-07-13 17:26       ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 06/39] x86/entry/32: Split off return-to-kernel path Joerg Roedel
2018-07-11 11:29 ` [PATCH 07/39] x86/entry/32: Enter the kernel via trampoline stack Joerg Roedel
2018-07-12 21:09   ` Andy Lutomirski [this message]
2018-07-13 10:56     ` Joerg Roedel
2018-07-13 17:21       ` Andy Lutomirski
2018-07-17  7:07         ` Joerg Roedel
2018-07-11 11:29 ` [PATCH 08/39] x86/entry/32: Leave " Joerg Roedel
2018-07-11 11:29 ` [PATCH 09/39] x86/entry/32: Introduce SAVE_ALL_NMI and RESTORE_ALL_NMI Joerg Roedel
2018-07-11 11:29 ` [PATCH 10/39] x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack Joerg Roedel
2018-07-13 23:31   ` Andy Lutomirski
2018-07-14  5:21     ` Joerg Roedel
2018-07-14  6:26       ` Andy Lutomirski
2018-07-14  8:01         ` Joerg Roedel
2018-07-14 14:36           ` Andy Lutomirski
2018-07-17  7:15             ` Joerg Roedel
2018-07-17 20:06               ` Andy Lutomirski
2018-07-18 11:59                 ` Joerg Roedel
2018-07-11 11:29 ` [PATCH 11/39] x86/entry/32: Simplify debug entry point Joerg Roedel
2018-07-11 11:29 ` [PATCH 12/39] x86/32: Use tss.sp1 as cpu_current_top_of_stack Joerg Roedel
2018-07-11 11:29 ` [PATCH 13/39] x86/entry/32: Add PTI cr3 switch to non-NMI entry/exit points Joerg Roedel
2018-07-11 11:29 ` [PATCH 14/39] x86/entry/32: Add PTI cr3 switches to NMI handler code Joerg Roedel
2018-07-11 11:29 ` [PATCH 15/39] x86/pgtable: Rename pti_set_user_pgd to pti_set_user_pgtbl Joerg Roedel
2018-07-11 11:29 ` [PATCH 16/39] x86/pgtable/pae: Unshare kernel PMDs when PTI is enabled Joerg Roedel
2018-07-11 11:29 ` [PATCH 17/39] x86/pgtable/32: Allocate 8k page-tables " Joerg Roedel
2018-07-11 11:29 ` [PATCH 18/39] x86/pgtable: Move pgdp kernel/user conversion functions to pgtable.h Joerg Roedel
2018-07-11 11:29 ` [PATCH 19/39] x86/pgtable: Move pti_set_user_pgtbl() " Joerg Roedel
2018-07-11 11:29 ` [PATCH 20/39] x86/pgtable: Move two more functions from pgtable_64.h " Joerg Roedel
2018-07-11 11:29 ` [PATCH 21/39] x86/mm/pae: Populate valid user PGD entries Joerg Roedel
2018-07-11 11:29 ` [PATCH 22/39] x86/mm/pae: Populate the user page-table with user pgd's Joerg Roedel
2018-07-11 11:29 ` [PATCH 23/39] x86/mm/legacy: " Joerg Roedel
2018-07-11 11:29 ` [PATCH 24/39] x86/mm/pti: Add an overflow check to pti_clone_pmds() Joerg Roedel
2018-07-11 11:29 ` [PATCH 25/39] x86/mm/pti: Define X86_CR3_PTI_PCID_USER_BIT on x86_32 Joerg Roedel
2018-07-11 11:29 ` [PATCH 26/39] x86/mm/pti: Clone CPU_ENTRY_AREA on PMD level " Joerg Roedel
2018-07-11 11:29 ` [PATCH 27/39] x86/mm/pti: Make pti_clone_kernel_text() compile on 32 bit Joerg Roedel
2018-07-11 11:29 ` [PATCH 28/39] x86/mm/pti: Keep permissions when cloning kernel text in pti_clone_kernel_text() Joerg Roedel
2018-07-13 23:25   ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 29/39] x86/mm/pti: Introduce pti_finalize() Joerg Roedel
2018-07-11 11:29 ` [PATCH 30/39] x86/mm/pti: Clone entry-text again in pti_finalize() Joerg Roedel
2018-07-13 23:21   ` Andy Lutomirski
2018-07-14  5:04     ` Joerg Roedel
2018-07-11 11:29 ` [PATCH 31/39] x86/mm/dump_pagetables: Define INIT_PGD Joerg Roedel
2018-07-11 11:29 ` [PATCH 32/39] x86/pgtable/pae: Use separate kernel PMDs for user page-table Joerg Roedel
2018-07-11 11:29 ` [PATCH 33/39] x86/ldt: Reserve address-space range on 32 bit for the LDT Joerg Roedel
2018-07-11 11:29 ` [PATCH 34/39] x86/ldt: Define LDT_END_ADDR Joerg Roedel
2018-07-13 17:29   ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 35/39] x86/ldt: Split out sanity check in map_ldt_struct() Joerg Roedel
2018-07-13 23:18   ` Andy Lutomirski
2018-07-11 11:29 ` [PATCH 36/39] x86/ldt: Enable LDT user-mapping for PAE Joerg Roedel
2018-07-11 11:29 ` [PATCH 37/39] x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32 Joerg Roedel
2018-07-11 11:29 ` [PATCH 38/39] x86/mm/pti: Add Warning when booting on a PCID capable CPU Joerg Roedel
2018-07-13 18:59   ` Andy Lutomirski
2018-07-14  5:08     ` Joerg Roedel
2018-07-11 11:29 ` [PATCH 39/39] x86/entry/32: Add debug code to check entry/exit cr3 Joerg Roedel
2018-07-13 17:28   ` Andy Lutomirski
2018-07-14  5:09     ` Joerg Roedel
2018-07-11 16:28 ` [PATCH 00/39 v7] PTI support for x86-32 Linus Torvalds
2018-07-11 17:28   ` Jiri Kosina
2018-07-11 19:57     ` Thomas Backlund
2018-07-12 13:59       ` Boris Ostrovsky
2018-07-11 21:07   ` Pavel Machek
2018-07-16  7:51 ` Pavel Machek
2018-07-17  2:07 ` David H. Gutteridge
2018-07-17  6:16   ` Joerg Roedel
2018-07-18  9:40 [PATCH 00/39 v8] " Joerg Roedel
2018-07-18  9:40 ` [PATCH 07/39] x86/entry/32: Enter the kernel via trampoline stack Joerg Roedel
2018-07-18 18:09   ` Brian Gerst
2018-07-19 20:52     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A66D58A6-3DC6-4CF3-B2A5-433C6E974060@amacapital.net \
    --to=luto@amacapital.net \
    --cc=David.Laight@aculab.com \
    --cc=aarcange@redhat.com \
    --cc=aliguori@amazon.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=daniel.gruss@iaik.tugraz.at \
    --cc=dave.hansen@intel.com \
    --cc=dhgutteridge@sympatico.ca \
    --cc=dvlasenk@redhat.com \
    --cc=eduval@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jgross@suse.com \
    --cc=jkosina@suse.cz \
    --cc=joro@8bytes.org \
    --cc=jpoimboe@redhat.com \
    --cc=jroedel@suse.de \
    --cc=keescook@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llong@redhat.com \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git