From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752762AbdKLC7a (ORCPT ); Sat, 11 Nov 2017 21:59:30 -0500 Received: from mail.kernel.org ([198.145.29.99]:39626 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752502AbdKLC73 (ORCPT ); Sat, 11 Nov 2017 21:59:29 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2D4C521904 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org X-Google-Smtp-Source: AGs4zMZVkoijMBNl0UxjkkdZRZvSj9X/UUIL7RzhfRF+OC50yvbKqQIoOj8pOqqDZR+uSrcSn6ZfzDedCJYssIz9hQY= MIME-Version: 1.0 In-Reply-To: <20171111105821.kxjjuc7peiqoxfuc@pd.tnic> References: <20171111105821.kxjjuc7peiqoxfuc@pd.tnic> From: Andy Lutomirski Date: Sat, 11 Nov 2017 18:59:07 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC 0/7] Prep code for better stack switching To: Borislav Petkov Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , Brian Gerst , Dave Hansen , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 11, 2017 at 2:58 AM, Borislav Petkov wrote: > On Fri, Nov 10, 2017 at 08:05:19PM -0800, Andy Lutomirski wrote: >> This isn't quite done (the TSS remap patch is busted on 32-bit, but >> that's a straightforward fix), but it should be ready for at least a >> conceptual review. >> >> The idea here is to prepare us to have all kernel data needed for >> user mode execution and early entry located in the fixmap. To do >> this, I hijack the GDT remap mechanism and make it more general. I >> add a struct cpu_entry_area. This struct is never instantiated >> directly. Instead, it represents the layout of a per-cpu portion of >> the fixmap. That portion contains the GDT, the TSS (including IO >> bitmap), and the entry stack (for now just a part of the TSS >> region). It should also end up containing the PEBS and BTS buffers. >> >> If this works, then the idea would be to add a magic *executable* page >> to cpu_entry_area. That page would contain a stub like this: >> >> ENTRY(entry_SYSCALL_64_trampoline) >> UNWIND_HINT_EMPTY >> movq %rsp, 0x1000+entry_SYSCALL_64_trampoline-1f(%rip) >> 1: >> movq 0x1008+entry_SYSCALL_64_trampoline-1f(%rip), %rsp >> 1: >> pushq %rdi >> pushq %rsi > >> movq 0x1000+entry_SYSCALL_64_trampoline-1f(%rip), %rsi >> 1: >> movq $entry_SYSCALL_64, %rdi >> jmp *%rdi > > So I'm wondering: r12-r15 are callee-preserved so why can't you > scratch into those on entry and leave rsi and rdi pristine so that > entry_SYSCALL_64 can get to work directly? I'm not sure I understand your suggestion. SYSCALL has always preserved all regs except rcx, r11, flags, rax, and, depending on what signals are involved, the argument registers. r12-r15 are definitely preserved, and existing userspace relies on that. Anyway, I'm halfway through actually implementing this, and it looks a wee bit different, but not much different.