From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752769AbbC0Viz (ORCPT ); Fri, 27 Mar 2015 17:38:55 -0400 Received: from mail-lb0-f173.google.com ([209.85.217.173]:34694 "EHLO mail-lb0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752563AbbC0ViU (ORCPT ); Fri, 27 Mar 2015 17:38:20 -0400 MIME-Version: 1.0 In-Reply-To: <20150327113125.GA14778@gmail.com> References: <1427373731-13056-1-git-send-email-dvlasenk@redhat.com> <20150327113125.GA14778@gmail.com> From: Andy Lutomirski Date: Fri, 27 Mar 2015 14:37:58 -0700 Message-ID: Subject: Re: [PATCH] x86/asm/entry/64: better check for canonical address To: Ingo Molnar Cc: Brian Gerst , Denys Vlasenko , Borislav Petkov , "the arch/x86 maintainers" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 27, 2015 at 4:31 AM, Ingo Molnar wrote: > > * Brian Gerst wrote: > >> On Thu, Mar 26, 2015 at 8:42 AM, Denys Vlasenko wrote: >> > This change makes the check exact (no more false positives >> > on kernel addresses). >> > >> > It isn't really important to be fully correct here - >> > almost all addresses we'll ever see will be userspace ones, >> > but OTOH it looks to be cheap enough: >> > the new code uses two more ALU ops but preserves %rcx, >> > allowing to not reload it from pt_regs->cx again. >> > On disassembly level, the changes are: >> > >> > cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11 >> > shr $0x2f,%rcx -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11 >> > mov 0x58(%rsp),%rcx -> (eliminated) >> > >> > Signed-off-by: Denys Vlasenko >> > CC: Borislav Petkov >> > CC: x86@kernel.org >> > CC: linux-kernel@vger.kernel.org >> > --- >> > >> > Andy, I'd undecided myself on the merits of doing this. >> > If you like it, feel free to take it in your tree. >> > I trimmed CC list to not bother too many people with this trivial >> > and quite possibly "useless churn"-class change. >> > >> > arch/x86/kernel/entry_64.S | 23 ++++++++++++----------- >> > 1 file changed, 12 insertions(+), 11 deletions(-) >> > >> > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S >> > index bf9afad..a36d04d 100644 >> > --- a/arch/x86/kernel/entry_64.S >> > +++ b/arch/x86/kernel/entry_64.S >> > @@ -688,26 +688,27 @@ retint_swapgs: /* return to user-space */ >> > * a completely clean 64-bit userspace context. >> > */ >> > movq RCX(%rsp),%rcx >> > - cmpq %rcx,RIP(%rsp) /* RCX == RIP */ >> > + movq RIP(%rsp),%r11 >> > + cmpq %rcx,%r11 /* RCX == RIP */ >> > jne opportunistic_sysret_failed >> > >> > /* >> > * On Intel CPUs, sysret with non-canonical RCX/RIP will #GP >> > * in kernel space. This essentially lets the user take over >> > - * the kernel, since userspace controls RSP. It's not worth >> > - * testing for canonicalness exactly -- this check detects any >> > - * of the 17 high bits set, which is true for non-canonical >> > - * or kernel addresses. (This will pessimize vsyscall=native. >> > - * Big deal.) >> > + * the kernel, since userspace controls RSP. >> > * >> > - * If virtual addresses ever become wider, this will need >> > + * If width of "canonical tail" ever become variable, this will need >> > * to be updated to remain correct on both old and new CPUs. >> > */ >> > .ifne __VIRTUAL_MASK_SHIFT - 47 >> > .error "virtual address width changed -- sysret checks need update" >> > .endif >> > - shr $__VIRTUAL_MASK_SHIFT, %rcx >> > - jnz opportunistic_sysret_failed >> > + /* Change top 16 bits to be a sign-extension of the rest */ >> > + shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx >> > + sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx >> > + /* If this changed %rcx, it was not canonical */ >> > + cmpq %rcx, %r11 >> > + jne opportunistic_sysret_failed >> > >> > cmpq $__USER_CS,CS(%rsp) /* CS must match SYSRET */ >> > jne opportunistic_sysret_failed >> >> Would it be possible to to skip this check entirely on AMD >> processors? It's my understanding that AMD correctly issues the #GP >> from CPL3, causing a stack switch. > > This needs a testcase I suspect. IMO one decent way to write the test case would be to extend the sigreturn test I just submitted. For each n, do raise(SIGUSR1), then change RCX and RIP to 2^n. Return and catch the SIGSEGV, then restore the original RIP. Repeat with 2^n replaced with 2^n-1 and ~(2^n-1). The only real trick is that we need to make sure that there's no actual executable code at any of these addresses. --Andy > >> Looking at the AMD docs, sysret doesn't even check for a canonical >> address. The #GP is probably from the instruction fetch at the >> non-canonical address instead of from sysret itself. > > I suspect it's similar to what would happen if we tried a RET to a > non-canonical address: the fetch fails and the JMP gets the #GP? > > In that sense it's the fault of the return instruction. > > Thanks, > > Ingo -- Andy Lutomirski AMA Capital Management, LLC