From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752725AbbC0L2o (ORCPT ); Fri, 27 Mar 2015 07:28:44 -0400 Received: from mail-wi0-f172.google.com ([209.85.212.172]:38750 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752172AbbC0L2m (ORCPT ); Fri, 27 Mar 2015 07:28:42 -0400 MIME-Version: 1.0 In-Reply-To: <20150327111738.GA8749@gmail.com> References: <1427373731-13056-1-git-send-email-dvlasenk@redhat.com> <20150327081141.GA9526@gmail.com> <551534B1.6090908@redhat.com> <20150327111738.GA8749@gmail.com> Date: Fri, 27 Mar 2015 07:28:41 -0400 Message-ID: Subject: Re: [PATCH] x86/asm/entry/64: better check for canonical address From: Brian Gerst To: Ingo Molnar Cc: Denys Vlasenko , Andy Lutomirski , Borislav Petkov , "the arch/x86 maintainers" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 27, 2015 at 7:17 AM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> On 03/27/2015 09:11 AM, Ingo Molnar wrote: >> > >> > * Denys Vlasenko wrote: >> > >> >> This change makes the check exact (no more false positives >> >> on kernel addresses). >> >> >> >> It isn't really important to be fully correct here - >> >> almost all addresses we'll ever see will be userspace ones, >> >> but OTOH it looks to be cheap enough: >> >> the new code uses two more ALU ops but preserves %rcx, >> >> allowing to not reload it from pt_regs->cx again. >> >> On disassembly level, the changes are: >> >> >> >> cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11 >> >> shr $0x2f,%rcx -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11 >> >> mov 0x58(%rsp),%rcx -> (eliminated) >> >> >> >> Signed-off-by: Denys Vlasenko >> >> CC: Borislav Petkov >> >> CC: x86@kernel.org >> >> CC: linux-kernel@vger.kernel.org >> >> --- >> >> >> >> Andy, I'd undecided myself on the merits of doing this. >> >> If you like it, feel free to take it in your tree. >> >> I trimmed CC list to not bother too many people with this trivial >> >> and quite possibly "useless churn"-class change. >> >> >> >> arch/x86/kernel/entry_64.S | 23 ++++++++++++----------- >> >> 1 file changed, 12 insertions(+), 11 deletions(-) >> >> >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S >> >> index bf9afad..a36d04d 100644 >> >> --- a/arch/x86/kernel/entry_64.S >> >> +++ b/arch/x86/kernel/entry_64.S >> >> @@ -688,26 +688,27 @@ retint_swapgs: /* return to user-space */ >> >> * a completely clean 64-bit userspace context. >> >> */ >> >> movq RCX(%rsp),%rcx >> >> - cmpq %rcx,RIP(%rsp) /* RCX == RIP */ >> >> + movq RIP(%rsp),%r11 >> >> + cmpq %rcx,%r11 /* RCX == RIP */ >> >> jne opportunistic_sysret_failed >> > >> > Btw., in the normal syscall entry path, RIP(%rsp) == RCX(%rsp), >> > because we set up pt_regs like that - and at this point RIP/RCX is >> > guaranteed to be canonical, right? >> > >> > So if there's a mismatch generated, it's the kernel's doing. >> >> This is an optimization on IRET exit code path. >> >> We go here if we know that pt_regs can be modified by .e.g. ptrace. >> >> I think we also go here even on interrupt return. > > Yeah, missed that, this would kill any flag based approach. > >> (Granted, chances that RCX was the same as RIP at the moment of >> interrupt are slim, but we still would check that and (ab)use SYSRET >> if it looks like it'll work). > > Btw., there's a neat trick we could do: in the HLT, MWAIT and > ACPI-idle code we could attempt to set up RCX to match RIP, to trigger > this optimization in the common 'irq interrupted the idle task' case? sysret only returns to CPL3. -- Brian Gerst