From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752612AbbC0KpN (ORCPT ); Fri, 27 Mar 2015 06:45:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52977 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751621AbbC0KpI (ORCPT ); Fri, 27 Mar 2015 06:45:08 -0400 Message-ID: <551534B1.6090908@redhat.com> Date: Fri, 27 Mar 2015 11:45:05 +0100 From: Denys Vlasenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Ingo Molnar CC: Andy Lutomirski , Borislav Petkov , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] x86/asm/entry/64: better check for canonical address References: <1427373731-13056-1-git-send-email-dvlasenk@redhat.com> <20150327081141.GA9526@gmail.com> In-Reply-To: <20150327081141.GA9526@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/27/2015 09:11 AM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> This change makes the check exact (no more false positives >> on kernel addresses). >> >> It isn't really important to be fully correct here - >> almost all addresses we'll ever see will be userspace ones, >> but OTOH it looks to be cheap enough: >> the new code uses two more ALU ops but preserves %rcx, >> allowing to not reload it from pt_regs->cx again. >> On disassembly level, the changes are: >> >> cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11 >> shr $0x2f,%rcx -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11 >> mov 0x58(%rsp),%rcx -> (eliminated) >> >> Signed-off-by: Denys Vlasenko >> CC: Borislav Petkov >> CC: x86@kernel.org >> CC: linux-kernel@vger.kernel.org >> --- >> >> Andy, I'd undecided myself on the merits of doing this. >> If you like it, feel free to take it in your tree. >> I trimmed CC list to not bother too many people with this trivial >> and quite possibly "useless churn"-class change. >> >> arch/x86/kernel/entry_64.S | 23 ++++++++++++----------- >> 1 file changed, 12 insertions(+), 11 deletions(-) >> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S >> index bf9afad..a36d04d 100644 >> --- a/arch/x86/kernel/entry_64.S >> +++ b/arch/x86/kernel/entry_64.S >> @@ -688,26 +688,27 @@ retint_swapgs: /* return to user-space */ >> * a completely clean 64-bit userspace context. >> */ >> movq RCX(%rsp),%rcx >> - cmpq %rcx,RIP(%rsp) /* RCX == RIP */ >> + movq RIP(%rsp),%r11 >> + cmpq %rcx,%r11 /* RCX == RIP */ >> jne opportunistic_sysret_failed > > Btw., in the normal syscall entry path, RIP(%rsp) == RCX(%rsp), > because we set up pt_regs like that - and at this point RIP/RCX is > guaranteed to be canonical, right? > > So if there's a mismatch generated, it's the kernel's doing. This is an optimization on IRET exit code path. We go here if we know that pt_regs can be modified by .e.g. ptrace. I think we also go here even on interrupt return. (Granted, chances that RCX was the same as RIP at the moment of interrupt are slim, but we still would check that and (ab)use SYSRET if it looks like it'll work). > Why don't we detect those cases where a new return address is created > (ptrace, exec, etc.), check for canonicalness and add a TIF flag for > it (and add it to the work mask) and execute the IRET from the slow > path? > > We already have a work-mask branch. > > That would allow the removal of all these checks and canonization from > the fast return path! We could go straight to the SYSRET... The point is, this is not a fast return path. It's a "let's try to use fast SYSRET instead of IRET" path. > The frequency of exec() and ptrace() is 2-3 orders of magnitude lower > than the frequency of system calls, so this would be well worth it. On untraced system calls, we don't come here. We go to SYSRET without these checks.