From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753222AbbCJNV4 (ORCPT ); Tue, 10 Mar 2015 09:21:56 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:33514 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751807AbbCJNVy (ORCPT ); Tue, 10 Mar 2015 09:21:54 -0400 Date: Tue, 10 Mar 2015 14:21:48 +0100 From: Ingo Molnar To: Denys Vlasenko Cc: Andy Lutomirski , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath Message-ID: <20150310132147.GB26185@gmail.com> References: <1425926364-9526-1-git-send-email-dvlasenk@redhat.com> <1425926364-9526-4-git-send-email-dvlasenk@redhat.com> <20150310125151.GB21686@gmail.com> <54FEEF0D.5080505@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54FEEF0D.5080505@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Denys Vlasenko wrote: > > So there are now +2 instructions (5 instead of 3) in the > > system_call path, but there are -2 instructions in the SYSRETQ > > path, > > Unfortunately, no. [...] So I assumed that it was an equivalent transformation, given that none of the changelogs spelled out the increase in overhead ... > [...] There is only this change in SYSRETQ path, which simply > changes where we get RSP from: > > @@ -293,7 +289,7 @@ ret_from_sys_call: > CFI_REGISTER rip,rcx > movq EFLAGS(%rsp),%r11 > /*CFI_REGISTER rflags,r11*/ > - movq PER_CPU_VAR(old_rsp), %rsp > + movq RSP(%rsp),%rsp > /* > * 64bit SYSRET restores rip from rcx, > * rflags from r11 (but RF and VM bits are forced to 0), > > Most likely, no change in execution speed here. > At best, it is one cycle faster somewhere in address generation unit > because for PER_CPU_VAR() address evaluation, GS base is nonzero. > > Since this patch does add two extra MOVs, > I did benchmark these patches. They add exactly one cycle > to system call code path on my Sandy Bridge CPU. Hm, but that's the wrong direction, we should try to make it faster, and to clean it up - but making it slower without really good reasons isn't good. Is 'usersp' really that much of a complication? Thanks, Ingo