From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752581AbbCJOJw (ORCPT ); Tue, 10 Mar 2015 10:09:52 -0400 Received: from mail-qg0-f42.google.com ([209.85.192.42]:41068 "EHLO mail-qg0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751879AbbCJOJu (ORCPT ); Tue, 10 Mar 2015 10:09:50 -0400 MIME-Version: 1.0 In-Reply-To: References: <1425926364-9526-1-git-send-email-dvlasenk@redhat.com> <1425926364-9526-4-git-send-email-dvlasenk@redhat.com> <20150310125151.GB21686@gmail.com> <54FEEF0D.5080505@redhat.com> <20150310132147.GB26185@gmail.com> From: Denys Vlasenko Date: Tue, 10 Mar 2015 15:09:27 +0100 Message-ID: Subject: Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath To: Andy Lutomirski Cc: Ingo Molnar , Denys Vlasenko , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , X86 ML , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2015 at 3:02 PM, Andy Lutomirski wrote: > On Tue, Mar 10, 2015 at 7:00 AM, Denys Vlasenko > wrote: >> On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski wrote: >>> usersp is IMO tolerable. The nasty thing is the FIXUP_TOP_OF_STACK / >>> RESTORE_TOP_OF_STACK garbage, and this patch is the main step toward >>> killing that off completely. I've still never convinced myself that >>> there aren't ptrace-related info leaks in there. >>> >>> Denys, did you ever benchmark what happens if we use push instead of >>> mov? I bet that we get that cycle back and more, not to mention much >>> less icache usage. >> >> Yes, I did. >> Push conversion seems to perform the same as current, MOV-based code. >> >> The expected win there that we lose two huge 12-byte insns >> which store __USER_CS and __USER_DS in iret frame. >> >> MOVQ imm,ofs(%rsp) has a very unfortunate encoding in x86: >> - needs REX prefix >> - no sing-extending imm8 form exists for it >> - ofs in our case can't fit into 8 bits >> - (%esp) requires SIB byte >> >> In my tests, each such instruction adds one cycle. >> >> Compare this to PUSH imm8, which is 2 bytes only. > > Does that mean that using push on top of this patch gets us our cycle back? Maybe. I can't be sure about it. In general I see a jitter of 1-2, sometimes 3 cycles even when I do changes which merely change code size (e.g. replacing equivalent insns). This may be caused by jump targets getting aligned differently wrt cacheline boundaries. If second/third/fourth insn after current one is not fetched because it did not fit into the cacheline, then some insn decoders don't get anything to chew on.