From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030202AbbDVRKW (ORCPT ); Wed, 22 Apr 2015 13:10:22 -0400 Received: from relay4-d.mail.gandi.net ([217.70.183.196]:33912 "EHLO relay4-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966252AbbDVRKQ (ORCPT ); Wed, 22 Apr 2015 13:10:16 -0400 X-Originating-IP: 98.158.13.35 Date: Wed, 22 Apr 2015 10:10:05 -0700 From: Josh Triplett To: Andy Lutomirski Cc: Denys Vlasenko , Ingo Molnar , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , X86 ML , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH 2/2] x86/asm/entry/32: Remove unnecessary optimization in stub32_clone Message-ID: <20150422171005.GA1020@jtriplet-mobl1> References: <1429720808-7173-1-git-send-email-dvlasenk@redhat.com> <1429720808-7173-2-git-send-email-dvlasenk@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 22, 2015 at 09:54:24AM -0700, Andy Lutomirski wrote: > On Wed, Apr 22, 2015 at 9:40 AM, Denys Vlasenko wrote: > > Really swap arguments #4 and #5 in stub32_clone instead of "optimizing" > > it into a move. > > > > Yes, tls_val is currently unused. Yes, on some CPUs XCHG is a little bit > > more expensive than MOV. But a cycle or two on an expensive syscall like > > clone() is way below noise floor, and obfuscation of logic introduced > > by this optimization is simply not worth it. > > Ditto re: Josh's patch. I do think my two-patch HAVE_COPY_THREAD_TLS series should go in fixing this, but I'd like to see the final version of Denys' comment added on top of it (with an update to the type and name of the tls argument to match the changes to sys_clone). Denys, would you consider submitting a patch adding your comment on top of the two-patch series I just sent? Thanks, Josh Triplett > --Andy > > > > > Signed-off-by: Denys Vlasenko > > CC: Linus Torvalds > > CC: Steven Rostedt > > CC: Ingo Molnar > > CC: Borislav Petkov > > CC: "H. Peter Anvin" > > CC: Andy Lutomirski > > CC: Oleg Nesterov > > CC: Frederic Weisbecker > > CC: Alexei Starovoitov > > CC: Will Drewry > > CC: Kees Cook > > CC: x86@kernel.org > > CC: linux-kernel@vger.kernel.org > > --- > > arch/x86/ia32/ia32entry.S | 6 ++---- > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S > > index 8e72256..0c302d0 100644 > > --- a/arch/x86/ia32/ia32entry.S > > +++ b/arch/x86/ia32/ia32entry.S > > @@ -567,11 +567,9 @@ GLOBAL(stub32_clone) > > * 32-bit clone API is clone(..., int tls_val, int *child_tidptr). > > * 64-bit clone API is clone(..., int *child_tidptr, int tls_val). > > * Native 64-bit kernel's sys_clone() implements the latter. > > - * We need to swap args here. But since tls_val is in fact ignored > > - * by sys_clone(), we can get away with an assignment > > - * (arg4 = arg5) instead of a full swap: > > + * We need to swap args here: > > */ > > - mov %r8, %rcx > > + xchg %r8, %rcx > > jmp ia32_ptregs_common > > > > ALIGN > > -- > > 1.8.1.4 > > > > > > -- > Andy Lutomirski > AMA Capital Management, LLC