From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966505AbbDVQmB (ORCPT ); Wed, 22 Apr 2015 12:42:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58936 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966681AbbDVQks (ORCPT ); Wed, 22 Apr 2015 12:40:48 -0400 From: Denys Vlasenko To: Ingo Molnar Cc: Denys Vlasenko , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Andy Lutomirski , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] x86/asm/entry/32: Remove unnecessary optimization in stub32_clone Date: Wed, 22 Apr 2015 18:40:08 +0200 Message-Id: <1429720808-7173-2-git-send-email-dvlasenk@redhat.com> In-Reply-To: <1429720808-7173-1-git-send-email-dvlasenk@redhat.com> References: <1429720808-7173-1-git-send-email-dvlasenk@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Really swap arguments #4 and #5 in stub32_clone instead of "optimizing" it into a move. Yes, tls_val is currently unused. Yes, on some CPUs XCHG is a little bit more expensive than MOV. But a cycle or two on an expensive syscall like clone() is way below noise floor, and obfuscation of logic introduced by this optimization is simply not worth it. Signed-off-by: Denys Vlasenko CC: Linus Torvalds CC: Steven Rostedt CC: Ingo Molnar CC: Borislav Petkov CC: "H. Peter Anvin" CC: Andy Lutomirski CC: Oleg Nesterov CC: Frederic Weisbecker CC: Alexei Starovoitov CC: Will Drewry CC: Kees Cook CC: x86@kernel.org CC: linux-kernel@vger.kernel.org --- arch/x86/ia32/ia32entry.S | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S index 8e72256..0c302d0 100644 --- a/arch/x86/ia32/ia32entry.S +++ b/arch/x86/ia32/ia32entry.S @@ -567,11 +567,9 @@ GLOBAL(stub32_clone) * 32-bit clone API is clone(..., int tls_val, int *child_tidptr). * 64-bit clone API is clone(..., int *child_tidptr, int tls_val). * Native 64-bit kernel's sys_clone() implements the latter. - * We need to swap args here. But since tls_val is in fact ignored - * by sys_clone(), we can get away with an assignment - * (arg4 = arg5) instead of a full swap: + * We need to swap args here: */ - mov %r8, %rcx + xchg %r8, %rcx jmp ia32_ptregs_common ALIGN -- 1.8.1.4