From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752377AbbCaWVo (ORCPT ); Tue, 31 Mar 2015 18:21:44 -0400 Received: from mail-wg0-f54.google.com ([74.125.82.54]:34385 "EHLO mail-wg0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751638AbbCaWVn (ORCPT ); Tue, 31 Mar 2015 18:21:43 -0400 MIME-Version: 1.0 In-Reply-To: <1427821211-25099-7-git-send-email-dvlasenk@redhat.com> References: <1427821211-25099-1-git-send-email-dvlasenk@redhat.com> <1427821211-25099-7-git-send-email-dvlasenk@redhat.com> Date: Tue, 31 Mar 2015 18:21:42 -0400 Message-ID: Subject: Re: [PATCH 7/9] x86/asm/entry/32: tidy up some instructions From: Brian Gerst To: Denys Vlasenko Cc: Ingo Molnar , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Andy Lutomirski , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , "the arch/x86 maintainers" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 31, 2015 at 1:00 PM, Denys Vlasenko wrote: > After TESTs, use logically correct JZ mnemonic instead of JE > (this doesn't change code). > > Tidy up CMPW insns: > > Modern CPUs are not good with 16-bit operations. > The instructions with 16-bit immediates are especially bad, > on many CPUs they cause length changing prefix stall > in the decoders, costing ~6 cycles to recover. > > Replace CMPWs with CMPLs. > Of these, for form with 8-bit sign-extended immediates > it is a win because they are smaller now > (no 0x66 prefix anymore); > ones with 16-bit immediates are faster. > > @@ -708,7 +708,7 @@ END(sysenter_badsys) > #ifdef CONFIG_X86_ESPFIX32 > movl %ss, %eax > /* see if on espfix stack */ > - cmpw $__ESPFIX_SS, %ax > + cmpl $__ESPFIX_SS, %eax > jne 27f > movl $__KERNEL_DS, %eax > movl %eax, %ds This is incorrect. 32-bit reads from a segment register are not zero-extended. The upper 16 bits are implementation-defined. Most processors will clear them but it's not guaranteed. -- Brian Gerst