From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10411C433EF for ; Thu, 9 Sep 2021 21:29:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DA1AA61131 for ; Thu, 9 Sep 2021 21:29:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237562AbhIIVaW (ORCPT ); Thu, 9 Sep 2021 17:30:22 -0400 Received: from terminus.zytor.com ([198.137.202.136]:58085 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229877AbhIIVaV (ORCPT ); Thu, 9 Sep 2021 17:30:21 -0400 Received: from tazenda.hos.anvin.org ([IPv6:2601:646:8600:3c70:7285:c2ff:fefb:fd4]) (authenticated bits=0) by mail.zytor.com (8.16.1/8.15.2) with ESMTPSA id 189LSmYo232844 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Thu, 9 Sep 2021 14:28:48 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 189LSmYo232844 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2021083001; t=1631222929; bh=CI2OZa+9rUBUu0LLPz+5if0syOzPOWKr2yvZ2VEeROU=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=mAi6vMFS6DV2JJQThOInHggwyfbcuI0SPc38PY0vjad+WV+6LF0resE7wmFgFjVm0 JP9RL4i4/xpD0U9V7vJXUjKcnjG/895tA60mQeXWU02aTiWm5Y/7q//0XxRwmBrBxb TIdIoFGAV161251msMMeQxcStPmYOG19ECGJ0VynN1GPLTRfLXNCXkQIYOUZYtn5AQ aMNVSfdXb0O4ERfElq1qzqm/ZNI3ME+fwlg9e27OvFKqqEViKRUiPp6yWY+JJaJcMq syRq81Q+cSCzLQLsQsj4yfuHtuRfwodibjZnRQ5DQYScw6XG13z6ghQnH57HOAiBxL E+gxGOmovFVrA== Subject: Re: [PATCH] x86/asm: pessimize the pre-initialization case in static_cpu_has() To: Borislav Petkov Cc: Thomas Gleixner , Ingo Molnar , Andy Lutomirski , Linux Kernel Mailing List References: <20210908171716.3340120-1-hpa@zytor.com> From: "H. Peter Anvin" Message-ID: <1a73e0c2-8efe-fee9-5141-f7e9a95c748d@zytor.com> Date: Thu, 9 Sep 2021 14:28:42 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/9/21 10:01 AM, Borislav Petkov wrote: > On Wed, Sep 08, 2021 at 10:17:16AM -0700, H. Peter Anvin (Intel) wrote: > >> Subject: Re: [PATCH] x86/asm: pessimize the pre-initialization case in static_cpu_has() > > "pessimize" huh? :) > > Why not simply > > "Do not waste registers in the pre-initialization case..." > Because it is shorter and thus can fit more contents > ? > >> gcc will sometimes manifest the address of boot_cpu_data in a register >> as part of constant propagation. When multiple static_cpu_has() are >> used this may foul the mainline code with a register load which will >> only be used on the fallback path, which is unused after >> initialization. > > So a before-after thing looks like this here: > > before: > > ffffffff89696517 <.altinstr_aux>: > ffffffff89696517: f6 05 cb 09 cb ff 80 testb $0x80,-0x34f635(%rip) # ffffffff89346ee9 > ffffffff8969651e: 0f 85 fc 3e fb ff jne ffffffff8964a420 > ffffffff89696524: e9 ee 3e fb ff jmp ffffffff8964a417 > ffffffff89696529: f6 45 6a 08 testb $0x8,0x6a(%rbp) > ffffffff8969652d: 0f 85 45 b9 97 f7 jne ffffffff81011e78 > ffffffff89696533: e9 95 b9 97 f7 jmp ffffffff81011ecd > ffffffff89696538: 41 f6 44 24 6a 08 testb $0x8,0x6a(%r12) > ffffffff8969653e: 0f 85 d3 bc 97 f7 jne ffffffff81012217 > ffffffff89696544: e9 d9 bc 97 f7 jmp ffffffff81012222 > ffffffff89696549: 41 f6 44 24 6a 08 testb $0x8,0x6a(%r12) > > after: > > ffffffff89696517 <.altinstr_aux>: > ffffffff89696517: f6 04 25 e9 6e 34 89 testb $0x80,0xffffffff89346ee9 > ffffffff8969651e: 80 > ffffffff8969651f: 0f 85 fb 3e fb ff jne ffffffff8964a420 > ffffffff89696525: e9 ed 3e fb ff jmp ffffffff8964a417 > ffffffff8969652a: f6 04 25 ea 6e 34 89 testb $0x8,0xffffffff89346eea > ffffffff89696531: 08 > ffffffff89696532: 0f 85 37 b9 97 f7 jne ffffffff81011e6f > ffffffff89696538: e9 89 b9 97 f7 jmp ffffffff81011ec6 > ffffffff8969653d: f6 04 25 ea 6e 34 89 testb $0x8,0xffffffff89346eea > ffffffff89696544: 08 > ffffffff89696545: 0f 85 b5 bc 97 f7 jne ffffffff81012200 > ffffffff8969654b: e9 bb bc 97 f7 jmp ffffffff8101220b > ffffffff89696550: f6 04 25 ea 6e 34 89 testb $0x8,0xffffffff89346eea > > so you're basically forcing an immediate thing. > > And you wanna get rid of the (%) relative addressing and force it > to be rip-relative. > >> Explicitly force gcc to use immediate (rip-relative) addressing for > > Right, the rip-relative addressing doesn't happen here: > Indeed it doesn't (egg on my face), nor does it turn out is there currently a way to do so (just adding (%%rip) breaks i386, and there is no equivalent to %{pP} which adds the suffix). Let me fix both; will have a patchset shortly. -hpa