From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E189C433EF for ; Wed, 8 Sep 2021 07:12:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 367C361175 for ; Wed, 8 Sep 2021 07:12:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349144AbhIHHN1 (ORCPT ); Wed, 8 Sep 2021 03:13:27 -0400 Received: from out30-42.freemail.mail.aliyun.com ([115.124.30.42]:51129 "EHLO out30-42.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349259AbhIHHNS (ORCPT ); Wed, 8 Sep 2021 03:13:18 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=laijs@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0UnfT516_1631085126; Received: from C02XQCBJJG5H.local(mailfrom:laijs@linux.alibaba.com fp:SMTPD_---0UnfT516_1631085126) by smtp.aliyun-inc.com(127.0.0.1); Wed, 08 Sep 2021 15:12:07 +0800 Subject: Re: [PATCH 25/24] x86/traps: Rewrite native_load_gs_index in C code To: "H. Peter Anvin" , Lai Jiangshan , linux-kernel@vger.kernel.org Cc: Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, Dave Hansen , Peter Zijlstra , Dave Jiang , Ben Widawsky , Dan Williams , Arvind Sankar References: <20210831175025.27570-1-jiangshanlai@gmail.com> <20210902105052.2842-1-jiangshanlai@gmail.com> <9fdb04b1-dbb8-069d-f5ef-d4e8c0f2a83e@zytor.com> <638f3b2b-aff9-72e5-3a5d-fff5ef6b88fc@zytor.com> From: Lai Jiangshan Message-ID: Date: Wed, 8 Sep 2021 15:12:06 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <638f3b2b-aff9-72e5-3a5d-fff5ef6b88fc@zytor.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/8 13:00, H. Peter Anvin wrote: > On 9/7/21 9:42 PM, H. Peter Anvin wrote: >> >> >> On 9/7/21 6:38 PM, H. Peter Anvin wrote: >>> On 9/2/21 3:50 AM, Lai Jiangshan wrote: >>>> From: Lai Jiangshan >>>> >>>> There is no constrain/limition to force native_load_gs_index() to be in >>>> ASM code. >>> >>> Hi, >>> >>> First of all, let me say I really like your patchset, and I will try to >>> review it in detail ASAP (most of the initial read pass looks very sane >>> to me. Hello Thank you for your review. >>> >>> However, I would like to object in part this specific patch. It adds a >>> fair bit of extra code to the exception path, and adds jumps between >>> files which makes the code much harder to read. I tried putting all code into a single C function. But I didn't know how to use a C-label in _ASM_EXTABLE and then I split it. Your code is much better. Thanks Lai >>> >>> You end up doing one swapgs in assembly and one in C, which would seem >>> to be a very good indication that really isn't an improvement. >>> >>> Note that this entire sequence is scheduled to be obsoleted by a single >>> atomic hardware instruction, LKGS, which will replace ALL of >>> native_load_gs_index(); it will no longer be necessary even to disable >>> interrupts as there is no non-atomic state. In that sense, doing this as >>> an out-of-line C function (with some inline assembly) is great, because >>> it makes it far easier to use LKGS as an alternative; the only (small) >>> disadvantage is that it ends up clobbering a lot of registers >>> unnecessarily (in assembly it can be implemented clobbering only two >>> registers; one if one uses pushf/popf to save the interrupt flag.) >>> >> >> OK, here is a version which actually compiles: >> > > ... slightly shorter and minimally better compiled code ... > > noinstr void native_load_gs_index(unsigned int selector) > { >     unsigned long flags; > >     local_irq_save(flags); >     native_swapgs(); > do_mov_gs: >     asm_volatile_goto("1: mov %[seg],%%gs\n" >               "2:\n" >               _ASM_EXTABLE(1b,%l[bad_seg]) >               : : [seg] "r" (selector) : : bad_seg); >     alternative("", "mfence", X86_BUG_SWAPGS_FENCE); >     native_swapgs(); >     local_irq_restore(flags); >     return; > > bad_seg: >     /* The exception dispatch will have restored kernel GS */ >     native_swapgs(); >     alternative_input("", "mov %[seg],%%gs", >               X86_BUG_NULL_SEG, [seg] "r" (__USER_DS)); >     selector = 0; >     goto do_mov_gs; > }