From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E13CC433E4 for ; Fri, 26 Mar 2021 15:16:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 085F861A44 for ; Fri, 26 Mar 2021 15:16:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230449AbhCZPPz (ORCPT ); Fri, 26 Mar 2021 11:15:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230483AbhCZPO6 (ORCPT ); Fri, 26 Mar 2021 11:14:58 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18DC8C0613B2 for ; Fri, 26 Mar 2021 08:14:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=FxXbn2v6mpkKjPN6AxBStAi8amFCVrOrJj8oCrJp7K8=; b=gZShJllJ0FN5npqzOh0UcfGB8N GBw0GKLnyE86gNqUfNy5JZAw9FFsg/jr62FiwSf1vBTjsxfButvlawDdebahjELL1DMfXpIa5nFpT 2ZVw0WZq10/s1ZCxTEkS+2is4pBxyBAA5N49/yE2fcdBep+x+mo3loo4OMMJyiINA4eGh+WwtzHLZ Q5MNsSu88ljq5L9iEgMh/itpIdk15TUbtQWRarNBio/vwsZp1sKFx+gmW13X3N4TFABYyyIu7gFLe NkZp8cm0FGAb1VCAiR/9MvToo3AU3nJmefg3HJWiVieQE2/YyLH0+Dp/4/ZOa/O5r6cpUVUy/dAc3 GZFYk7Vg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94 #2 (Red Hat Linux)) id 1lPo9X-00F0oM-Rv; Fri, 26 Mar 2021 15:14:01 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 78EB8306099; Fri, 26 Mar 2021 16:13:34 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 374462BAC2E5F; Fri, 26 Mar 2021 16:13:34 +0100 (CET) Message-ID: <20210326151259.442992235@infradead.org> User-Agent: quilt/0.66 Date: Fri, 26 Mar 2021 16:12:01 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, jgross@suse.com, mbenes@suse.com Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops() References: <20210326151159.128534163@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently optimize_nops() scans to see if the alternative starts with NOPs. However, the emit pattern is: 141: \oldinstr 142: .skip (len-(142b-141b)), 0x90 That is, when oldinstr is short, we pad the tail with NOPs. This case never gets optimized. Rewrite optimize_nops() to replace any string of NOPs inside the alternative to larger NOPs. Also run it irrespective of patching, replacing NOPs in both the original and replaced code. A direct consequence is that padlen becomes superfluous, so remove it. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/alternative.h | 14 ++------ arch/x86/kernel/alternative.c | 45 +++++++++++++++----------- tools/objtool/arch/x86/include/arch/special.h | 2 - 3 files changed, 33 insertions(+), 28 deletions(-) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -65,7 +65,6 @@ struct alt_instr { u16 cpuid; /* cpuid bit set for replacement */ u8 instrlen; /* length of original instruction */ u8 replacementlen; /* length of new instruction */ - u8 padlen; /* length of build-time padding */ } __packed; /* @@ -104,7 +103,6 @@ static inline int alternatives_text_rese #define alt_end_marker "663" #define alt_slen "662b-661b" -#define alt_pad_len alt_end_marker"b-662b" #define alt_total_slen alt_end_marker"b-661b" #define alt_rlen(num) e_replacement(num)"f-"b_replacement(num)"f" @@ -151,8 +149,7 @@ static inline int alternatives_text_rese " .long " b_replacement(num)"f - .\n" /* new instruction */ \ " .word " __stringify(feature) "\n" /* feature bit */ \ " .byte " alt_total_slen "\n" /* source len */ \ - " .byte " alt_rlen(num) "\n" /* replacement len */ \ - " .byte " alt_pad_len "\n" /* pad len */ + " .byte " alt_rlen(num) "\n" /* replacement len */ #define ALTINSTR_REPLACEMENT(newinstr, num) /* replacement */ \ "# ALT: replacement " #num "\n" \ @@ -315,13 +312,12 @@ static inline int alternatives_text_rese * enough information for the alternatives patching code to patch an * instruction. See apply_alternatives(). */ -.macro altinstruction_entry orig alt feature orig_len alt_len pad_len +.macro altinstruction_entry orig alt feature orig_len alt_len .long \orig - . .long \alt - . .word \feature .byte \orig_len .byte \alt_len - .byte \pad_len .endm /* @@ -338,7 +334,7 @@ static inline int alternatives_text_rese 142: .pushsection .altinstructions,"a" - altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f,142b-141b + altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f .popsection .pushsection .altinstr_replacement,"ax" @@ -375,8 +371,8 @@ static inline int alternatives_text_rese 142: .pushsection .altinstructions,"a" - altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b - altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b + altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f + altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f .popsection .pushsection .altinstr_replacement,"ax" --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -189,19 +189,31 @@ recompute_jump(struct alt_instr *a, u8 * static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr) { unsigned long flags; - int i; + struct insn insn; + int nop, i = 0; - for (i = 0; i < a->padlen; i++) { - if (instr[i] != 0x90) + for (;;) { + if (insn_decode_kernel(&insn, &instr[i])) return; + + if (insn.length == 1 && insn.opcode.bytes[0] == 0x90) + break; + + if ((i += insn.length) >= a->instrlen) + return; + } + + for (nop = i; i < a->instrlen; i++) { + if (WARN_ONCE(instr[i] != 0x90, "Not a NOP at 0x%px\n", &instr[i])) + break; } local_irq_save(flags); - add_nops(instr + (a->instrlen - a->padlen), a->padlen); + add_nops(instr + nop, i - nop); local_irq_restore(flags); DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ", - instr, a->instrlen - a->padlen, a->padlen); + instr, nop, i - nop); } /* @@ -247,19 +259,15 @@ void __init_or_module noinline apply_alt * - feature not present but ALTINSTR_FLAG_INV is set to mean, * patch if feature is *NOT* present. */ - if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV)) { - if (a->padlen > 1) - optimize_nops(a, instr); + if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV)) + goto next; - continue; - } - - DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d), pad: %d", + DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d)", (a->cpuid & ALTINSTR_FLAG_INV) ? "!" : "", feature >> 5, feature & 0x1f, instr, instr, a->instrlen, - replacement, a->replacementlen, a->padlen); + replacement, a->replacementlen); DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr); DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement); @@ -283,14 +291,15 @@ void __init_or_module noinline apply_alt if (a->replacementlen && is_jmp(replacement[0])) recompute_jump(a, instr, replacement, insn_buff); - if (a->instrlen > a->replacementlen) { - add_nops(insn_buff + a->replacementlen, - a->instrlen - a->replacementlen); - insn_buff_sz += a->instrlen - a->replacementlen; - } + for (; insn_buff_sz < a->instrlen; insn_buff_sz++) + insn_buff[insn_buff_sz] = 0x90; + DUMP_BYTES(insn_buff, insn_buff_sz, "%px: final_insn: ", instr); text_poke_early(instr, insn_buff, insn_buff_sz); + +next: + optimize_nops(a, instr); } } --- a/tools/objtool/arch/x86/include/arch/special.h +++ b/tools/objtool/arch/x86/include/arch/special.h @@ -10,7 +10,7 @@ #define JUMP_ORIG_OFFSET 0 #define JUMP_NEW_OFFSET 4 -#define ALT_ENTRY_SIZE 13 +#define ALT_ENTRY_SIZE 12 #define ALT_ORIG_OFFSET 0 #define ALT_NEW_OFFSET 4 #define ALT_FEATURE_OFFSET 8