From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.0 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9664C433DF for ; Wed, 24 Jun 2020 20:35:19 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 77BD92080C for ; Wed, 24 Jun 2020 20:35:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="MSO++Q0v"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="COoeXSQl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 77BD92080C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:From:Subject:References:Mime-Version:Message-Id: In-Reply-To:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=8aAj4HKM+H6S5PCF0k3fFOUhq4S7e9ghyb4+Jrmgf5Q=; b=MSO++Q0v9RhqV9LvrlxRSFUYh FUFCRT4CHRktGrm1xh69N5rIUkOlr6Ht8vRssiflWASJ0euhnk7/iT4ttBeobX7rjiI+m9PAjD21T POWRRH1Cx/9lxfGNpI1/t+lebC62x9lLMLGnMS9hPlgx4/qT1T2wvYH1oBg05J0Jlb88iUQg0A+Aa w3AaJbrw22nl4v20M/fXTRu2I3CqbD0nXuFFbiKZRzrYlDTfQ04/KjHq7navLPLOwTos6OgVSfWkm pZDk4bE+vBb0l63GvPr09q+RFF7+TE231natZm5zaRcPR4dl8a1wPoJO0KIk1osaEBr9DIgcFAwwz 4i8PxJidQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1joC4s-00072w-1x; Wed, 24 Jun 2020 20:33:02 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1joC4o-000710-H9 for linux-arm-kernel@lists.infradead.org; Wed, 24 Jun 2020 20:33:00 +0000 Received: by mail-yb1-xb49.google.com with SMTP id z7so3555256ybz.1 for ; Wed, 24 Jun 2020 13:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=/wrr17V+FsCXcFbmRN2vHICWLIelteQsueKzbrVs9hw=; b=COoeXSQlCw4QRSMoNZarFzFNTbDcr6maZAyBTdDeJ3xsLc6rncF7v9wWsAunQPtuzH 7vNEgFVFpuwR2ETEwflij42oCPn4XuNbO8xP+xeO/JpQ4cbNn2mzatO27UjmEPI7Tlmm 5TKoTnAzd8KGxzyXY/s7E79IbefQXu1e7hrmLjW2x43kK7mH1XyN8wifsRReBhhprunp uhXKFTHc+VpBQyltyThaPTu+CQS4/Nc+Iffym5jl0Vq7YLusiAqpFo333GBF/6UHQ6HU tKuOaCWx4957S+p/qAE9zWc4CIXFC4tU55bAISk7rrINEcRJNCeupH5Gi/K/bZ9GZlKw OtfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=/wrr17V+FsCXcFbmRN2vHICWLIelteQsueKzbrVs9hw=; b=UjmoSqXEJ/dXum3xiM+AP5G2drio8SIGyns6u+Q2HLxW42JyHnfXSJYu7D1bpX5feO bHAPib0ysDUMGQshDZe0xJykbwmMVtN4semwAFoe2JNq7Jy/US48vDMSPpjCeh5LhuHD jndvgtgcCdlXiLZgeSjjblijjjGb4nOEiPHcSQ2OE4wPvJblb7dTYXT0k5tflhRQ7EpB hr8ELSdJnI2xsNNXnKEWYjyVJu/UA8r1f3N7tespJsMZ8MW/p7QI4BmWG6qxqsyGzPXN OW5/ORUdeUjt6GHqs71+torpS3ZlHEz/T2uxEfHZMQsgNKA5Oywoqbg7V+HFW5eLp8YL tXOQ== X-Gm-Message-State: AOAM531aEZnWa0KKa5VBC7iX7WRBIDtGgdhLyqW0CH2xlfXKTr+/Dr/X vhS6Kr6HTj3CSfZJhMl908RlqSz9uKsH6o9bFA8= X-Google-Smtp-Source: ABdhPJzNUx+zDn4WfpUBRej52AydQCDedaVXaTDX3VpJk3mmTtqNJ/BdB8+Tfc74cfqhKkMv2ddRCbECop7L3VUHAek= X-Received: by 2002:a5b:307:: with SMTP id j7mr44548339ybp.292.1593030775736; Wed, 24 Jun 2020 13:32:55 -0700 (PDT) Date: Wed, 24 Jun 2020 13:31:40 -0700 In-Reply-To: <20200624203200.78870-1-samitolvanen@google.com> Message-Id: <20200624203200.78870-3-samitolvanen@google.com> Mime-Version: 1.0 References: <20200624203200.78870-1-samitolvanen@google.com> X-Mailer: git-send-email 2.27.0.212.ge8ba1cc988-goog Subject: [PATCH 02/22] kbuild: add support for Clang LTO From: Sami Tolvanen To: Masahiro Yamada , Will Deacon X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, x86@kernel.org, Kees Cook , "Paul E. McKenney" , kernel-hardening@lists.openwall.com, Greg Kroah-Hartman , linux-kbuild@vger.kernel.org, Nick Desaulniers , linux-kernel@vger.kernel.org, clang-built-linux@googlegroups.com, Sami Tolvanen , linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This change adds build system support for Clang's Link Time Optimization (LTO). With -flto, instead of ELF object files, Clang produces LLVM bitcode, which is compiled into native code at link time, allowing the final binary to be optimized globally. For more details, see: https://llvm.org/docs/LinkTimeOptimization.html The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, which defaults to LTO being disabled. To use LTO, the architecture must select ARCH_SUPPORTS_LTO_CLANG and support: - compiling with Clang, - compiling inline assembly with Clang's integrated assembler, - and linking with LLD. While using full LTO results in the best runtime performance, the compilation is not scalable in time or memory. CONFIG_THINLTO enables ThinLTO, which allows parallel optimization and faster incremental builds. ThinLTO is used by default if the architecture also selects ARCH_SUPPORTS_THINLTO: https://clang.llvm.org/docs/ThinLTO.html To enable LTO, LLVM tools must be used to handle bitcode files. The easiest way is to pass the LLVM=1 option to make: $ make LLVM=1 defconfig $ scripts/config -e LTO_CLANG $ make LLVM=1 Alternatively, at least the following LLVM tools must be used: CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm To prepare for LTO support with other compilers, common parts are gated behind the CONFIG_LTO option, and LTO can be disabled for specific files by filtering out CC_FLAGS_LTO. Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in follow-up patches. Signed-off-by: Sami Tolvanen --- Makefile | 16 ++++++++ arch/Kconfig | 66 +++++++++++++++++++++++++++++++ include/asm-generic/vmlinux.lds.h | 11 ++++-- scripts/Makefile.build | 9 ++++- scripts/Makefile.modfinal | 9 ++++- scripts/Makefile.modpost | 24 ++++++++++- scripts/link-vmlinux.sh | 32 +++++++++++---- 7 files changed, 151 insertions(+), 16 deletions(-) diff --git a/Makefile b/Makefile index ac2c61c37a73..0c7fe6fb2143 100644 --- a/Makefile +++ b/Makefile @@ -886,6 +886,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) export CC_FLAGS_SCS endif +ifdef CONFIG_LTO_CLANG +ifdef CONFIG_THINLTO +CC_FLAGS_LTO_CLANG := -flto=thin $(call cc-option, -fsplit-lto-unit) +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache +else +CC_FLAGS_LTO_CLANG := -flto +endif +CC_FLAGS_LTO_CLANG += -fvisibility=default +endif + +ifdef CONFIG_LTO +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) +KBUILD_CFLAGS += $(CC_FLAGS_LTO) +export CC_FLAGS_LTO +endif + # arch Makefile may override CC so keep this after arch Makefile is included NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) diff --git a/arch/Kconfig b/arch/Kconfig index 8cc35dc556c7..e00b122293f8 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -552,6 +552,72 @@ config SHADOW_CALL_STACK reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying the stacks. +config LTO + bool + +config ARCH_SUPPORTS_LTO_CLANG + bool + help + An architecture should select this option if it supports: + - compiling with Clang, + - compiling inline assembly with Clang's integrated assembler, + - and linking with LLD. + +config ARCH_SUPPORTS_THINLTO + bool + help + An architecture should select this option if it supports Clang's + ThinLTO. + +config THINLTO + bool "Clang ThinLTO" + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO + default y + help + This option enables Clang's ThinLTO, which allows for parallel + optimization and faster incremental compiles. More information + can be found from Clang's documentation: + + https://clang.llvm.org/docs/ThinLTO.html + +choice + prompt "Link Time Optimization (LTO)" + default LTO_NONE + help + This option enables Link Time Optimization (LTO), which allows the + compiler to optimize binaries globally. + + If unsure, select LTO_NONE. + +config LTO_NONE + bool "None" + +config LTO_CLANG + bool "Clang's Link Time Optimization (EXPERIMENTAL)" + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD + depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) + depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) + depends on ARCH_SUPPORTS_LTO_CLANG + depends on !FTRACE_MCOUNT_RECORD + depends on !KASAN + depends on !MODVERSIONS + select LTO + help + This option enables Clang's Link Time Optimization (LTO), which + allows the compiler to optimize the kernel globally. If you enable + this option, the compiler generates LLVM bitcode instead of ELF + object files, and the actual compilation from bitcode happens at + the LTO link step, which may take several minutes depending on the + kernel configuration. More information can be found from LLVM's + documentation: + + https://llvm.org/docs/LinkTimeOptimization.html + + To select this option, you also need to use LLVM tools to handle + the bitcode by passing LLVM=1 to make. + +endchoice + config HAVE_ARCH_WITHIN_STACK_FRAMES bool help diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index db600ef218d7..78079000c05a 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -89,15 +89,18 @@ * .data. We don't want to pull in .data..other sections, which Linux * has defined. Same for text and bss. * + * With LTO_CLANG, the linker also splits sections by default, so we need + * these macros to combine the sections during the final link. + * * RODATA_MAIN is not used because existing code already defines .rodata.x * sections to be brought in with rodata. */ -#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG) #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* -#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* +#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral* #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* -#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* -#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* +#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L* +#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..compoundliteral* #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* #else #define TEXT_MAIN .text diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 2e8810b7e5ed..f307e708a1b7 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -108,7 +108,7 @@ endif # --------------------------------------------------------------------------- quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< $(obj)/%.s: $(src)/%.c FORCE $(call if_changed_dep,cc_s_c) @@ -424,8 +424,15 @@ $(obj)/lib.a: $(lib-y) FORCE # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object # module is turned into a multi object module, $^ will contain header file # dependencies recorded in the .*.cmd file. +ifdef CONFIG_LTO_CLANG +quiet_cmd_link_multi-m = AR [M] $@ +cmd_link_multi-m = \ + rm -f $@; \ + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) +else quiet_cmd_link_multi-m = LD [M] $@ cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^) +endif $(multi-used-m): FORCE $(call if_changed,link_multi-m) diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 411c1e600e7d..1005b147abd0 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -6,6 +6,7 @@ PHONY := __modfinal __modfinal: +include $(objtree)/include/config/auto.conf include $(srctree)/scripts/Kbuild.include # for c_flags @@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M] $@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) +ifdef CONFIG_LTO_CLANG +# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to +# avoid a second slow LTO link +prelink-ext := .lto +endif + quiet_cmd_ld_ko_o = LD [M] $@ cmd_ld_ko_o = \ $(LD) -r $(KBUILD_LDFLAGS) \ @@ -37,7 +44,7 @@ quiet_cmd_ld_ko_o = LD [M] $@ -o $@ $(filter %.o, $^); \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) -$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE +$(modules): %.ko: %$(prelink-ext).o %.mod.o $(KBUILD_LDS_MODULE) FORCE +$(call if_changed,ld_ko_o) targets += $(modules) $(modules:.ko=.mod.o) diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index 3651cbf6ad49..9ced8aecd579 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -102,12 +102,32 @@ $(input-symdump): @echo >&2 'WARNING: Symbol version dump "$@" is missing.' @echo >&2 ' Modules may not have dependencies or modversions.' +ifdef CONFIG_LTO_CLANG +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run +# LTO to compile them into native code before running modpost +prelink-ext = .lto + +quiet_cmd_cc_lto_link_modules = LTO [M] $@ +cmd_cc_lto_link_modules = \ + $(LD) $(ld_flags) -r -o $@ \ + --whole-archive $(filter-out FORCE,$^) + +%.lto.o: %.o FORCE + $(call if_changed,cc_lto_link_modules) + +PHONY += FORCE +FORCE: + +endif + +modules := $(sort $(shell cat $(MODORDER))) + # Read out modules.order to pass in modpost. # Otherwise, allmodconfig would fail with "Argument list too long". quiet_cmd_modpost = MODPOST $@ - cmd_modpost = sed 's/ko$$/o/' $< | $(MODPOST) -T - + cmd_modpost = sed 's/\.ko$$/$(prelink-ext)\.o/' $< | $(MODPOST) -T - -$(output-symdump): $(MODORDER) $(input-symdump) FORCE +$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE $(call if_changed,modpost) targets += $(output-symdump) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 92dd745906f4..a681b3b6722e 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -52,6 +52,14 @@ modpost_link() ${KBUILD_VMLINUX_LIBS} \ --end-group" + if [ -n "${CONFIG_LTO_CLANG}" ]; then + # This might take a while, so indicate that we're doing + # an LTO link + info LTO ${1} + else + info LD ${1} + fi + ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} } @@ -99,13 +107,22 @@ vmlinux_link() fi if [ "${SRCARCH}" != "um" ]; then - objects="--whole-archive \ - ${KBUILD_VMLINUX_OBJS} \ - --no-whole-archive \ - --start-group \ - ${KBUILD_VMLINUX_LIBS} \ - --end-group \ - ${@}" + if [ -n "${CONFIG_LTO_CLANG}" ]; then + # Use vmlinux.o instead of performing the slow LTO + # link again. + objects="--whole-archive \ + vmlinux.o \ + --no-whole-archive \ + ${@}" + else + objects="--whole-archive \ + ${KBUILD_VMLINUX_OBJS} \ + --no-whole-archive \ + --start-group \ + ${KBUILD_VMLINUX_LIBS} \ + --end-group \ + ${@}" + fi ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ ${strip_debug#-Wl,} \ @@ -270,7 +287,6 @@ fi; ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1 #link vmlinux.o -info LD vmlinux.o modpost_link vmlinux.o objtool_link vmlinux.o -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel