* [PATCH 00/22] add support for Clang LTO @ 2020-06-24 20:31 Sami Tolvanen 2020-06-24 20:31 ` [PATCH 01/22] objtool: use sh_info to find the base for .rela sections Sami Tolvanen ` (25 more replies) 0 siblings, 26 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel This patch series adds support for building x86_64 and arm64 kernels with Clang's Link Time Optimization (LTO). In addition to performance, the primary motivation for LTO is to allow Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's Pixel devices have shipped with LTO+CFI kernels since 2018. Most of the patches are build system changes for handling LLVM bitcode, which Clang produces with LTO instead of ELF object files, postponing ELF processing until a later stage, and ensuring initcall ordering. Note that first objtool patch in the series is already in linux-next, but as it's needed with LTO, I'm including it also here to make testing easier. Sami Tolvanen (22): objtool: use sh_info to find the base for .rela sections kbuild: add support for Clang LTO kbuild: lto: fix module versioning kbuild: lto: fix recordmcount kbuild: lto: postpone objtool kbuild: lto: limit inlining kbuild: lto: merge module sections kbuild: lto: remove duplicate dependencies from .mod files init: lto: ensure initcall ordering init: lto: fix PREL32 relocations pci: lto: fix PREL32 relocations modpost: lto: strip .lto from module names scripts/mod: disable LTO for empty.c efi/libstub: disable LTO drivers/misc/lkdtm: disable LTO for rodata.o arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY arm64: vdso: disable LTO arm64: allow LTO_CLANG and THINLTO to be selected x86, vdso: disable LTO only for vDSO x86, ftrace: disable recordmcount for ftrace_make_nop x86, relocs: Ignore L4_PAGE_OFFSET relocations x86, build: allow LTO_CLANG and THINLTO to be selected .gitignore | 1 + Makefile | 27 ++- arch/Kconfig | 65 +++++++ arch/arm64/Kconfig | 2 + arch/arm64/Makefile | 1 + arch/arm64/kernel/vdso/Makefile | 4 +- arch/x86/Kconfig | 2 + arch/x86/Makefile | 5 + arch/x86/entry/vdso/Makefile | 5 +- arch/x86/kernel/ftrace.c | 1 + arch/x86/tools/relocs.c | 1 + drivers/firmware/efi/libstub/Makefile | 2 + drivers/misc/lkdtm/Makefile | 1 + include/asm-generic/vmlinux.lds.h | 12 +- include/linux/compiler-clang.h | 4 + include/linux/compiler.h | 2 +- include/linux/compiler_types.h | 4 + include/linux/init.h | 78 +++++++- include/linux/pci.h | 15 +- kernel/trace/ftrace.c | 1 + lib/Kconfig.debug | 2 +- scripts/Makefile.build | 55 +++++- scripts/Makefile.lib | 6 +- scripts/Makefile.modfinal | 40 +++- scripts/Makefile.modpost | 26 ++- scripts/generate_initcall_order.pl | 270 ++++++++++++++++++++++++++ scripts/link-vmlinux.sh | 100 +++++++++- scripts/mod/Makefile | 1 + scripts/mod/modpost.c | 16 +- scripts/mod/modpost.h | 9 + scripts/mod/sumversion.c | 6 +- scripts/module-lto.lds | 26 +++ scripts/recordmcount.c | 3 +- tools/objtool/elf.c | 2 +- 34 files changed, 737 insertions(+), 58 deletions(-) create mode 100755 scripts/generate_initcall_order.pl create mode 100644 scripts/module-lto.lds base-commit: 26e122e97a3d0390ebec389347f64f3730fdf48f -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 01/22] objtool: use sh_info to find the base for .rela sections 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 02/22] kbuild: add support for Clang LTO Sami Tolvanen ` (24 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, linux-arm-kernel ELF doesn't require .rela section names to match the base section. Use the section index in sh_info to find the section instead of looking it up by name. LLD, for example, generates a .rela section that doesn't match the base section name when we merge sections in a linker script for a binary compiled with -ffunction-sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> --- tools/objtool/elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index 84225679f96d..c1ba92abaa03 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -502,7 +502,7 @@ static int read_relas(struct elf *elf) if (sec->sh.sh_type != SHT_RELA) continue; - sec->base = find_section_by_name(elf, sec->name + 5); + sec->base = find_section_by_index(elf, sec->sh.sh_info); if (!sec->base) { WARN("can't find base section for rela section %s", sec->name); -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 02/22] kbuild: add support for Clang LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen 2020-06-24 20:31 ` [PATCH 01/22] objtool: use sh_info to find the base for .rela sections Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:53 ` Nick Desaulniers 2020-06-25 2:26 ` Nathan Chancellor 2020-06-24 20:31 ` [PATCH 03/22] kbuild: lto: fix module versioning Sami Tolvanen ` (23 subsequent siblings) 25 siblings, 2 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel This change adds build system support for Clang's Link Time Optimization (LTO). With -flto, instead of ELF object files, Clang produces LLVM bitcode, which is compiled into native code at link time, allowing the final binary to be optimized globally. For more details, see: https://llvm.org/docs/LinkTimeOptimization.html The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, which defaults to LTO being disabled. To use LTO, the architecture must select ARCH_SUPPORTS_LTO_CLANG and support: - compiling with Clang, - compiling inline assembly with Clang's integrated assembler, - and linking with LLD. While using full LTO results in the best runtime performance, the compilation is not scalable in time or memory. CONFIG_THINLTO enables ThinLTO, which allows parallel optimization and faster incremental builds. ThinLTO is used by default if the architecture also selects ARCH_SUPPORTS_THINLTO: https://clang.llvm.org/docs/ThinLTO.html To enable LTO, LLVM tools must be used to handle bitcode files. The easiest way is to pass the LLVM=1 option to make: $ make LLVM=1 defconfig $ scripts/config -e LTO_CLANG $ make LLVM=1 Alternatively, at least the following LLVM tools must be used: CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm To prepare for LTO support with other compilers, common parts are gated behind the CONFIG_LTO option, and LTO can be disabled for specific files by filtering out CC_FLAGS_LTO. Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in follow-up patches. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 16 ++++++++ arch/Kconfig | 66 +++++++++++++++++++++++++++++++ include/asm-generic/vmlinux.lds.h | 11 ++++-- scripts/Makefile.build | 9 ++++- scripts/Makefile.modfinal | 9 ++++- scripts/Makefile.modpost | 24 ++++++++++- scripts/link-vmlinux.sh | 32 +++++++++++---- 7 files changed, 151 insertions(+), 16 deletions(-) diff --git a/Makefile b/Makefile index ac2c61c37a73..0c7fe6fb2143 100644 --- a/Makefile +++ b/Makefile @@ -886,6 +886,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) export CC_FLAGS_SCS endif +ifdef CONFIG_LTO_CLANG +ifdef CONFIG_THINLTO +CC_FLAGS_LTO_CLANG := -flto=thin $(call cc-option, -fsplit-lto-unit) +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache +else +CC_FLAGS_LTO_CLANG := -flto +endif +CC_FLAGS_LTO_CLANG += -fvisibility=default +endif + +ifdef CONFIG_LTO +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) +KBUILD_CFLAGS += $(CC_FLAGS_LTO) +export CC_FLAGS_LTO +endif + # arch Makefile may override CC so keep this after arch Makefile is included NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) diff --git a/arch/Kconfig b/arch/Kconfig index 8cc35dc556c7..e00b122293f8 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -552,6 +552,72 @@ config SHADOW_CALL_STACK reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying the stacks. +config LTO + bool + +config ARCH_SUPPORTS_LTO_CLANG + bool + help + An architecture should select this option if it supports: + - compiling with Clang, + - compiling inline assembly with Clang's integrated assembler, + - and linking with LLD. + +config ARCH_SUPPORTS_THINLTO + bool + help + An architecture should select this option if it supports Clang's + ThinLTO. + +config THINLTO + bool "Clang ThinLTO" + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO + default y + help + This option enables Clang's ThinLTO, which allows for parallel + optimization and faster incremental compiles. More information + can be found from Clang's documentation: + + https://clang.llvm.org/docs/ThinLTO.html + +choice + prompt "Link Time Optimization (LTO)" + default LTO_NONE + help + This option enables Link Time Optimization (LTO), which allows the + compiler to optimize binaries globally. + + If unsure, select LTO_NONE. + +config LTO_NONE + bool "None" + +config LTO_CLANG + bool "Clang's Link Time Optimization (EXPERIMENTAL)" + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD + depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) + depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) + depends on ARCH_SUPPORTS_LTO_CLANG + depends on !FTRACE_MCOUNT_RECORD + depends on !KASAN + depends on !MODVERSIONS + select LTO + help + This option enables Clang's Link Time Optimization (LTO), which + allows the compiler to optimize the kernel globally. If you enable + this option, the compiler generates LLVM bitcode instead of ELF + object files, and the actual compilation from bitcode happens at + the LTO link step, which may take several minutes depending on the + kernel configuration. More information can be found from LLVM's + documentation: + + https://llvm.org/docs/LinkTimeOptimization.html + + To select this option, you also need to use LLVM tools to handle + the bitcode by passing LLVM=1 to make. + +endchoice + config HAVE_ARCH_WITHIN_STACK_FRAMES bool help diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index db600ef218d7..78079000c05a 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -89,15 +89,18 @@ * .data. We don't want to pull in .data..other sections, which Linux * has defined. Same for text and bss. * + * With LTO_CLANG, the linker also splits sections by default, so we need + * these macros to combine the sections during the final link. + * * RODATA_MAIN is not used because existing code already defines .rodata.x * sections to be brought in with rodata. */ -#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG) #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* -#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* +#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral* #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* -#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* -#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* +#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L* +#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..compoundliteral* #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* #else #define TEXT_MAIN .text diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 2e8810b7e5ed..f307e708a1b7 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -108,7 +108,7 @@ endif # --------------------------------------------------------------------------- quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< $(obj)/%.s: $(src)/%.c FORCE $(call if_changed_dep,cc_s_c) @@ -424,8 +424,15 @@ $(obj)/lib.a: $(lib-y) FORCE # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object # module is turned into a multi object module, $^ will contain header file # dependencies recorded in the .*.cmd file. +ifdef CONFIG_LTO_CLANG +quiet_cmd_link_multi-m = AR [M] $@ +cmd_link_multi-m = \ + rm -f $@; \ + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) +else quiet_cmd_link_multi-m = LD [M] $@ cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^) +endif $(multi-used-m): FORCE $(call if_changed,link_multi-m) diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 411c1e600e7d..1005b147abd0 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -6,6 +6,7 @@ PHONY := __modfinal __modfinal: +include $(objtree)/include/config/auto.conf include $(srctree)/scripts/Kbuild.include # for c_flags @@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M] $@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) +ifdef CONFIG_LTO_CLANG +# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to +# avoid a second slow LTO link +prelink-ext := .lto +endif + quiet_cmd_ld_ko_o = LD [M] $@ cmd_ld_ko_o = \ $(LD) -r $(KBUILD_LDFLAGS) \ @@ -37,7 +44,7 @@ quiet_cmd_ld_ko_o = LD [M] $@ -o $@ $(filter %.o, $^); \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) -$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE +$(modules): %.ko: %$(prelink-ext).o %.mod.o $(KBUILD_LDS_MODULE) FORCE +$(call if_changed,ld_ko_o) targets += $(modules) $(modules:.ko=.mod.o) diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index 3651cbf6ad49..9ced8aecd579 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -102,12 +102,32 @@ $(input-symdump): @echo >&2 'WARNING: Symbol version dump "$@" is missing.' @echo >&2 ' Modules may not have dependencies or modversions.' +ifdef CONFIG_LTO_CLANG +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run +# LTO to compile them into native code before running modpost +prelink-ext = .lto + +quiet_cmd_cc_lto_link_modules = LTO [M] $@ +cmd_cc_lto_link_modules = \ + $(LD) $(ld_flags) -r -o $@ \ + --whole-archive $(filter-out FORCE,$^) + +%.lto.o: %.o FORCE + $(call if_changed,cc_lto_link_modules) + +PHONY += FORCE +FORCE: + +endif + +modules := $(sort $(shell cat $(MODORDER))) + # Read out modules.order to pass in modpost. # Otherwise, allmodconfig would fail with "Argument list too long". quiet_cmd_modpost = MODPOST $@ - cmd_modpost = sed 's/ko$$/o/' $< | $(MODPOST) -T - + cmd_modpost = sed 's/\.ko$$/$(prelink-ext)\.o/' $< | $(MODPOST) -T - -$(output-symdump): $(MODORDER) $(input-symdump) FORCE +$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE $(call if_changed,modpost) targets += $(output-symdump) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 92dd745906f4..a681b3b6722e 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -52,6 +52,14 @@ modpost_link() ${KBUILD_VMLINUX_LIBS} \ --end-group" + if [ -n "${CONFIG_LTO_CLANG}" ]; then + # This might take a while, so indicate that we're doing + # an LTO link + info LTO ${1} + else + info LD ${1} + fi + ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} } @@ -99,13 +107,22 @@ vmlinux_link() fi if [ "${SRCARCH}" != "um" ]; then - objects="--whole-archive \ - ${KBUILD_VMLINUX_OBJS} \ - --no-whole-archive \ - --start-group \ - ${KBUILD_VMLINUX_LIBS} \ - --end-group \ - ${@}" + if [ -n "${CONFIG_LTO_CLANG}" ]; then + # Use vmlinux.o instead of performing the slow LTO + # link again. + objects="--whole-archive \ + vmlinux.o \ + --no-whole-archive \ + ${@}" + else + objects="--whole-archive \ + ${KBUILD_VMLINUX_OBJS} \ + --no-whole-archive \ + --start-group \ + ${KBUILD_VMLINUX_LIBS} \ + --end-group \ + ${@}" + fi ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ ${strip_debug#-Wl,} \ @@ -270,7 +287,6 @@ fi; ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1 #link vmlinux.o -info LD vmlinux.o modpost_link vmlinux.o objtool_link vmlinux.o -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 02/22] kbuild: add support for Clang LTO 2020-06-24 20:31 ` [PATCH 02/22] kbuild: add support for Clang LTO Sami Tolvanen @ 2020-06-24 20:53 ` Nick Desaulniers 2020-06-24 21:29 ` Sami Tolvanen 2020-06-25 2:26 ` Nathan Chancellor 1 sibling, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 20:53 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:32 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > This change adds build system support for Clang's Link Time > Optimization (LTO). With -flto, instead of ELF object files, Clang > produces LLVM bitcode, which is compiled into native code at link > time, allowing the final binary to be optimized globally. For more > details, see: > > https://llvm.org/docs/LinkTimeOptimization.html > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > which defaults to LTO being disabled. To use LTO, the architecture > must select ARCH_SUPPORTS_LTO_CLANG and support: > > - compiling with Clang, > - compiling inline assembly with Clang's integrated assembler, > - and linking with LLD. > > While using full LTO results in the best runtime performance, the > compilation is not scalable in time or memory. CONFIG_THINLTO > enables ThinLTO, which allows parallel optimization and faster > incremental builds. ThinLTO is used by default if the architecture > also selects ARCH_SUPPORTS_THINLTO: > > https://clang.llvm.org/docs/ThinLTO.html > > To enable LTO, LLVM tools must be used to handle bitcode files. The > easiest way is to pass the LLVM=1 option to make: > > $ make LLVM=1 defconfig > $ scripts/config -e LTO_CLANG > $ make LLVM=1 > > Alternatively, at least the following LLVM tools must be used: > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > To prepare for LTO support with other compilers, common parts are > gated behind the CONFIG_LTO option, and LTO can be disabled for > specific files by filtering out CC_FLAGS_LTO. > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > follow-up patches. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 16 ++++++++ > arch/Kconfig | 66 +++++++++++++++++++++++++++++++ > include/asm-generic/vmlinux.lds.h | 11 ++++-- > scripts/Makefile.build | 9 ++++- > scripts/Makefile.modfinal | 9 ++++- > scripts/Makefile.modpost | 24 ++++++++++- > scripts/link-vmlinux.sh | 32 +++++++++++---- > 7 files changed, 151 insertions(+), 16 deletions(-) > > diff --git a/Makefile b/Makefile > index ac2c61c37a73..0c7fe6fb2143 100644 > --- a/Makefile > +++ b/Makefile > @@ -886,6 +886,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > export CC_FLAGS_SCS > endif > > +ifdef CONFIG_LTO_CLANG > +ifdef CONFIG_THINLTO > +CC_FLAGS_LTO_CLANG := -flto=thin $(call cc-option, -fsplit-lto-unit) The kconfig change gates this on clang-11; do we still need the cc-option check here, or can we hardcode the use of -fsplit-lto-unit? Playing with the flag in godbolt, it looks like clang-8 had support for this flag. > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache It might be nice to have `make distclean` or even `make clean` scrub the .thinlto-cache? Also, I verified that the `.gitignore` rule for `.*` properly ignores this dir. > +else > +CC_FLAGS_LTO_CLANG := -flto > +endif > +CC_FLAGS_LTO_CLANG += -fvisibility=default > +endif > + > +ifdef CONFIG_LTO > +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) > +KBUILD_CFLAGS += $(CC_FLAGS_LTO) > +export CC_FLAGS_LTO > +endif > + > # arch Makefile may override CC so keep this after arch Makefile is included > NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) > > diff --git a/arch/Kconfig b/arch/Kconfig > index 8cc35dc556c7..e00b122293f8 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -552,6 +552,72 @@ config SHADOW_CALL_STACK > reading and writing arbitrary memory may be able to locate them > and hijack control flow by modifying the stacks. > > +config LTO > + bool > + > +config ARCH_SUPPORTS_LTO_CLANG > + bool > + help > + An architecture should select this option if it supports: > + - compiling with Clang, > + - compiling inline assembly with Clang's integrated assembler, > + - and linking with LLD. > + > +config ARCH_SUPPORTS_THINLTO > + bool > + help > + An architecture should select this option if it supports Clang's > + ThinLTO. > + > +config THINLTO > + bool "Clang ThinLTO" > + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO > + default y > + help > + This option enables Clang's ThinLTO, which allows for parallel > + optimization and faster incremental compiles. More information > + can be found from Clang's documentation: > + > + https://clang.llvm.org/docs/ThinLTO.html > + > +choice > + prompt "Link Time Optimization (LTO)" > + default LTO_NONE > + help > + This option enables Link Time Optimization (LTO), which allows the > + compiler to optimize binaries globally. > + > + If unsure, select LTO_NONE. > + > +config LTO_NONE > + bool "None" > + > +config LTO_CLANG > + bool "Clang's Link Time Optimization (EXPERIMENTAL)" > + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD > + depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) > + depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) > + depends on ARCH_SUPPORTS_LTO_CLANG > + depends on !FTRACE_MCOUNT_RECORD > + depends on !KASAN > + depends on !MODVERSIONS > + select LTO > + help > + This option enables Clang's Link Time Optimization (LTO), which > + allows the compiler to optimize the kernel globally. If you enable > + this option, the compiler generates LLVM bitcode instead of ELF > + object files, and the actual compilation from bitcode happens at > + the LTO link step, which may take several minutes depending on the > + kernel configuration. More information can be found from LLVM's > + documentation: > + > + https://llvm.org/docs/LinkTimeOptimization.html > + > + To select this option, you also need to use LLVM tools to handle > + the bitcode by passing LLVM=1 to make. > + > +endchoice > + > config HAVE_ARCH_WITHIN_STACK_FRAMES > bool > help > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h > index db600ef218d7..78079000c05a 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -89,15 +89,18 @@ > * .data. We don't want to pull in .data..other sections, which Linux > * has defined. Same for text and bss. > * > + * With LTO_CLANG, the linker also splits sections by default, so we need > + * these macros to combine the sections during the final link. > + * > * RODATA_MAIN is not used because existing code already defines .rodata.x > * sections to be brought in with rodata. > */ > -#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION > +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG) > #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* > -#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* > +#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral* > #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* > -#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* > -#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* > +#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L* > +#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..compoundliteral* > #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* > #else > #define TEXT_MAIN .text > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 2e8810b7e5ed..f307e708a1b7 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -108,7 +108,7 @@ endif > # --------------------------------------------------------------------------- > > quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ > - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< > + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< > > $(obj)/%.s: $(src)/%.c FORCE > $(call if_changed_dep,cc_s_c) > @@ -424,8 +424,15 @@ $(obj)/lib.a: $(lib-y) FORCE > # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object > # module is turned into a multi object module, $^ will contain header file > # dependencies recorded in the .*.cmd file. > +ifdef CONFIG_LTO_CLANG > +quiet_cmd_link_multi-m = AR [M] $@ > +cmd_link_multi-m = \ > + rm -f $@; \ > + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) > +else > quiet_cmd_link_multi-m = LD [M] $@ > cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^) > +endif > > $(multi-used-m): FORCE > $(call if_changed,link_multi-m) > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal > index 411c1e600e7d..1005b147abd0 100644 > --- a/scripts/Makefile.modfinal > +++ b/scripts/Makefile.modfinal > @@ -6,6 +6,7 @@ > PHONY := __modfinal > __modfinal: > > +include $(objtree)/include/config/auto.conf > include $(srctree)/scripts/Kbuild.include > > # for c_flags > @@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M] $@ > > ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) > > +ifdef CONFIG_LTO_CLANG > +# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to > +# avoid a second slow LTO link > +prelink-ext := .lto > +endif > + > quiet_cmd_ld_ko_o = LD [M] $@ > cmd_ld_ko_o = \ > $(LD) -r $(KBUILD_LDFLAGS) \ > @@ -37,7 +44,7 @@ quiet_cmd_ld_ko_o = LD [M] $@ > -o $@ $(filter %.o, $^); \ > $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) > > -$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE > +$(modules): %.ko: %$(prelink-ext).o %.mod.o $(KBUILD_LDS_MODULE) FORCE > +$(call if_changed,ld_ko_o) > > targets += $(modules) $(modules:.ko=.mod.o) > diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost > index 3651cbf6ad49..9ced8aecd579 100644 > --- a/scripts/Makefile.modpost > +++ b/scripts/Makefile.modpost > @@ -102,12 +102,32 @@ $(input-symdump): > @echo >&2 'WARNING: Symbol version dump "$@" is missing.' > @echo >&2 ' Modules may not have dependencies or modversions.' > > +ifdef CONFIG_LTO_CLANG > +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run > +# LTO to compile them into native code before running modpost > +prelink-ext = .lto > + > +quiet_cmd_cc_lto_link_modules = LTO [M] $@ > +cmd_cc_lto_link_modules = \ > + $(LD) $(ld_flags) -r -o $@ \ > + --whole-archive $(filter-out FORCE,$^) > + > +%.lto.o: %.o FORCE > + $(call if_changed,cc_lto_link_modules) > + > +PHONY += FORCE > +FORCE: > + > +endif > + > +modules := $(sort $(shell cat $(MODORDER))) > + > # Read out modules.order to pass in modpost. > # Otherwise, allmodconfig would fail with "Argument list too long". > quiet_cmd_modpost = MODPOST $@ > - cmd_modpost = sed 's/ko$$/o/' $< | $(MODPOST) -T - > + cmd_modpost = sed 's/\.ko$$/$(prelink-ext)\.o/' $< | $(MODPOST) -T - > > -$(output-symdump): $(MODORDER) $(input-symdump) FORCE > +$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE > $(call if_changed,modpost) > > targets += $(output-symdump) > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index 92dd745906f4..a681b3b6722e 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -52,6 +52,14 @@ modpost_link() > ${KBUILD_VMLINUX_LIBS} \ > --end-group" > > + if [ -n "${CONFIG_LTO_CLANG}" ]; then > + # This might take a while, so indicate that we're doing > + # an LTO link > + info LTO ${1} > + else > + info LD ${1} > + fi > + > ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} > } > > @@ -99,13 +107,22 @@ vmlinux_link() > fi > > if [ "${SRCARCH}" != "um" ]; then > - objects="--whole-archive \ > - ${KBUILD_VMLINUX_OBJS} \ > - --no-whole-archive \ > - --start-group \ > - ${KBUILD_VMLINUX_LIBS} \ > - --end-group \ > - ${@}" > + if [ -n "${CONFIG_LTO_CLANG}" ]; then > + # Use vmlinux.o instead of performing the slow LTO > + # link again. > + objects="--whole-archive \ > + vmlinux.o \ > + --no-whole-archive \ > + ${@}" > + else > + objects="--whole-archive \ > + ${KBUILD_VMLINUX_OBJS} \ > + --no-whole-archive \ > + --start-group \ > + ${KBUILD_VMLINUX_LIBS} \ > + --end-group \ > + ${@}" > + fi > > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ > ${strip_debug#-Wl,} \ > @@ -270,7 +287,6 @@ fi; > ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1 > > #link vmlinux.o > -info LD vmlinux.o > modpost_link vmlinux.o > objtool_link vmlinux.o > > -- > 2.27.0.212.ge8ba1cc988-goog > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 02/22] kbuild: add support for Clang LTO 2020-06-24 20:53 ` Nick Desaulniers @ 2020-06-24 21:29 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 21:29 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 01:53:52PM -0700, Nick Desaulniers wrote: > On Wed, Jun 24, 2020 at 1:32 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > diff --git a/Makefile b/Makefile > > index ac2c61c37a73..0c7fe6fb2143 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -886,6 +886,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > > export CC_FLAGS_SCS > > endif > > > > +ifdef CONFIG_LTO_CLANG > > +ifdef CONFIG_THINLTO > > +CC_FLAGS_LTO_CLANG := -flto=thin $(call cc-option, -fsplit-lto-unit) > > The kconfig change gates this on clang-11; do we still need the > cc-option check here, or can we hardcode the use of -fsplit-lto-unit? > Playing with the flag in godbolt, it looks like clang-8 had support > for this flag. True, we don't need cc-option here anymore. I'll remove it, thanks. > > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache > > It might be nice to have `make distclean` or even `make clean` scrub > the .thinlto-cache? Also, I verified that the `.gitignore` rule for > `.*` properly ignores this dir. Sure, distclean sounds appropriate to me. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 02/22] kbuild: add support for Clang LTO 2020-06-24 20:31 ` [PATCH 02/22] kbuild: add support for Clang LTO Sami Tolvanen 2020-06-24 20:53 ` Nick Desaulniers @ 2020-06-25 2:26 ` Nathan Chancellor 2020-06-25 16:13 ` Sami Tolvanen 1 sibling, 1 reply; 212+ messages in thread From: Nathan Chancellor @ 2020-06-25 2:26 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel Hi Sami, On Wed, Jun 24, 2020 at 01:31:40PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote: > This change adds build system support for Clang's Link Time > Optimization (LTO). With -flto, instead of ELF object files, Clang > produces LLVM bitcode, which is compiled into native code at link > time, allowing the final binary to be optimized globally. For more > details, see: > > https://llvm.org/docs/LinkTimeOptimization.html > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > which defaults to LTO being disabled. To use LTO, the architecture > must select ARCH_SUPPORTS_LTO_CLANG and support: > > - compiling with Clang, > - compiling inline assembly with Clang's integrated assembler, > - and linking with LLD. > > While using full LTO results in the best runtime performance, the > compilation is not scalable in time or memory. CONFIG_THINLTO > enables ThinLTO, which allows parallel optimization and faster > incremental builds. ThinLTO is used by default if the architecture > also selects ARCH_SUPPORTS_THINLTO: > > https://clang.llvm.org/docs/ThinLTO.html > > To enable LTO, LLVM tools must be used to handle bitcode files. The > easiest way is to pass the LLVM=1 option to make: > > $ make LLVM=1 defconfig > $ scripts/config -e LTO_CLANG > $ make LLVM=1 > > Alternatively, at least the following LLVM tools must be used: > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > To prepare for LTO support with other compilers, common parts are > gated behind the CONFIG_LTO option, and LTO can be disabled for > specific files by filtering out CC_FLAGS_LTO. > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > follow-up patches. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 16 ++++++++ > arch/Kconfig | 66 +++++++++++++++++++++++++++++++ > include/asm-generic/vmlinux.lds.h | 11 ++++-- > scripts/Makefile.build | 9 ++++- > scripts/Makefile.modfinal | 9 ++++- > scripts/Makefile.modpost | 24 ++++++++++- > scripts/link-vmlinux.sh | 32 +++++++++++---- > 7 files changed, 151 insertions(+), 16 deletions(-) > > diff --git a/Makefile b/Makefile > index ac2c61c37a73..0c7fe6fb2143 100644 > --- a/Makefile > +++ b/Makefile > @@ -886,6 +886,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > export CC_FLAGS_SCS > endif > > +ifdef CONFIG_LTO_CLANG > +ifdef CONFIG_THINLTO > +CC_FLAGS_LTO_CLANG := -flto=thin $(call cc-option, -fsplit-lto-unit) > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache > +else > +CC_FLAGS_LTO_CLANG := -flto > +endif > +CC_FLAGS_LTO_CLANG += -fvisibility=default > +endif > + > +ifdef CONFIG_LTO > +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) > +KBUILD_CFLAGS += $(CC_FLAGS_LTO) > +export CC_FLAGS_LTO > +endif > + > # arch Makefile may override CC so keep this after arch Makefile is included > NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) > > diff --git a/arch/Kconfig b/arch/Kconfig > index 8cc35dc556c7..e00b122293f8 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -552,6 +552,72 @@ config SHADOW_CALL_STACK > reading and writing arbitrary memory may be able to locate them > and hijack control flow by modifying the stacks. > > +config LTO > + bool > + > +config ARCH_SUPPORTS_LTO_CLANG > + bool > + help > + An architecture should select this option if it supports: > + - compiling with Clang, > + - compiling inline assembly with Clang's integrated assembler, > + - and linking with LLD. > + > +config ARCH_SUPPORTS_THINLTO > + bool > + help > + An architecture should select this option if it supports Clang's > + ThinLTO. > + > +config THINLTO > + bool "Clang ThinLTO" > + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO > + default y > + help > + This option enables Clang's ThinLTO, which allows for parallel > + optimization and faster incremental compiles. More information > + can be found from Clang's documentation: > + > + https://clang.llvm.org/docs/ThinLTO.html > + > +choice > + prompt "Link Time Optimization (LTO)" > + default LTO_NONE > + help > + This option enables Link Time Optimization (LTO), which allows the > + compiler to optimize binaries globally. > + > + If unsure, select LTO_NONE. > + > +config LTO_NONE > + bool "None" > + > +config LTO_CLANG > + bool "Clang's Link Time Optimization (EXPERIMENTAL)" > + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD I am curious, what is the reason for gating this at clang 11.0.0? Presumably this? https://github.com/ClangBuiltLinux/linux/issues/510 It might be nice to notate this so that we do not have to wonder :) Cheers, Nathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 02/22] kbuild: add support for Clang LTO 2020-06-25 2:26 ` Nathan Chancellor @ 2020-06-25 16:13 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-25 16:13 UTC (permalink / raw) To: Nathan Chancellor Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 07:26:47PM -0700, Nathan Chancellor wrote: > Hi Sami, > > On Wed, Jun 24, 2020 at 01:31:40PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote: > > This change adds build system support for Clang's Link Time > > Optimization (LTO). With -flto, instead of ELF object files, Clang > > produces LLVM bitcode, which is compiled into native code at link > > time, allowing the final binary to be optimized globally. For more > > details, see: > > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > > which defaults to LTO being disabled. To use LTO, the architecture > > must select ARCH_SUPPORTS_LTO_CLANG and support: > > > > - compiling with Clang, > > - compiling inline assembly with Clang's integrated assembler, > > - and linking with LLD. > > > > While using full LTO results in the best runtime performance, the > > compilation is not scalable in time or memory. CONFIG_THINLTO > > enables ThinLTO, which allows parallel optimization and faster > > incremental builds. ThinLTO is used by default if the architecture > > also selects ARCH_SUPPORTS_THINLTO: > > > > https://clang.llvm.org/docs/ThinLTO.html > > > > To enable LTO, LLVM tools must be used to handle bitcode files. The > > easiest way is to pass the LLVM=1 option to make: > > > > $ make LLVM=1 defconfig > > $ scripts/config -e LTO_CLANG > > $ make LLVM=1 > > > > Alternatively, at least the following LLVM tools must be used: > > > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > > > To prepare for LTO support with other compilers, common parts are > > gated behind the CONFIG_LTO option, and LTO can be disabled for > > specific files by filtering out CC_FLAGS_LTO. > > > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > > follow-up patches. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > Makefile | 16 ++++++++ > > arch/Kconfig | 66 +++++++++++++++++++++++++++++++ > > include/asm-generic/vmlinux.lds.h | 11 ++++-- > > scripts/Makefile.build | 9 ++++- > > scripts/Makefile.modfinal | 9 ++++- > > scripts/Makefile.modpost | 24 ++++++++++- > > scripts/link-vmlinux.sh | 32 +++++++++++---- > > 7 files changed, 151 insertions(+), 16 deletions(-) > > > > diff --git a/Makefile b/Makefile > > index ac2c61c37a73..0c7fe6fb2143 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -886,6 +886,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > > export CC_FLAGS_SCS > > endif > > > > +ifdef CONFIG_LTO_CLANG > > +ifdef CONFIG_THINLTO > > +CC_FLAGS_LTO_CLANG := -flto=thin $(call cc-option, -fsplit-lto-unit) > > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache > > +else > > +CC_FLAGS_LTO_CLANG := -flto > > +endif > > +CC_FLAGS_LTO_CLANG += -fvisibility=default > > +endif > > + > > +ifdef CONFIG_LTO > > +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) > > +KBUILD_CFLAGS += $(CC_FLAGS_LTO) > > +export CC_FLAGS_LTO > > +endif > > + > > # arch Makefile may override CC so keep this after arch Makefile is included > > NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) > > > > diff --git a/arch/Kconfig b/arch/Kconfig > > index 8cc35dc556c7..e00b122293f8 100644 > > --- a/arch/Kconfig > > +++ b/arch/Kconfig > > @@ -552,6 +552,72 @@ config SHADOW_CALL_STACK > > reading and writing arbitrary memory may be able to locate them > > and hijack control flow by modifying the stacks. > > > > +config LTO > > + bool > > + > > +config ARCH_SUPPORTS_LTO_CLANG > > + bool > > + help > > + An architecture should select this option if it supports: > > + - compiling with Clang, > > + - compiling inline assembly with Clang's integrated assembler, > > + - and linking with LLD. > > + > > +config ARCH_SUPPORTS_THINLTO > > + bool > > + help > > + An architecture should select this option if it supports Clang's > > + ThinLTO. > > + > > +config THINLTO > > + bool "Clang ThinLTO" > > + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO > > + default y > > + help > > + This option enables Clang's ThinLTO, which allows for parallel > > + optimization and faster incremental compiles. More information > > + can be found from Clang's documentation: > > + > > + https://clang.llvm.org/docs/ThinLTO.html > > + > > +choice > > + prompt "Link Time Optimization (LTO)" > > + default LTO_NONE > > + help > > + This option enables Link Time Optimization (LTO), which allows the > > + compiler to optimize binaries globally. > > + > > + If unsure, select LTO_NONE. > > + > > +config LTO_NONE > > + bool "None" > > + > > +config LTO_CLANG > > + bool "Clang's Link Time Optimization (EXPERIMENTAL)" > > + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD > > I am curious, what is the reason for gating this at clang 11.0.0? > > Presumably this? https://github.com/ClangBuiltLinux/linux/issues/510 > > It might be nice to notate this so that we do not have to wonder :) Yes, that's the reason. I'll add a note about it. Thanks! Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 03/22] kbuild: lto: fix module versioning 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen 2020-06-24 20:31 ` [PATCH 01/22] objtool: use sh_info to find the base for .rela sections Sami Tolvanen 2020-06-24 20:31 ` [PATCH 02/22] kbuild: add support for Clang LTO Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 04/22] kbuild: lto: fix recordmcount Sami Tolvanen ` (22 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With CONFIG_MODVERSIONS, version information is linked into each compilation unit that exports symbols. With LTO, we cannot use this method as all C code is compiled into LLVM bitcode instead. This change collects symbol versions into .symversions files and merges them in link-vmlinux.sh where they are all linked into vmlinux.o at the same time. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- .gitignore | 1 + Makefile | 3 ++- arch/Kconfig | 1 - scripts/Makefile.build | 33 +++++++++++++++++++++++++++++++-- scripts/Makefile.modpost | 2 ++ scripts/link-vmlinux.sh | 25 ++++++++++++++++++++++++- 6 files changed, 60 insertions(+), 5 deletions(-) diff --git a/.gitignore b/.gitignore index 87b9dd8a163b..51b02c2f2826 100644 --- a/.gitignore +++ b/.gitignore @@ -41,6 +41,7 @@ *.so.dbg *.su *.symtypes +*.symversions *.tab.[ch] *.tar *.xz diff --git a/Makefile b/Makefile index 0c7fe6fb2143..161ad0d1f77f 100644 --- a/Makefile +++ b/Makefile @@ -1793,7 +1793,8 @@ clean: $(clean-dirs) -o -name '.tmp_*.o.*' \ -o -name '*.c.[012]*.*' \ -o -name '*.ll' \ - -o -name '*.gcno' \) -type f -print | xargs rm -f + -o -name '*.gcno' \ + -o -name '*.*.symversions' \) -type f -print | xargs rm -f # Generate tags for editors # --------------------------------------------------------------------------- diff --git a/arch/Kconfig b/arch/Kconfig index e00b122293f8..87488fe1e6b8 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -600,7 +600,6 @@ config LTO_CLANG depends on ARCH_SUPPORTS_LTO_CLANG depends on !FTRACE_MCOUNT_RECORD depends on !KASAN - depends on !MODVERSIONS select LTO help This option enables Clang's Link Time Optimization (LTO), which diff --git a/scripts/Makefile.build b/scripts/Makefile.build index f307e708a1b7..5c0bbb6ddfcf 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -163,6 +163,15 @@ ifdef CONFIG_MODVERSIONS # the actual value of the checksum generated by genksyms # o remove .tmp_<file>.o to <file>.o +ifdef CONFIG_LTO_CLANG +# Generate .o.symversions files for each .o with exported symbols, and link these +# to the kernel and/or modules at the end. +cmd_modversions_c = \ + if $(NM) $@ 2>/dev/null | grep -q __ksymtab; then \ + $(call cmd_gensymtypes_c,$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \ + > $@.symversions; \ + fi; +else cmd_modversions_c = \ if $(OBJDUMP) -h $@ | grep -q __ksymtab; then \ $(call cmd_gensymtypes_c,$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \ @@ -174,6 +183,7 @@ cmd_modversions_c = \ rm -f $(@D)/.tmp_$(@F:.o=.ver); \ fi endif +endif ifdef CONFIG_FTRACE_MCOUNT_RECORD ifndef CC_USING_RECORD_MCOUNT @@ -389,6 +399,18 @@ $(obj)/%.asn1.c $(obj)/%.asn1.h: $(src)/%.asn1 $(objtree)/scripts/asn1_compiler $(subdir-builtin): $(obj)/%/built-in.a: $(obj)/% ; $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ; +# combine symversions for later processing +quiet_cmd_update_lto_symversions = SYMVER $@ +ifeq ($(CONFIG_LTO_CLANG) $(CONFIG_MODVERSIONS),y y) + cmd_update_lto_symversions = \ + rm -f $@.symversions \ + $(foreach n, $(filter-out FORCE,$^), \ + $(if $(wildcard $(n).symversions), \ + ; cat $(n).symversions >> $@.symversions)) +else + cmd_update_lto_symversions = echo >/dev/null +endif + # # Rule to compile a set of .o files into one .a file (without symbol table) # @@ -396,8 +418,11 @@ $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ; quiet_cmd_ar_builtin = AR $@ cmd_ar_builtin = rm -f $@; $(AR) cDPrST $@ $(real-prereqs) +quiet_cmd_ar_and_symver = AR $@ + cmd_ar_and_symver = $(cmd_update_lto_symversions); $(cmd_ar_builtin) + $(obj)/built-in.a: $(real-obj-y) FORCE - $(call if_changed,ar_builtin) + $(call if_changed,ar_and_symver) # # Rule to create modules.order file @@ -417,8 +442,11 @@ $(obj)/modules.order: $(obj-m) FORCE # # Rule to compile a set of .o files into one .a file (with symbol table) # +quiet_cmd_ar_lib = AR $@ + cmd_ar_lib = $(cmd_update_lto_symversions); $(cmd_ar) + $(obj)/lib.a: $(lib-y) FORCE - $(call if_changed,ar) + $(call if_changed,ar_lib) # NOTE: # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object @@ -427,6 +455,7 @@ $(obj)/lib.a: $(lib-y) FORCE ifdef CONFIG_LTO_CLANG quiet_cmd_link_multi-m = AR [M] $@ cmd_link_multi-m = \ + $(cmd_update_lto_symversions); \ rm -f $@; \ $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) else diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index 9ced8aecd579..42dbdc2bbf73 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -110,6 +110,8 @@ prelink-ext = .lto quiet_cmd_cc_lto_link_modules = LTO [M] $@ cmd_cc_lto_link_modules = \ $(LD) $(ld_flags) -r -o $@ \ + $(shell [ -s $(@:.lto.o=.o.symversions) ] && \ + echo -T $(@:.lto.o=.o.symversions)) \ --whole-archive $(filter-out FORCE,$^) %.lto.o: %.o FORCE diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index a681b3b6722e..69a6d7254e28 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -39,11 +39,28 @@ info() fi } +# If CONFIG_LTO_CLANG is selected, collect generated symbol versions into +# .tmp_symversions.lds +gen_symversions() +{ + info GEN .tmp_symversions.lds + rm -f .tmp_symversions.lds + + for a in ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS}; do + for o in $(${AR} t $a 2>/dev/null); do + if [ -f ${o}.symversions ]; then + cat ${o}.symversions >> .tmp_symversions.lds + fi + done + done +} + # Link of vmlinux.o used for section mismatch analysis # ${1} output file modpost_link() { local objects + local lds="" objects="--whole-archive \ ${KBUILD_VMLINUX_OBJS} \ @@ -53,6 +70,11 @@ modpost_link() --end-group" if [ -n "${CONFIG_LTO_CLANG}" ]; then + if [ -n "${CONFIG_MODVERSIONS}" ]; then + gen_symversions + lds="${lds} -T .tmp_symversions.lds" + fi + # This might take a while, so indicate that we're doing # an LTO link info LTO ${1} @@ -60,7 +82,7 @@ modpost_link() info LD ${1} fi - ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} + ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${lds} ${objects} } objtool_link() @@ -238,6 +260,7 @@ cleanup() { rm -f .btf.* rm -f .tmp_System.map + rm -f .tmp_symversions.lds rm -f .tmp_vmlinux* rm -f System.map rm -f vmlinux -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 04/22] kbuild: lto: fix recordmcount 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (2 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 03/22] kbuild: lto: fix module versioning Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 21:27 ` Peter Zijlstra 2020-06-24 20:31 ` [PATCH 05/22] kbuild: lto: postpone objtool Sami Tolvanen ` (21 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, LLVM bitcode won't be compiled into native code until modpost_link. This change postpones calls to recordmcount until after this step. In order to exclude specific functions from inspection, we add a new code section .text..nomcount, which we tell recordmcount to ignore, and a __nomcount attribute for moving functions to this section. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 2 +- arch/Kconfig | 2 +- include/asm-generic/vmlinux.lds.h | 1 + include/linux/compiler-clang.h | 4 ++++ include/linux/compiler_types.h | 4 ++++ kernel/trace/ftrace.c | 1 + scripts/Makefile.build | 9 +++++++++ scripts/Makefile.modfinal | 18 ++++++++++++++++-- scripts/link-vmlinux.sh | 29 +++++++++++++++++++++++++++++ scripts/recordmcount.c | 3 ++- 10 files changed, 68 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 161ad0d1f77f..3a7e5e5c17b9 100644 --- a/Makefile +++ b/Makefile @@ -861,7 +861,7 @@ KBUILD_AFLAGS += $(CC_FLAGS_USING) ifdef CONFIG_DYNAMIC_FTRACE ifdef CONFIG_HAVE_C_RECORDMCOUNT BUILD_C_RECORDMCOUNT := y - export BUILD_C_RECORDMCOUNT + export BUILD_C_RECORDMCOUNT RECORDMCOUNT_WARN endif endif endif diff --git a/arch/Kconfig b/arch/Kconfig index 87488fe1e6b8..85b2044b927d 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -598,7 +598,7 @@ config LTO_CLANG depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) depends on ARCH_SUPPORTS_LTO_CLANG - depends on !FTRACE_MCOUNT_RECORD + depends on !FTRACE_MCOUNT_RECORD || HAVE_C_RECORDMCOUNT depends on !KASAN select LTO help diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 78079000c05a..a1c902b808d0 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -565,6 +565,7 @@ *(.text.hot TEXT_MAIN .text.fixup .text.unlikely) \ NOINSTR_TEXT \ *(.text..refcount) \ + *(.text..nomcount) \ *(.ref.text) \ MEM_KEEP(init.text*) \ MEM_KEEP(exit.text*) \ diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index ee37256ec8bd..fd78475c0642 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -55,3 +55,7 @@ #if __has_feature(shadow_call_stack) # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) #endif + +#if defined(CONFIG_LTO_CLANG) && defined(CONFIG_FTRACE_MCOUNT_RECORD) +#define __nomcount __attribute__((__section__(".text..nomcount"))) +#endif diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index e368384445b6..1470c9703a25 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -233,6 +233,10 @@ struct ftrace_likely_data { # define __noscs #endif +#ifndef __nomcount +# define __nomcount +#endif + #ifndef asm_volatile_goto #define asm_volatile_goto(x...) asm goto(x) #endif diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 1903b80db6eb..8e3ddb8123d9 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -6062,6 +6062,7 @@ static int ftrace_cmp_ips(const void *a, const void *b) return 0; } +__nomcount static int ftrace_process_locs(struct module *mod, unsigned long *start, unsigned long *end) diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 5c0bbb6ddfcf..64e99f4baa5b 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -187,6 +187,9 @@ endif ifdef CONFIG_FTRACE_MCOUNT_RECORD ifndef CC_USING_RECORD_MCOUNT +ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY +# With LTO, we postpone recordmcount until we compile a native binary +ifndef CONFIG_LTO_CLANG # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") @@ -200,6 +203,8 @@ sub_cmd_record_mcount = \ if [ $(@) != "scripts/mod/empty.o" ]; then \ $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) "$(@)"; \ fi; +endif # CONFIG_LTO_CLANG + recordmcount_source := $(srctree)/scripts/recordmcount.c \ $(srctree)/scripts/recordmcount.h else @@ -209,11 +214,15 @@ sub_cmd_record_mcount = perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \ "$(OBJDUMP)" "$(OBJCOPY)" "$(CC) $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS)" \ "$(LD) $(KBUILD_LDFLAGS)" "$(NM)" "$(RM)" "$(MV)" \ "$(if $(part-of-module),1,0)" "$(@)"; + recordmcount_source := $(srctree)/scripts/recordmcount.pl endif # BUILD_C_RECORDMCOUNT +ifndef CONFIG_LTO_CLANG cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \ $(sub_cmd_record_mcount)) +endif # CONFIG_LTO_CLANG endif # CC_USING_RECORD_MCOUNT +endif # CC_USING_PATCHABLE_FUNCTION_ENTRY endif # CONFIG_FTRACE_MCOUNT_RECORD ifdef CONFIG_STACK_VALIDATION diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 1005b147abd0..d168f0cfe67c 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -34,10 +34,24 @@ ifdef CONFIG_LTO_CLANG # With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to # avoid a second slow LTO link prelink-ext := .lto -endif + +# ELF processing was skipped earlier because we didn't have native code, +# so let's now process the prelinked binary before we link the module. + +ifdef CONFIG_FTRACE_MCOUNT_RECORD +ifndef CC_USING_RECORD_MCOUNT +ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY +cmd_ld_ko_o += $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) \ + $(@:.ko=$(prelink-ext).o); + +endif # CC_USING_PATCHABLE_FUNCTION_ENTRY +endif # CC_USING_RECORD_MCOUNT +endif # CONFIG_FTRACE_MCOUNT_RECORD + +endif # CONFIG_LTO_CLANG quiet_cmd_ld_ko_o = LD [M] $@ - cmd_ld_ko_o = \ + cmd_ld_ko_o += \ $(LD) -r $(KBUILD_LDFLAGS) \ $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \ $(addprefix -T , $(KBUILD_LDS_MODULE)) \ diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 69a6d7254e28..c72f5d0238f1 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -108,6 +108,29 @@ objtool_link() fi } +# If CONFIG_LTO_CLANG is selected, we postpone running recordmcount until +# we have compiled LLVM IR to an object file. +recordmcount() +{ + if [ "${CONFIG_LTO_CLANG} ${CONFIG_FTRACE_MCOUNT_RECORD}" != "y y" ]; then + return + fi + + if [ -n "${CC_USING_RECORD_MCOUNT}" ]; then + return + fi + if [ -n "${CC_USING_PATCHABLE_FUNCTION_ENTRY}" ]; then + return + fi + + local flags="" + + [ -n "${RECORDMCOUNT_WARN}" ] && flags="-w" + + info MCOUNT $* + ${objtree}/scripts/recordmcount ${flags} $* +} + # Link of vmlinux # ${1} - output file # ${2}, ${3}, ... - optional extra .o files @@ -316,6 +339,12 @@ objtool_link vmlinux.o # modpost vmlinux.o to check for section mismatches ${MAKE} -f "${srctree}/scripts/Makefile.modpost" MODPOST_VMLINUX=1 +if [ -n "${CONFIG_LTO_CLANG}" ]; then + # If we postponed ELF processing steps due to LTO, process + # vmlinux.o instead. + recordmcount vmlinux.o +fi + info MODINFO modules.builtin.modinfo ${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo info GEN modules.builtin diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c index 7225107a9aaf..9e9f10b4d649 100644 --- a/scripts/recordmcount.c +++ b/scripts/recordmcount.c @@ -404,7 +404,8 @@ static uint32_t (*w2)(uint16_t); /* Names of the sections that could contain calls to mcount. */ static int is_mcounted_section_name(char const *const txtname) { - return strncmp(".text", txtname, 5) == 0 || + return (strncmp(".text", txtname, 5) == 0 && + strcmp(".text..nomcount", txtname) != 0) || strcmp(".init.text", txtname) == 0 || strcmp(".ref.text", txtname) == 0 || strcmp(".sched.text", txtname) == 0 || -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 04/22] kbuild: lto: fix recordmcount 2020-06-24 20:31 ` [PATCH 04/22] kbuild: lto: fix recordmcount Sami Tolvanen @ 2020-06-24 21:27 ` Peter Zijlstra 2020-06-24 21:45 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-24 21:27 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 01:31:42PM -0700, Sami Tolvanen wrote: > With LTO, LLVM bitcode won't be compiled into native code until > modpost_link. This change postpones calls to recordmcount until after > this step. > > In order to exclude specific functions from inspection, we add a new > code section .text..nomcount, which we tell recordmcount to ignore, and > a __nomcount attribute for moving functions to this section. I'm confused, you only add this to functions in ftrace itself, which is compiled with: KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)) and so should not have mcount/fentry sites anyway. So what's the point of ignoring them further? This Changelog does not explain. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 04/22] kbuild: lto: fix recordmcount 2020-06-24 21:27 ` Peter Zijlstra @ 2020-06-24 21:45 ` Sami Tolvanen 2020-06-25 7:45 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 21:45 UTC (permalink / raw) To: Peter Zijlstra, Steven Rostedt Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 11:27:37PM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 01:31:42PM -0700, Sami Tolvanen wrote: > > With LTO, LLVM bitcode won't be compiled into native code until > > modpost_link. This change postpones calls to recordmcount until after > > this step. > > > > In order to exclude specific functions from inspection, we add a new > > code section .text..nomcount, which we tell recordmcount to ignore, and > > a __nomcount attribute for moving functions to this section. > > I'm confused, you only add this to functions in ftrace itself, which is > compiled with: > > KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)) > > and so should not have mcount/fentry sites anyway. So what's the point > of ignoring them further? > > This Changelog does not explain. Normally, recordmcount ignores each ftrace.o file, but since we are running it on vmlinux.o, we need another way to stop it from looking at references to mcount/fentry that are not calls. Here's a comment from recordmcount.c: /* * The file kernel/trace/ftrace.o references the mcount * function but does not call it. Since ftrace.o should * not be traced anyway, we just skip it. */ But I agree, the commit message could use more defails. Also +Steven for thoughts about this approach. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 04/22] kbuild: lto: fix recordmcount 2020-06-24 21:45 ` Sami Tolvanen @ 2020-06-25 7:45 ` Peter Zijlstra 2020-06-25 16:15 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 7:45 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 02:45:30PM -0700, Sami Tolvanen wrote: > On Wed, Jun 24, 2020 at 11:27:37PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 24, 2020 at 01:31:42PM -0700, Sami Tolvanen wrote: > > > With LTO, LLVM bitcode won't be compiled into native code until > > > modpost_link. This change postpones calls to recordmcount until after > > > this step. > > > > > > In order to exclude specific functions from inspection, we add a new > > > code section .text..nomcount, which we tell recordmcount to ignore, and > > > a __nomcount attribute for moving functions to this section. > > > > I'm confused, you only add this to functions in ftrace itself, which is > > compiled with: > > > > KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)) > > > > and so should not have mcount/fentry sites anyway. So what's the point > > of ignoring them further? > > > > This Changelog does not explain. > > Normally, recordmcount ignores each ftrace.o file, but since we are > running it on vmlinux.o, we need another way to stop it from looking > at references to mcount/fentry that are not calls. Here's a comment > from recordmcount.c: > > /* > * The file kernel/trace/ftrace.o references the mcount > * function but does not call it. Since ftrace.o should > * not be traced anyway, we just skip it. > */ > > But I agree, the commit message could use more defails. Also +Steven > for thoughts about this approach. Ah, is thi because recordmcount isn't smart enough to know the difference between "CALL $mcount" and any other RELA that has mcount? At least for x86_64 I can do a really quick take for a recordmcount pass in objtool, but I suppose you also need this for ARM64 ? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 04/22] kbuild: lto: fix recordmcount 2020-06-25 7:45 ` Peter Zijlstra @ 2020-06-25 16:15 ` Sami Tolvanen 2020-06-25 20:02 ` [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-25 16:15 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Jun 25, 2020 at 09:45:30AM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 02:45:30PM -0700, Sami Tolvanen wrote: > > On Wed, Jun 24, 2020 at 11:27:37PM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 24, 2020 at 01:31:42PM -0700, Sami Tolvanen wrote: > > > > With LTO, LLVM bitcode won't be compiled into native code until > > > > modpost_link. This change postpones calls to recordmcount until after > > > > this step. > > > > > > > > In order to exclude specific functions from inspection, we add a new > > > > code section .text..nomcount, which we tell recordmcount to ignore, and > > > > a __nomcount attribute for moving functions to this section. > > > > > > I'm confused, you only add this to functions in ftrace itself, which is > > > compiled with: > > > > > > KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)) > > > > > > and so should not have mcount/fentry sites anyway. So what's the point > > > of ignoring them further? > > > > > > This Changelog does not explain. > > > > Normally, recordmcount ignores each ftrace.o file, but since we are > > running it on vmlinux.o, we need another way to stop it from looking > > at references to mcount/fentry that are not calls. Here's a comment > > from recordmcount.c: > > > > /* > > * The file kernel/trace/ftrace.o references the mcount > > * function but does not call it. Since ftrace.o should > > * not be traced anyway, we just skip it. > > */ > > > > But I agree, the commit message could use more defails. Also +Steven > > for thoughts about this approach. > > Ah, is thi because recordmcount isn't smart enough to know the > difference between "CALL $mcount" and any other RELA that has mcount? Yes. > At least for x86_64 I can do a really quick take for a recordmcount pass > in objtool, but I suppose you also need this for ARM64 ? Sure, sounds good. arm64 uses -fpatchable-function-entry with clang, so we don't need recordmcount there. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-25 16:15 ` Sami Tolvanen @ 2020-06-25 20:02 ` Peter Zijlstra 2020-06-25 20:54 ` Nick Desaulniers 2020-06-25 22:40 ` Sami Tolvanen 0 siblings, 2 replies; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 20:02 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, mhelsley On Thu, Jun 25, 2020 at 09:15:03AM -0700, Sami Tolvanen wrote: > On Thu, Jun 25, 2020 at 09:45:30AM +0200, Peter Zijlstra wrote: > > At least for x86_64 I can do a really quick take for a recordmcount pass > > in objtool, but I suppose you also need this for ARM64 ? > > Sure, sounds good. arm64 uses -fpatchable-function-entry with clang, so we > don't need recordmcount there. This is on top of my local pile: git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git master which notably includes the static_call series. Not boot tested, but it generates the required sections and they look more or less as expected, ymmv. --- arch/x86/Kconfig | 1 - scripts/Makefile.build | 3 ++ scripts/link-vmlinux.sh | 2 +- tools/objtool/builtin-check.c | 9 ++--- tools/objtool/builtin.h | 2 +- tools/objtool/check.c | 81 +++++++++++++++++++++++++++++++++++++++++++ tools/objtool/check.h | 1 + tools/objtool/objtool.h | 1 + 8 files changed, 91 insertions(+), 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a291823f3f26..189575c12434 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -174,7 +174,6 @@ config X86 select HAVE_EXIT_THREAD select HAVE_FAST_GUP select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE - select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_TRACER select HAVE_GCC_PLUGINS diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 2e8810b7e5ed..c774befc57da 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -227,6 +227,9 @@ endif ifdef CONFIG_X86_SMAP objtool_args += --uaccess endif +ifdef CONFIG_DYNAMIC_FTRACE + objtool_args += --mcount +endif # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 92dd745906f4..00c6e4f28a1a 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -60,7 +60,7 @@ objtool_link() local objtoolopt; if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then - objtoolopt="check" + objtoolopt="check --vmlinux" if [ -z "${CONFIG_FRAME_POINTER}" ]; then objtoolopt="${objtoolopt} --no-fp" fi diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 4896c5a89702..a6c3a3fba67d 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -18,7 +18,7 @@ #include "builtin.h" #include "objtool.h" -bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu; +bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu, mcount; static const char * const check_usage[] = { "objtool check [<options>] file.o", @@ -36,12 +36,13 @@ const struct option check_options[] = { OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"), OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"), OPT_BOOLEAN('F', "fpu", &fpu, "validate FPU context"), + OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"), OPT_END(), }; int cmd_check(int argc, const char **argv) { - const char *objname, *s; + const char *objname; argc = parse_options(argc, argv, check_options, check_usage, 0); @@ -50,9 +51,5 @@ int cmd_check(int argc, const char **argv) objname = argv[0]; - s = strstr(objname, "vmlinux.o"); - if (s && !s[9]) - vmlinux = true; - return check(objname, false); } diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h index 7158e09d4cc9..b51d883ec245 100644 --- a/tools/objtool/builtin.h +++ b/tools/objtool/builtin.h @@ -8,7 +8,7 @@ #include <subcmd/parse-options.h> extern const struct option check_options[]; -extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu; +extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu, mcount; extern int cmd_check(int argc, const char **argv); extern int cmd_orc(int argc, const char **argv); diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 6647a8d1545b..ee99566bdae9 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -533,6 +533,65 @@ static int create_static_call_sections(struct objtool_file *file) return 0; } +static int create_mcount_loc_sections(struct objtool_file *file) +{ + struct section *sec, *reloc_sec; + struct reloc *reloc; + unsigned long *loc; + struct instruction *insn; + int idx; + + sec = find_section_by_name(file->elf, "__mcount_loc"); + if (sec) { + INIT_LIST_HEAD(&file->mcount_loc_list); + WARN("file already has __mcount_loc section, skipping"); + return 0; + } + + if (list_empty(&file->mcount_loc_list)) + return 0; + + idx = 0; + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) + idx++; + + sec = elf_create_section(file->elf, "__mcount_loc", 0, sizeof(unsigned long), idx); + if (!sec) + return -1; + + reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA); + if (!reloc_sec) + return -1; + + idx = 0; + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) { + + loc = (unsigned long *)sec->data->d_buf + idx; + memset(loc, 0, sizeof(unsigned long)); + + reloc = malloc(sizeof(*reloc)); + if (!reloc) { + perror("malloc"); + return -1; + } + memset(reloc, 0, sizeof(*reloc)); + + reloc->sym = insn->sec->sym; + reloc->addend = insn->offset; + reloc->type = R_X86_64_64; + reloc->offset = idx * sizeof(unsigned long); + reloc->sec = reloc_sec; + elf_add_reloc(file->elf, reloc); + + idx++; + } + + if (elf_rebuild_reloc_section(file->elf, reloc_sec)) + return -1; + + return 0; +} + /* * Warnings shouldn't be reported for ignored functions. */ @@ -892,6 +951,22 @@ static int add_call_destinations(struct objtool_file *file) insn->type = INSN_NOP; } + if (mcount && !strcmp(insn->call_dest->name, "__fentry__")) { + if (reloc) { + reloc->type = R_NONE; + elf_write_reloc(file->elf, reloc); + } + + elf_write_insn(file->elf, insn->sec, + insn->offset, insn->len, + arch_nop_insn(insn->len)); + + insn->type = INSN_NOP; + + list_add_tail(&insn->mcount_loc_node, + &file->mcount_loc_list); + } + /* * Whatever stack impact regular CALLs have, should be undone * by the RETURN of the called function. @@ -3004,6 +3079,7 @@ int check(const char *_objname, bool orc) INIT_LIST_HEAD(&file.insn_list); hash_init(file.insn_hash); INIT_LIST_HEAD(&file.static_call_list); + INIT_LIST_HEAD(&file.mcount_loc_list); file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment"); file.ignore_unreachables = no_unreachable; file.hints = false; @@ -3056,6 +3132,11 @@ int check(const char *_objname, bool orc) goto out; warnings += ret; + ret = create_mcount_loc_sections(&file); + if (ret < 0) + goto out; + warnings += ret; + if (orc) { ret = create_orc(&file); if (ret < 0) diff --git a/tools/objtool/check.h b/tools/objtool/check.h index cd95fca0d237..01f11b5da5dd 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -24,6 +24,7 @@ struct instruction { struct list_head list; struct hlist_node hash; struct list_head static_call_node; + struct list_head mcount_loc_node; struct section *sec; unsigned long offset; unsigned int len; diff --git a/tools/objtool/objtool.h b/tools/objtool/objtool.h index 9a7cd0b88bd8..f604b22d22cc 100644 --- a/tools/objtool/objtool.h +++ b/tools/objtool/objtool.h @@ -17,6 +17,7 @@ struct objtool_file { struct list_head insn_list; DECLARE_HASHTABLE(insn_hash, 20); struct list_head static_call_list; + struct list_head mcount_loc_list; bool ignore_unreachables, c_file, hints, rodata; }; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-25 20:02 ` [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool Peter Zijlstra @ 2020-06-25 20:54 ` Nick Desaulniers 2020-06-25 22:40 ` Sami Tolvanen 1 sibling, 0 replies; 212+ messages in thread From: Nick Desaulniers @ 2020-06-25 20:54 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, Steven Rostedt, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM, mhelsley On Thu, Jun 25, 2020 at 1:02 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Jun 25, 2020 at 09:15:03AM -0700, Sami Tolvanen wrote: > > On Thu, Jun 25, 2020 at 09:45:30AM +0200, Peter Zijlstra wrote: > > > > At least for x86_64 I can do a really quick take for a recordmcount pass > > > in objtool, but I suppose you also need this for ARM64 ? > > > > Sure, sounds good. arm64 uses -fpatchable-function-entry with clang, so we > > don't need recordmcount there. > > This is on top of my local pile: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git master > > which notably includes the static_call series. > > Not boot tested, but it generates the required sections and they look > more or less as expected, ymmv. > > --- > arch/x86/Kconfig | 1 - > scripts/Makefile.build | 3 ++ > scripts/link-vmlinux.sh | 2 +- > tools/objtool/builtin-check.c | 9 ++--- > tools/objtool/builtin.h | 2 +- > tools/objtool/check.c | 81 +++++++++++++++++++++++++++++++++++++++++++ > tools/objtool/check.h | 1 + > tools/objtool/objtool.h | 1 + > 8 files changed, 91 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index a291823f3f26..189575c12434 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -174,7 +174,6 @@ config X86 > select HAVE_EXIT_THREAD > select HAVE_FAST_GUP > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > - select HAVE_FTRACE_MCOUNT_RECORD > select HAVE_FUNCTION_GRAPH_TRACER > select HAVE_FUNCTION_TRACER > select HAVE_GCC_PLUGINS > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 2e8810b7e5ed..c774befc57da 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -227,6 +227,9 @@ endif > ifdef CONFIG_X86_SMAP > objtool_args += --uaccess > endif > +ifdef CONFIG_DYNAMIC_FTRACE > + objtool_args += --mcount > +endif > > # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory > # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index 92dd745906f4..00c6e4f28a1a 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -60,7 +60,7 @@ objtool_link() > local objtoolopt; > > if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then > - objtoolopt="check" > + objtoolopt="check --vmlinux" > if [ -z "${CONFIG_FRAME_POINTER}" ]; then > objtoolopt="${objtoolopt} --no-fp" > fi > diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c > index 4896c5a89702..a6c3a3fba67d 100644 > --- a/tools/objtool/builtin-check.c > +++ b/tools/objtool/builtin-check.c > @@ -18,7 +18,7 @@ > #include "builtin.h" > #include "objtool.h" > > -bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu; > +bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu, mcount; > > static const char * const check_usage[] = { > "objtool check [<options>] file.o", > @@ -36,12 +36,13 @@ const struct option check_options[] = { > OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"), > OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"), > OPT_BOOLEAN('F', "fpu", &fpu, "validate FPU context"), > + OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"), > OPT_END(), > }; > > int cmd_check(int argc, const char **argv) > { > - const char *objname, *s; > + const char *objname; > > argc = parse_options(argc, argv, check_options, check_usage, 0); > > @@ -50,9 +51,5 @@ int cmd_check(int argc, const char **argv) > > objname = argv[0]; > > - s = strstr(objname, "vmlinux.o"); > - if (s && !s[9]) > - vmlinux = true; > - > return check(objname, false); > } > diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h > index 7158e09d4cc9..b51d883ec245 100644 > --- a/tools/objtool/builtin.h > +++ b/tools/objtool/builtin.h > @@ -8,7 +8,7 @@ > #include <subcmd/parse-options.h> > > extern const struct option check_options[]; > -extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu; > +extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, fpu, mcount; > > extern int cmd_check(int argc, const char **argv); > extern int cmd_orc(int argc, const char **argv); > diff --git a/tools/objtool/check.c b/tools/objtool/check.c > index 6647a8d1545b..ee99566bdae9 100644 > --- a/tools/objtool/check.c > +++ b/tools/objtool/check.c > @@ -533,6 +533,65 @@ static int create_static_call_sections(struct objtool_file *file) > return 0; > } > > +static int create_mcount_loc_sections(struct objtool_file *file) > +{ > + struct section *sec, *reloc_sec; > + struct reloc *reloc; > + unsigned long *loc; > + struct instruction *insn; > + int idx; > + > + sec = find_section_by_name(file->elf, "__mcount_loc"); > + if (sec) { > + INIT_LIST_HEAD(&file->mcount_loc_list); > + WARN("file already has __mcount_loc section, skipping"); > + return 0; > + } > + > + if (list_empty(&file->mcount_loc_list)) > + return 0; > + > + idx = 0; > + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) > + idx++; > + > + sec = elf_create_section(file->elf, "__mcount_loc", 0, sizeof(unsigned long), idx); > + if (!sec) > + return -1; > + > + reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA); > + if (!reloc_sec) > + return -1; > + > + idx = 0; > + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) { > + > + loc = (unsigned long *)sec->data->d_buf + idx; > + memset(loc, 0, sizeof(unsigned long)); > + > + reloc = malloc(sizeof(*reloc)); > + if (!reloc) { > + perror("malloc"); > + return -1; > + } > + memset(reloc, 0, sizeof(*reloc)); calloc(1, sizeof(*reloc))? > + > + reloc->sym = insn->sec->sym; > + reloc->addend = insn->offset; > + reloc->type = R_X86_64_64; > + reloc->offset = idx * sizeof(unsigned long); > + reloc->sec = reloc_sec; > + elf_add_reloc(file->elf, reloc); > + > + idx++; > + } > + > + if (elf_rebuild_reloc_section(file->elf, reloc_sec)) > + return -1; > + > + return 0; > +} > + > /* > * Warnings shouldn't be reported for ignored functions. > */ > @@ -892,6 +951,22 @@ static int add_call_destinations(struct objtool_file *file) > insn->type = INSN_NOP; > } > > + if (mcount && !strcmp(insn->call_dest->name, "__fentry__")) { > + if (reloc) { > + reloc->type = R_NONE; > + elf_write_reloc(file->elf, reloc); > + } > + > + elf_write_insn(file->elf, insn->sec, > + insn->offset, insn->len, > + arch_nop_insn(insn->len)); > + > + insn->type = INSN_NOP; > + > + list_add_tail(&insn->mcount_loc_node, > + &file->mcount_loc_list); > + } > + > /* > * Whatever stack impact regular CALLs have, should be undone > * by the RETURN of the called function. > @@ -3004,6 +3079,7 @@ int check(const char *_objname, bool orc) > INIT_LIST_HEAD(&file.insn_list); > hash_init(file.insn_hash); > INIT_LIST_HEAD(&file.static_call_list); > + INIT_LIST_HEAD(&file.mcount_loc_list); > file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment"); > file.ignore_unreachables = no_unreachable; > file.hints = false; > @@ -3056,6 +3132,11 @@ int check(const char *_objname, bool orc) > goto out; > warnings += ret; > > + ret = create_mcount_loc_sections(&file); > + if (ret < 0) > + goto out; > + warnings += ret; > + > if (orc) { > ret = create_orc(&file); > if (ret < 0) > diff --git a/tools/objtool/check.h b/tools/objtool/check.h > index cd95fca0d237..01f11b5da5dd 100644 > --- a/tools/objtool/check.h > +++ b/tools/objtool/check.h > @@ -24,6 +24,7 @@ struct instruction { > struct list_head list; > struct hlist_node hash; > struct list_head static_call_node; > + struct list_head mcount_loc_node; > struct section *sec; > unsigned long offset; > unsigned int len; > diff --git a/tools/objtool/objtool.h b/tools/objtool/objtool.h > index 9a7cd0b88bd8..f604b22d22cc 100644 > --- a/tools/objtool/objtool.h > +++ b/tools/objtool/objtool.h > @@ -17,6 +17,7 @@ struct objtool_file { > struct list_head insn_list; > DECLARE_HASHTABLE(insn_hash, 20); > struct list_head static_call_list; > + struct list_head mcount_loc_list; > bool ignore_unreachables, c_file, hints, rodata; > }; > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-25 20:02 ` [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool Peter Zijlstra 2020-06-25 20:54 ` Nick Desaulniers @ 2020-06-25 22:40 ` Sami Tolvanen 2020-06-26 11:29 ` Peter Zijlstra 1 sibling, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-25 22:40 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, mhelsley On Thu, Jun 25, 2020 at 10:02:35PM +0200, Peter Zijlstra wrote: > On Thu, Jun 25, 2020 at 09:15:03AM -0700, Sami Tolvanen wrote: > > On Thu, Jun 25, 2020 at 09:45:30AM +0200, Peter Zijlstra wrote: > > > > At least for x86_64 I can do a really quick take for a recordmcount pass > > > in objtool, but I suppose you also need this for ARM64 ? > > > > Sure, sounds good. arm64 uses -fpatchable-function-entry with clang, so we > > don't need recordmcount there. > > This is on top of my local pile: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git master > > which notably includes the static_call series. > > Not boot tested, but it generates the required sections and they look > more or less as expected, ymmv. > > --- > arch/x86/Kconfig | 1 - > scripts/Makefile.build | 3 ++ > scripts/link-vmlinux.sh | 2 +- > tools/objtool/builtin-check.c | 9 ++--- > tools/objtool/builtin.h | 2 +- > tools/objtool/check.c | 81 +++++++++++++++++++++++++++++++++++++++++++ > tools/objtool/check.h | 1 + > tools/objtool/objtool.h | 1 + > 8 files changed, 91 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index a291823f3f26..189575c12434 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -174,7 +174,6 @@ config X86 > select HAVE_EXIT_THREAD > select HAVE_FAST_GUP > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > - select HAVE_FTRACE_MCOUNT_RECORD > select HAVE_FUNCTION_GRAPH_TRACER > select HAVE_FUNCTION_TRACER > select HAVE_GCC_PLUGINS This breaks DYNAMIC_FTRACE according to kernel/trace/ftrace.c: #ifndef CONFIG_FTRACE_MCOUNT_RECORD # error Dynamic ftrace depends on MCOUNT_RECORD #endif And the build errors after that seem to confirm this. It looks like we might need another flag to skip recordmcount. Anyway, since objtool is run before recordmcount, I just left this unchanged for testing and ignored the recordmcount warnings about __mcount_loc already existing. Something is a bit off still though, I see this at boot: ------------[ ftrace bug ]------------ ftrace failed to modify [<ffffffff81000660>] __tracepoint_iter_initcall_level+0x0/0x40 actual: 0f:1f:44:00:00 Initializing ftrace call sites ftrace record flags: 0 (0) expected tramp: ffffffff81056500 ------------[ cut here ]------------ Otherwise, this looks pretty good. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-25 22:40 ` Sami Tolvanen @ 2020-06-26 11:29 ` Peter Zijlstra 2020-06-26 11:42 ` Peter Zijlstra ` (2 more replies) 0 siblings, 3 replies; 212+ messages in thread From: Peter Zijlstra @ 2020-06-26 11:29 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, mhelsley On Thu, Jun 25, 2020 at 03:40:42PM -0700, Sami Tolvanen wrote: > > Not boot tested, but it generates the required sections and they look > > more or less as expected, ymmv. > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index a291823f3f26..189575c12434 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -174,7 +174,6 @@ config X86 > > select HAVE_EXIT_THREAD > > select HAVE_FAST_GUP > > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > > - select HAVE_FTRACE_MCOUNT_RECORD > > select HAVE_FUNCTION_GRAPH_TRACER > > select HAVE_FUNCTION_TRACER > > select HAVE_GCC_PLUGINS > > This breaks DYNAMIC_FTRACE according to kernel/trace/ftrace.c: > > #ifndef CONFIG_FTRACE_MCOUNT_RECORD > # error Dynamic ftrace depends on MCOUNT_RECORD > #endif > > And the build errors after that seem to confirm this. It looks like we might > need another flag to skip recordmcount. Hurm, Steve, how you want to do that? > Anyway, since objtool is run before recordmcount, I just left this unchanged > for testing and ignored the recordmcount warnings about __mcount_loc already > existing. Something is a bit off still though, I see this at boot: > > ------------[ ftrace bug ]------------ > ftrace failed to modify > [<ffffffff81000660>] __tracepoint_iter_initcall_level+0x0/0x40 > actual: 0f:1f:44:00:00 > Initializing ftrace call sites > ftrace record flags: 0 > (0) > expected tramp: ffffffff81056500 > ------------[ cut here ]------------ > > Otherwise, this looks pretty good. Ha! it is trying to convert the "CALL __fentry__" into a NOP and not finding the CALL -- because objtool already made it a NOP... Weird, I thought recordmcount would also write NOPs, it certainly has code for that. I suppose we can use CC_USING_NOP_MCOUNT to avoid those, but I'd rather Steve explain this before I wreck things further. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-26 11:29 ` Peter Zijlstra @ 2020-06-26 11:42 ` Peter Zijlstra 2020-07-17 17:28 ` Sami Tolvanen 2020-07-22 17:55 ` Steven Rostedt 2 siblings, 0 replies; 212+ messages in thread From: Peter Zijlstra @ 2020-06-26 11:42 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, mhelsley On Fri, Jun 26, 2020 at 01:29:31PM +0200, Peter Zijlstra wrote: > On Thu, Jun 25, 2020 at 03:40:42PM -0700, Sami Tolvanen wrote: > > Anyway, since objtool is run before recordmcount, I just left this unchanged > > for testing and ignored the recordmcount warnings about __mcount_loc already > > existing. Something is a bit off still though, I see this at boot: > > > > ------------[ ftrace bug ]------------ > > ftrace failed to modify > > [<ffffffff81000660>] __tracepoint_iter_initcall_level+0x0/0x40 > > actual: 0f:1f:44:00:00 > > Initializing ftrace call sites > > ftrace record flags: 0 > > (0) > > expected tramp: ffffffff81056500 > > ------------[ cut here ]------------ > > > > Otherwise, this looks pretty good. > > Ha! it is trying to convert the "CALL __fentry__" into a NOP and not > finding the CALL -- because objtool already made it a NOP... > > Weird, I thought recordmcount would also write NOPs, it certainly has > code for that. I suppose we can use CC_USING_NOP_MCOUNT to avoid those, > but I'd rather Steve explain this before I wreck things further. Something like so would ignore whatever text is there and rewrite it with ideal_nop. --- diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index c84d28e90a58..98a6a93d7615 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -109,9 +109,11 @@ static int __ref ftrace_modify_code_direct(unsigned long ip, const char *old_code, const char *new_code) { - int ret = ftrace_verify_code(ip, old_code); - if (ret) - return ret; + if (old_code) { + int ret = ftrace_verify_code(ip, old_code); + if (ret) + return ret; + } /* replace the text with the new text */ if (ftrace_poke_late) @@ -124,9 +126,8 @@ ftrace_modify_code_direct(unsigned long ip, const char *old_code, int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { unsigned long ip = rec->ip; - const char *new, *old; + const char *new; - old = ftrace_call_replace(ip, addr); new = ftrace_nop_replace(); /* @@ -138,7 +139,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long ad * just modify the code directly. */ if (addr == MCOUNT_ADDR) - return ftrace_modify_code_direct(ip, old, new); + return ftrace_modify_code_direct(ip, NULL, new); /* * x86 overrides ftrace_replace_code -- this function will never be used _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-26 11:29 ` Peter Zijlstra 2020-06-26 11:42 ` Peter Zijlstra @ 2020-07-17 17:28 ` Sami Tolvanen 2020-07-17 17:36 ` Steven Rostedt 2020-07-22 17:55 ` Steven Rostedt 2 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-07-17 17:28 UTC (permalink / raw) To: Steven Rostedt, Peter Zijlstra Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, Matt Helsley On Fri, Jun 26, 2020 at 4:29 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Jun 25, 2020 at 03:40:42PM -0700, Sami Tolvanen wrote: > > > > Not boot tested, but it generates the required sections and they look > > > more or less as expected, ymmv. > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > > index a291823f3f26..189575c12434 100644 > > > --- a/arch/x86/Kconfig > > > +++ b/arch/x86/Kconfig > > > @@ -174,7 +174,6 @@ config X86 > > > select HAVE_EXIT_THREAD > > > select HAVE_FAST_GUP > > > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > > > - select HAVE_FTRACE_MCOUNT_RECORD > > > select HAVE_FUNCTION_GRAPH_TRACER > > > select HAVE_FUNCTION_TRACER > > > select HAVE_GCC_PLUGINS > > > > This breaks DYNAMIC_FTRACE according to kernel/trace/ftrace.c: > > > > #ifndef CONFIG_FTRACE_MCOUNT_RECORD > > # error Dynamic ftrace depends on MCOUNT_RECORD > > #endif > > > > And the build errors after that seem to confirm this. It looks like we might > > need another flag to skip recordmcount. > > Hurm, Steve, how you want to do that? Steven, did you have any thoughts about this? Moving recordmcount to an objtool pass that knows about call sites feels like a much cleaner solution than annotating kernel code to avoid unwanted relocations. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-17 17:28 ` Sami Tolvanen @ 2020-07-17 17:36 ` Steven Rostedt 2020-07-17 17:47 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Steven Rostedt @ 2020-07-17 17:36 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, Matt Helsley On Fri, 17 Jul 2020 10:28:13 -0700 Sami Tolvanen <samitolvanen@google.com> wrote: > On Fri, Jun 26, 2020 at 4:29 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Jun 25, 2020 at 03:40:42PM -0700, Sami Tolvanen wrote: > > > > > > Not boot tested, but it generates the required sections and they look > > > > more or less as expected, ymmv. > > > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > > > index a291823f3f26..189575c12434 100644 > > > > --- a/arch/x86/Kconfig > > > > +++ b/arch/x86/Kconfig > > > > @@ -174,7 +174,6 @@ config X86 > > > > select HAVE_EXIT_THREAD > > > > select HAVE_FAST_GUP > > > > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > > > > - select HAVE_FTRACE_MCOUNT_RECORD > > > > select HAVE_FUNCTION_GRAPH_TRACER > > > > select HAVE_FUNCTION_TRACER > > > > select HAVE_GCC_PLUGINS > > > > > > This breaks DYNAMIC_FTRACE according to kernel/trace/ftrace.c: > > > > > > #ifndef CONFIG_FTRACE_MCOUNT_RECORD > > > # error Dynamic ftrace depends on MCOUNT_RECORD > > > #endif > > > > > > And the build errors after that seem to confirm this. It looks like we might > > > need another flag to skip recordmcount. > > > > Hurm, Steve, how you want to do that? > > Steven, did you have any thoughts about this? Moving recordmcount to > an objtool pass that knows about call sites feels like a much cleaner > solution than annotating kernel code to avoid unwanted relocations. > Bah, I started to reply to this then went to look for details, got distracted, forgot about it, my laptop crashed (due to a zoom call), and I lost the email I was writing (haven't looked in the drafts folder, but my idea about this has changed since anyway). So the problem is that we process mcount references in other areas and that confuses the ftrace modification portion? Someone just submitted a patch for arm64 for this: https://lore.kernel.org/r/20200717143338.19302-1-gregory.herrero@oracle.com Is that what you want? -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-17 17:36 ` Steven Rostedt @ 2020-07-17 17:47 ` Sami Tolvanen 2020-07-17 18:05 ` Steven Rostedt 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-07-17 17:47 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, Matt Helsley On Fri, Jul 17, 2020 at 10:36 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 17 Jul 2020 10:28:13 -0700 > Sami Tolvanen <samitolvanen@google.com> wrote: > > > On Fri, Jun 26, 2020 at 4:29 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Thu, Jun 25, 2020 at 03:40:42PM -0700, Sami Tolvanen wrote: > > > > > > > > Not boot tested, but it generates the required sections and they look > > > > > more or less as expected, ymmv. > > > > > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > > > > index a291823f3f26..189575c12434 100644 > > > > > --- a/arch/x86/Kconfig > > > > > +++ b/arch/x86/Kconfig > > > > > @@ -174,7 +174,6 @@ config X86 > > > > > select HAVE_EXIT_THREAD > > > > > select HAVE_FAST_GUP > > > > > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > > > > > - select HAVE_FTRACE_MCOUNT_RECORD > > > > > select HAVE_FUNCTION_GRAPH_TRACER > > > > > select HAVE_FUNCTION_TRACER > > > > > select HAVE_GCC_PLUGINS > > > > > > > > This breaks DYNAMIC_FTRACE according to kernel/trace/ftrace.c: > > > > > > > > #ifndef CONFIG_FTRACE_MCOUNT_RECORD > > > > # error Dynamic ftrace depends on MCOUNT_RECORD > > > > #endif > > > > > > > > And the build errors after that seem to confirm this. It looks like we might > > > > need another flag to skip recordmcount. > > > > > > Hurm, Steve, how you want to do that? > > > > Steven, did you have any thoughts about this? Moving recordmcount to > > an objtool pass that knows about call sites feels like a much cleaner > > solution than annotating kernel code to avoid unwanted relocations. > > > > Bah, I started to reply to this then went to look for details, got > distracted, forgot about it, my laptop crashed (due to a zoom call), > and I lost the email I was writing (haven't looked in the drafts > folder, but my idea about this has changed since anyway). > > So the problem is that we process mcount references in other areas and > that confuses the ftrace modification portion? Correct. > Someone just submitted a patch for arm64 for this: > > https://lore.kernel.org/r/20200717143338.19302-1-gregory.herrero@oracle.com > > Is that what you want? That looks like the same issue, but we need to fix this on x86 instead. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-17 17:47 ` Sami Tolvanen @ 2020-07-17 18:05 ` Steven Rostedt 2020-07-20 16:52 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Steven Rostedt @ 2020-07-17 18:05 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, Matt Helsley On Fri, 17 Jul 2020 10:47:51 -0700 Sami Tolvanen <samitolvanen@google.com> wrote: > > Someone just submitted a patch for arm64 for this: > > > > https://lore.kernel.org/r/20200717143338.19302-1-gregory.herrero@oracle.com > > > > Is that what you want? > > That looks like the same issue, but we need to fix this on x86 instead. Does x86 have a way to differentiate between the two that record mcount can check? -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-17 18:05 ` Steven Rostedt @ 2020-07-20 16:52 ` Sami Tolvanen 2020-07-22 17:58 ` Steven Rostedt 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-07-20 16:52 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel, Matt Helsley On Fri, Jul 17, 2020 at 11:05 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 17 Jul 2020 10:47:51 -0700 > Sami Tolvanen <samitolvanen@google.com> wrote: > > > > Someone just submitted a patch for arm64 for this: > > > > > > https://lore.kernel.org/r/20200717143338.19302-1-gregory.herrero@oracle.com > > > > > > Is that what you want? > > > > That looks like the same issue, but we need to fix this on x86 instead. > > Does x86 have a way to differentiate between the two that record mcount > can check? I'm not sure if looking at the relocation alone is sufficient on x86, we might also have to decode the instruction, which is what objtool does. Did you have any thoughts on Peter's patch, or my initial suggestion, which adds a __nomcount attribute to affected functions? Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-20 16:52 ` Sami Tolvanen @ 2020-07-22 17:58 ` Steven Rostedt 2020-07-22 18:07 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Steven Rostedt @ 2020-07-22 17:58 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel On Mon, 20 Jul 2020 09:52:37 -0700 Sami Tolvanen <samitolvanen@google.com> wrote: > > Does x86 have a way to differentiate between the two that record mcount > > can check? > > I'm not sure if looking at the relocation alone is sufficient on x86, > we might also have to decode the instruction, which is what objtool > does. Did you have any thoughts on Peter's patch, or my initial > suggestion, which adds a __nomcount attribute to affected functions? There's a lot of code in this thread. Can you give me the message-id of Peter's patch in question. Thanks, -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-22 17:58 ` Steven Rostedt @ 2020-07-22 18:07 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-07-22 18:07 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jul 22, 2020 at 10:58 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Mon, 20 Jul 2020 09:52:37 -0700 > Sami Tolvanen <samitolvanen@google.com> wrote: > > > > Does x86 have a way to differentiate between the two that record mcount > > > can check? > > > > I'm not sure if looking at the relocation alone is sufficient on x86, > > we might also have to decode the instruction, which is what objtool > > does. Did you have any thoughts on Peter's patch, or my initial > > suggestion, which adds a __nomcount attribute to affected functions? > > There's a lot of code in this thread. Can you give me the message-id of > Peter's patch in question. Sure, I was referring to the objtool patch in this message: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/ Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-06-26 11:29 ` Peter Zijlstra 2020-06-26 11:42 ` Peter Zijlstra 2020-07-17 17:28 ` Sami Tolvanen @ 2020-07-22 17:55 ` Steven Rostedt 2020-07-22 18:41 ` Peter Zijlstra 2 siblings, 1 reply; 212+ messages in thread From: Steven Rostedt @ 2020-07-22 17:55 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel, mhelsley On Fri, 26 Jun 2020 13:29:31 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Jun 25, 2020 at 03:40:42PM -0700, Sami Tolvanen wrote: > > > > Not boot tested, but it generates the required sections and they look > > > more or less as expected, ymmv. > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > > index a291823f3f26..189575c12434 100644 > > > --- a/arch/x86/Kconfig > > > +++ b/arch/x86/Kconfig > > > @@ -174,7 +174,6 @@ config X86 > > > select HAVE_EXIT_THREAD > > > select HAVE_FAST_GUP > > > select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE > > > - select HAVE_FTRACE_MCOUNT_RECORD > > > select HAVE_FUNCTION_GRAPH_TRACER > > > select HAVE_FUNCTION_TRACER > > > select HAVE_GCC_PLUGINS > > > > This breaks DYNAMIC_FTRACE according to kernel/trace/ftrace.c: > > > > #ifndef CONFIG_FTRACE_MCOUNT_RECORD > > # error Dynamic ftrace depends on MCOUNT_RECORD > > #endif > > > > And the build errors after that seem to confirm this. It looks like we might > > need another flag to skip recordmcount. > > Hurm, Steve, how you want to do that? That was added when we removed that dangerous daemon that did the updates, and was added to make sure it didn't come back. We can probably just get rid of it. > > > Anyway, since objtool is run before recordmcount, I just left this unchanged > > for testing and ignored the recordmcount warnings about __mcount_loc already > > existing. Something is a bit off still though, I see this at boot: > > > > ------------[ ftrace bug ]------------ > > ftrace failed to modify > > [<ffffffff81000660>] __tracepoint_iter_initcall_level+0x0/0x40 > > actual: 0f:1f:44:00:00 > > Initializing ftrace call sites > > ftrace record flags: 0 > > (0) > > expected tramp: ffffffff81056500 > > ------------[ cut here ]------------ > > > > Otherwise, this looks pretty good. > > Ha! it is trying to convert the "CALL __fentry__" into a NOP and not > finding the CALL -- because objtool already made it a NOP... > > Weird, I thought recordmcount would also write NOPs, it certainly has > code for that. I suppose we can use CC_USING_NOP_MCOUNT to avoid those, > but I'd rather Steve explain this before I wreck things further. The reason for not having recordmcount insert all the nops, is because x86 has more than one optimal nop which is determined by the machine it runs on, and not at compile time. So we figured just updated it then. We can change it to be a nop on boot, and just modify it if it's not the optimal nop already. That said, Andi Kleen added an option to gcc called -mnop-mcount which will have gcc do both create the mcount section and convert the calls into nops. When doing so, it defines CC_USING_NOP_MCOUNT which will tell ftrace to expect the calls to already be converted. -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-22 17:55 ` Steven Rostedt @ 2020-07-22 18:41 ` Peter Zijlstra 2020-07-22 19:09 ` Steven Rostedt 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-22 18:41 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel, mhelsley On Wed, Jul 22, 2020 at 01:55:42PM -0400, Steven Rostedt wrote: > > Ha! it is trying to convert the "CALL __fentry__" into a NOP and not > > finding the CALL -- because objtool already made it a NOP... > > > > Weird, I thought recordmcount would also write NOPs, it certainly has > > code for that. I suppose we can use CC_USING_NOP_MCOUNT to avoid those, > > but I'd rather Steve explain this before I wreck things further. > > The reason for not having recordmcount insert all the nops, is because > x86 has more than one optimal nop which is determined by the machine it > runs on, and not at compile time. So we figured just updated it then. > > We can change it to be a nop on boot, and just modify it if it's not > the optimal nop already. Right, I throught that's what we'd be doing already, anyway: > That said, Andi Kleen added an option to gcc called -mnop-mcount which > will have gcc do both create the mcount section and convert the calls > into nops. When doing so, it defines CC_USING_NOP_MCOUNT which will > tell ftrace to expect the calls to already be converted. That seems like the much easier solution, then we can forget about recordmcount / objtool entirely for this. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-22 18:41 ` Peter Zijlstra @ 2020-07-22 19:09 ` Steven Rostedt 2020-07-22 20:03 ` Sami Tolvanen 2020-07-22 23:56 ` Peter Zijlstra 0 siblings, 2 replies; 212+ messages in thread From: Steven Rostedt @ 2020-07-22 19:09 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Wed, 22 Jul 2020 20:41:37 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > > That said, Andi Kleen added an option to gcc called -mnop-mcount which > > will have gcc do both create the mcount section and convert the calls > > into nops. When doing so, it defines CC_USING_NOP_MCOUNT which will > > tell ftrace to expect the calls to already be converted. > > That seems like the much easier solution, then we can forget about > recordmcount / objtool entirely for this. Of course that was only for some gcc compilers, and I'm not sure if clang can do this. Or do you just see all compilers doing this in the future, and not worrying about record-mcount at all, and bothering with objtool? -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-22 19:09 ` Steven Rostedt @ 2020-07-22 20:03 ` Sami Tolvanen 2020-07-22 23:56 ` Peter Zijlstra 1 sibling, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-07-22 20:03 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jul 22, 2020 at 12:09 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Wed, 22 Jul 2020 20:41:37 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > > That said, Andi Kleen added an option to gcc called -mnop-mcount which > > > will have gcc do both create the mcount section and convert the calls > > > into nops. When doing so, it defines CC_USING_NOP_MCOUNT which will > > > tell ftrace to expect the calls to already be converted. > > > > That seems like the much easier solution, then we can forget about > > recordmcount / objtool entirely for this. > > Of course that was only for some gcc compilers, and I'm not sure if > clang can do this. > > Or do you just see all compilers doing this in the future, and not > worrying about record-mcount at all, and bothering with objtool? Clang appears to only support -mrecord-mcount and -mnop-mcount for s390, so we still need recordmcount / objtool for x86. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-22 19:09 ` Steven Rostedt 2020-07-22 20:03 ` Sami Tolvanen @ 2020-07-22 23:56 ` Peter Zijlstra 2020-07-23 0:06 ` Steven Rostedt 1 sibling, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-22 23:56 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jul 22, 2020 at 03:09:43PM -0400, Steven Rostedt wrote: > On Wed, 22 Jul 2020 20:41:37 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > > That said, Andi Kleen added an option to gcc called -mnop-mcount which > > > will have gcc do both create the mcount section and convert the calls > > > into nops. When doing so, it defines CC_USING_NOP_MCOUNT which will > > > tell ftrace to expect the calls to already be converted. > > > > That seems like the much easier solution, then we can forget about > > recordmcount / objtool entirely for this. > > Of course that was only for some gcc compilers, and I'm not sure if > clang can do this. > > Or do you just see all compilers doing this in the future, and not > worrying about record-mcount at all, and bothering with objtool? I got the GCC version wrong :/ Both -mnop-mcount and -mrecord-mcount landed in GCC-5, where our minimum GCC is now at 4.9. Anyway, what do you prefer, I suppose I can make objtool whatever we need, that patch is trivial. Simply recording the sites and not rewriting them should be simple enough. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-22 23:56 ` Peter Zijlstra @ 2020-07-23 0:06 ` Steven Rostedt 2020-08-06 22:09 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Steven Rostedt @ 2020-07-23 0:06 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Thu, 23 Jul 2020 01:56:20 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > Anyway, what do you prefer, I suppose I can make objtool whatever we > need, that patch is trivial. Simply recording the sites and not > rewriting them should be simple enough. Either way. If objtool turns it into nops, just make it where we can enable -DCC_USING_NOP_MCOUNT set, and the kernel will be unaware. Or if you just add the locations, then that would work too. -- Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool 2020-07-23 0:06 ` Steven Rostedt @ 2020-08-06 22:09 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-08-06 22:09 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Josh Poimboeuf, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jul 22, 2020 at 08:06:08PM -0400, Steven Rostedt wrote: > On Thu, 23 Jul 2020 01:56:20 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > Anyway, what do you prefer, I suppose I can make objtool whatever we > > need, that patch is trivial. Simply recording the sites and not > > rewriting them should be simple enough. > > Either way. If objtool turns it into nops, just make it where we can > enable -DCC_USING_NOP_MCOUNT set, and the kernel will be unaware. > > Or if you just add the locations, then that would work too. I took Peter's earlier patch, rebased it on top of the current mainline tree for easier testing, and tweaked the makefiles to only use objtool --mcount when CONFIG_STACK_VALIDATION is enabled and the compiler supports -mfentry. This works for me with both gcc and clang. Thoughts? Sami --- Makefile | 38 ++++++++++++---- arch/x86/Kconfig | 1 + kernel/trace/Kconfig | 5 +++ scripts/Makefile.build | 9 ++-- tools/objtool/builtin-check.c | 3 +- tools/objtool/builtin.h | 2 +- tools/objtool/check.c | 83 +++++++++++++++++++++++++++++++++++ tools/objtool/check.h | 1 + tools/objtool/objtool.h | 1 + 9 files changed, 129 insertions(+), 14 deletions(-) diff --git a/Makefile b/Makefile index 5cfc3481207f..2d23b6b6c4c9 100644 --- a/Makefile +++ b/Makefile @@ -864,17 +864,34 @@ ifdef CONFIG_HAVE_FENTRY ifeq ($(call cc-option-yn, -mfentry),y) CC_FLAGS_FTRACE += -mfentry CC_FLAGS_USING += -DCC_USING_FENTRY + export CC_USING_FENTRY := 1 endif endif export CC_FLAGS_FTRACE -KBUILD_CFLAGS += $(CC_FLAGS_FTRACE) $(CC_FLAGS_USING) -KBUILD_AFLAGS += $(CC_FLAGS_USING) ifdef CONFIG_DYNAMIC_FTRACE - ifdef CONFIG_HAVE_C_RECORDMCOUNT - BUILD_C_RECORDMCOUNT := y - export BUILD_C_RECORDMCOUNT - endif + ifndef CC_USING_RECORD_MCOUNT + ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY + # use objtool or recordmcount to generate mcount tables + ifdef CONFIG_HAVE_OBJTOOL_MCOUNT + ifdef CC_USING_FENTRY + USE_OBJTOOL_MCOUNT := y + CC_FLAGS_USING += -DCC_USING_NOP_MCOUNT + export USE_OBJTOOL_MCOUNT + endif + endif + ifndef USE_OBJTOOL_MCOUNT + USE_RECORDMCOUNT := y + export USE_RECORDMCOUNT + ifdef CONFIG_HAVE_C_RECORDMCOUNT + BUILD_C_RECORDMCOUNT := y + export BUILD_C_RECORDMCOUNT + endif + endif + endif + endif endif +KBUILD_CFLAGS += $(CC_FLAGS_FTRACE) $(CC_FLAGS_USING) +KBUILD_AFLAGS += $(CC_FLAGS_USING) endif # We trigger additional mismatches with less inlining @@ -1211,11 +1228,16 @@ uapi-asm-generic: PHONY += prepare-objtool prepare-resolve_btfids prepare-objtool: $(objtool_target) ifeq ($(SKIP_STACK_VALIDATION),1) +objtool-lib-prompt := "please install libelf-dev, libelf-devel or elfutils-libelf-devel" +ifdef USE_OBJTOOL_MCOUNT + @echo "error: Cannot generate __mcount_loc for CONFIG_DYNAMIC_FTRACE=y, $(objtool-lib-prompt)" >&2 + @false +endif ifdef CONFIG_UNWINDER_ORC - @echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2 + @echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, $(objtool-lib-prompt)" >&2 @false else - @echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2 + @echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, $(objtool-lib-prompt)" >&2 endif endif diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9a2849527dd7..149c94a44cf0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -163,6 +163,7 @@ config X86 select HAVE_CMPXCHG_LOCAL select HAVE_CONTEXT_TRACKING if X86_64 select HAVE_C_RECORDMCOUNT + select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index a4020c0b4508..b510af5b216c 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -56,6 +56,11 @@ config HAVE_C_RECORDMCOUNT help C version of recordmcount available? +config HAVE_OBJTOOL_MCOUNT + bool + help + Arch supports objtool --mcount + config TRACER_MAX_TRACE bool diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 2e8810b7e5ed..f66f8c0ef294 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -175,8 +175,7 @@ cmd_modversions_c = \ fi endif -ifdef CONFIG_FTRACE_MCOUNT_RECORD -ifndef CC_USING_RECORD_MCOUNT +ifdef USE_RECORDMCOUNT # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") @@ -203,8 +202,7 @@ recordmcount_source := $(srctree)/scripts/recordmcount.pl endif # BUILD_C_RECORDMCOUNT cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \ $(sub_cmd_record_mcount)) -endif # CC_USING_RECORD_MCOUNT -endif # CONFIG_FTRACE_MCOUNT_RECORD +endif # USE_RECORDMCOUNT ifdef CONFIG_STACK_VALIDATION ifneq ($(SKIP_STACK_VALIDATION),1) @@ -227,6 +225,9 @@ endif ifdef CONFIG_X86_SMAP objtool_args += --uaccess endif +ifdef USE_OBJTOOL_MCOUNT + objtool_args += --mcount +endif # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 7a44174967b5..71595cf4946d 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -18,7 +18,7 @@ #include "builtin.h" #include "objtool.h" -bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux; +bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount; static const char * const check_usage[] = { "objtool check [<options>] file.o", @@ -35,6 +35,7 @@ const struct option check_options[] = { OPT_BOOLEAN('s', "stats", &stats, "print statistics"), OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"), OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"), + OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"), OPT_END(), }; diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h index 85c979caa367..94565a72b701 100644 --- a/tools/objtool/builtin.h +++ b/tools/objtool/builtin.h @@ -8,7 +8,7 @@ #include <subcmd/parse-options.h> extern const struct option check_options[]; -extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux; +extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount; extern int cmd_check(int argc, const char **argv); extern int cmd_orc(int argc, const char **argv); diff --git a/tools/objtool/check.c b/tools/objtool/check.c index e034a8f24f46..6e0b478dc065 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -433,6 +433,65 @@ static int add_dead_ends(struct objtool_file *file) return 0; } +static int create_mcount_loc_sections(struct objtool_file *file) +{ + struct section *sec, *reloc_sec; + struct reloc *reloc; + unsigned long *loc; + struct instruction *insn; + int idx; + + sec = find_section_by_name(file->elf, "__mcount_loc"); + if (sec) { + INIT_LIST_HEAD(&file->mcount_loc_list); + WARN("file already has __mcount_loc section, skipping"); + return 0; + } + + if (list_empty(&file->mcount_loc_list)) + return 0; + + idx = 0; + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) + idx++; + + sec = elf_create_section(file->elf, "__mcount_loc", sizeof(unsigned long), idx); + if (!sec) + return -1; + + reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA); + if (!reloc_sec) + return -1; + + idx = 0; + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) { + + loc = (unsigned long *)sec->data->d_buf + idx; + memset(loc, 0, sizeof(unsigned long)); + + reloc = malloc(sizeof(*reloc)); + if (!reloc) { + perror("malloc"); + return -1; + } + memset(reloc, 0, sizeof(*reloc)); + + reloc->sym = insn->sec->sym; + reloc->addend = insn->offset; + reloc->type = R_X86_64_64; + reloc->offset = idx * sizeof(unsigned long); + reloc->sec = reloc_sec; + elf_add_reloc(file->elf, reloc); + + idx++; + } + + if (elf_rebuild_reloc_section(file->elf, reloc_sec)) + return -1; + + return 0; +} + /* * Warnings shouldn't be reported for ignored functions. */ @@ -784,6 +843,22 @@ static int add_call_destinations(struct objtool_file *file) insn->type = INSN_NOP; } + if (mcount && !strcmp(insn->call_dest->name, "__fentry__")) { + if (reloc) { + reloc->type = R_NONE; + elf_write_reloc(file->elf, reloc); + } + + elf_write_insn(file->elf, insn->sec, + insn->offset, insn->len, + arch_nop_insn(insn->len)); + + insn->type = INSN_NOP; + + list_add_tail(&insn->mcount_loc_node, + &file->mcount_loc_list); + } + /* * Whatever stack impact regular CALLs have, should be undone * by the RETURN of the called function. @@ -2791,6 +2866,7 @@ int check(const char *_objname, bool orc) INIT_LIST_HEAD(&file.insn_list); hash_init(file.insn_hash); + INIT_LIST_HEAD(&file.mcount_loc_list); file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment"); file.ignore_unreachables = no_unreachable; file.hints = false; @@ -2838,6 +2914,13 @@ int check(const char *_objname, bool orc) warnings += ret; } + if (mcount) { + ret = create_mcount_loc_sections(&file); + if (ret < 0) + goto out; + warnings += ret; + } + if (orc) { ret = create_orc(&file); if (ret < 0) diff --git a/tools/objtool/check.h b/tools/objtool/check.h index 061aa96e15d3..b62afd3d970b 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -22,6 +22,7 @@ struct insn_state { struct instruction { struct list_head list; struct hlist_node hash; + struct list_head mcount_loc_node; struct section *sec; unsigned long offset; unsigned int len; diff --git a/tools/objtool/objtool.h b/tools/objtool/objtool.h index 528028a66816..427806079540 100644 --- a/tools/objtool/objtool.h +++ b/tools/objtool/objtool.h @@ -16,6 +16,7 @@ struct objtool_file { struct elf *elf; struct list_head insn_list; DECLARE_HASHTABLE(insn_hash, 20); + struct list_head mcount_loc_list; bool ignore_unreachables, c_file, hints, rodata; }; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (3 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 04/22] kbuild: lto: fix recordmcount Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 21:19 ` Peter Zijlstra 2020-06-24 20:31 ` [PATCH 06/22] kbuild: lto: limit inlining Sami Tolvanen ` (20 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, LLVM bitcode won't be compiled into native code until modpost_link, or modfinal for modules. This change postpones calls to objtool until after these steps. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/compiler.h | 2 +- lib/Kconfig.debug | 2 +- scripts/Makefile.build | 2 ++ scripts/Makefile.modfinal | 15 +++++++++++++++ 4 files changed, 19 insertions(+), 2 deletions(-) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 30827f82ad62..12b115152532 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -120,7 +120,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, /* Annotate a C jump table to allow objtool to follow the code flow */ #define __annotate_jump_table __section(.rodata..c_jump_table) -#ifdef CONFIG_DEBUG_ENTRY +#if defined(CONFIG_DEBUG_ENTRY) || defined(CONFIG_LTO_CLANG) /* Begin/end of an instrumentation safe region */ #define instrumentation_begin() ({ \ asm volatile("%c0:\n\t" \ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 9ad9210d70a1..9fdba71c135a 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -399,7 +399,7 @@ config STACK_VALIDATION config VMLINUX_VALIDATION bool - depends on STACK_VALIDATION && DEBUG_ENTRY && !PARAVIRT + depends on STACK_VALIDATION && (DEBUG_ENTRY || LTO_CLANG) && !PARAVIRT default y config DEBUG_FORCE_WEAK_PER_CPU diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 64e99f4baa5b..82977350f5a6 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -226,6 +226,7 @@ endif # CC_USING_PATCHABLE_FUNCTION_ENTRY endif # CONFIG_FTRACE_MCOUNT_RECORD ifdef CONFIG_STACK_VALIDATION +ifndef CONFIG_LTO_CLANG ifneq ($(SKIP_STACK_VALIDATION),1) __objtool_obj := $(objtree)/tools/objtool/objtool @@ -258,6 +259,7 @@ objtool_obj = $(if $(patsubst y%,, \ $(__objtool_obj)) endif # SKIP_STACK_VALIDATION +endif # CONFIG_LTO_CLANG endif # CONFIG_STACK_VALIDATION # Rebuild all objects when objtool changes, or is enabled/disabled. diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index d168f0cfe67c..9f1df2f1fab5 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -48,6 +48,21 @@ endif # CC_USING_PATCHABLE_FUNCTION_ENTRY endif # CC_USING_RECORD_MCOUNT endif # CONFIG_FTRACE_MCOUNT_RECORD +ifdef CONFIG_STACK_VALIDATION +ifneq ($(SKIP_STACK_VALIDATION),1) +cmd_ld_ko_o += \ + $(objtree)/tools/objtool/objtool \ + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ + --module \ + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ + $(if $(CONFIG_RETPOLINE),--retpoline,) \ + $(if $(CONFIG_X86_SMAP),--uaccess,) \ + $(@:.ko=$(prelink-ext).o); + +endif # SKIP_STACK_VALIDATION +endif # CONFIG_STACK_VALIDATION + endif # CONFIG_LTO_CLANG quiet_cmd_ld_ko_o = LD [M] $@ -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-24 20:31 ` [PATCH 05/22] kbuild: lto: postpone objtool Sami Tolvanen @ 2020-06-24 21:19 ` Peter Zijlstra 2020-06-24 21:49 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-24 21:19 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 01:31:43PM -0700, Sami Tolvanen wrote: > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 30827f82ad62..12b115152532 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -120,7 +120,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > /* Annotate a C jump table to allow objtool to follow the code flow */ > #define __annotate_jump_table __section(.rodata..c_jump_table) > > -#ifdef CONFIG_DEBUG_ENTRY > +#if defined(CONFIG_DEBUG_ENTRY) || defined(CONFIG_LTO_CLANG) > /* Begin/end of an instrumentation safe region */ > #define instrumentation_begin() ({ \ > asm volatile("%c0:\n\t" \ Why would you be doing noinstr validation for lto builds? That doesn't make sense. > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 9ad9210d70a1..9fdba71c135a 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -399,7 +399,7 @@ config STACK_VALIDATION > > config VMLINUX_VALIDATION > bool > - depends on STACK_VALIDATION && DEBUG_ENTRY && !PARAVIRT > + depends on STACK_VALIDATION && (DEBUG_ENTRY || LTO_CLANG) && !PARAVIRT > default y > For that very same reason you shouldn't be excluding paravirt either. > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal > index d168f0cfe67c..9f1df2f1fab5 100644 > --- a/scripts/Makefile.modfinal > +++ b/scripts/Makefile.modfinal > @@ -48,6 +48,21 @@ endif # CC_USING_PATCHABLE_FUNCTION_ENTRY > endif # CC_USING_RECORD_MCOUNT > endif # CONFIG_FTRACE_MCOUNT_RECORD > > +ifdef CONFIG_STACK_VALIDATION > +ifneq ($(SKIP_STACK_VALIDATION),1) > +cmd_ld_ko_o += \ > + $(objtree)/tools/objtool/objtool \ > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > + --module \ > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > + $(@:.ko=$(prelink-ext).o); > + > +endif # SKIP_STACK_VALIDATION > +endif # CONFIG_STACK_VALIDATION What about the objtool invocation from link-vmlinux.sh ? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-24 21:19 ` Peter Zijlstra @ 2020-06-24 21:49 ` Sami Tolvanen 2020-06-25 7:47 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 21:49 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 11:19:08PM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 01:31:43PM -0700, Sami Tolvanen wrote: > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > index 30827f82ad62..12b115152532 100644 > > --- a/include/linux/compiler.h > > +++ b/include/linux/compiler.h > > @@ -120,7 +120,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > > /* Annotate a C jump table to allow objtool to follow the code flow */ > > #define __annotate_jump_table __section(.rodata..c_jump_table) > > > > -#ifdef CONFIG_DEBUG_ENTRY > > +#if defined(CONFIG_DEBUG_ENTRY) || defined(CONFIG_LTO_CLANG) > > /* Begin/end of an instrumentation safe region */ > > #define instrumentation_begin() ({ \ > > asm volatile("%c0:\n\t" \ > > Why would you be doing noinstr validation for lto builds? That doesn't > make sense. This is just to avoid a ton of noinstr warnings when we run objtool on vmlinux.o, but I'm also fine with skipping noinstr validation with LTO. > > +ifdef CONFIG_STACK_VALIDATION > > +ifneq ($(SKIP_STACK_VALIDATION),1) > > +cmd_ld_ko_o += \ > > + $(objtree)/tools/objtool/objtool \ > > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > > + --module \ > > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > > + $(@:.ko=$(prelink-ext).o); > > + > > +endif # SKIP_STACK_VALIDATION > > +endif # CONFIG_STACK_VALIDATION > > What about the objtool invocation from link-vmlinux.sh ? What about it? The existing objtool_link invocation in link-vmlinux.sh works fine for our purposes as well. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-24 21:49 ` Sami Tolvanen @ 2020-06-25 7:47 ` Peter Zijlstra 2020-06-25 16:22 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 7:47 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 02:49:25PM -0700, Sami Tolvanen wrote: > On Wed, Jun 24, 2020 at 11:19:08PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 24, 2020 at 01:31:43PM -0700, Sami Tolvanen wrote: > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > > index 30827f82ad62..12b115152532 100644 > > > --- a/include/linux/compiler.h > > > +++ b/include/linux/compiler.h > > > @@ -120,7 +120,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > > > /* Annotate a C jump table to allow objtool to follow the code flow */ > > > #define __annotate_jump_table __section(.rodata..c_jump_table) > > > > > > -#ifdef CONFIG_DEBUG_ENTRY > > > +#if defined(CONFIG_DEBUG_ENTRY) || defined(CONFIG_LTO_CLANG) > > > /* Begin/end of an instrumentation safe region */ > > > #define instrumentation_begin() ({ \ > > > asm volatile("%c0:\n\t" \ > > > > Why would you be doing noinstr validation for lto builds? That doesn't > > make sense. > > This is just to avoid a ton of noinstr warnings when we run objtool on > vmlinux.o, but I'm also fine with skipping noinstr validation with LTO. Right, then we need to make --no-vmlinux work properly when !DEBUG_ENTRY, which I think might be buggered due to us overriding the argument when the objname ends with "vmlinux.o". > > > +ifdef CONFIG_STACK_VALIDATION > > > +ifneq ($(SKIP_STACK_VALIDATION),1) > > > +cmd_ld_ko_o += \ > > > + $(objtree)/tools/objtool/objtool \ > > > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > > > + --module \ > > > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > > > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > > > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > > > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > > > + $(@:.ko=$(prelink-ext).o); > > > + > > > +endif # SKIP_STACK_VALIDATION > > > +endif # CONFIG_STACK_VALIDATION > > > > What about the objtool invocation from link-vmlinux.sh ? > > What about it? The existing objtool_link invocation in link-vmlinux.sh > works fine for our purposes as well. Well, I was wondering why you're adding yet another objtool invocation while we already have one. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-25 7:47 ` Peter Zijlstra @ 2020-06-25 16:22 ` Sami Tolvanen 2020-06-25 18:33 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-25 16:22 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Jun 25, 2020 at 09:47:16AM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 02:49:25PM -0700, Sami Tolvanen wrote: > > On Wed, Jun 24, 2020 at 11:19:08PM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 24, 2020 at 01:31:43PM -0700, Sami Tolvanen wrote: > > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > > > index 30827f82ad62..12b115152532 100644 > > > > --- a/include/linux/compiler.h > > > > +++ b/include/linux/compiler.h > > > > @@ -120,7 +120,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > > > > /* Annotate a C jump table to allow objtool to follow the code flow */ > > > > #define __annotate_jump_table __section(.rodata..c_jump_table) > > > > > > > > -#ifdef CONFIG_DEBUG_ENTRY > > > > +#if defined(CONFIG_DEBUG_ENTRY) || defined(CONFIG_LTO_CLANG) > > > > /* Begin/end of an instrumentation safe region */ > > > > #define instrumentation_begin() ({ \ > > > > asm volatile("%c0:\n\t" \ > > > > > > Why would you be doing noinstr validation for lto builds? That doesn't > > > make sense. > > > > This is just to avoid a ton of noinstr warnings when we run objtool on > > vmlinux.o, but I'm also fine with skipping noinstr validation with LTO. > > Right, then we need to make --no-vmlinux work properly when > !DEBUG_ENTRY, which I think might be buggered due to us overriding the > argument when the objname ends with "vmlinux.o". Right. Can we just remove that and pass --vmlinux to objtool in link-vmlinux.sh, or is the override necessary somewhere else? > > > > +ifdef CONFIG_STACK_VALIDATION > > > > +ifneq ($(SKIP_STACK_VALIDATION),1) > > > > +cmd_ld_ko_o += \ > > > > + $(objtree)/tools/objtool/objtool \ > > > > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > > > > + --module \ > > > > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > > > > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > > > > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > > > > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > > > > + $(@:.ko=$(prelink-ext).o); > > > > + > > > > +endif # SKIP_STACK_VALIDATION > > > > +endif # CONFIG_STACK_VALIDATION > > > > > > What about the objtool invocation from link-vmlinux.sh ? > > > > What about it? The existing objtool_link invocation in link-vmlinux.sh > > works fine for our purposes as well. > > Well, I was wondering why you're adding yet another objtool invocation > while we already have one. Because we can't run objtool until we have compiled bitcode to native code, so for modules, we're need another invocation after everything has been compiled. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-25 16:22 ` Sami Tolvanen @ 2020-06-25 18:33 ` Peter Zijlstra 2020-06-25 19:32 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 18:33 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Jun 25, 2020 at 09:22:26AM -0700, Sami Tolvanen wrote: > On Thu, Jun 25, 2020 at 09:47:16AM +0200, Peter Zijlstra wrote: > > Right, then we need to make --no-vmlinux work properly when > > !DEBUG_ENTRY, which I think might be buggered due to us overriding the > > argument when the objname ends with "vmlinux.o". > > Right. Can we just remove that and pass --vmlinux to objtool in > link-vmlinux.sh, or is the override necessary somewhere else? Think we can remove it; it was just convenient when running manually. > > > > > +ifdef CONFIG_STACK_VALIDATION > > > > > +ifneq ($(SKIP_STACK_VALIDATION),1) > > > > > +cmd_ld_ko_o += \ > > > > > + $(objtree)/tools/objtool/objtool \ > > > > > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > > > > > + --module \ > > > > > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > > > > > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > > > > > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > > > > > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > > > > > + $(@:.ko=$(prelink-ext).o); > > > > > + > > > > > +endif # SKIP_STACK_VALIDATION > > > > > +endif # CONFIG_STACK_VALIDATION > > > > > > > > What about the objtool invocation from link-vmlinux.sh ? > > > > > > What about it? The existing objtool_link invocation in link-vmlinux.sh > > > works fine for our purposes as well. > > > > Well, I was wondering why you're adding yet another objtool invocation > > while we already have one. > > Because we can't run objtool until we have compiled bitcode to native > code, so for modules, we're need another invocation after everything has > been compiled. Well, that I understand, my question was why we need one in scripts/link-vmlinux.sh and an additional one. I think we're just talking past one another and agree we only need one. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 05/22] kbuild: lto: postpone objtool 2020-06-25 18:33 ` Peter Zijlstra @ 2020-06-25 19:32 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-25 19:32 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Jun 25, 2020 at 08:33:51PM +0200, Peter Zijlstra wrote: > On Thu, Jun 25, 2020 at 09:22:26AM -0700, Sami Tolvanen wrote: > > On Thu, Jun 25, 2020 at 09:47:16AM +0200, Peter Zijlstra wrote: > > > > Right, then we need to make --no-vmlinux work properly when > > > !DEBUG_ENTRY, which I think might be buggered due to us overriding the > > > argument when the objname ends with "vmlinux.o". > > > > Right. Can we just remove that and pass --vmlinux to objtool in > > link-vmlinux.sh, or is the override necessary somewhere else? > > Think we can remove it; it was just convenient when running manually. Great, I'll change this in v2. > > > > > > +ifdef CONFIG_STACK_VALIDATION > > > > > > +ifneq ($(SKIP_STACK_VALIDATION),1) > > > > > > +cmd_ld_ko_o += \ > > > > > > + $(objtree)/tools/objtool/objtool \ > > > > > > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > > > > > > + --module \ > > > > > > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > > > > > > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > > > > > > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > > > > > > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > > > > > > + $(@:.ko=$(prelink-ext).o); > > > > > > + > > > > > > +endif # SKIP_STACK_VALIDATION > > > > > > +endif # CONFIG_STACK_VALIDATION > > > > > > > > > > What about the objtool invocation from link-vmlinux.sh ? > > > > > > > > What about it? The existing objtool_link invocation in link-vmlinux.sh > > > > works fine for our purposes as well. > > > > > > Well, I was wondering why you're adding yet another objtool invocation > > > while we already have one. > > > > Because we can't run objtool until we have compiled bitcode to native > > code, so for modules, we're need another invocation after everything has > > been compiled. > > Well, that I understand, my question was why we need one in > scripts/link-vmlinux.sh and an additional one. I think we're just > talking past one another and agree we only need one. We need just one for vmlinux.o, but this rule adds an objtool invocation for kernel modules, which we also couldn't check earlier. We link all the bitcode for modules into <module>.lto.o and run modpost and objtool on that before building the final .ko. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 06/22] kbuild: lto: limit inlining 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (4 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 05/22] kbuild: lto: postpone objtool Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 21:20 ` Peter Zijlstra 2020-06-24 20:31 ` [PATCH 07/22] kbuild: lto: merge module sections Sami Tolvanen ` (19 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, George Burgess IV, Sami Tolvanen, linux-pci, linux-arm-kernel This change limits function inlining across translation unit boundaries in order to reduce the binary size with LTO. The -import-instr-limit flag defines a size limit, as the number of LLVM IR instructions, for importing functions from other TUs. The default value is 100, and decreasing it to 5 reduces the size of a stripped arm64 defconfig vmlinux by 11%. Suggested-by: George Burgess IV <gbiv@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Makefile b/Makefile index 3a7e5e5c17b9..ee66513a5b66 100644 --- a/Makefile +++ b/Makefile @@ -894,6 +894,10 @@ else CC_FLAGS_LTO_CLANG := -flto endif CC_FLAGS_LTO_CLANG += -fvisibility=default + +# Limit inlining across translation units to reduce binary size +LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 +KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) endif ifdef CONFIG_LTO -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 06/22] kbuild: lto: limit inlining 2020-06-24 20:31 ` [PATCH 06/22] kbuild: lto: limit inlining Sami Tolvanen @ 2020-06-24 21:20 ` Peter Zijlstra 2020-06-24 23:37 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-24 21:20 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, George Burgess IV, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 01:31:44PM -0700, Sami Tolvanen wrote: > This change limits function inlining across translation unit > boundaries in order to reduce the binary size with LTO. > > The -import-instr-limit flag defines a size limit, as the number > of LLVM IR instructions, for importing functions from other TUs. > The default value is 100, and decreasing it to 5 reduces the size > of a stripped arm64 defconfig vmlinux by 11%. Is that also the right number for x86? What about the effect on performance? What did 6 do? or 4? > Suggested-by: George Burgess IV <gbiv@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/Makefile b/Makefile > index 3a7e5e5c17b9..ee66513a5b66 100644 > --- a/Makefile > +++ b/Makefile > @@ -894,6 +894,10 @@ else > CC_FLAGS_LTO_CLANG := -flto > endif > CC_FLAGS_LTO_CLANG += -fvisibility=default > + > +# Limit inlining across translation units to reduce binary size > +LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 > +KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) > endif > > ifdef CONFIG_LTO > -- > 2.27.0.212.ge8ba1cc988-goog > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 06/22] kbuild: lto: limit inlining 2020-06-24 21:20 ` Peter Zijlstra @ 2020-06-24 23:37 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 23:37 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, George Burgess IV, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 11:20:55PM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 01:31:44PM -0700, Sami Tolvanen wrote: > > This change limits function inlining across translation unit > > boundaries in order to reduce the binary size with LTO. > > > > The -import-instr-limit flag defines a size limit, as the number > > of LLVM IR instructions, for importing functions from other TUs. > > The default value is 100, and decreasing it to 5 reduces the size > > of a stripped arm64 defconfig vmlinux by 11%. > > Is that also the right number for x86? What about the effect on > performance? What did 6 do? or 4? This is the size limit we decided on for Android after testing on arm64, but the number is obviously a compromise between code size and performance. I'd be happy to benchmark this further once other concerns have been resolved. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 07/22] kbuild: lto: merge module sections 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (5 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 06/22] kbuild: lto: limit inlining Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 21:01 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 08/22] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen ` (18 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel LLD always splits sections with LTO, which increases module sizes. This change adds a linker script that merges the split sections in the final module and discards the .eh_frame section that LLD may generate. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 2 ++ scripts/module-lto.lds | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) create mode 100644 scripts/module-lto.lds diff --git a/Makefile b/Makefile index ee66513a5b66..9ffec5fe1737 100644 --- a/Makefile +++ b/Makefile @@ -898,6 +898,8 @@ CC_FLAGS_LTO_CLANG += -fvisibility=default # Limit inlining across translation units to reduce binary size LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) + +KBUILD_LDS_MODULE += $(srctree)/scripts/module-lto.lds endif ifdef CONFIG_LTO diff --git a/scripts/module-lto.lds b/scripts/module-lto.lds new file mode 100644 index 000000000000..65884c652bf2 --- /dev/null +++ b/scripts/module-lto.lds @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and + * -ffunction-sections, which increases the size of the final module. + * Merge the split sections in the final binary. + */ +SECTIONS { + __patchable_function_entries : { *(__patchable_function_entries) } + + .bss : { + *(.bss .bss.[0-9a-zA-Z_]*) + *(.bss..L* .bss..compoundliteral*) + } + + .data : { + *(.data .data.[0-9a-zA-Z_]*) + *(.data..L* .data..compoundliteral*) + } + + .rodata : { + *(.rodata .rodata.[0-9a-zA-Z_]*) + *(.rodata..L* .rodata..compoundliteral*) + } + + .text : { *(.text .text.[0-9a-zA-Z_]*) } +} -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 07/22] kbuild: lto: merge module sections 2020-06-24 20:31 ` [PATCH 07/22] kbuild: lto: merge module sections Sami Tolvanen @ 2020-06-24 21:01 ` Nick Desaulniers 2020-06-24 21:31 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 21:01 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > LLD always splits sections with LTO, which increases module sizes. This > change adds a linker script that merges the split sections in the final > module and discards the .eh_frame section that LLD may generate. For discarding .eh_frame, Kees is currently fighting with a series that I would really like to see land that enables warnings on orphan section placement. I don't see any new flags to inhibit .eh_frame generation, or discard it in the linker script, so I'd expect it to be treated as an orphan section and kept. Was that missed, or should that be removed from the commit message? > > Suggested-by: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 2 ++ > scripts/module-lto.lds | 26 ++++++++++++++++++++++++++ > 2 files changed, 28 insertions(+) > create mode 100644 scripts/module-lto.lds > > diff --git a/Makefile b/Makefile > index ee66513a5b66..9ffec5fe1737 100644 > --- a/Makefile > +++ b/Makefile > @@ -898,6 +898,8 @@ CC_FLAGS_LTO_CLANG += -fvisibility=default > # Limit inlining across translation units to reduce binary size > LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 > KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) > + > +KBUILD_LDS_MODULE += $(srctree)/scripts/module-lto.lds > endif > > ifdef CONFIG_LTO > diff --git a/scripts/module-lto.lds b/scripts/module-lto.lds > new file mode 100644 > index 000000000000..65884c652bf2 > --- /dev/null > +++ b/scripts/module-lto.lds > @@ -0,0 +1,26 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and > + * -ffunction-sections, which increases the size of the final module. > + * Merge the split sections in the final binary. > + */ > +SECTIONS { > + __patchable_function_entries : { *(__patchable_function_entries) } > + > + .bss : { > + *(.bss .bss.[0-9a-zA-Z_]*) > + *(.bss..L* .bss..compoundliteral*) > + } > + > + .data : { > + *(.data .data.[0-9a-zA-Z_]*) > + *(.data..L* .data..compoundliteral*) > + } > + > + .rodata : { > + *(.rodata .rodata.[0-9a-zA-Z_]*) > + *(.rodata..L* .rodata..compoundliteral*) > + } > + > + .text : { *(.text .text.[0-9a-zA-Z_]*) } > +} > -- > 2.27.0.212.ge8ba1cc988-goog > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 07/22] kbuild: lto: merge module sections 2020-06-24 21:01 ` Nick Desaulniers @ 2020-06-24 21:31 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 21:31 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 02:01:59PM -0700, 'Nick Desaulniers' via Clang Built Linux wrote: > On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > LLD always splits sections with LTO, which increases module sizes. This > > change adds a linker script that merges the split sections in the final > > module and discards the .eh_frame section that LLD may generate. > > For discarding .eh_frame, Kees is currently fighting with a series > that I would really like to see land that enables warnings on orphan > section placement. I don't see any new flags to inhibit .eh_frame > generation, or discard it in the linker script, so I'd expect it to be > treated as an orphan section and kept. Was that missed, or should > that be removed from the commit message? It should be removed from the commit message, thanks for pointing it out. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 08/22] kbuild: lto: remove duplicate dependencies from .mod files 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (6 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 07/22] kbuild: lto: merge module sections Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 21:13 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 09/22] init: lto: ensure initcall ordering Sami Tolvanen ` (17 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, llvm-nm prints out symbols for each archive member separately, which results in a lot of duplicate dependencies in the .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module consists of several compilation units, the output can exceed the default xargs command size limit and split the dependency list to multiple lines, which results in used symbols getting trimmed. This change removes duplicate dependencies, which will reduce the probability of this happening and makes .mod files smaller and easier to read. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 82977350f5a6..82b465ce3ca0 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -291,7 +291,7 @@ endef # List module undefined symbols (or empty line if not enabled) ifdef CONFIG_TRIM_UNUSED_KSYMS -cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | xargs echo +cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | sort -u | xargs echo else cmd_undef_syms = echo endif -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 08/22] kbuild: lto: remove duplicate dependencies from .mod files 2020-06-24 20:31 ` [PATCH 08/22] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen @ 2020-06-24 21:13 ` Nick Desaulniers 0 siblings, 0 replies; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 21:13 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > With LTO, llvm-nm prints out symbols for each archive member > separately, which results in a lot of duplicate dependencies in the > .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module > consists of several compilation units, the output can exceed the > default xargs command size limit and split the dependency list to > multiple lines, which results in used symbols getting trimmed. > > This change removes duplicate dependencies, which will reduce the > probability of this happening and makes .mod files smaller and > easier to read. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > --- > scripts/Makefile.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 82977350f5a6..82b465ce3ca0 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -291,7 +291,7 @@ endef > > # List module undefined symbols (or empty line if not enabled) > ifdef CONFIG_TRIM_UNUSED_KSYMS > -cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | xargs echo > +cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | sort -u | xargs echo > else > cmd_undef_syms = echo > endif > -- > 2.27.0.212.ge8ba1cc988-goog > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 09/22] init: lto: ensure initcall ordering 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (7 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 08/22] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-25 0:58 ` kernel test robot 2020-06-25 4:19 ` kernel test robot 2020-06-24 20:31 ` [PATCH 10/22] init: lto: fix PREL32 relocations Sami Tolvanen ` (16 subsequent siblings) 25 siblings, 2 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, the compiler doesn't necessarily obey the link order for initcalls, and initcall variables need globally unique names to avoid collisions at link time. This change exports __KBUILD_MODNAME and adds the initcall_id() macro, which uses it together with __COUNTER__ and __LINE__ to help ensure these variables have unique names, and moves each variable to its own section when LTO is enabled, so the correct order can be specified using a linker script. The generate_initcall_ordering.pl script uses nm to find initcalls from the object files passed to the linker, and generates a linker script that specifies the intended order. With LTO, the script is called in link-vmlinux.sh. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/init.h | 52 +++++- scripts/Makefile.lib | 6 +- scripts/generate_initcall_order.pl | 270 +++++++++++++++++++++++++++++ scripts/link-vmlinux.sh | 14 ++ 4 files changed, 333 insertions(+), 9 deletions(-) create mode 100755 scripts/generate_initcall_order.pl diff --git a/include/linux/init.h b/include/linux/init.h index 212fc9e2f691..af638cd6dd52 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -184,19 +184,57 @@ extern bool initcall_debug; * as KEEP() in the linker script. */ +/* Format: <modname>__<counter>_<line>_<fn> */ +#define __initcall_id(fn) \ + __PASTE(__KBUILD_MODNAME, \ + __PASTE(__, \ + __PASTE(__COUNTER__, \ + __PASTE(_, \ + __PASTE(__LINE__, \ + __PASTE(_, fn)))))) + +/* Format: __<prefix>__<iid><id> */ +#define __initcall_name(prefix, __iid, id) \ + __PASTE(__, \ + __PASTE(prefix, \ + __PASTE(__, \ + __PASTE(__iid, id)))) + +#ifdef CONFIG_LTO_CLANG +/* + * With LTO, the compiler doesn't necessarily obey link order for + * initcalls. In order to preserve the correct order, we add each + * variable into its own section and generate a linker script (in + * scripts/link-vmlinux.sh) to specify the order of the sections. + */ +#define __initcall_section(__sec, __iid) \ + #__sec ".init.." #__iid +#else +#define __initcall_section(__sec, __iid) \ + #__sec ".init" +#endif + #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define ___define_initcall(fn, id, __sec) \ +#define ____define_initcall(fn, __name, __sec) \ __ADDRESSABLE(fn) \ - asm(".section \"" #__sec ".init\", \"a\" \n" \ - "__initcall_" #fn #id ": \n" \ + asm(".section \"" __sec "\", \"a\" \n" \ + __stringify(__name) ": \n" \ ".long " #fn " - . \n" \ ".previous \n"); #else -#define ___define_initcall(fn, id, __sec) \ - static initcall_t __initcall_##fn##id __used \ - __attribute__((__section__(#__sec ".init"))) = fn; +#define ____define_initcall(fn, __name, __sec) \ + static initcall_t __name __used \ + __attribute__((__section__(__sec))) = fn; #endif +#define __unique_initcall(fn, id, __sec, __iid) \ + ____define_initcall(fn, \ + __initcall_name(initcall, __iid, id), \ + __initcall_section(__sec, __iid)) + +#define ___define_initcall(fn, id, __sec) \ + __unique_initcall(fn, id, __sec, __initcall_id(fn)) + #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) /* @@ -236,7 +274,7 @@ extern bool initcall_debug; #define __exitcall(fn) \ static exitcall_t __exitcall_##fn __exit_call = fn -#define console_initcall(fn) ___define_initcall(fn,, .con_initcall) +#define console_initcall(fn) ___define_initcall(fn, con, .con_initcall) struct obs_kernel_param { const char *str; diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 99ac59c59826..17447354b543 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -106,9 +106,11 @@ target-stem = $(basename $(patsubst $(obj)/%,%,$@)) # These flags are needed for modversions and compiling, so we define them here # $(modname_flags) defines KBUILD_MODNAME as the name of the module it will # end up in (or would, if it gets compiled in) -name-fix = $(call stringify,$(subst $(comma),_,$(subst -,_,$1))) +name-fix-token = $(subst $(comma),_,$(subst -,_,$1)) +name-fix = $(call stringify,$(call name-fix-token,$1)) basename_flags = -DKBUILD_BASENAME=$(call name-fix,$(basetarget)) -modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) +modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) \ + -D__KBUILD_MODNAME=$(call name-fix-token,$(modname)) modfile_flags = -DKBUILD_MODFILE=$(call stringify,$(modfile)) orig_c_flags = $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) \ diff --git a/scripts/generate_initcall_order.pl b/scripts/generate_initcall_order.pl new file mode 100755 index 000000000000..fe83aec2b51e --- /dev/null +++ b/scripts/generate_initcall_order.pl @@ -0,0 +1,270 @@ +#!/usr/bin/env perl +# SPDX-License-Identifier: GPL-2.0 +# +# Generates a linker script that specifies the correct initcall order. +# +# Copyright (C) 2019 Google LLC + +use strict; +use warnings; +use IO::Handle; +use IO::Select; +use POSIX ":sys_wait_h"; + +my $nm = $ENV{'NM'} || die "$0: ERROR: NM not set?"; +my $objtree = $ENV{'objtree'} || '.'; + +## currently active child processes +my $jobs = {}; # child process pid -> file handle +## results from child processes +my $results = {}; # object index -> [ { level, secname }, ... ] + +## reads _NPROCESSORS_ONLN to determine the maximum number of processes to +## start +sub get_online_processors { + open(my $fh, "getconf _NPROCESSORS_ONLN 2>/dev/null |") + or die "$0: ERROR: failed to execute getconf: $!"; + my $procs = <$fh>; + close($fh); + + if (!($procs =~ /^\d+$/)) { + return 1; + } + + return int($procs); +} + +## writes results to the parent process +## format: <file index> <initcall level> <base initcall section name> +sub write_results { + my ($index, $initcalls) = @_; + + # sort by the counter value to ensure the order of initcalls within + # each object file is correct + foreach my $counter (sort { $a <=> $b } keys(%{$initcalls})) { + my $level = $initcalls->{$counter}->{'level'}; + + # section name for the initcall function + my $secname = $initcalls->{$counter}->{'module'} . '__' . + $counter . '_' . + $initcalls->{$counter}->{'line'} . '_' . + $initcalls->{$counter}->{'function'}; + + print "$index $level $secname\n"; + } +} + +## reads a result line from a child process and adds it to the $results array +sub read_results{ + my ($fh) = @_; + + # each child prints out a full line w/ autoflush and exits after the + # last line, so even if buffered I/O blocks here, it shouldn't block + # very long + my $data = <$fh>; + + if (!defined($data)) { + return 0; + } + + chomp($data); + + my ($index, $level, $secname) = $data =~ + /^(\d+)\ ([^\ ]+)\ (.*)$/; + + if (!defined($index) || + !defined($level) || + !defined($secname)) { + die "$0: ERROR: child process returned invalid data: $data\n"; + } + + $index = int($index); + + if (!exists($results->{$index})) { + $results->{$index} = []; + } + + push (@{$results->{$index}}, { + 'level' => $level, + 'secname' => $secname + }); + + return 1; +} + +## finds initcalls from an object file or all object files in an archive, and +## writes results back to the parent process +sub find_initcalls { + my ($index, $file) = @_; + + die "$0: ERROR: file $file doesn't exist?" if (! -f $file); + + open(my $fh, "\"$nm\" --defined-only \"$file\" 2>/dev/null |") + or die "$0: ERROR: failed to execute \"$nm\": $!"; + + my $initcalls = {}; + + while (<$fh>) { + chomp; + + # check for the start of a new object file (if processing an + # archive) + my ($path)= $_ =~ /^(.+)\:$/; + + if (defined($path)) { + write_results($index, $initcalls); + $initcalls = {}; + next; + } + + # look for an initcall + my ($module, $counter, $line, $symbol) = $_ =~ + /[a-z]\s+__initcall__(\S*)__(\d+)_(\d+)_(.*)$/; + + if (!defined($module)) { + $module = '' + } + + if (!defined($counter) || + !defined($line) || + !defined($symbol)) { + next; + } + + # parse initcall level + my ($function, $level) = $symbol =~ + /^(.*)((early|rootfs|con|[0-9])s?)$/; + + die "$0: ERROR: invalid initcall name $symbol in $file($path)" + if (!defined($function) || !defined($level)); + + $initcalls->{$counter} = { + 'module' => $module, + 'line' => $line, + 'function' => $function, + 'level' => $level, + }; + } + + close($fh); + write_results($index, $initcalls); +} + +## waits for any child process to complete, reads the results, and adds them to +## the $results array for later processing +sub wait_for_results { + my ($select) = @_; + + my $pid = 0; + do { + # unblock children that may have a full write buffer + foreach my $fh ($select->can_read(0)) { + read_results($fh); + } + + # check for children that have exited, read the remaining data + # from them, and clean up + $pid = waitpid(-1, WNOHANG); + if ($pid > 0) { + if (!exists($jobs->{$pid})) { + next; + } + + my $fh = $jobs->{$pid}; + $select->remove($fh); + + while (read_results($fh)) { + # until eof + } + + close($fh); + delete($jobs->{$pid}); + } + } while ($pid > 0); +} + +## forks a child to process each file passed in the command line and collects +## the results +sub process_files { + my $index = 0; + my $njobs = get_online_processors(); + my $select = IO::Select->new(); + + while (my $file = shift(@ARGV)) { + # fork a child process and read it's stdout + my $pid = open(my $fh, '-|'); + + if (!defined($pid)) { + die "$0: ERROR: failed to fork: $!"; + } elsif ($pid) { + # save the child process pid and the file handle + $select->add($fh); + $jobs->{$pid} = $fh; + } else { + # in the child process + STDOUT->autoflush(1); + find_initcalls($index, "$objtree/$file"); + exit; + } + + $index++; + + # limit the number of children to $njobs + if (scalar(keys(%{$jobs})) >= $njobs) { + wait_for_results($select); + } + } + + # wait for the remaining children to complete + while (scalar(keys(%{$jobs})) > 0) { + wait_for_results($select); + } +} + +sub generate_initcall_lds() { + process_files(); + + my $sections = {}; # level -> [ secname, ...] + + # sort results to retain link order and split to sections per + # initcall level + foreach my $index (sort { $a <=> $b } keys(%{$results})) { + foreach my $result (@{$results->{$index}}) { + my $level = $result->{'level'}; + + if (!exists($sections->{$level})) { + $sections->{$level} = []; + } + + push(@{$sections->{$level}}, $result->{'secname'}); + } + } + + die "$0: ERROR: no initcalls?" if (!keys(%{$sections})); + + # print out a linker script that defines the order of initcalls for + # each level + print "SECTIONS {\n"; + + foreach my $level (sort(keys(%{$sections}))) { + my $section; + + if ($level eq 'con') { + $section = '.con_initcall.init'; + } else { + $section = ".initcall${level}.init"; + } + + print "\t${section} : {\n"; + + foreach my $secname (@{$sections->{$level}}) { + print "\t\t*(${section}..${secname}) ;\n"; + } + + print "\t}\n"; + } + + print "}\n"; +} + +generate_initcall_lds(); diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index c72f5d0238f1..42c73e24e820 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -39,6 +39,16 @@ info() fi } +# Generate a linker script to ensure correct ordering of initcalls. +gen_initcalls() +{ + info GEN .tmp_initcalls.lds + + ${srctree}/scripts/generate_initcall_order.pl \ + ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS} \ + > .tmp_initcalls.lds +} + # If CONFIG_LTO_CLANG is selected, collect generated symbol versions into # .tmp_symversions.lds gen_symversions() @@ -70,6 +80,9 @@ modpost_link() --end-group" if [ -n "${CONFIG_LTO_CLANG}" ]; then + gen_initcalls + lds="-T .tmp_initcalls.lds" + if [ -n "${CONFIG_MODVERSIONS}" ]; then gen_symversions lds="${lds} -T .tmp_symversions.lds" @@ -283,6 +296,7 @@ cleanup() { rm -f .btf.* rm -f .tmp_System.map + rm -f .tmp_initcalls.lds rm -f .tmp_symversions.lds rm -f .tmp_vmlinux* rm -f System.map -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 09/22] init: lto: ensure initcall ordering 2020-06-24 20:31 ` [PATCH 09/22] init: lto: ensure initcall ordering Sami Tolvanen @ 2020-06-25 0:58 ` kernel test robot 2020-06-25 4:19 ` kernel test robot 1 sibling, 0 replies; 212+ messages in thread From: kernel test robot @ 2020-06-25 0:58 UTC (permalink / raw) To: Sami Tolvanen, Masahiro Yamada, Will Deacon Cc: linux-arch, kbuild-all, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Nick Desaulniers, clang-built-linux, linux-arm-kernel [-- Attachment #1: Type: text/plain, Size: 17039 bytes --] Hi Sami, Thank you for the patch! Yet something to improve: [auto build test ERROR on 26e122e97a3d0390ebec389347f64f3730fdf48f] url: https://github.com/0day-ci/linux/commits/Sami-Tolvanen/add-support-for-Clang-LTO/20200625-043816 base: 26e122e97a3d0390ebec389347f64f3730fdf48f config: m68k-defconfig (attached as .config) compiler: m68k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): In file included from arch/m68k/include/asm/io_mm.h:25, from arch/m68k/include/asm/io.h:8, from include/linux/io.h:13, from include/linux/irq.h:20, from include/asm-generic/hardirq.h:13, from ./arch/m68k/include/generated/asm/hardirq.h:1, from include/linux/hardirq.h:10, from include/linux/interrupt.h:11, from drivers/ide/gayle.c:13: arch/m68k/include/asm/raw_io.h: In function 'raw_rom_outsb': arch/m68k/include/asm/raw_io.h:83:7: warning: variable '__w' set but not used [-Wunused-but-set-variable] 83 | ({u8 __w, __v = (b); u32 _addr = ((u32) (addr)); \ | ^~~ arch/m68k/include/asm/raw_io.h:430:3: note: in expansion of macro 'rom_out_8' 430 | rom_out_8(port, *buf++); | ^~~~~~~~~ arch/m68k/include/asm/raw_io.h: In function 'raw_rom_outsw': arch/m68k/include/asm/raw_io.h:86:8: warning: variable '__w' set but not used [-Wunused-but-set-variable] 86 | ({u16 __w, __v = (w); u32 _addr = ((u32) (addr)); \ | ^~~ arch/m68k/include/asm/raw_io.h:448:3: note: in expansion of macro 'rom_out_be16' 448 | rom_out_be16(port, *buf++); | ^~~~~~~~~~~~ arch/m68k/include/asm/raw_io.h: In function 'raw_rom_outsw_swapw': arch/m68k/include/asm/raw_io.h:90:8: warning: variable '__w' set but not used [-Wunused-but-set-variable] 90 | ({u16 __w, __v = (w); u32 _addr = ((u32) (addr)); \ | ^~~ arch/m68k/include/asm/raw_io.h:466:3: note: in expansion of macro 'rom_out_le16' 466 | rom_out_le16(port, *buf++); | ^~~~~~~~~~~~ In file included from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:9, from drivers/ide/gayle.c:12: include/linux/dma-mapping.h: In function 'dma_map_resource': arch/m68k/include/asm/page_mm.h:169:49: warning: ordered comparison of pointer with null pointer [-Wextra] 169 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory) | ^~ include/asm-generic/bug.h:144:27: note: in definition of macro 'WARN_ON_ONCE' 144 | int __ret_warn_once = !!(condition); \ | ^~~~~~~~~ arch/m68k/include/asm/page_mm.h:170:25: note: in expansion of macro 'virt_addr_valid' 170 | #define pfn_valid(pfn) virt_addr_valid(pfn_to_virt(pfn)) | ^~~~~~~~~~~~~~~ include/linux/dma-mapping.h:352:19: note: in expansion of macro 'pfn_valid' 352 | if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) | ^~~~~~~~~ In file included from <command-line>: drivers/ide/gayle.c: At top level: >> arch/m68k/include/asm/amigayle.h:57:66: error: pasting ")" and "__279_185_amiga_gayle_ide_driver_init" does not give a valid preprocessing token 57 | #define gayle (*(volatile struct GAYLE *)(zTwoBase+GAYLE_ADDRESS)) | ^ include/linux/compiler_types.h:53:23: note: in definition of macro '___PASTE' 53 | #define ___PASTE(a,b) a##b | ^ >> include/linux/init.h:189:2: note: in expansion of macro '__PASTE' 189 | __PASTE(__KBUILD_MODNAME, \ | ^~~~~~~ >> <command-line>: note: in expansion of macro 'gayle' >> include/linux/init.h:189:10: note: in expansion of macro '__KBUILD_MODNAME' 189 | __PASTE(__KBUILD_MODNAME, \ | ^~~~~~~~~~~~~~~~ >> include/linux/init.h:236:35: note: in expansion of macro '__initcall_id' 236 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~ include/linux/init.h:238:35: note: in expansion of macro '___define_initcall' 238 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) | ^~~~~~~~~~~~~~~~~~ include/linux/init.h:267:30: note: in expansion of macro '__define_initcall' 267 | #define device_initcall(fn) __define_initcall(fn, 6) | ^~~~~~~~~~~~~~~~~ >> include/linux/init.h:272:24: note: in expansion of macro 'device_initcall' 272 | #define __initcall(fn) device_initcall(fn) | ^~~~~~~~~~~~~~~ >> include/linux/module.h:88:24: note: in expansion of macro '__initcall' 88 | #define module_init(x) __initcall(x); | ^~~~~~~~~~ include/linux/platform_device.h:271:1: note: in expansion of macro 'module_init' 271 | module_init(__platform_driver##_init); \ | ^~~~~~~~~~~ drivers/ide/gayle.c:185:1: note: in expansion of macro 'module_platform_driver_probe' 185 | module_platform_driver_probe(amiga_gayle_ide_driver, amiga_gayle_ide_probe); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/init.h:200:10: error: pasting "__" and "(" does not give a valid preprocessing token 200 | __PASTE(__, \ | ^~ include/linux/compiler_types.h:53:23: note: in definition of macro '___PASTE' 53 | #define ___PASTE(a,b) a##b | ^ include/linux/init.h:200:2: note: in expansion of macro '__PASTE' 200 | __PASTE(__, \ | ^~~~~~~ >> include/linux/init.h:232:3: note: in expansion of macro '__initcall_name' 232 | __initcall_name(initcall, __iid, id), \ | ^~~~~~~~~~~~~~~ >> include/linux/init.h:236:2: note: in expansion of macro '__unique_initcall' 236 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~~~~~ include/linux/init.h:238:35: note: in expansion of macro '___define_initcall' 238 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) | ^~~~~~~~~~~~~~~~~~ include/linux/init.h:267:30: note: in expansion of macro '__define_initcall' 267 | #define device_initcall(fn) __define_initcall(fn, 6) | ^~~~~~~~~~~~~~~~~ >> include/linux/init.h:272:24: note: in expansion of macro 'device_initcall' 272 | #define __initcall(fn) device_initcall(fn) | ^~~~~~~~~~~~~~~ >> include/linux/module.h:88:24: note: in expansion of macro '__initcall' 88 | #define module_init(x) __initcall(x); | ^~~~~~~~~~ include/linux/platform_device.h:271:1: note: in expansion of macro 'module_init' 271 | module_init(__platform_driver##_init); \ | ^~~~~~~~~~~ drivers/ide/gayle.c:185:1: note: in expansion of macro 'module_platform_driver_probe' 185 | module_platform_driver_probe(amiga_gayle_ide_driver, amiga_gayle_ide_probe); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/printk.h:6, from include/linux/kernel.h:15, from include/asm-generic/bug.h:19, from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:9, from drivers/ide/gayle.c:12: >> arch/m68k/include/asm/amigayle.h:57:16: error: expected declaration specifiers or '...' before '*' token 57 | #define gayle (*(volatile struct GAYLE *)(zTwoBase+GAYLE_ADDRESS)) | ^ include/linux/init.h:226:20: note: in definition of macro '____define_initcall' 226 | static initcall_t __name __used \ | ^~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:198:2: note: in expansion of macro '__PASTE' 198 | __PASTE(__, \ | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:199:2: note: in expansion of macro '__PASTE' 199 | __PASTE(prefix, \ | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:200:2: note: in expansion of macro '__PASTE' 200 | __PASTE(__, \ | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:201:2: note: in expansion of macro '__PASTE' 201 | __PASTE(__iid, id)))) | ^~~~~~~ >> include/linux/init.h:232:3: note: in expansion of macro '__initcall_name' 232 | __initcall_name(initcall, __iid, id), \ | ^~~~~~~~~~~~~~~ >> include/linux/init.h:236:2: note: in expansion of macro '__unique_initcall' 236 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/init.h:189:2: note: in expansion of macro '__PASTE' 189 | __PASTE(__KBUILD_MODNAME, \ | ^~~~~~~ >> <command-line>: note: in expansion of macro 'gayle' >> include/linux/init.h:189:10: note: in expansion of macro '__KBUILD_MODNAME' 189 | __PASTE(__KBUILD_MODNAME, \ | ^~~~~~~~~~~~~~~~ >> include/linux/init.h:236:35: note: in expansion of macro '__initcall_id' 236 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~ include/linux/init.h:238:35: note: in expansion of macro '___define_initcall' 238 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) | ^~~~~~~~~~~~~~~~~~ include/linux/init.h:267:30: note: in expansion of macro '__define_initcall' 267 | #define device_initcall(fn) __define_initcall(fn, 6) | ^~~~~~~~~~~~~~~~~ >> include/linux/init.h:272:24: note: in expansion of macro 'device_initcall' 272 | #define __initcall(fn) device_initcall(fn) | ^~~~~~~~~~~~~~~ include/linux/module.h:88:24: note: in expansion of macro '__initcall' 88 | #define module_init(x) __initcall(x); | ^~~~~~~~~~ include/linux/platform_device.h:271:1: note: in expansion of macro 'module_init' 271 | module_init(__platform_driver##_init); \ | ^~~~~~~~~~~ drivers/ide/gayle.c:185:1: note: in expansion of macro 'module_platform_driver_probe' 185 | module_platform_driver_probe(amiga_gayle_ide_driver, amiga_gayle_ide_probe); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/ide/gayle.c:19: drivers/ide/gayle.c:185:30: warning: 'amiga_gayle_ide_driver_init' defined but not used [-Wunused-function] 185 | module_platform_driver_probe(amiga_gayle_ide_driver, amiga_gayle_ide_probe); | ^~~~~~~~~~~~~~~~~~~~~~ include/linux/platform_device.h:266:19: note: in definition of macro 'module_platform_driver_probe' 266 | static int __init __platform_driver##_init(void) \ | ^~~~~~~~~~~~~~~~~ vim +200 include/linux/init.h 170 171 /* 172 * initcalls are now grouped by functionality into separate 173 * subsections. Ordering inside the subsections is determined 174 * by link order. 175 * For backwards compatibility, initcall() puts the call in 176 * the device init subsection. 177 * 178 * The `id' arg to __define_initcall() is needed so that multiple initcalls 179 * can point at the same handler without causing duplicate-symbol build errors. 180 * 181 * Initcalls are run by placing pointers in initcall sections that the 182 * kernel iterates at runtime. The linker can do dead code / data elimination 183 * and remove that completely, so the initcall sections have to be marked 184 * as KEEP() in the linker script. 185 */ 186 187 /* Format: <modname>__<counter>_<line>_<fn> */ 188 #define __initcall_id(fn) \ > 189 __PASTE(__KBUILD_MODNAME, \ 190 __PASTE(__, \ 191 __PASTE(__COUNTER__, \ 192 __PASTE(_, \ 193 __PASTE(__LINE__, \ 194 __PASTE(_, fn)))))) 195 196 /* Format: __<prefix>__<iid><id> */ 197 #define __initcall_name(prefix, __iid, id) \ 198 __PASTE(__, \ 199 __PASTE(prefix, \ > 200 __PASTE(__, \ 201 __PASTE(__iid, id)))) 202 203 #ifdef CONFIG_LTO_CLANG 204 /* 205 * With LTO, the compiler doesn't necessarily obey link order for 206 * initcalls. In order to preserve the correct order, we add each 207 * variable into its own section and generate a linker script (in 208 * scripts/link-vmlinux.sh) to specify the order of the sections. 209 */ 210 #define __initcall_section(__sec, __iid) \ 211 #__sec ".init.." #__iid 212 #else 213 #define __initcall_section(__sec, __iid) \ 214 #__sec ".init" 215 #endif 216 217 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS 218 #define ____define_initcall(fn, __name, __sec) \ 219 __ADDRESSABLE(fn) \ 220 asm(".section \"" __sec "\", \"a\" \n" \ 221 __stringify(__name) ": \n" \ 222 ".long " #fn " - . \n" \ 223 ".previous \n"); 224 #else 225 #define ____define_initcall(fn, __name, __sec) \ 226 static initcall_t __name __used \ 227 __attribute__((__section__(__sec))) = fn; 228 #endif 229 230 #define __unique_initcall(fn, id, __sec, __iid) \ 231 ____define_initcall(fn, \ > 232 __initcall_name(initcall, __iid, id), \ 233 __initcall_section(__sec, __iid)) 234 235 #define ___define_initcall(fn, id, __sec) \ > 236 __unique_initcall(fn, id, __sec, __initcall_id(fn)) 237 238 #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) 239 240 /* 241 * Early initcalls run before initializing SMP. 242 * 243 * Only for built-in code, not modules. 244 */ 245 #define early_initcall(fn) __define_initcall(fn, early) 246 247 /* 248 * A "pure" initcall has no dependencies on anything else, and purely 249 * initializes variables that couldn't be statically initialized. 250 * 251 * This only exists for built-in code, not for modules. 252 * Keep main.c:initcall_level_names[] in sync. 253 */ 254 #define pure_initcall(fn) __define_initcall(fn, 0) 255 256 #define core_initcall(fn) __define_initcall(fn, 1) 257 #define core_initcall_sync(fn) __define_initcall(fn, 1s) 258 #define postcore_initcall(fn) __define_initcall(fn, 2) 259 #define postcore_initcall_sync(fn) __define_initcall(fn, 2s) 260 #define arch_initcall(fn) __define_initcall(fn, 3) 261 #define arch_initcall_sync(fn) __define_initcall(fn, 3s) 262 #define subsys_initcall(fn) __define_initcall(fn, 4) 263 #define subsys_initcall_sync(fn) __define_initcall(fn, 4s) 264 #define fs_initcall(fn) __define_initcall(fn, 5) 265 #define fs_initcall_sync(fn) __define_initcall(fn, 5s) 266 #define rootfs_initcall(fn) __define_initcall(fn, rootfs) 267 #define device_initcall(fn) __define_initcall(fn, 6) 268 #define device_initcall_sync(fn) __define_initcall(fn, 6s) 269 #define late_initcall(fn) __define_initcall(fn, 7) 270 #define late_initcall_sync(fn) __define_initcall(fn, 7s) 271 > 272 #define __initcall(fn) device_initcall(fn) 273 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 16942 bytes --] [-- Attachment #3: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 09/22] init: lto: ensure initcall ordering 2020-06-24 20:31 ` [PATCH 09/22] init: lto: ensure initcall ordering Sami Tolvanen 2020-06-25 0:58 ` kernel test robot @ 2020-06-25 4:19 ` kernel test robot 1 sibling, 0 replies; 212+ messages in thread From: kernel test robot @ 2020-06-25 4:19 UTC (permalink / raw) To: Sami Tolvanen, Masahiro Yamada, Will Deacon Cc: linux-arch, kbuild-all, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Nick Desaulniers, clang-built-linux, linux-arm-kernel [-- Attachment #1: Type: text/plain, Size: 11769 bytes --] Hi Sami, Thank you for the patch! Yet something to improve: [auto build test ERROR on 26e122e97a3d0390ebec389347f64f3730fdf48f] url: https://github.com/0day-ci/linux/commits/Sami-Tolvanen/add-support-for-Clang-LTO/20200625-043816 base: 26e122e97a3d0390ebec389347f64f3730fdf48f config: m68k-allyesconfig (attached as .config) compiler: m68k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from arch/m68k/include/asm/io_mm.h:25, from arch/m68k/include/asm/io.h:8, from include/linux/io.h:13, from include/linux/irq.h:20, from include/asm-generic/hardirq.h:13, from ./arch/m68k/include/generated/asm/hardirq.h:1, from include/linux/hardirq.h:10, from include/linux/interrupt.h:11, from drivers/ide/gayle.c:13: arch/m68k/include/asm/raw_io.h: In function 'raw_rom_outsb': arch/m68k/include/asm/raw_io.h:83:7: warning: variable '__w' set but not used [-Wunused-but-set-variable] 83 | ({u8 __w, __v = (b); u32 _addr = ((u32) (addr)); \ | ^~~ arch/m68k/include/asm/raw_io.h:430:3: note: in expansion of macro 'rom_out_8' 430 | rom_out_8(port, *buf++); | ^~~~~~~~~ arch/m68k/include/asm/raw_io.h: In function 'raw_rom_outsw': arch/m68k/include/asm/raw_io.h:86:8: warning: variable '__w' set but not used [-Wunused-but-set-variable] 86 | ({u16 __w, __v = (w); u32 _addr = ((u32) (addr)); \ | ^~~ arch/m68k/include/asm/raw_io.h:448:3: note: in expansion of macro 'rom_out_be16' 448 | rom_out_be16(port, *buf++); | ^~~~~~~~~~~~ arch/m68k/include/asm/raw_io.h: In function 'raw_rom_outsw_swapw': arch/m68k/include/asm/raw_io.h:90:8: warning: variable '__w' set but not used [-Wunused-but-set-variable] 90 | ({u16 __w, __v = (w); u32 _addr = ((u32) (addr)); \ | ^~~ arch/m68k/include/asm/raw_io.h:466:3: note: in expansion of macro 'rom_out_le16' 466 | rom_out_le16(port, *buf++); | ^~~~~~~~~~~~ In file included from include/asm-generic/bug.h:5, from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:9, from drivers/ide/gayle.c:12: include/linux/scatterlist.h: In function 'sg_set_buf': arch/m68k/include/asm/page_mm.h:169:49: warning: ordered comparison of pointer with null pointer [-Wextra] 169 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory) | ^~ include/linux/compiler.h:78:42: note: in definition of macro 'unlikely' 78 | # define unlikely(x) __builtin_expect(!!(x), 0) | ^ include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON' 143 | BUG_ON(!virt_addr_valid(buf)); | ^~~~~~ include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid' 143 | BUG_ON(!virt_addr_valid(buf)); | ^~~~~~~~~~~~~~~ In file included from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:9, from drivers/ide/gayle.c:12: include/linux/dma-mapping.h: In function 'dma_map_resource': arch/m68k/include/asm/page_mm.h:169:49: warning: ordered comparison of pointer with null pointer [-Wextra] 169 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory) | ^~ include/asm-generic/bug.h:144:27: note: in definition of macro 'WARN_ON_ONCE' 144 | int __ret_warn_once = !!(condition); \ | ^~~~~~~~~ arch/m68k/include/asm/page_mm.h:170:25: note: in expansion of macro 'virt_addr_valid' 170 | #define pfn_valid(pfn) virt_addr_valid(pfn_to_virt(pfn)) | ^~~~~~~~~~~~~~~ include/linux/dma-mapping.h:352:19: note: in expansion of macro 'pfn_valid' 352 | if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) | ^~~~~~~~~ In file included from <command-line>: drivers/ide/gayle.c: At top level: >> arch/m68k/include/asm/amigayle.h:57:66: error: pasting ")" and "__282_185_amiga_gayle_ide_driver_init" does not give a valid preprocessing token 57 | #define gayle (*(volatile struct GAYLE *)(zTwoBase+GAYLE_ADDRESS)) | ^ include/linux/compiler_types.h:53:23: note: in definition of macro '___PASTE' 53 | #define ___PASTE(a,b) a##b | ^ include/linux/init.h:189:2: note: in expansion of macro '__PASTE' 189 | __PASTE(__KBUILD_MODNAME, \ | ^~~~~~~ <command-line>: note: in expansion of macro 'gayle' include/linux/init.h:189:10: note: in expansion of macro '__KBUILD_MODNAME' 189 | __PASTE(__KBUILD_MODNAME, \ | ^~~~~~~~~~~~~~~~ include/linux/init.h:236:35: note: in expansion of macro '__initcall_id' 236 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~ include/linux/init.h:238:35: note: in expansion of macro '___define_initcall' 238 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) | ^~~~~~~~~~~~~~~~~~ include/linux/init.h:267:30: note: in expansion of macro '__define_initcall' 267 | #define device_initcall(fn) __define_initcall(fn, 6) | ^~~~~~~~~~~~~~~~~ include/linux/init.h:272:24: note: in expansion of macro 'device_initcall' 272 | #define __initcall(fn) device_initcall(fn) | ^~~~~~~~~~~~~~~ include/linux/module.h:88:24: note: in expansion of macro '__initcall' 88 | #define module_init(x) __initcall(x); | ^~~~~~~~~~ include/linux/platform_device.h:271:1: note: in expansion of macro 'module_init' 271 | module_init(__platform_driver##_init); \ | ^~~~~~~~~~~ drivers/ide/gayle.c:185:1: note: in expansion of macro 'module_platform_driver_probe' 185 | module_platform_driver_probe(amiga_gayle_ide_driver, amiga_gayle_ide_probe); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/init.h:200:10: error: pasting "__" and "(" does not give a valid preprocessing token 200 | __PASTE(__, \ | ^~ include/linux/compiler_types.h:53:23: note: in definition of macro '___PASTE' 53 | #define ___PASTE(a,b) a##b | ^ include/linux/init.h:200:2: note: in expansion of macro '__PASTE' 200 | __PASTE(__, \ | ^~~~~~~ include/linux/init.h:232:3: note: in expansion of macro '__initcall_name' 232 | __initcall_name(initcall, __iid, id), \ | ^~~~~~~~~~~~~~~ include/linux/init.h:236:2: note: in expansion of macro '__unique_initcall' 236 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~~~~~ include/linux/init.h:238:35: note: in expansion of macro '___define_initcall' 238 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) | ^~~~~~~~~~~~~~~~~~ include/linux/init.h:267:30: note: in expansion of macro '__define_initcall' 267 | #define device_initcall(fn) __define_initcall(fn, 6) | ^~~~~~~~~~~~~~~~~ include/linux/init.h:272:24: note: in expansion of macro 'device_initcall' 272 | #define __initcall(fn) device_initcall(fn) | ^~~~~~~~~~~~~~~ include/linux/module.h:88:24: note: in expansion of macro '__initcall' 88 | #define module_init(x) __initcall(x); | ^~~~~~~~~~ include/linux/platform_device.h:271:1: note: in expansion of macro 'module_init' 271 | module_init(__platform_driver##_init); \ | ^~~~~~~~~~~ drivers/ide/gayle.c:185:1: note: in expansion of macro 'module_platform_driver_probe' 185 | module_platform_driver_probe(amiga_gayle_ide_driver, amiga_gayle_ide_probe); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/printk.h:6, from include/linux/kernel.h:15, from include/asm-generic/bug.h:19, from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:9, from drivers/ide/gayle.c:12: arch/m68k/include/asm/amigayle.h:57:16: error: expected declaration specifiers or '...' before '*' token 57 | #define gayle (*(volatile struct GAYLE *)(zTwoBase+GAYLE_ADDRESS)) | ^ include/linux/init.h:226:20: note: in definition of macro '____define_initcall' 226 | static initcall_t __name __used \ | ^~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:198:2: note: in expansion of macro '__PASTE' 198 | __PASTE(__, \ | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:199:2: note: in expansion of macro '__PASTE' 199 | __PASTE(prefix, \ | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/init.h:200:2: note: in expansion of macro '__PASTE' 200 | __PASTE(__, \ | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) vim +57 arch/m68k/include/asm/amigayle.h ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 53 ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 54 #define GAYLE_RESET (0xa40000) /* write 0x00 to start reset, ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 55 read 1 byte to stop reset */ ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 56 ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 @57 #define gayle (*(volatile struct GAYLE *)(zTwoBase+GAYLE_ADDRESS)) ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 58 #define gayle_reset (*(volatile u_char *)(zTwoBase+GAYLE_RESET)) ^1da177e4c3f41 include/asm-m68k/amigayle.h Linus Torvalds 2005-04-16 59 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 57533 bytes --] [-- Attachment #3: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 10/22] init: lto: fix PREL32 relocations 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (8 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 09/22] init: lto: ensure initcall ordering Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 11/22] pci: " Sami Tolvanen ` (15 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, the compiler can rename static functions to avoid global naming collisions. As initcall functions are typically static, renaming can break references to them in inline assembly. This change adds a global stub with a stable name for each initcall to fix the issue when PREL32 relocations are used. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/init.h | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/include/linux/init.h b/include/linux/init.h index af638cd6dd52..5b4bdc5a8399 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -209,26 +209,48 @@ extern bool initcall_debug; */ #define __initcall_section(__sec, __iid) \ #__sec ".init.." #__iid + +/* + * With LTO, the compiler can rename static functions to avoid + * global naming collisions. We use a global stub function for + * initcalls to create a stable symbol name whose address can be + * taken in inline assembly when PREL32 relocations are used. + */ +#define __initcall_stub(fn, __iid, id) \ + __initcall_name(initstub, __iid, id) + +#define __define_initcall_stub(__stub, fn) \ + int __init __stub(void) \ + { \ + return fn(); \ + } \ + __ADDRESSABLE(__stub) #else #define __initcall_section(__sec, __iid) \ #__sec ".init" + +#define __initcall_stub(fn, __iid, id) fn + +#define __define_initcall_stub(__stub, fn) \ + __ADDRESSABLE(fn) #endif #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define ____define_initcall(fn, __name, __sec) \ - __ADDRESSABLE(fn) \ +#define ____define_initcall(fn, __stub, __name, __sec) \ + __define_initcall_stub(__stub, fn) \ asm(".section \"" __sec "\", \"a\" \n" \ __stringify(__name) ": \n" \ - ".long " #fn " - . \n" \ + ".long " __stringify(__stub) " - . \n" \ ".previous \n"); #else -#define ____define_initcall(fn, __name, __sec) \ +#define ____define_initcall(fn, __unused, __name, __sec) \ static initcall_t __name __used \ __attribute__((__section__(__sec))) = fn; #endif #define __unique_initcall(fn, id, __sec, __iid) \ ____define_initcall(fn, \ + __initcall_stub(fn, __iid, id), \ __initcall_name(initcall, __iid, id), \ __initcall_section(__sec, __iid)) -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 11/22] pci: lto: fix PREL32 relocations 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (9 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 10/22] init: lto: fix PREL32 relocations Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 22:49 ` kernel test robot 2020-07-17 20:26 ` Bjorn Helgaas 2020-06-24 20:31 ` [PATCH 12/22] modpost: lto: strip .lto from module names Sami Tolvanen ` (14 subsequent siblings) 25 siblings, 2 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, the compiler can rename static functions to avoid global naming collisions. As PCI fixup functions are typically static, renaming can break references to them in inline assembly. This change adds a global stub to DECLARE_PCI_FIXUP_SECTION to fix the issue when PREL32 relocations are used. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/pci.h | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/include/linux/pci.h b/include/linux/pci.h index c79d83304e52..1e65e16f165a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1909,19 +1909,24 @@ enum pci_fixup_pass { }; #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ - class_shift, hook) \ - __ADDRESSABLE(hook) \ +#define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook, stub) \ + void stub(struct pci_dev *dev) { hook(dev); } \ asm(".section " #sec ", \"a\" \n" \ ".balign 16 \n" \ ".short " #vendor ", " #device " \n" \ ".long " #class ", " #class_shift " \n" \ - ".long " #hook " - . \n" \ + ".long " #stub " - . \n" \ ".previous \n"); + +#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook, stub) \ + ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook, stub) #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ class_shift, hook) \ __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ - class_shift, hook) + class_shift, hook, __UNIQUE_ID(hook)) #else /* Anonymous variables would be nice... */ #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class, \ -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 11/22] pci: lto: fix PREL32 relocations 2020-06-24 20:31 ` [PATCH 11/22] pci: " Sami Tolvanen @ 2020-06-24 22:49 ` kernel test robot 2020-06-24 23:03 ` Nick Desaulniers 2020-07-17 20:26 ` Bjorn Helgaas 1 sibling, 1 reply; 212+ messages in thread From: kernel test robot @ 2020-06-24 22:49 UTC (permalink / raw) To: Sami Tolvanen, Masahiro Yamada, Will Deacon Cc: linux-arch, kbuild-all, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Nick Desaulniers, clang-built-linux, linux-arm-kernel [-- Attachment #1: Type: text/plain, Size: 34069 bytes --] Hi Sami, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on 26e122e97a3d0390ebec389347f64f3730fdf48f] url: https://github.com/0day-ci/linux/commits/Sami-Tolvanen/add-support-for-Clang-LTO/20200625-043816 base: 26e122e97a3d0390ebec389347f64f3730fdf48f config: i386-alldefconfig (attached as .config) compiler: gcc-9 (Debian 9.3.0-13) 9.3.0 reproduce (this is a W=1 build): # save the attached .config to linux build tree make W=1 ARCH=i386 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): In file included from arch/x86/kernel/pci-dma.c:9: >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_via_no_dac190' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ >> include/linux/pci.h:1949:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1949 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ >> arch/x86/kernel/pci-dma.c:154:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_CLASS_FINAL' 154 | DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_VIA, PCI_ANY_ID, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- In file included from arch/x86/kernel/quirks.c:6: >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet180' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ >> arch/x86/kernel/quirks.c:156:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' 156 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB2_0, | ^~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet181' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/kernel/quirks.c:158:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' 158 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_0, | ^~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet182' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/kernel/quirks.c:160:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' 160 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, | ^~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet183' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/kernel/quirks.c:162:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' 162 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_0, | ^~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet184' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/kernel/quirks.c:164:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' 164 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_1, | ^~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet185' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/kernel/quirks.c:166:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' 166 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_31, | ^~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet186' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) -- In file included from drivers/pci/vpd.c:8: >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_f0_vpd_link180' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1941:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1941 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_early, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/pci/vpd.c:543:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_CLASS_EARLY' 543 | DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd181' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/pci/vpd.c:560:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' 560 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060, quirk_blacklist_vpd); | ^~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd182' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/pci/vpd.c:561:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' 561 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c, quirk_blacklist_vpd); | ^~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd183' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/pci/vpd.c:562:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' 562 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413, quirk_blacklist_vpd); | ^~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd184' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/pci/vpd.c:563:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' 563 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078, quirk_blacklist_vpd); | ^~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd185' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~~~~~~ include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ | ^~~~ include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' 54 | #define __PASTE(a,b) ___PASTE(a,b) | ^~~~~~~~ include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) | ^~~~~~~ include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' 1929 | class_shift, hook, __UNIQUE_ID(hook)) | ^~~~~~~~~~~ include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ | ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/pci/vpd.c:564:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' 564 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079, quirk_blacklist_vpd); | ^~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd186' [-Wmissing-prototypes] 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) .. vim +/__DECLARE_PCI_FIXUP_SECTION +1928 include/linux/pci.h ^1da177e4c3f415 Linus Torvalds 2005-04-16 1910 c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1911 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS b1b820bb0420d08 Sami Tolvanen 2020-06-24 1912 #define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 1913 class_shift, hook, stub) \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 @1914 void stub(struct pci_dev *dev) { hook(dev); } \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1915 asm(".section " #sec ", \"a\" \n" \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1916 ".balign 16 \n" \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1917 ".short " #vendor ", " #device " \n" \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1918 ".long " #class ", " #class_shift " \n" \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 1919 ".long " #stub " - . \n" \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1920 ".previous \n"); b1b820bb0420d08 Sami Tolvanen 2020-06-24 1921 b1b820bb0420d08 Sami Tolvanen 2020-06-24 1922 #define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 1923 class_shift, hook, stub) \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 1924 ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 1925 class_shift, hook, stub) c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1926 #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1927 class_shift, hook) \ c9d8b55fa019162 Ard Biesheuvel 2018-08-21 @1928 __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ b1b820bb0420d08 Sami Tolvanen 2020-06-24 @1929 class_shift, hook, __UNIQUE_ID(hook)) c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1930 #else ^1da177e4c3f415 Linus Torvalds 2005-04-16 1931 /* Anonymous variables would be nice... */ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1932 #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1933 class_shift, hook) \ ecf61c78bd787b9 Michal Marek 2013-11-11 1934 static const struct pci_fixup __PASTE(__pci_fixup_##name,__LINE__) __used \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1935 __attribute__((__section__(#section), aligned((sizeof(void *))))) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1936 = { vendor, device, class, class_shift, hook }; c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1937 #endif f4ca5c6a56278ca Yinghai Lu 2012-02-23 1938 f4ca5c6a56278ca Yinghai Lu 2012-02-23 1939 #define DECLARE_PCI_FIXUP_CLASS_EARLY(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1940 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1941 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_early, \ ecf61c78bd787b9 Michal Marek 2013-11-11 1942 hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1943 #define DECLARE_PCI_FIXUP_CLASS_HEADER(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1944 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1945 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ ecf61c78bd787b9 Michal Marek 2013-11-11 1946 hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1947 #define DECLARE_PCI_FIXUP_CLASS_FINAL(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1948 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 @1949 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ ecf61c78bd787b9 Michal Marek 2013-11-11 1950 hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1951 #define DECLARE_PCI_FIXUP_CLASS_ENABLE(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1952 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1953 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable, \ ecf61c78bd787b9 Michal Marek 2013-11-11 1954 hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1955 #define DECLARE_PCI_FIXUP_CLASS_RESUME(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1956 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1957 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_resume, \ 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1958 resume##hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1959 #define DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1960 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1961 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_resume_early, \ 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1962 resume_early##hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1963 #define DECLARE_PCI_FIXUP_CLASS_SUSPEND(vendor, device, class, \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1964 class_shift, hook) \ f4ca5c6a56278ca Yinghai Lu 2012-02-23 1965 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_suspend, \ 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1966 suspend##hook, vendor, device, class, class_shift, hook) 7d2a01b87f1682f Andreas Noever 2014-06-03 1967 #define DECLARE_PCI_FIXUP_CLASS_SUSPEND_LATE(vendor, device, class, \ 7d2a01b87f1682f Andreas Noever 2014-06-03 1968 class_shift, hook) \ 7d2a01b87f1682f Andreas Noever 2014-06-03 1969 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_suspend_late, \ 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1970 suspend_late##hook, vendor, device, class, class_shift, hook) f4ca5c6a56278ca Yinghai Lu 2012-02-23 1971 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 16216 bytes --] [-- Attachment #3: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 11/22] pci: lto: fix PREL32 relocations 2020-06-24 22:49 ` kernel test robot @ 2020-06-24 23:03 ` Nick Desaulniers 2020-06-24 23:21 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 23:03 UTC (permalink / raw) To: kernel test robot Cc: linux-arch, kbuild-all, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, clang-built-linux, Sami Tolvanen, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 3:50 PM kernel test robot <lkp@intel.com> wrote: > > Hi Sami, > > Thank you for the patch! Perhaps something to improve: > > [auto build test WARNING on 26e122e97a3d0390ebec389347f64f3730fdf48f] > > url: https://github.com/0day-ci/linux/commits/Sami-Tolvanen/add-support-for-Clang-LTO/20200625-043816 > base: 26e122e97a3d0390ebec389347f64f3730fdf48f > config: i386-alldefconfig (attached as .config) > compiler: gcc-9 (Debian 9.3.0-13) 9.3.0 > reproduce (this is a W=1 build): > # save the attached .config to linux build tree > make W=1 ARCH=i386 Note: W=1 ^ > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot <lkp@intel.com> > > All warnings (new ones prefixed by >>): > > In file included from arch/x86/kernel/pci-dma.c:9: > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_via_no_dac190' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ Should `stub` be qualified as `static inline`? https://godbolt.org/z/cPBXxW Or should stub be declared in this header, but implemented in a .c file? (I'm guessing the former, since the `hook` callback comes from the macro). > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > >> include/linux/pci.h:1949:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1949 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >> arch/x86/kernel/pci-dma.c:154:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_CLASS_FINAL' > 154 | DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_VIA, PCI_ANY_ID, > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > -- > In file included from arch/x86/kernel/quirks.c:6: > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet180' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >> arch/x86/kernel/quirks.c:156:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' > 156 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB2_0, > | ^~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet181' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > arch/x86/kernel/quirks.c:158:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' > 158 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_0, > | ^~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet182' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > arch/x86/kernel/quirks.c:160:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' > 160 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, > | ^~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet183' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > arch/x86/kernel/quirks.c:162:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' > 162 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_0, > | ^~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet184' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > arch/x86/kernel/quirks.c:164:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' > 164 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_1, > | ^~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet185' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1976:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1976 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > arch/x86/kernel/quirks.c:166:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_HEADER' > 166 | DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_31, > | ^~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_ich_force_enable_hpet186' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > -- > In file included from drivers/pci/vpd.c:8: > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_f0_vpd_link180' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1941:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1941 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_early, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >> drivers/pci/vpd.c:543:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_CLASS_EARLY' > 543 | DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd181' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >> drivers/pci/vpd.c:560:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' > 560 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060, quirk_blacklist_vpd); > | ^~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd182' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > >> include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > >> include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > >> include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > drivers/pci/vpd.c:561:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' > 561 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c, quirk_blacklist_vpd); > | ^~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd183' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > drivers/pci/vpd.c:562:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' > 562 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413, quirk_blacklist_vpd); > | ^~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd184' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > drivers/pci/vpd.c:563:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' > 563 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078, quirk_blacklist_vpd); > | ^~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd185' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~~~~~~ > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > | ^~~~ > include/linux/pci.h:1928:2: note: in expansion of macro '__DECLARE_PCI_FIXUP_SECTION' > 1928 | __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:29: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/compiler_types.h:54:22: note: in expansion of macro '___PASTE' > 54 | #define __PASTE(a,b) ___PASTE(a,b) > | ^~~~~~~~ > include/linux/compiler-gcc.h:72:37: note: in expansion of macro '__PASTE' > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > | ^~~~~~~ > include/linux/pci.h:1929:26: note: in expansion of macro '__UNIQUE_ID' > 1929 | class_shift, hook, __UNIQUE_ID(hook)) > | ^~~~~~~~~~~ > include/linux/pci.h:1979:2: note: in expansion of macro 'DECLARE_PCI_FIXUP_SECTION' > 1979 | DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > drivers/pci/vpd.c:564:1: note: in expansion of macro 'DECLARE_PCI_FIXUP_FINAL' > 564 | DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079, quirk_blacklist_vpd); > | ^~~~~~~~~~~~~~~~~~~~~~~ > include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_quirk_blacklist_vpd186' [-Wmissing-prototypes] > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > .. > > vim +/__DECLARE_PCI_FIXUP_SECTION +1928 include/linux/pci.h > > ^1da177e4c3f415 Linus Torvalds 2005-04-16 1910 > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1911 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1912 #define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1913 class_shift, hook, stub) \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 @1914 void stub(struct pci_dev *dev) { hook(dev); } \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1915 asm(".section " #sec ", \"a\" \n" \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1916 ".balign 16 \n" \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1917 ".short " #vendor ", " #device " \n" \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1918 ".long " #class ", " #class_shift " \n" \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1919 ".long " #stub " - . \n" \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1920 ".previous \n"); > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1921 > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1922 #define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1923 class_shift, hook, stub) \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1924 ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 1925 class_shift, hook, stub) > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1926 #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1927 class_shift, hook) \ > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 @1928 __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > b1b820bb0420d08 Sami Tolvanen 2020-06-24 @1929 class_shift, hook, __UNIQUE_ID(hook)) > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1930 #else > ^1da177e4c3f415 Linus Torvalds 2005-04-16 1931 /* Anonymous variables would be nice... */ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1932 #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1933 class_shift, hook) \ > ecf61c78bd787b9 Michal Marek 2013-11-11 1934 static const struct pci_fixup __PASTE(__pci_fixup_##name,__LINE__) __used \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1935 __attribute__((__section__(#section), aligned((sizeof(void *))))) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1936 = { vendor, device, class, class_shift, hook }; > c9d8b55fa019162 Ard Biesheuvel 2018-08-21 1937 #endif > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1938 > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1939 #define DECLARE_PCI_FIXUP_CLASS_EARLY(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1940 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1941 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_early, \ > ecf61c78bd787b9 Michal Marek 2013-11-11 1942 hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1943 #define DECLARE_PCI_FIXUP_CLASS_HEADER(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1944 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1945 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_header, \ > ecf61c78bd787b9 Michal Marek 2013-11-11 1946 hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1947 #define DECLARE_PCI_FIXUP_CLASS_FINAL(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1948 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 @1949 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \ > ecf61c78bd787b9 Michal Marek 2013-11-11 1950 hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1951 #define DECLARE_PCI_FIXUP_CLASS_ENABLE(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1952 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1953 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable, \ > ecf61c78bd787b9 Michal Marek 2013-11-11 1954 hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1955 #define DECLARE_PCI_FIXUP_CLASS_RESUME(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1956 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1957 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_resume, \ > 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1958 resume##hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1959 #define DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1960 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1961 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_resume_early, \ > 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1962 resume_early##hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1963 #define DECLARE_PCI_FIXUP_CLASS_SUSPEND(vendor, device, class, \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1964 class_shift, hook) \ > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1965 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_suspend, \ > 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1966 suspend##hook, vendor, device, class, class_shift, hook) > 7d2a01b87f1682f Andreas Noever 2014-06-03 1967 #define DECLARE_PCI_FIXUP_CLASS_SUSPEND_LATE(vendor, device, class, \ > 7d2a01b87f1682f Andreas Noever 2014-06-03 1968 class_shift, hook) \ > 7d2a01b87f1682f Andreas Noever 2014-06-03 1969 DECLARE_PCI_FIXUP_SECTION(.pci_fixup_suspend_late, \ > 0aa0f5d1084ca1c Bjorn Helgaas 2017-12-02 1970 suspend_late##hook, vendor, device, class, class_shift, hook) > f4ca5c6a56278ca Yinghai Lu 2012-02-23 1971 > > --- > 0-DAY CI Kernel Test Service, Intel Corporation > https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 11/22] pci: lto: fix PREL32 relocations 2020-06-24 23:03 ` Nick Desaulniers @ 2020-06-24 23:21 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 23:21 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, kbuild-all, kernel test robot, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, clang-built-linux, Will Deacon, Linux ARM, Kees Cook On Wed, Jun 24, 2020 at 04:03:48PM -0700, Nick Desaulniers wrote: > On Wed, Jun 24, 2020 at 3:50 PM kernel test robot <lkp@intel.com> wrote: > > > > Hi Sami, > > > > Thank you for the patch! Perhaps something to improve: > > > > [auto build test WARNING on 26e122e97a3d0390ebec389347f64f3730fdf48f] > > > > url: https://github.com/0day-ci/linux/commits/Sami-Tolvanen/add-support-for-Clang-LTO/20200625-043816 > > base: 26e122e97a3d0390ebec389347f64f3730fdf48f > > config: i386-alldefconfig (attached as .config) > > compiler: gcc-9 (Debian 9.3.0-13) 9.3.0 > > reproduce (this is a W=1 build): > > # save the attached .config to linux build tree > > make W=1 ARCH=i386 > > Note: W=1 ^ > > > > > If you fix the issue, kindly add following tag as appropriate > > Reported-by: kernel test robot <lkp@intel.com> > > > > All warnings (new ones prefixed by >>): > > > > In file included from arch/x86/kernel/pci-dma.c:9: > > >> include/linux/compiler-gcc.h:72:45: warning: no previous prototype for '__UNIQUE_ID_via_no_dac190' [-Wmissing-prototypes] > > 72 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > > | ^~~~~~~~~~~~ > > include/linux/pci.h:1914:7: note: in definition of macro '___DECLARE_PCI_FIXUP_SECTION' > > 1914 | void stub(struct pci_dev *dev) { hook(dev); } \ > > | ^~~~ > > Should `stub` be qualified as `static inline`? https://godbolt.org/z/cPBXxW > Or should stub be declared in this header, but implemented in a .c > file? (I'm guessing the former, since the `hook` callback comes from > the macro). Does static inline guarantee that the compiler won't rename the symbol? The purpose of this change is to have a stable symbol name, which we can safely use in inline assembly. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 11/22] pci: lto: fix PREL32 relocations 2020-06-24 20:31 ` [PATCH 11/22] pci: " Sami Tolvanen 2020-06-24 22:49 ` kernel test robot @ 2020-07-17 20:26 ` Bjorn Helgaas 2020-07-22 18:15 ` Sami Tolvanen 1 sibling, 1 reply; 212+ messages in thread From: Bjorn Helgaas @ 2020-07-17 20:26 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel OK by me, but please update the subject to match convention: PCI: Fix PREL32 relocations for LTO and include a hint in the commit log about what LTO is. At least expand the initialism once. Googling for "LTO" isn't very useful. With Clang's Link Time Optimization (LTO), the compiler ... ? On Wed, Jun 24, 2020 at 01:31:49PM -0700, Sami Tolvanen wrote: > With LTO, the compiler can rename static functions to avoid global > naming collisions. As PCI fixup functions are typically static, > renaming can break references to them in inline assembly. This > change adds a global stub to DECLARE_PCI_FIXUP_SECTION to fix the > issue when PREL32 relocations are used. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> > --- > include/linux/pci.h | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/include/linux/pci.h b/include/linux/pci.h > index c79d83304e52..1e65e16f165a 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -1909,19 +1909,24 @@ enum pci_fixup_pass { > }; > > #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS > -#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > - class_shift, hook) \ > - __ADDRESSABLE(hook) \ > +#define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > + class_shift, hook, stub) \ > + void stub(struct pci_dev *dev) { hook(dev); } \ > asm(".section " #sec ", \"a\" \n" \ > ".balign 16 \n" \ > ".short " #vendor ", " #device " \n" \ > ".long " #class ", " #class_shift " \n" \ > - ".long " #hook " - . \n" \ > + ".long " #stub " - . \n" \ > ".previous \n"); > + > +#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > + class_shift, hook, stub) \ > + ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > + class_shift, hook, stub) > #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > class_shift, hook) \ > __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ > - class_shift, hook) > + class_shift, hook, __UNIQUE_ID(hook)) > #else > /* Anonymous variables would be nice... */ > #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class, \ > -- > 2.27.0.212.ge8ba1cc988-goog > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 11/22] pci: lto: fix PREL32 relocations 2020-07-17 20:26 ` Bjorn Helgaas @ 2020-07-22 18:15 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-07-22 18:15 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel Hi Bjorn, On Fri, Jul 17, 2020 at 1:26 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > OK by me, but please update the subject to match convention: > > PCI: Fix PREL32 relocations for LTO > > and include a hint in the commit log about what LTO is. At least > expand the initialism once. Googling for "LTO" isn't very useful. > > With Clang's Link Time Optimization (LTO), the compiler ... ? Sure, I'll change this in the next version. Thanks for taking a look! Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 12/22] modpost: lto: strip .lto from module names 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (10 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 11/22] pci: " Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 22:05 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 13/22] scripts/mod: disable LTO for empty.c Sami Tolvanen ` (13 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Bill Wendling, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, everything is compiled into LLVM bitcode, so we have to link each module into native code before modpost. Kbuild uses the .lto.o suffix for these files, which also ends up in module information. This change strips the unnecessary .lto suffix from the module name. Suggested-by: Bill Wendling <morbo@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/mod/modpost.c | 16 +++++++--------- scripts/mod/modpost.h | 9 +++++++++ scripts/mod/sumversion.c | 6 +++++- 3 files changed, 21 insertions(+), 10 deletions(-) diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index 6aea65c65745..8352f8a1a138 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -17,7 +17,6 @@ #include <ctype.h> #include <string.h> #include <limits.h> -#include <stdbool.h> #include <errno.h> #include "modpost.h" #include "../../include/linux/license.h" @@ -80,14 +79,6 @@ modpost_log(enum loglevel loglevel, const char *fmt, ...) exit(1); } -static inline bool strends(const char *str, const char *postfix) -{ - if (strlen(str) < strlen(postfix)) - return false; - - return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0; -} - void *do_nofail(void *ptr, const char *expr) { if (!ptr) @@ -1975,6 +1966,10 @@ static char *remove_dot(char *s) size_t m = strspn(s + n + 1, "0123456789"); if (m && (s[n + m] == '.' || s[n + m] == 0)) s[n] = 0; + + /* strip trailing .lto */ + if (strends(s, ".lto")) + s[strlen(s) - 4] = '\0'; } return s; } @@ -1998,6 +1993,9 @@ static void read_symbols(const char *modname) /* strip trailing .o */ tmp = NOFAIL(strdup(modname)); tmp[strlen(tmp) - 2] = '\0'; + /* strip trailing .lto */ + if (strends(tmp, ".lto")) + tmp[strlen(tmp) - 4] = '\0'; mod = new_module(tmp); free(tmp); } diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h index 3aa052722233..fab30d201f9e 100644 --- a/scripts/mod/modpost.h +++ b/scripts/mod/modpost.h @@ -2,6 +2,7 @@ #include <stdio.h> #include <stdlib.h> #include <stdarg.h> +#include <stdbool.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> @@ -180,6 +181,14 @@ static inline unsigned int get_secindex(const struct elf_info *info, return info->symtab_shndx_start[sym - info->symtab_start]; } +static inline bool strends(const char *str, const char *postfix) +{ + if (strlen(str) < strlen(postfix)) + return false; + + return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0; +} + /* file2alias.c */ extern unsigned int cross_build; void handle_moddevtable(struct module *mod, struct elf_info *info, diff --git a/scripts/mod/sumversion.c b/scripts/mod/sumversion.c index d587f40f1117..760e6baa7eda 100644 --- a/scripts/mod/sumversion.c +++ b/scripts/mod/sumversion.c @@ -391,10 +391,14 @@ void get_src_version(const char *modname, char sum[], unsigned sumlen) struct md4_ctx md; char *fname; char filelist[PATH_MAX + 1]; + int postfix_len = 1; + + if (strends(modname, ".lto.o")) + postfix_len = 5; /* objects for a module are listed in the first line of *.mod file. */ snprintf(filelist, sizeof(filelist), "%.*smod", - (int)strlen(modname) - 1, modname); + (int)strlen(modname) - postfix_len, modname); buf = read_text_file(filelist); -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 12/22] modpost: lto: strip .lto from module names 2020-06-24 20:31 ` [PATCH 12/22] modpost: lto: strip .lto from module names Sami Tolvanen @ 2020-06-24 22:05 ` Nick Desaulniers 0 siblings, 0 replies; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 22:05 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Bill Wendling, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > With LTO, everything is compiled into LLVM bitcode, so we have to link > each module into native code before modpost. Kbuild uses the .lto.o > suffix for these files, which also ends up in module information. This > change strips the unnecessary .lto suffix from the module name. > > Suggested-by: Bill Wendling <morbo@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > scripts/mod/modpost.c | 16 +++++++--------- > scripts/mod/modpost.h | 9 +++++++++ > scripts/mod/sumversion.c | 6 +++++- > 3 files changed, 21 insertions(+), 10 deletions(-) > > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c > index 6aea65c65745..8352f8a1a138 100644 > --- a/scripts/mod/modpost.c > +++ b/scripts/mod/modpost.c > @@ -17,7 +17,6 @@ > #include <ctype.h> > #include <string.h> > #include <limits.h> > -#include <stdbool.h> It looks like `bool` is used in the function signatures of other functions in this TU, I'm not the biggest fan of hoisting the includes out of the .c source into the header (I'd keep it in both), but I don't feel strongly enough to NACK. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > #include <errno.h> > #include "modpost.h" > #include "../../include/linux/license.h" > @@ -80,14 +79,6 @@ modpost_log(enum loglevel loglevel, const char *fmt, ...) > exit(1); > } > > -static inline bool strends(const char *str, const char *postfix) > -{ > - if (strlen(str) < strlen(postfix)) > - return false; > - > - return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0; > -} > - > void *do_nofail(void *ptr, const char *expr) > { > if (!ptr) > @@ -1975,6 +1966,10 @@ static char *remove_dot(char *s) > size_t m = strspn(s + n + 1, "0123456789"); > if (m && (s[n + m] == '.' || s[n + m] == 0)) > s[n] = 0; > + > + /* strip trailing .lto */ > + if (strends(s, ".lto")) > + s[strlen(s) - 4] = '\0'; > } > return s; > } > @@ -1998,6 +1993,9 @@ static void read_symbols(const char *modname) > /* strip trailing .o */ > tmp = NOFAIL(strdup(modname)); > tmp[strlen(tmp) - 2] = '\0'; > + /* strip trailing .lto */ > + if (strends(tmp, ".lto")) > + tmp[strlen(tmp) - 4] = '\0'; > mod = new_module(tmp); > free(tmp); > } > diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h > index 3aa052722233..fab30d201f9e 100644 > --- a/scripts/mod/modpost.h > +++ b/scripts/mod/modpost.h > @@ -2,6 +2,7 @@ > #include <stdio.h> > #include <stdlib.h> > #include <stdarg.h> > +#include <stdbool.h> > #include <string.h> > #include <sys/types.h> > #include <sys/stat.h> > @@ -180,6 +181,14 @@ static inline unsigned int get_secindex(const struct elf_info *info, > return info->symtab_shndx_start[sym - info->symtab_start]; > } > > +static inline bool strends(const char *str, const char *postfix) > +{ > + if (strlen(str) < strlen(postfix)) > + return false; > + > + return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0; > +} > + > /* file2alias.c */ > extern unsigned int cross_build; > void handle_moddevtable(struct module *mod, struct elf_info *info, > diff --git a/scripts/mod/sumversion.c b/scripts/mod/sumversion.c > index d587f40f1117..760e6baa7eda 100644 > --- a/scripts/mod/sumversion.c > +++ b/scripts/mod/sumversion.c > @@ -391,10 +391,14 @@ void get_src_version(const char *modname, char sum[], unsigned sumlen) > struct md4_ctx md; > char *fname; > char filelist[PATH_MAX + 1]; > + int postfix_len = 1; > + > + if (strends(modname, ".lto.o")) > + postfix_len = 5; > > /* objects for a module are listed in the first line of *.mod file. */ > snprintf(filelist, sizeof(filelist), "%.*smod", > - (int)strlen(modname) - 1, modname); > + (int)strlen(modname) - postfix_len, modname); > > buf = read_text_file(filelist); > > -- > 2.27.0.212.ge8ba1cc988-goog > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 13/22] scripts/mod: disable LTO for empty.c 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (11 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 12/22] modpost: lto: strip .lto from module names Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:57 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 14/22] efi/libstub: disable LTO Sami Tolvanen ` (12 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With CONFIG_LTO_CLANG, clang generates LLVM IR instead of ELF object files. As empty.o is used for probing target properties, disable LTO for it to produce an object file instead. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/mod/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile index 296b6a3878b2..b6e3b40c6eeb 100644 --- a/scripts/mod/Makefile +++ b/scripts/mod/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 OBJECT_FILES_NON_STANDARD := y +CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO) hostprogs := modpost mk_elfconfig always-y := $(hostprogs) empty.o -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 13/22] scripts/mod: disable LTO for empty.c 2020-06-24 20:31 ` [PATCH 13/22] scripts/mod: disable LTO for empty.c Sami Tolvanen @ 2020-06-24 20:57 ` Nick Desaulniers 0 siblings, 0 replies; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 20:57 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > With CONFIG_LTO_CLANG, clang generates LLVM IR instead of ELF object > files. As empty.o is used for probing target properties, disable LTO > for it to produce an object file instead. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > --- > scripts/mod/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile > index 296b6a3878b2..b6e3b40c6eeb 100644 > --- a/scripts/mod/Makefile > +++ b/scripts/mod/Makefile > @@ -1,5 +1,6 @@ > # SPDX-License-Identifier: GPL-2.0 > OBJECT_FILES_NON_STANDARD := y > +CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO) > > hostprogs := modpost mk_elfconfig > always-y := $(hostprogs) empty.o > -- > 2.27.0.212.ge8ba1cc988-goog > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 14/22] efi/libstub: disable LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (12 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 13/22] scripts/mod: disable LTO for empty.c Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 15/22] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen ` (11 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With CONFIG_LTO_CLANG, we produce LLVM bitcode instead of ELF object files. Since LTO is not really needed here and the Makefile assumes we produce an object file, disable LTO for libstub. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- drivers/firmware/efi/libstub/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 75daaf20374e..95e12002cc7c 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -35,6 +35,8 @@ KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \ # remove SCS flags from all objects in this directory KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) +# disable LTO +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS)) GCOV_PROFILE := n # Sanitizer runtimes are unavailable and cannot be linked here. -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 15/22] drivers/misc/lkdtm: disable LTO for rodata.o 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (13 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 14/22] efi/libstub: disable LTO Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 16/22] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen ` (10 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Disable LTO for rodata.o to allow objcopy to be used to manipulate sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Kees Cook <keescook@chromium.org> --- drivers/misc/lkdtm/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile index c70b3822013f..dd4c936d4d73 100644 --- a/drivers/misc/lkdtm/Makefile +++ b/drivers/misc/lkdtm/Makefile @@ -13,6 +13,7 @@ lkdtm-$(CONFIG_LKDTM) += cfi.o KASAN_SANITIZE_stackleak.o := n KCOV_INSTRUMENT_rodata.o := n +CFLAGS_REMOVE_rodata.o += $(CC_FLAGS_LTO) OBJCOPYFLAGS := OBJCOPYFLAGS_rodata_objcopy.o := \ -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 16/22] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (14 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 15/22] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 17/22] arm64: vdso: disable LTO Sami Tolvanen ` (9 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Since arm64 does not use -pg in CC_FLAGS_FTRACE with DYNAMIC_FTRACE_WITH_REGS, skip running recordmcount by exporting CC_USING_PATCHABLE_FUNCTION_ENTRY. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index a0d94d063fa8..fc6c20a10291 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -115,6 +115,7 @@ endif ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y) KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY CC_FLAGS_FTRACE := -fpatchable-function-entry=2 + export CC_USING_PATCHABLE_FUNCTION_ENTRY := 1 endif # Default value -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (15 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 16/22] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:58 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 18/22] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen ` (8 subsequent siblings) 25 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Filter out CC_FLAGS_LTO for the vDSO. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/kernel/vdso/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile index 556d424c6f52..cfad4c296ca1 100644 --- a/arch/arm64/kernel/vdso/Makefile +++ b/arch/arm64/kernel/vdso/Makefile @@ -29,8 +29,8 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv \ ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18 ccflags-y += -DDISABLE_BRANCH_PROFILING -CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) -KBUILD_CFLAGS += $(DISABLE_LTO) +CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) \ + $(CC_FLAGS_LTO) KASAN_SANITIZE := n UBSAN_SANITIZE := n OBJECT_FILES_NON_STANDARD := y -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 20:31 ` [PATCH 17/22] arm64: vdso: disable LTO Sami Tolvanen @ 2020-06-24 20:58 ` Nick Desaulniers 2020-06-24 21:09 ` Nick Desaulniers 2020-06-24 21:52 ` Sami Tolvanen 0 siblings, 2 replies; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 20:58 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > Filter out CC_FLAGS_LTO for the vDSO. Just curious about this patch (and the following one for x86's vdso), do you happen to recall specifically what the issues with the vdso's are? > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > arch/arm64/kernel/vdso/Makefile | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile > index 556d424c6f52..cfad4c296ca1 100644 > --- a/arch/arm64/kernel/vdso/Makefile > +++ b/arch/arm64/kernel/vdso/Makefile > @@ -29,8 +29,8 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv \ > ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18 > ccflags-y += -DDISABLE_BRANCH_PROFILING > > -CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) > -KBUILD_CFLAGS += $(DISABLE_LTO) > +CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) \ > + $(CC_FLAGS_LTO) > KASAN_SANITIZE := n > UBSAN_SANITIZE := n > OBJECT_FILES_NON_STANDARD := y > -- > 2.27.0.212.ge8ba1cc988-goog > -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 20:58 ` Nick Desaulniers @ 2020-06-24 21:09 ` Nick Desaulniers 2020-06-24 23:51 ` Andi Kleen 2020-06-24 21:52 ` Sami Tolvanen 1 sibling, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 21:09 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Andi Kleen, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 1:58 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > Filter out CC_FLAGS_LTO for the vDSO. > > Just curious about this patch (and the following one for x86's vdso), > do you happen to recall specifically what the issues with the vdso's > are? + Andi (tangential, I actually have a bunch of tabs open with slides from http://halobates.de/ right now) 58edae3aac9f2 67424d5a22124 $ git log -S DISABLE_LTO > > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > arch/arm64/kernel/vdso/Makefile | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile > > index 556d424c6f52..cfad4c296ca1 100644 > > --- a/arch/arm64/kernel/vdso/Makefile > > +++ b/arch/arm64/kernel/vdso/Makefile > > @@ -29,8 +29,8 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv \ > > ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18 > > ccflags-y += -DDISABLE_BRANCH_PROFILING > > > > -CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) > > -KBUILD_CFLAGS += $(DISABLE_LTO) > > +CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) \ > > + $(CC_FLAGS_LTO) > > KASAN_SANITIZE := n > > UBSAN_SANITIZE := n > > OBJECT_FILES_NON_STANDARD := y > > -- -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 21:09 ` Nick Desaulniers @ 2020-06-24 23:51 ` Andi Kleen 0 siblings, 0 replies; 212+ messages in thread From: Andi Kleen @ 2020-06-24 23:51 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 02:09:40PM -0700, Nick Desaulniers wrote: > On Wed, Jun 24, 2020 at 1:58 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: > > > > On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > Filter out CC_FLAGS_LTO for the vDSO. > > > > Just curious about this patch (and the following one for x86's vdso), > > do you happen to recall specifically what the issues with the vdso's > > are? > > + Andi (tangential, I actually have a bunch of tabs open with slides > from http://halobates.de/ right now) > 58edae3aac9f2 > 67424d5a22124 > $ git log -S DISABLE_LTO I think I did it originally because the vDSO linker step didn't do all the magic needed for gcc LTO. But it also doesn't seem to be very useful for just a few functions that don't have complex interactions, and somewhat risky for violating some assumptions. -Andi _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 20:58 ` Nick Desaulniers 2020-06-24 21:09 ` Nick Desaulniers @ 2020-06-24 21:52 ` Sami Tolvanen 2020-06-24 23:05 ` Nick Desaulniers 1 sibling, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 21:52 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 01:58:57PM -0700, 'Nick Desaulniers' via Clang Built Linux wrote: > On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > Filter out CC_FLAGS_LTO for the vDSO. > > Just curious about this patch (and the following one for x86's vdso), > do you happen to recall specifically what the issues with the vdso's > are? I recall the compiler optimizing away functions at some point, but as LTO is not really needed in the vDSO, it's just easiest to disable it there. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 21:52 ` Sami Tolvanen @ 2020-06-24 23:05 ` Nick Desaulniers 2020-06-24 23:39 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 23:05 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 2:52 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > On Wed, Jun 24, 2020 at 01:58:57PM -0700, 'Nick Desaulniers' via Clang Built Linux wrote: > > On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > Filter out CC_FLAGS_LTO for the vDSO. > > > > Just curious about this patch (and the following one for x86's vdso), > > do you happen to recall specifically what the issues with the vdso's > > are? > > I recall the compiler optimizing away functions at some point, but as > LTO is not really needed in the vDSO, it's just easiest to disable it > there. Sounds fishy; with extern linkage then I would think it's not safe to eliminate functions. Probably unnecessary for the initial implementation, and something we can follow up on, but always good to have an answer to the inevitable question "why?" in the commit message. -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 17/22] arm64: vdso: disable LTO 2020-06-24 23:05 ` Nick Desaulniers @ 2020-06-24 23:39 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 23:39 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 04:05:48PM -0700, Nick Desaulniers wrote: > On Wed, Jun 24, 2020 at 2:52 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > On Wed, Jun 24, 2020 at 01:58:57PM -0700, 'Nick Desaulniers' via Clang Built Linux wrote: > > > On Wed, Jun 24, 2020 at 1:33 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > > > Filter out CC_FLAGS_LTO for the vDSO. > > > > > > Just curious about this patch (and the following one for x86's vdso), > > > do you happen to recall specifically what the issues with the vdso's > > > are? > > > > I recall the compiler optimizing away functions at some point, but as > > LTO is not really needed in the vDSO, it's just easiest to disable it > > there. > > Sounds fishy; with extern linkage then I would think it's not safe to > eliminate functions. Probably unnecessary for the initial > implementation, and something we can follow up on, but always good to > have an answer to the inevitable question "why?" in the commit > message. Sure. I can test this again with the current toolchain to see if there are still problems. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH 18/22] arm64: allow LTO_CLANG and THINLTO to be selected 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (16 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 17/22] arm64: vdso: disable LTO Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 19/22] x86, vdso: disable LTO only for vDSO Sami Tolvanen ` (7 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Allow CONFIG_LTO_CLANG and CONFIG_THINLTO to be enabled. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a4a094bedcb2..e1961653964d 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -72,6 +72,8 @@ config ARM64 select ARCH_USE_SYM_ANNOTATIONS select ARCH_SUPPORTS_MEMORY_FAILURE select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK + select ARCH_SUPPORTS_LTO_CLANG + select ARCH_SUPPORTS_THINLTO select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && (GCC_VERSION >= 50000 || CC_IS_CLANG) select ARCH_SUPPORTS_NUMA_BALANCING -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 19/22] x86, vdso: disable LTO only for vDSO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (17 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 18/22] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 20/22] x86, ftrace: disable recordmcount for ftrace_make_nop Sami Tolvanen ` (6 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Remove the undefined DISABLE_LTO flag from the vDSO, and filter out CC_FLAGS_LTO flags instead where needed. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/entry/vdso/Makefile | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 04e65f0698f6..67f60662830a 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -9,8 +9,6 @@ ARCH_REL_TYPE_ABS := R_X86_64_JUMP_SLOT|R_X86_64_GLOB_DAT|R_X86_64_RELATIVE| ARCH_REL_TYPE_ABS += R_386_GLOB_DAT|R_386_JMP_SLOT|R_386_RELATIVE include $(srctree)/lib/vdso/Makefile -KBUILD_CFLAGS += $(DISABLE_LTO) - # Sanitizer runtimes are unavailable and cannot be linked here. KASAN_SANITIZE := n UBSAN_SANITIZE := n @@ -92,7 +90,7 @@ ifneq ($(RETPOLINE_VDSO_CFLAGS),) endif endif -$(vobjs): KBUILD_CFLAGS := $(filter-out $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL) +$(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL) # # vDSO code runs in userspace and -pg doesn't help with profiling anyway. @@ -150,6 +148,7 @@ KBUILD_CFLAGS_32 := $(filter-out -fno-pic,$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out -mfentry,$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS_32)) +KBUILD_CFLAGS_32 := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=0 -fpic KBUILD_CFLAGS_32 += $(call cc-option, -fno-stack-protector) KBUILD_CFLAGS_32 += $(call cc-option, -foptimize-sibling-calls) -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 20/22] x86, ftrace: disable recordmcount for ftrace_make_nop 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (18 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 19/22] x86, vdso: disable LTO only for vDSO Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 21/22] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen ` (5 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Ignore mcount relocations in ftrace_make_nop. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/kernel/ftrace.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 51504566b3a6..c3b28b81277b 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -121,6 +121,7 @@ ftrace_modify_code_direct(unsigned long ip, const char *old_code, return 0; } +__nomcount int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { unsigned long ip = rec->ip; -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 21/22] x86, relocs: Ignore L4_PAGE_OFFSET relocations 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (19 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 20/22] x86, ftrace: disable recordmcount for ftrace_make_nop Sami Tolvanen @ 2020-06-24 20:31 ` Sami Tolvanen 2020-06-24 20:32 ` [PATCH 22/22] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen ` (4 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:31 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel L4_PAGE_OFFSET is a constant value, so don't warn about absolute relocations. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/tools/relocs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..8f3bf34840ce 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -47,6 +47,7 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = { [S_ABS] = "^(xen_irq_disable_direct_reloc$|" "xen_save_fl_direct_reloc$|" + "L4_PAGE_OFFSET|" "VDSO|" "__crc_)", -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH 22/22] x86, build: allow LTO_CLANG and THINLTO to be selected 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (20 preceding siblings ...) 2020-06-24 20:31 ` [PATCH 21/22] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen @ 2020-06-24 20:32 ` Sami Tolvanen 2020-06-24 21:15 ` [PATCH 00/22] add support for Clang LTO Peter Zijlstra ` (3 subsequent siblings) 25 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 20:32 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Allow CONFIG_LTO_CLANG and CONFIG_THINLTO to be enabled. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/Kconfig | 2 ++ arch/x86/Makefile | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6a0cc524882d..df335b1f9c31 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -92,6 +92,8 @@ config X86 select ARCH_SUPPORTS_ACPI select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 + select ARCH_SUPPORTS_LTO_CLANG if X86_64 + select ARCH_SUPPORTS_THINLTO if X86_64 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_QUEUED_RWLOCKS select ARCH_USE_QUEUED_SPINLOCKS diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 00e378de8bc0..a1abc1e081ad 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -188,6 +188,11 @@ ifdef CONFIG_X86_64 KBUILD_LDFLAGS += $(call ld-option, -z max-page-size=0x200000) endif +ifdef CONFIG_LTO_CLANG +KBUILD_LDFLAGS += -plugin-opt=-code-model=kernel \ + -plugin-opt=-stack-alignment=$(if $(CONFIG_X86_32),4,8) +endif + # Workaround for a gcc prelease that unfortunately was shipped in a suse release KBUILD_CFLAGS += -Wno-sign-compare # -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (21 preceding siblings ...) 2020-06-24 20:32 ` [PATCH 22/22] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen @ 2020-06-24 21:15 ` Peter Zijlstra 2020-06-24 21:30 ` Sami Tolvanen 2020-06-24 21:31 ` Nick Desaulniers 2020-06-28 16:56 ` Masahiro Yamada ` (2 subsequent siblings) 25 siblings, 2 replies; 212+ messages in thread From: Peter Zijlstra @ 2020-06-24 21:15 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote: > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). > > In addition to performance, the primary motivation for LTO is to allow > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > Pixel devices have shipped with LTO+CFI kernels since 2018. > > Most of the patches are build system changes for handling LLVM bitcode, > which Clang produces with LTO instead of ELF object files, postponing > ELF processing until a later stage, and ensuring initcall ordering. > > Note that first objtool patch in the series is already in linux-next, > but as it's needed with LTO, I'm including it also here to make testing > easier. I'm very sad that yet again, memory ordering isn't addressed. LTO vastly increases the range of the optimizer to wreck things. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 21:15 ` [PATCH 00/22] add support for Clang LTO Peter Zijlstra @ 2020-06-24 21:30 ` Sami Tolvanen 2020-06-25 8:27 ` Will Deacon 2020-06-24 21:31 ` Nick Desaulniers 1 sibling, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-24 21:30 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Jun 24, 2020 at 11:15:40PM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote: > > This patch series adds support for building x86_64 and arm64 kernels > > with Clang's Link Time Optimization (LTO). > > > > In addition to performance, the primary motivation for LTO is to allow > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > Most of the patches are build system changes for handling LLVM bitcode, > > which Clang produces with LTO instead of ELF object files, postponing > > ELF processing until a later stage, and ensuring initcall ordering. > > > > Note that first objtool patch in the series is already in linux-next, > > but as it's needed with LTO, I'm including it also here to make testing > > easier. > > I'm very sad that yet again, memory ordering isn't addressed. LTO vastly > increases the range of the optimizer to wreck things. I believe Will has some thoughts about this, and patches, but I'll let him talk about it. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 21:30 ` Sami Tolvanen @ 2020-06-25 8:27 ` Will Deacon 0 siblings, 0 replies; 212+ messages in thread From: Will Deacon @ 2020-06-25 8:27 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, linux-arm-kernel On Wed, Jun 24, 2020 at 02:30:14PM -0700, Sami Tolvanen wrote: > On Wed, Jun 24, 2020 at 11:15:40PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote: > > > This patch series adds support for building x86_64 and arm64 kernels > > > with Clang's Link Time Optimization (LTO). > > > > > > In addition to performance, the primary motivation for LTO is to allow > > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > > > Most of the patches are build system changes for handling LLVM bitcode, > > > which Clang produces with LTO instead of ELF object files, postponing > > > ELF processing until a later stage, and ensuring initcall ordering. > > > > > > Note that first objtool patch in the series is already in linux-next, > > > but as it's needed with LTO, I'm including it also here to make testing > > > easier. > > > > I'm very sad that yet again, memory ordering isn't addressed. LTO vastly > > increases the range of the optimizer to wreck things. > > I believe Will has some thoughts about this, and patches, but I'll let > him talk about it. Thanks for reminding me! I will get patches out ASAP (I've been avoiding the rebase from hell, but this is the motivation I need). Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 21:15 ` [PATCH 00/22] add support for Clang LTO Peter Zijlstra 2020-06-24 21:30 ` Sami Tolvanen @ 2020-06-24 21:31 ` Nick Desaulniers 2020-06-25 8:03 ` Peter Zijlstra 1 sibling, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-06-24 21:31 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 2:15 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote: > > This patch series adds support for building x86_64 and arm64 kernels > > with Clang's Link Time Optimization (LTO). > > > > In addition to performance, the primary motivation for LTO is to allow > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > Most of the patches are build system changes for handling LLVM bitcode, > > which Clang produces with LTO instead of ELF object files, postponing > > ELF processing until a later stage, and ensuring initcall ordering. > > > > Note that first objtool patch in the series is already in linux-next, > > but as it's needed with LTO, I'm including it also here to make testing > > easier. > > I'm very sad that yet again, memory ordering isn't addressed. LTO vastly > increases the range of the optimizer to wreck things. Hi Peter, could you expand on the issue for the folks on the thread? I'm happy to try to hack something up in LLVM if we check that X does or does not happen; maybe we can even come up with some concrete test cases that can be added to LLVM's codebase? -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 21:31 ` Nick Desaulniers @ 2020-06-25 8:03 ` Peter Zijlstra 2020-06-25 8:24 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 8:03 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jun 24, 2020 at 02:31:36PM -0700, Nick Desaulniers wrote: > On Wed, Jun 24, 2020 at 2:15 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote: > > > This patch series adds support for building x86_64 and arm64 kernels > > > with Clang's Link Time Optimization (LTO). > > > > > > In addition to performance, the primary motivation for LTO is to allow > > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > > > Most of the patches are build system changes for handling LLVM bitcode, > > > which Clang produces with LTO instead of ELF object files, postponing > > > ELF processing until a later stage, and ensuring initcall ordering. > > > > > > Note that first objtool patch in the series is already in linux-next, > > > but as it's needed with LTO, I'm including it also here to make testing > > > easier. > > > > I'm very sad that yet again, memory ordering isn't addressed. LTO vastly > > increases the range of the optimizer to wreck things. > > Hi Peter, could you expand on the issue for the folks on the thread? > I'm happy to try to hack something up in LLVM if we check that X does > or does not happen; maybe we can even come up with some concrete test > cases that can be added to LLVM's codebase? I'm sure Will will respond, but the basic issue is the trainwreck C11 made of dependent loads. Anyway, here's a link to the last time this came up: https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-25 8:03 ` Peter Zijlstra @ 2020-06-25 8:24 ` Peter Zijlstra 2020-06-25 8:57 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 8:24 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Thu, Jun 25, 2020 at 10:03:13AM +0200, Peter Zijlstra wrote: > On Wed, Jun 24, 2020 at 02:31:36PM -0700, Nick Desaulniers wrote: > > On Wed, Jun 24, 2020 at 2:15 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Wed, Jun 24, 2020 at 01:31:38PM -0700, Sami Tolvanen wrote: > > > > This patch series adds support for building x86_64 and arm64 kernels > > > > with Clang's Link Time Optimization (LTO). > > > > > > > > In addition to performance, the primary motivation for LTO is to allow > > > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > > > > > Most of the patches are build system changes for handling LLVM bitcode, > > > > which Clang produces with LTO instead of ELF object files, postponing > > > > ELF processing until a later stage, and ensuring initcall ordering. > > > > > > > > Note that first objtool patch in the series is already in linux-next, > > > > but as it's needed with LTO, I'm including it also here to make testing > > > > easier. > > > > > > I'm very sad that yet again, memory ordering isn't addressed. LTO vastly > > > increases the range of the optimizer to wreck things. > > > > Hi Peter, could you expand on the issue for the folks on the thread? > > I'm happy to try to hack something up in LLVM if we check that X does > > or does not happen; maybe we can even come up with some concrete test > > cases that can be added to LLVM's codebase? > > I'm sure Will will respond, but the basic issue is the trainwreck C11 > made of dependent loads. > > Anyway, here's a link to the last time this came up: > > https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/ Another good read: https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ and having (partially) re-read that, I now worry intensily about things like latch_tree_find(), cyc2ns_read_begin, __ktime_get_fast_ns(). It looks like kernel/time/sched_clock.c uses raw_read_seqcount() which deviates from the above patterns by, for some reason, using a primitive that includes an extra smp_rmb(). And this is just the few things I could remember off the top of my head, who knows what else is out there. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-25 8:24 ` Peter Zijlstra @ 2020-06-25 8:57 ` Peter Zijlstra 2020-06-30 19:19 ` Marco Elver 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-25 8:57 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Thu, Jun 25, 2020 at 10:24:33AM +0200, Peter Zijlstra wrote: > On Thu, Jun 25, 2020 at 10:03:13AM +0200, Peter Zijlstra wrote: > > I'm sure Will will respond, but the basic issue is the trainwreck C11 > > made of dependent loads. > > > > Anyway, here's a link to the last time this came up: > > > > https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/ > > Another good read: > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > and having (partially) re-read that, I now worry intensily about things > like latch_tree_find(), cyc2ns_read_begin, __ktime_get_fast_ns(). > > It looks like kernel/time/sched_clock.c uses raw_read_seqcount() which > deviates from the above patterns by, for some reason, using a primitive > that includes an extra smp_rmb(). > > And this is just the few things I could remember off the top of my head, > who knows what else is out there. As an example, let us consider __ktime_get_fast_ns(), the critical bit is: seq = raw_read_seqcount_latch(&tkf->seq); tkr = tkf->base + (seq & 0x01); now = tkr->base; And we hard rely on that being a dependent load, so: LOAD seq, (tkf->seq) LOAD tkr, tkf->base AND seq, 1 MUL seq, sizeof(tk_read_base) ADD tkr, seq LOAD now, (tkr->base) Such that we obtain 'now' as a direct dependency on 'seq'. This ensures the loads are ordered. A compiler can wreck this by translating it into something like: LOAD seq, (tkf->seq) LOAD tkr, tkf->base AND seq, 1 CMP seq, 0 JE 1f ADD tkr, sizeof(tk_read_base) 1: LOAD now, (tkr->base) Because now the machine can speculate and load now before seq, breaking the ordering. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-25 8:57 ` Peter Zijlstra @ 2020-06-30 19:19 ` Marco Elver 2020-06-30 20:12 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Marco Elver @ 2020-06-30 19:19 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM I was asked for input on this, and after a few days digging through some history, thought I'd comment. Hope you don't mind. On Thu, Jun 25, 2020 at 10:57AM +0200, Peter Zijlstra wrote: > On Thu, Jun 25, 2020 at 10:24:33AM +0200, Peter Zijlstra wrote: > > On Thu, Jun 25, 2020 at 10:03:13AM +0200, Peter Zijlstra wrote: > > > I'm sure Will will respond, but the basic issue is the trainwreck C11 > > > made of dependent loads. > > > > > > Anyway, here's a link to the last time this came up: > > > > > > https://lore.kernel.org/linux-arm-kernel/20171116174830.GX3624@linux.vnet.ibm.com/ > > > > Another good read: > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ [...] > Because now the machine can speculate and load now before seq, breaking > the ordering. First of all, I agree with the concerns, but not because of LTO. To set the stage better, and summarize the fundamental problem again: we're in the unfortunate situation that no compiler today has a way to _efficiently_ deal with C11's memory_order_consume [https://lwn.net/Articles/588300/]. If we did, we could just use that and be done with it. But, sadly, that doesn't seem possible right now -- compilers just say consume==acquire. Will suggests doing the same in the kernel: https://lkml.kernel.org/r/20200630173734.14057-19-will@kernel.org What we're most worried about right now is the existence of compiler transformations that could break data dependencies by e.g. turning them into control dependencies. If this is a real worry, I don't think LTO is the magical feature that will uncover those optimizations. If these compiler transformations are real, they also exist in a normal build! And if we are worried about them, we need to stop relying on dependent load ordering across the board; or switch to -O0 for everything. Clearly, we don't want either. Why do we think LTO is special? With LTO, Clang just emits LLVM bitcode instead of ELF objects, and during the linker stage intermodular optimizations across translation unit boundaries are done that might not be possible otherwise [https://llvm.org/docs/LinkTimeOptimization.html]. From the memory model side of things, if we could fully convey our intent to the compiler (the imaginary consume), there would be no problem, because all optimization stages from bitcode generation to the final machine code generation after LTO know about the intended semantics. (Also, keep in mind that LTO is _not_ doing post link optimization of machine code binaries!) But as far as we can tell, there is no evidence of the dreaded "data dependency to control dependency" conversion with LTO that isn't there in non-LTO builds, if it's even there at all. Has the data to control dependency conversion been encountered in the wild? If not, is the resulting reaction an overreaction? If so, we need to be careful blaming LTO for something that it isn't even guilty of. So, we are probably better off untangling LTO from the story: 1. LTO or no LTO does not matter. The LTO series should not get tangled up with memory model issues. 2. The memory model question and problems need to be answered and addressed separately. Thoughts? Thanks, -- Marco _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-30 19:19 ` Marco Elver @ 2020-06-30 20:12 ` Peter Zijlstra 2020-06-30 20:30 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-06-30 20:12 UTC (permalink / raw) To: Marco Elver Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > I was asked for input on this, and after a few days digging through some > history, thought I'd comment. Hope you don't mind. Not at all, being the one that asked :-) > First of all, I agree with the concerns, but not because of LTO. > > To set the stage better, and summarize the fundamental problem again: > we're in the unfortunate situation that no compiler today has a way to > _efficiently_ deal with C11's memory_order_consume > [https://lwn.net/Articles/588300/]. If we did, we could just use that > and be done with it. But, sadly, that doesn't seem possible right now -- > compilers just say consume==acquire. I'm not convinced C11 memory_order_consume would actually work for us, even if it would work. That is, given: https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ only pointers can have consume, but like I pointed out, we have code that relies on dependent loads from integers. > Will suggests doing the same in the > kernel: https://lkml.kernel.org/r/20200630173734.14057-19-will@kernel.org PowerPC would need a similar thing, it too will not preserve causality for control dependecies. > What we're most worried about right now is the existence of compiler > transformations that could break data dependencies by e.g. turning them > into control dependencies. Correct. > If this is a real worry, I don't think LTO is the magical feature that > will uncover those optimizations. If these compiler transformations are > real, they also exist in a normal build! Agreed, _however_ with the caveat that LTO could make them more common. After all, with whole program analysis, the compiler might be able to more easily determine that our pointer @ptr is only ever assigned the values of &A, &B or &C, while without that visibility it would not be able to determine this. Once it knows @ptr has a limited number of determined values, the conversion into control dependencies becomes much more likely. > And if we are worried about them, we need to stop relying on dependent > load ordering across the board; or switch to -O0 for everything. > Clearly, we don't want either. Agreed. > Why do we think LTO is special? As argued above, whole-program analysis would make it more likely. But I agree the fundamental problem exists independent from LTO. > But as far as we can tell, there is no evidence of the dreaded "data > dependency to control dependency" conversion with LTO that isn't there > in non-LTO builds, if it's even there at all. Has the data to control > dependency conversion been encountered in the wild? If not, is the > resulting reaction an overreaction? If so, we need to be careful blaming > LTO for something that it isn't even guilty of. It is mostly paranoia; in a large part driven by the fact that even if such a conversion were to be done, it could go a very long time without actually causing problems, and longer still for such problems to be traced back to such an 'optimization'. That is, the collective hurt from debugging too many ordering issues. > So, we are probably better off untangling LTO from the story: > > 1. LTO or no LTO does not matter. The LTO series should not get tangled > up with memory model issues. > > 2. The memory model question and problems need to be answered and > addressed separately. > > Thoughts? How hard would it be to creates something that analyzes a build and looks for all 'dependent load -> control dependency' transformations headed by a volatile (and/or from asm) load and issues a warning for them? This would give us an indication of how valuable this transformation is for the kernel. I'm hoping/expecting it's vanishingly rare, but what do I know. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-30 20:12 ` Peter Zijlstra @ 2020-06-30 20:30 ` Paul E. McKenney 2020-07-01 9:10 ` Peter Zijlstra 2020-07-01 9:41 ` Marco Elver 0 siblings, 2 replies; 212+ messages in thread From: Paul E. McKenney @ 2020-06-30 20:30 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > > I was asked for input on this, and after a few days digging through some > > history, thought I'd comment. Hope you don't mind. > > Not at all, being the one that asked :-) > > > First of all, I agree with the concerns, but not because of LTO. > > > > To set the stage better, and summarize the fundamental problem again: > > we're in the unfortunate situation that no compiler today has a way to > > _efficiently_ deal with C11's memory_order_consume > > [https://lwn.net/Articles/588300/]. If we did, we could just use that > > and be done with it. But, sadly, that doesn't seem possible right now -- > > compilers just say consume==acquire. > > I'm not convinced C11 memory_order_consume would actually work for us, > even if it would work. That is, given: > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > only pointers can have consume, but like I pointed out, we have code > that relies on dependent loads from integers. I agree that C11 memory_order_consume is not normally what we want, given that it is universally promoted to memory_order_acquire. However, dependent loads from integers are, if anything, more difficult to defend from the compiler than are control dependencies. This applies doubly to integers that are used to index two-element arrays, in which case you are just asking the compiler to destroy your dependent loads by converting them into control dependencies. > > Will suggests doing the same in the > > kernel: https://lkml.kernel.org/r/20200630173734.14057-19-will@kernel.org > > PowerPC would need a similar thing, it too will not preserve causality > for control dependecies. > > > What we're most worried about right now is the existence of compiler > > transformations that could break data dependencies by e.g. turning them > > into control dependencies. > > Correct. > > > If this is a real worry, I don't think LTO is the magical feature that > > will uncover those optimizations. If these compiler transformations are > > real, they also exist in a normal build! > > Agreed, _however_ with the caveat that LTO could make them more common. > > After all, with whole program analysis, the compiler might be able to > more easily determine that our pointer @ptr is only ever assigned the > values of &A, &B or &C, while without that visibility it would not be > able to determine this. > > Once it knows @ptr has a limited number of determined values, the > conversion into control dependencies becomes much more likely. Which would of course break dependent loads. > > And if we are worried about them, we need to stop relying on dependent > > load ordering across the board; or switch to -O0 for everything. > > Clearly, we don't want either. > > Agreed. > > > Why do we think LTO is special? > > As argued above, whole-program analysis would make it more likely. But I > agree the fundamental problem exists independent from LTO. > > > But as far as we can tell, there is no evidence of the dreaded "data > > dependency to control dependency" conversion with LTO that isn't there > > in non-LTO builds, if it's even there at all. Has the data to control > > dependency conversion been encountered in the wild? If not, is the > > resulting reaction an overreaction? If so, we need to be careful blaming > > LTO for something that it isn't even guilty of. > > It is mostly paranoia; in a large part driven by the fact that even if > such a conversion were to be done, it could go a very long time without > actually causing problems, and longer still for such problems to be > traced back to such an 'optimization'. > > That is, the collective hurt from debugging too many ordering issues. > > > So, we are probably better off untangling LTO from the story: > > > > 1. LTO or no LTO does not matter. The LTO series should not get tangled > > up with memory model issues. > > > > 2. The memory model question and problems need to be answered and > > addressed separately. > > > > Thoughts? > > How hard would it be to creates something that analyzes a build and > looks for all 'dependent load -> control dependency' transformations > headed by a volatile (and/or from asm) load and issues a warning for > them? > > This would give us an indication of how valuable this transformation is > for the kernel. I'm hoping/expecting it's vanishingly rare, but what do > I know. > This could be quite useful! Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-30 20:30 ` Paul E. McKenney @ 2020-07-01 9:10 ` Peter Zijlstra 2020-07-01 14:20 ` David Laight 2020-07-01 9:41 ` Marco Elver 1 sibling, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-01 9:10 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Tue, Jun 30, 2020 at 01:30:16PM -0700, Paul E. McKenney wrote: > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > I'm not convinced C11 memory_order_consume would actually work for us, > > even if it would work. That is, given: > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > > > only pointers can have consume, but like I pointed out, we have code > > that relies on dependent loads from integers. > > I agree that C11 memory_order_consume is not normally what we want, > given that it is universally promoted to memory_order_acquire. > > However, dependent loads from integers are, if anything, more difficult > to defend from the compiler than are control dependencies. This applies > doubly to integers that are used to index two-element arrays, in which > case you are just asking the compiler to destroy your dependent loads > by converting them into control dependencies. Yes, I'm aware. However, as you might know, I'm firmly in the 'C is a glorified assembler' camp (as I expect most actual OS people are, out of necessity if nothing else) and if I wanted a control dependency I would've bloody well written one. I think an optimizing compiler is awesome, but only in so far as that optimization is actually helpful -- and yes, I just stepped into a giant twilight zone there. That is, any optimization that has _any_ controversy should be controllable (like -fno-strict-overflow -fno-strict-aliasing) and I'd very much like the same here. In a larger context, I still think that eliminating speculative stores is both necessary and sufficient to avoid out-of-thin-air. So I'd also love to get some control on that. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* RE: [PATCH 00/22] add support for Clang LTO 2020-07-01 9:10 ` Peter Zijlstra @ 2020-07-01 14:20 ` David Laight 2020-07-01 16:06 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: David Laight @ 2020-07-01 14:20 UTC (permalink / raw) To: 'Peter Zijlstra', Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM From: Peter Zijlstra > Sent: 01 July 2020 10:11 > On Tue, Jun 30, 2020 at 01:30:16PM -0700, Paul E. McKenney wrote: > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > > I'm not convinced C11 memory_order_consume would actually work for us, > > > even if it would work. That is, given: > > > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > > > > > only pointers can have consume, but like I pointed out, we have code > > > that relies on dependent loads from integers. > > > > I agree that C11 memory_order_consume is not normally what we want, > > given that it is universally promoted to memory_order_acquire. > > > > However, dependent loads from integers are, if anything, more difficult > > to defend from the compiler than are control dependencies. This applies > > doubly to integers that are used to index two-element arrays, in which > > case you are just asking the compiler to destroy your dependent loads > > by converting them into control dependencies. > > Yes, I'm aware. However, as you might know, I'm firmly in the 'C is a > glorified assembler' camp (as I expect most actual OS people are, out of > necessity if nothing else) and if I wanted a control dependency I > would've bloody well written one. I write in C because doing register tracking is hard :-) I've got an hdlc implementation in C that is carefully adjusted so that the worst case path is bounded. I probably know every one of the 1000 instructions in it. Would an asm statement that uses the same 'register' for input and output but doesn't actually do anything help? It won't generate any code, but the compiler ought to assume that it might change the value - so can't do optimisations that track the value across the call. > I think an optimizing compiler is awesome, but only in so far as that > optimization is actually helpful -- and yes, I just stepped into a giant > twilight zone there. That is, any optimization that has _any_ > controversy should be controllable (like -fno-strict-overflow > -fno-strict-aliasing) and I'd very much like the same here. I'm fed up of gcc generating the code that uses SIMD instructions for the 'tail' loop at the end of a function that is already doing SIMD operations for the main part of the loop. And compilers that convert a byte copy loop to 'rep movsb'. If I'm copying 3 or 4 bytes I don't want a 40 clock overhead. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 14:20 ` David Laight @ 2020-07-01 16:06 ` Paul E. McKenney 2020-07-02 9:37 ` David Laight 0 siblings, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-01 16:06 UTC (permalink / raw) To: David Laight Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, 'Peter Zijlstra', Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jul 01, 2020 at 02:20:13PM +0000, David Laight wrote: > From: Peter Zijlstra > > Sent: 01 July 2020 10:11 > > On Tue, Jun 30, 2020 at 01:30:16PM -0700, Paul E. McKenney wrote: > > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > > > > I'm not convinced C11 memory_order_consume would actually work for us, > > > > even if it would work. That is, given: > > > > > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > > > > > > > only pointers can have consume, but like I pointed out, we have code > > > > that relies on dependent loads from integers. > > > > > > I agree that C11 memory_order_consume is not normally what we want, > > > given that it is universally promoted to memory_order_acquire. > > > > > > However, dependent loads from integers are, if anything, more difficult > > > to defend from the compiler than are control dependencies. This applies > > > doubly to integers that are used to index two-element arrays, in which > > > case you are just asking the compiler to destroy your dependent loads > > > by converting them into control dependencies. > > > > Yes, I'm aware. However, as you might know, I'm firmly in the 'C is a > > glorified assembler' camp (as I expect most actual OS people are, out of > > necessity if nothing else) and if I wanted a control dependency I > > would've bloody well written one. > > I write in C because doing register tracking is hard :-) > I've got an hdlc implementation in C that is carefully adjusted > so that the worst case path is bounded. > I probably know every one of the 1000 instructions in it. > > Would an asm statement that uses the same 'register' for input and > output but doesn't actually do anything help? > It won't generate any code, but the compiler ought to assume that > it might change the value - so can't do optimisations that track > the value across the call. It might replace the volatile load, but there are optimizations that apply to the downstream code as well. Or are you suggesting periodically pushing the dependent variable through this asm? That might work, but it would be easier and more maintainable to just mark the variable. > > I think an optimizing compiler is awesome, but only in so far as that > > optimization is actually helpful -- and yes, I just stepped into a giant > > twilight zone there. That is, any optimization that has _any_ > > controversy should be controllable (like -fno-strict-overflow > > -fno-strict-aliasing) and I'd very much like the same here. > > I'm fed up of gcc generating the code that uses SIMD instructions > for the 'tail' loop at the end of a function that is already doing > SIMD operations for the main part of the loop. > And compilers that convert a byte copy loop to 'rep movsb'. > If I'm copying 3 or 4 bytes I don't want a 40 clock overhead. Agreed, compilers can often be all too "helpful". :-( Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* RE: [PATCH 00/22] add support for Clang LTO 2020-07-01 16:06 ` Paul E. McKenney @ 2020-07-02 9:37 ` David Laight 2020-07-02 18:00 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: David Laight @ 2020-07-02 9:37 UTC (permalink / raw) To: 'paulmck@kernel.org' Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, 'Peter Zijlstra', Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM From: Paul E. McKenney > Sent: 01 July 2020 17:06 ... > > Would an asm statement that uses the same 'register' for input and > > output but doesn't actually do anything help? > > It won't generate any code, but the compiler ought to assume that > > it might change the value - so can't do optimisations that track > > the value across the call. > > It might replace the volatile load, but there are optimizations that > apply to the downstream code as well. > > Or are you suggesting periodically pushing the dependent variable > through this asm? That might work, but it would be easier and > more maintainable to just mark the variable. Marking the variable requires compiler support. Although what 'volatile register int foo;' means might be interesting. So I was thinking that in the case mentioned earlier you do: ptr += LAUNDER(offset & 1); to ensure the compiler didn't convert to: if (offset & 1) ptr++; (Which is probably a pessimisation - the reverse is likely better.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-02 9:37 ` David Laight @ 2020-07-02 18:00 ` Paul E. McKenney 0 siblings, 0 replies; 212+ messages in thread From: Paul E. McKenney @ 2020-07-02 18:00 UTC (permalink / raw) To: David Laight Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, 'Peter Zijlstra', Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Thu, Jul 02, 2020 at 09:37:26AM +0000, David Laight wrote: > From: Paul E. McKenney > > Sent: 01 July 2020 17:06 > ... > > > Would an asm statement that uses the same 'register' for input and > > > output but doesn't actually do anything help? > > > It won't generate any code, but the compiler ought to assume that > > > it might change the value - so can't do optimisations that track > > > the value across the call. > > > > It might replace the volatile load, but there are optimizations that > > apply to the downstream code as well. > > > > Or are you suggesting periodically pushing the dependent variable > > through this asm? That might work, but it would be easier and > > more maintainable to just mark the variable. > > Marking the variable requires compiler support. > Although what 'volatile register int foo;' means might be interesting. > > So I was thinking that in the case mentioned earlier you do: > ptr += LAUNDER(offset & 1); > to ensure the compiler didn't convert to: > if (offset & 1) ptr++; > (Which is probably a pessimisation - the reverse is likely better.) Indeed, Akshat's prototype follows the "volatile" qualifier in many ways. https://github.com/AKG001/gcc/ Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-30 20:30 ` Paul E. McKenney 2020-07-01 9:10 ` Peter Zijlstra @ 2020-07-01 9:41 ` Marco Elver 2020-07-01 10:03 ` Will Deacon 2020-07-01 11:40 ` Peter Zijlstra 1 sibling, 2 replies; 212+ messages in thread From: Marco Elver @ 2020-07-01 9:41 UTC (permalink / raw) To: Paul E. McKenney, Peter Zijlstra Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Tue, 30 Jun 2020 at 22:30, Paul E. McKenney <paulmck@kernel.org> wrote: > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > > > First of all, I agree with the concerns, but not because of LTO. > > > > > > To set the stage better, and summarize the fundamental problem again: > > > we're in the unfortunate situation that no compiler today has a way to > > > _efficiently_ deal with C11's memory_order_consume > > > [https://lwn.net/Articles/588300/]. If we did, we could just use that > > > and be done with it. But, sadly, that doesn't seem possible right now -- > > > compilers just say consume==acquire. > > > > I'm not convinced C11 memory_order_consume would actually work for us, > > even if it would work. That is, given: > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > > > only pointers can have consume, but like I pointed out, we have code > > that relies on dependent loads from integers. > > I agree that C11 memory_order_consume is not normally what we want, > given that it is universally promoted to memory_order_acquire. > > However, dependent loads from integers are, if anything, more difficult > to defend from the compiler than are control dependencies. This applies > doubly to integers that are used to index two-element arrays, in which > case you are just asking the compiler to destroy your dependent loads > by converting them into control dependencies. > > > > Will suggests doing the same in the > > > kernel: https://lkml.kernel.org/r/20200630173734.14057-19-will@kernel.org > > > > PowerPC would need a similar thing, it too will not preserve causality > > for control dependecies. > > > > > What we're most worried about right now is the existence of compiler > > > transformations that could break data dependencies by e.g. turning them > > > into control dependencies. > > > > Correct. > > > > > If this is a real worry, I don't think LTO is the magical feature that > > > will uncover those optimizations. If these compiler transformations are > > > real, they also exist in a normal build! > > > > Agreed, _however_ with the caveat that LTO could make them more common. > > > > After all, with whole program analysis, the compiler might be able to > > more easily determine that our pointer @ptr is only ever assigned the > > values of &A, &B or &C, while without that visibility it would not be > > able to determine this. > > > > Once it knows @ptr has a limited number of determined values, the > > conversion into control dependencies becomes much more likely. > > Which would of course break dependent loads. > > > > And if we are worried about them, we need to stop relying on dependent > > > load ordering across the board; or switch to -O0 for everything. > > > Clearly, we don't want either. > > > > Agreed. > > > > > Why do we think LTO is special? > > > > As argued above, whole-program analysis would make it more likely. But I > > agree the fundamental problem exists independent from LTO. > > > > > But as far as we can tell, there is no evidence of the dreaded "data > > > dependency to control dependency" conversion with LTO that isn't there > > > in non-LTO builds, if it's even there at all. Has the data to control > > > dependency conversion been encountered in the wild? If not, is the > > > resulting reaction an overreaction? If so, we need to be careful blaming > > > LTO for something that it isn't even guilty of. > > > > It is mostly paranoia; in a large part driven by the fact that even if > > such a conversion were to be done, it could go a very long time without > > actually causing problems, and longer still for such problems to be > > traced back to such an 'optimization'. > > > > That is, the collective hurt from debugging too many ordering issues. > > > > > So, we are probably better off untangling LTO from the story: > > > > > > 1. LTO or no LTO does not matter. The LTO series should not get tangled > > > up with memory model issues. > > > > > > 2. The memory model question and problems need to be answered and > > > addressed separately. > > > > > > Thoughts? > > > > How hard would it be to creates something that analyzes a build and > > looks for all 'dependent load -> control dependency' transformations > > headed by a volatile (and/or from asm) load and issues a warning for > > them? I was thinking about this, but in the context of the "auto-promote to acquire" which you didn't like. Issuing a warning should certainly be simpler. I think there is no one place where we know these transformations happen, but rather, need to analyze the IR before transformations, take note of all the dependent loads headed by volatile+asm, and then run an analysis after optimizations checking the dependencies are still there. > > This would give us an indication of how valuable this transformation is > > for the kernel. I'm hoping/expecting it's vanishingly rare, but what do > > I know. > > This could be quite useful! We might then even be able to say, "if you get this warning, turn on CONFIG_ACQUIRE_READ_DEPENDENCIES" (or however the option will be named). Or some other tricks, like automatically recompile the TU where this happens with the option. But again, this is not something that should specifically block LTO, because if we have this, we'll need to turn it on for everything. Thanks, -- Marco _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 9:41 ` Marco Elver @ 2020-07-01 10:03 ` Will Deacon 2020-07-01 11:40 ` Peter Zijlstra 1 sibling, 0 replies; 212+ messages in thread From: Will Deacon @ 2020-07-01 10:03 UTC (permalink / raw) To: Marco Elver Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Linux ARM On Wed, Jul 01, 2020 at 11:41:17AM +0200, Marco Elver wrote: > On Tue, 30 Jun 2020 at 22:30, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > > > > So, we are probably better off untangling LTO from the story: > > > > > > > > 1. LTO or no LTO does not matter. The LTO series should not get tangled > > > > up with memory model issues. > > > > > > > > 2. The memory model question and problems need to be answered and > > > > addressed separately. > > > > > > > > Thoughts? > > > > > > How hard would it be to creates something that analyzes a build and > > > looks for all 'dependent load -> control dependency' transformations > > > headed by a volatile (and/or from asm) load and issues a warning for > > > them? > > I was thinking about this, but in the context of the "auto-promote to > acquire" which you didn't like. Issuing a warning should certainly be > simpler. > > I think there is no one place where we know these transformations > happen, but rather, need to analyze the IR before transformations, > take note of all the dependent loads headed by volatile+asm, and then > run an analysis after optimizations checking the dependencies are > still there. > > > > This would give us an indication of how valuable this transformation is > > > for the kernel. I'm hoping/expecting it's vanishingly rare, but what do > > > I know. > > > > This could be quite useful! > > We might then even be able to say, "if you get this warning, turn on > CONFIG_ACQUIRE_READ_DEPENDENCIES" (or however the option will be > named). Or some other tricks, like automatically recompile the TU > where this happens with the option. But again, this is not something > that should specifically block LTO, because if we have this, we'll > need to turn it on for everything. I'm not especially keen on solving this with additional config options -- all it does it further fragment the number of kernels we have to care about and distributions really won't be in a position to know whether this should be enabled or not. I would prefer that the build fails, and we figure out which compiler switch we need to stop the harmful optimisation taking place. As Peter says, it _should_ be a rare thing to see (empirically, the kernel seems to be getting away with it so far). The problem, as I see it, is that the C language doesn't provide us with a way to express dependency ordering and so we're at the mercy of the compiler when we roll our own implementation. Paul continues to fight the good fight at committee meetings to improve the situation, but in the meantime we'd benefit from two things: 1. A way to disable any compiler optimisations that break our dependency ordering in spite of READ_ONCE() 2. A way to detect at build time if these harmful optimisations are taking place Finally, while I agree that this problem isn't limited to LTO, my fear is that LTO provides enough information for address dependencies headed by a READ_ONCE() to be converted to control dependencies when some common values of the pointer can be determined by the compiler. If we can rule this sort of thing out, then great, but in the absence of (2) I think throwing in an acquire is a sensible safety measure. Doesn't CFI rely on LTO to do something similar for indirect branch targets, or have I got that totally mixed up? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 9:41 ` Marco Elver 2020-07-01 10:03 ` Will Deacon @ 2020-07-01 11:40 ` Peter Zijlstra 2020-07-01 14:06 ` Paul E. McKenney 1 sibling, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-01 11:40 UTC (permalink / raw) To: Marco Elver Cc: linux-arch, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jul 01, 2020 at 11:41:17AM +0200, Marco Elver wrote: > On Tue, 30 Jun 2020 at 22:30, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > > > > Thoughts? > > > > > > How hard would it be to creates something that analyzes a build and > > > looks for all 'dependent load -> control dependency' transformations > > > headed by a volatile (and/or from asm) load and issues a warning for > > > them? > > I was thinking about this, but in the context of the "auto-promote to > acquire" which you didn't like. Issuing a warning should certainly be > simpler. > > I think there is no one place where we know these transformations > happen, but rather, need to analyze the IR before transformations, > take note of all the dependent loads headed by volatile+asm, and then > run an analysis after optimizations checking the dependencies are > still there. Urgh, that sounds nasty. The thing is, as I've hinted at in my other reply, I would really like a compiler switch to disable this optimization entirely -- knowing how relevant the trnaformation is, is simply a first step towards that. In order to control the tranformation, you have to actually know where in the optimization passes it happens. Also, if (big if in my book) we find the optimization is actually beneficial, we can invert the warning when using the switch and warn about lost optimization possibilities and manually re-write the code to use control deps. > > > This would give us an indication of how valuable this transformation is > > > for the kernel. I'm hoping/expecting it's vanishingly rare, but what do > > > I know. > > > > This could be quite useful! > > We might then even be able to say, "if you get this warning, turn on > CONFIG_ACQUIRE_READ_DEPENDENCIES" (or however the option will be > named). I was going to suggest: if this happens, employ -fno-wreck-dependencies :-) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 11:40 ` Peter Zijlstra @ 2020-07-01 14:06 ` Paul E. McKenney 2020-07-01 15:05 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-01 14:06 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jul 01, 2020 at 01:40:27PM +0200, Peter Zijlstra wrote: > On Wed, Jul 01, 2020 at 11:41:17AM +0200, Marco Elver wrote: > > On Tue, 30 Jun 2020 at 22:30, Paul E. McKenney <paulmck@kernel.org> wrote: > > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > > On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > > > > > > Thoughts? > > > > > > > > How hard would it be to creates something that analyzes a build and > > > > looks for all 'dependent load -> control dependency' transformations > > > > headed by a volatile (and/or from asm) load and issues a warning for > > > > them? > > > > I was thinking about this, but in the context of the "auto-promote to > > acquire" which you didn't like. Issuing a warning should certainly be > > simpler. > > > > I think there is no one place where we know these transformations > > happen, but rather, need to analyze the IR before transformations, > > take note of all the dependent loads headed by volatile+asm, and then > > run an analysis after optimizations checking the dependencies are > > still there. > > Urgh, that sounds nasty. The thing is, as I've hinted at in my other > reply, I would really like a compiler switch to disable this > optimization entirely -- knowing how relevant the trnaformation is, is > simply a first step towards that. > > In order to control the tranformation, you have to actually know where > in the optimization passes it happens. > > Also, if (big if in my book) we find the optimization is actually > beneficial, we can invert the warning when using the switch and warn > about lost optimization possibilities and manually re-write the code to > use control deps. There are lots of optimization passes and any of them might decide to destroy dependencies. :-( > > > > This would give us an indication of how valuable this transformation is > > > > for the kernel. I'm hoping/expecting it's vanishingly rare, but what do > > > > I know. > > > > > > This could be quite useful! > > > > We might then even be able to say, "if you get this warning, turn on > > CONFIG_ACQUIRE_READ_DEPENDENCIES" (or however the option will be > > named). > > I was going to suggest: if this happens, employ -fno-wreck-dependencies > :-) The current state in the C++ committee is that marking variables carrying dependencies is the way forward. This is of course not what the Linux kernel community does, but it should not be hard to have a -fall-variables-dependent or some such that causes all variables to be treated as if they were marked. Though I was hoping for only pointers. Are they -sure- that they -absolutely- need to carry dependencies through integers??? Anyway, the next step is to provide this functionality in one of the major compilers. Akshat Garg started this in GCC as a GSoC project by duplicating "volatile" functionality with a _Dependent_ptr keyword. Next steps would include removing "volatile" functionality not required for dependencies. Here is a random posting, which if I remember correctly raised some doubts as to whether "volatile" was really carried through everywhere that it needs to for things like LTO: https://gcc.gnu.org/legacy-ml/gcc/2019-07/msg00139.html What happened to this effort? Akshat graduated and got an unrelated job, you know, the usual. ;-) Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 14:06 ` Paul E. McKenney @ 2020-07-01 15:05 ` Peter Zijlstra 2020-07-01 16:03 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-01 15:05 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jul 01, 2020 at 07:06:54AM -0700, Paul E. McKenney wrote: > The current state in the C++ committee is that marking variables > carrying dependencies is the way forward. This is of course not what > the Linux kernel community does, but it should not be hard to have a > -fall-variables-dependent or some such that causes all variables to be > treated as if they were marked. Though I was hoping for only pointers. > Are they -sure- that they -absolutely- need to carry dependencies > through integers??? What's 'need'? :-) I'm thinking __ktime_get_fast_ns() is better off with a dependent load than it is with an extra smp_rmb(). Yes we can stick an smp_rmb() in there, but I don't like it. Like I wrote earlier, if I wanted a control dependency, I'd have written one. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 15:05 ` Peter Zijlstra @ 2020-07-01 16:03 ` Paul E. McKenney 2020-07-02 8:20 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-01 16:03 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jul 01, 2020 at 05:05:12PM +0200, Peter Zijlstra wrote: > On Wed, Jul 01, 2020 at 07:06:54AM -0700, Paul E. McKenney wrote: > > > The current state in the C++ committee is that marking variables > > carrying dependencies is the way forward. This is of course not what > > the Linux kernel community does, but it should not be hard to have a > > -fall-variables-dependent or some such that causes all variables to be > > treated as if they were marked. Though I was hoping for only pointers. > > Are they -sure- that they -absolutely- need to carry dependencies > > through integers??? > > What's 'need'? :-) Turning off all dependency-killing optimizations on all pointers is likely a non-event. Turning off all dependency-killing optimizations on all integers is not the road to happiness. So whatever "need" might be, it would need to be rather earthshaking. ;-) It is probably not -that- hard to convert to pointers, even if they are indexing multiple arrays. > I'm thinking __ktime_get_fast_ns() is better off with a dependent load > than it is with an extra smp_rmb(). > > Yes we can stick an smp_rmb() in there, but I don't like it. Like I > wrote earlier, if I wanted a control dependency, I'd have written one. No argument here. But it looks like we are going to have to tell the compiler. Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-01 16:03 ` Paul E. McKenney @ 2020-07-02 8:20 ` Peter Zijlstra 2020-07-02 17:59 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-02 8:20 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Wed, Jul 01, 2020 at 09:03:38AM -0700, Paul E. McKenney wrote: > But it looks like we are going to have to tell the compiler. What does the current proposal look like? I can certainly annotate the seqcount latch users, but who knows what other code is out there.... _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-02 8:20 ` Peter Zijlstra @ 2020-07-02 17:59 ` Paul E. McKenney 2020-07-03 13:13 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-02 17:59 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Thu, Jul 02, 2020 at 10:20:40AM +0200, Peter Zijlstra wrote: > On Wed, Jul 01, 2020 at 09:03:38AM -0700, Paul E. McKenney wrote: > > > But it looks like we are going to have to tell the compiler. > > What does the current proposal look like? I can certainly annotate the > seqcount latch users, but who knows what other code is out there.... For pointers, yes, within the Linux kernel it is hopeless, thus the thought of a -fall-dependent-ptr or some such that makes the compiler pretend that each and every pointer is marked with the _Dependent_ptr qualifier. New non-Linux-kernel code might want to use his qualifier explicitly, perhaps something like the following: _Dependent_ptr struct foo *p; // Or maybe after the "*"? rcu_read_lock(); p = rcu_dereference(gp); // And so on... If a function is to take a dependent pointer as a function argument, then the corresponding parameter need the _Dependent_ptr marking. Ditto for return values. The proposal did not cover integers due to concerns about the number of optimization passes that would need to be reviewed to make that work. Nevertheless, using a marked integer would be safer than using an unmarked one, and if the review can be carried out, why not? Maybe something like this: _Dependent_ptr int idx; rcu_read_lock(); idx = READ_ONCE(gidx); d = rcuarray[idx]; rcu_read_unlock(); do_something_with(d); So use of this qualifier is quite reasonable. The prototype for GCC is here: https://github.com/AKG001/gcc/ Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-02 17:59 ` Paul E. McKenney @ 2020-07-03 13:13 ` Peter Zijlstra 2020-07-03 13:25 ` Peter Zijlstra 2020-07-03 14:42 ` Paul E. McKenney 0 siblings, 2 replies; 212+ messages in thread From: Peter Zijlstra @ 2020-07-03 13:13 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Thu, Jul 02, 2020 at 10:59:48AM -0700, Paul E. McKenney wrote: > On Thu, Jul 02, 2020 at 10:20:40AM +0200, Peter Zijlstra wrote: > > On Wed, Jul 01, 2020 at 09:03:38AM -0700, Paul E. McKenney wrote: > > > > > But it looks like we are going to have to tell the compiler. > > > > What does the current proposal look like? I can certainly annotate the > > seqcount latch users, but who knows what other code is out there.... > > For pointers, yes, within the Linux kernel it is hopeless, thus the > thought of a -fall-dependent-ptr or some such that makes the compiler > pretend that each and every pointer is marked with the _Dependent_ptr > qualifier. > > New non-Linux-kernel code might want to use his qualifier explicitly, > perhaps something like the following: > > _Dependent_ptr struct foo *p; // Or maybe after the "*"? After, as you've written it, it's a pointer to a '_Dependent struct foo'. > > rcu_read_lock(); > p = rcu_dereference(gp); > // And so on... > > If a function is to take a dependent pointer as a function argument, > then the corresponding parameter need the _Dependent_ptr marking. > Ditto for return values. > > The proposal did not cover integers due to concerns about the number of > optimization passes that would need to be reviewed to make that work. > Nevertheless, using a marked integer would be safer than using an unmarked > one, and if the review can be carried out, why not? Maybe something > like this: > > _Dependent_ptr int idx; > > rcu_read_lock(); > idx = READ_ONCE(gidx); > d = rcuarray[idx]; > rcu_read_unlock(); > do_something_with(d); > > So use of this qualifier is quite reasonable. The above usage might warrant a rename of the qualifier though, since clearly there isn't anything ptr around. > The prototype for GCC is here: https://github.com/AKG001/gcc/ Thanks! Those test cases are somewhat over qualified though: static volatile _Atomic (TYPE) * _Dependent_ptr a; \ Also, if C goes and specifies load dependencies, in any form, is then not the corrolary that they need to specify control dependencies? How else can they exclude the transformation. And of course, once we're there, can we get explicit support for control dependencies too? :-) :-) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-03 13:13 ` Peter Zijlstra @ 2020-07-03 13:25 ` Peter Zijlstra 2020-07-03 14:51 ` Paul E. McKenney 2020-07-03 14:42 ` Paul E. McKenney 1 sibling, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-03 13:25 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Fri, Jul 03, 2020 at 03:13:30PM +0200, Peter Zijlstra wrote: > > The prototype for GCC is here: https://github.com/AKG001/gcc/ > > Thanks! Those test cases are somewhat over qualified though: > > static volatile _Atomic (TYPE) * _Dependent_ptr a; \ One question though; since its a qualifier, and we've recently spend a whole lot of effort to strip qualifiers in say READ_ONCE(), how does, and how do we want, this qualifier to behave. C++ has very convenient means of manipulating qualifiers, so it's not much of a problem there, but for C it is, as we've found, really quite cumbersome. Even with _Generic() we can't manipulate individual qualifiers afaict. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-03 13:25 ` Peter Zijlstra @ 2020-07-03 14:51 ` Paul E. McKenney 0 siblings, 0 replies; 212+ messages in thread From: Paul E. McKenney @ 2020-07-03 14:51 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Fri, Jul 03, 2020 at 03:25:23PM +0200, Peter Zijlstra wrote: > On Fri, Jul 03, 2020 at 03:13:30PM +0200, Peter Zijlstra wrote: > > > The prototype for GCC is here: https://github.com/AKG001/gcc/ > > > > Thanks! Those test cases are somewhat over qualified though: > > > > static volatile _Atomic (TYPE) * _Dependent_ptr a; \ > > One question though; since its a qualifier, and we've recently spend a > whole lot of effort to strip qualifiers in say READ_ONCE(), how does, > and how do we want, this qualifier to behave. Dereferencing a _Dependent_ptr pointer gives you something that is not _Dependent_ptr, unless the declaration was like this: _Dependent_ptr _Atomic (TYPE) * _Dependent_ptr a; And if I recall correctly, the current state is that assigning a _Dependent_ptr variable to a non-_Dependent_ptr variable strips this marking (though the thought was to be able to ask for a warning). So, yes, it would be nice to be able to explicitly strip the _Dependent_ptr, perhaps the kill_dependency() macro, which is already in the C standard. > C++ has very convenient means of manipulating qualifiers, so it's not > much of a problem there, but for C it is, as we've found, really quite > cumbersome. Even with _Generic() we can't manipulate individual > qualifiers afaict. Fair point, and in C++ this is a templated class, at least in the same sense that std::atomic<> is a templated class. But in this case, would kill_dependency do what you want? Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-03 13:13 ` Peter Zijlstra 2020-07-03 13:25 ` Peter Zijlstra @ 2020-07-03 14:42 ` Paul E. McKenney 2020-07-06 16:26 ` Paul E. McKenney 1 sibling, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-03 14:42 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Fri, Jul 03, 2020 at 03:13:30PM +0200, Peter Zijlstra wrote: > On Thu, Jul 02, 2020 at 10:59:48AM -0700, Paul E. McKenney wrote: > > On Thu, Jul 02, 2020 at 10:20:40AM +0200, Peter Zijlstra wrote: > > > On Wed, Jul 01, 2020 at 09:03:38AM -0700, Paul E. McKenney wrote: > > > > > > > But it looks like we are going to have to tell the compiler. > > > > > > What does the current proposal look like? I can certainly annotate the > > > seqcount latch users, but who knows what other code is out there.... > > > > For pointers, yes, within the Linux kernel it is hopeless, thus the > > thought of a -fall-dependent-ptr or some such that makes the compiler > > pretend that each and every pointer is marked with the _Dependent_ptr > > qualifier. > > > > New non-Linux-kernel code might want to use his qualifier explicitly, > > perhaps something like the following: > > > > _Dependent_ptr struct foo *p; // Or maybe after the "*"? > > After, as you've written it, it's a pointer to a '_Dependent struct > foo'. Yeah, I have to look that up every time. :-/ Thank you for checking! > > rcu_read_lock(); > > p = rcu_dereference(gp); > > // And so on... > > > > If a function is to take a dependent pointer as a function argument, > > then the corresponding parameter need the _Dependent_ptr marking. > > Ditto for return values. > > > > The proposal did not cover integers due to concerns about the number of > > optimization passes that would need to be reviewed to make that work. > > Nevertheless, using a marked integer would be safer than using an unmarked > > one, and if the review can be carried out, why not? Maybe something > > like this: > > > > _Dependent_ptr int idx; > > > > rcu_read_lock(); > > idx = READ_ONCE(gidx); > > d = rcuarray[idx]; > > rcu_read_unlock(); > > do_something_with(d); > > > > So use of this qualifier is quite reasonable. > > The above usage might warrant a rename of the qualifier though, since > clearly there isn't anything ptr around. Given the large number of additional optimizations that need to be suppressed in the non-pointer case, any discouragement based on the "_ptr" at the end of the name is all to the good. And if that line of reasoning is unconvincing, please look at the program at the end of this email, which compiles without errors with -Wall and gives the expected output. ;-) > > The prototype for GCC is here: https://github.com/AKG001/gcc/ > > Thanks! Those test cases are somewhat over qualified though: > > static volatile _Atomic (TYPE) * _Dependent_ptr a; \ Especially given that in C, _Atomic operations are implicitly volatile. But this is likely a holdover from Akshat's implementation strategy, which was to pattern _Dependent_ptr after the volatile keyword. > Also, if C goes and specifies load dependencies, in any form, is then > not the corrolary that they need to specify control dependencies? How > else can they exclude the transformation. By requiring that any temporaries generated from variables that are marked _Dependent_ptr also be marked _Dependent_ptr. This is of course one divergence of _Dependent_ptr from the volatile keyword. > And of course, once we're there, can we get explicit support for control > dependencies too? :-) :-) Keep talking like this and I am going to make sure that you attend a standards committee meeting. If need be, by arranging for you to be physically dragged there. ;-) More seriously, for control dependencies, the variable that would need to be marked would be the program counter, which might require some additional syntax. Thanx, Paul ------------------------------------------------------------------------ #include <stdio.h> #include <stdlib.h> #include <string.h> int foo(int *p, int i) { return i[p]; } int arr[] = { 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, }; int main(int argc, char *argv[]) { int i = atoi(argv[1]); printf("%d[arr] = %d\n", i, foo(arr, i)); return 0; } _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-03 14:42 ` Paul E. McKenney @ 2020-07-06 16:26 ` Paul E. McKenney 2020-07-06 18:29 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-06 16:26 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Fri, Jul 03, 2020 at 07:42:28AM -0700, Paul E. McKenney wrote: > On Fri, Jul 03, 2020 at 03:13:30PM +0200, Peter Zijlstra wrote: > > On Thu, Jul 02, 2020 at 10:59:48AM -0700, Paul E. McKenney wrote: > > > On Thu, Jul 02, 2020 at 10:20:40AM +0200, Peter Zijlstra wrote: > > > > On Wed, Jul 01, 2020 at 09:03:38AM -0700, Paul E. McKenney wrote: [ . . . ] > > Also, if C goes and specifies load dependencies, in any form, is then > > not the corrolary that they need to specify control dependencies? How > > else can they exclude the transformation. > > By requiring that any temporaries generated from variables that are > marked _Dependent_ptr also be marked _Dependent_ptr. This is of course > one divergence of _Dependent_ptr from the volatile keyword. > > > And of course, once we're there, can we get explicit support for control > > dependencies too? :-) :-) > > Keep talking like this and I am going to make sure that you attend a > standards committee meeting. If need be, by arranging for you to be > physically dragged there. ;-) > > More seriously, for control dependencies, the variable that would need > to be marked would be the program counter, which might require some > additional syntax. And perhaps more constructively, we do need to prioritize address and data dependencies over control dependencies. For one thing, there are a lot more address/data dependencies in existing code than there are control dependencies, and (sadly, perhaps more importantly) there are a lot more people who are convinced that address/data dependencies are important. For another (admittedly more theoretical) thing, the OOTA scenarios stemming from control dependencies are a lot less annoying than those from address/data dependencies. And address/data dependencies are as far as I know vulnerable to things like conditional-move instructions that can cause problems for control dependencies. Nevertheless, yes, control dependencies also need attention. Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-06 16:26 ` Paul E. McKenney @ 2020-07-06 18:29 ` Peter Zijlstra 2020-07-06 18:39 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-06 18:29 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Mon, Jul 06, 2020 at 09:26:33AM -0700, Paul E. McKenney wrote: > And perhaps more constructively, we do need to prioritize address and data > dependencies over control dependencies. For one thing, there are a lot > more address/data dependencies in existing code than there are control > dependencies, and (sadly, perhaps more importantly) there are a lot more > people who are convinced that address/data dependencies are important. If they do not consider their Linux OS running correctly :-) > For another (admittedly more theoretical) thing, the OOTA scenarios > stemming from control dependencies are a lot less annoying than those > from address/data dependencies. > > And address/data dependencies are as far as I know vulnerable to things > like conditional-move instructions that can cause problems for control > dependencies. > > Nevertheless, yes, control dependencies also need attention. Today I added one more \o/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-06 18:29 ` Peter Zijlstra @ 2020-07-06 18:39 ` Paul E. McKenney 2020-07-06 19:40 ` Peter Zijlstra 0 siblings, 1 reply; 212+ messages in thread From: Paul E. McKenney @ 2020-07-06 18:39 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Mon, Jul 06, 2020 at 08:29:26PM +0200, Peter Zijlstra wrote: > On Mon, Jul 06, 2020 at 09:26:33AM -0700, Paul E. McKenney wrote: > > > And perhaps more constructively, we do need to prioritize address and data > > dependencies over control dependencies. For one thing, there are a lot > > more address/data dependencies in existing code than there are control > > dependencies, and (sadly, perhaps more importantly) there are a lot more > > people who are convinced that address/data dependencies are important. > > If they do not consider their Linux OS running correctly :-) Many of them really do not care at all. In fact, some would consider Linux failing to run as an added bonus. > > For another (admittedly more theoretical) thing, the OOTA scenarios > > stemming from control dependencies are a lot less annoying than those > > from address/data dependencies. > > > > And address/data dependencies are as far as I know vulnerable to things > > like conditional-move instructions that can cause problems for control > > dependencies. > > > > Nevertheless, yes, control dependencies also need attention. > > Today I added one more \o/ Just make sure you continually check to make sure that compilers don't break it, along with the others you have added. ;-) Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-06 18:39 ` Paul E. McKenney @ 2020-07-06 19:40 ` Peter Zijlstra 2020-07-06 23:41 ` Paul E. McKenney 0 siblings, 1 reply; 212+ messages in thread From: Peter Zijlstra @ 2020-07-06 19:40 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Mon, Jul 06, 2020 at 11:39:33AM -0700, Paul E. McKenney wrote: > On Mon, Jul 06, 2020 at 08:29:26PM +0200, Peter Zijlstra wrote: > > On Mon, Jul 06, 2020 at 09:26:33AM -0700, Paul E. McKenney wrote: > > If they do not consider their Linux OS running correctly :-) > > Many of them really do not care at all. In fact, some would consider > Linux failing to run as an added bonus. This I think is why we have compiler people in the thread that care a lot more. > > > Nevertheless, yes, control dependencies also need attention. > > > > Today I added one more \o/ > > Just make sure you continually check to make sure that compilers > don't break it, along with the others you have added. ;-) There's: kernel/locking/mcs_spinlock.h: smp_cond_load_acquire(l, VAL); \ kernel/sched/core.c: smp_cond_load_acquire(&p->on_cpu, !VAL); kernel/smp.c: smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK)); arch/x86/kernel/alternative.c: atomic_cond_read_acquire(&desc.refs, !VAL); kernel/locking/qrwlock.c: atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED)); kernel/locking/qrwlock.c: atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED)); kernel/locking/qrwlock.c: atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); kernel/locking/qspinlock.c: atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK)); kernel/locking/qspinlock.c: val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK)); include/linux/refcount.h: smp_acquire__after_ctrl_dep(); ipc/mqueue.c: smp_acquire__after_ctrl_dep(); ipc/msg.c: smp_acquire__after_ctrl_dep(); ipc/sem.c: smp_acquire__after_ctrl_dep(); kernel/locking/rwsem.c: smp_acquire__after_ctrl_dep(); kernel/sched/core.c: smp_acquire__after_ctrl_dep(); kernel/events/ring_buffer.c:__perf_output_begin() And I'm fairly sure I'm forgetting some... One could argue there's too many of them to check already. Both GCC and CLANG had better think about it. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-06 19:40 ` Peter Zijlstra @ 2020-07-06 23:41 ` Paul E. McKenney 0 siblings, 0 replies; 212+ messages in thread From: Paul E. McKenney @ 2020-07-06 23:41 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-arch, Marco Elver, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, Linux ARM On Mon, Jul 06, 2020 at 09:40:12PM +0200, Peter Zijlstra wrote: > On Mon, Jul 06, 2020 at 11:39:33AM -0700, Paul E. McKenney wrote: > > On Mon, Jul 06, 2020 at 08:29:26PM +0200, Peter Zijlstra wrote: > > > On Mon, Jul 06, 2020 at 09:26:33AM -0700, Paul E. McKenney wrote: > > > > If they do not consider their Linux OS running correctly :-) > > > > Many of them really do not care at all. In fact, some would consider > > Linux failing to run as an added bonus. > > This I think is why we have compiler people in the thread that care a > lot more. Here is hoping! ;-) > > > > Nevertheless, yes, control dependencies also need attention. > > > > > > Today I added one more \o/ > > > > Just make sure you continually check to make sure that compilers > > don't break it, along with the others you have added. ;-) > > There's: > > kernel/locking/mcs_spinlock.h: smp_cond_load_acquire(l, VAL); \ > kernel/sched/core.c: smp_cond_load_acquire(&p->on_cpu, !VAL); > kernel/smp.c: smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK)); > > arch/x86/kernel/alternative.c: atomic_cond_read_acquire(&desc.refs, !VAL); > kernel/locking/qrwlock.c: atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED)); > kernel/locking/qrwlock.c: atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED)); > kernel/locking/qrwlock.c: atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > kernel/locking/qspinlock.c: atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK)); > kernel/locking/qspinlock.c: val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK)); > > include/linux/refcount.h: smp_acquire__after_ctrl_dep(); > ipc/mqueue.c: smp_acquire__after_ctrl_dep(); > ipc/msg.c: smp_acquire__after_ctrl_dep(); > ipc/sem.c: smp_acquire__after_ctrl_dep(); > kernel/locking/rwsem.c: smp_acquire__after_ctrl_dep(); > kernel/sched/core.c: smp_acquire__after_ctrl_dep(); > > kernel/events/ring_buffer.c:__perf_output_begin() > > And I'm fairly sure I'm forgetting some... One could argue there's too > many of them to check already. > > Both GCC and CLANG had better think about it. That would be good! I won't list the number of address/data dependencies given that there are well over a thousand of them. Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (22 preceding siblings ...) 2020-06-24 21:15 ` [PATCH 00/22] add support for Clang LTO Peter Zijlstra @ 2020-06-28 16:56 ` Masahiro Yamada 2020-06-29 23:20 ` Sami Tolvanen 2020-07-11 16:32 ` Paul Menzel 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen 25 siblings, 1 reply; 212+ messages in thread From: Masahiro Yamada @ 2020-06-28 16:56 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Jun 25, 2020 at 5:32 AM 'Sami Tolvanen' via Clang Built Linux <clang-built-linux@googlegroups.com> wrote: > > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). > > In addition to performance, the primary motivation for LTO is to allow > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > Pixel devices have shipped with LTO+CFI kernels since 2018. > > Most of the patches are build system changes for handling LLVM bitcode, > which Clang produces with LTO instead of ELF object files, postponing > ELF processing until a later stage, and ensuring initcall ordering. > > Note that first objtool patch in the series is already in linux-next, > but as it's needed with LTO, I'm including it also here to make testing > easier. I put this series on a testing branch, and 0-day bot started reporting some issues. (but 0-day bot is quieter than I expected. Perhaps, 0-day bot does not turn on LLVM=1 ?) I also got an error for ARCH=arm64 allyesconfig + CONFIG_LTO_CLANG=y $ make ARCH=arm64 LLVM=1 LLVM_IAS=1 CROSS_COMPILE=~/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- -j24 ... GEN .version CHK include/generated/compile.h UPD include/generated/compile.h CC init/version.o AR init/built-in.a GEN .tmp_initcalls.lds GEN .tmp_symversions.lds LTO vmlinux.o MODPOST vmlinux.symvers MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 ld.lld: error: undefined symbol: __compiletime_assert_905 >>> referenced by irqbypass.c >>> vmlinux.o:(jeq_imm) make: *** [Makefile:1161: vmlinux] Error 1 > Sami Tolvanen (22): > objtool: use sh_info to find the base for .rela sections > kbuild: add support for Clang LTO > kbuild: lto: fix module versioning > kbuild: lto: fix recordmcount > kbuild: lto: postpone objtool > kbuild: lto: limit inlining > kbuild: lto: merge module sections > kbuild: lto: remove duplicate dependencies from .mod files > init: lto: ensure initcall ordering > init: lto: fix PREL32 relocations > pci: lto: fix PREL32 relocations > modpost: lto: strip .lto from module names > scripts/mod: disable LTO for empty.c > efi/libstub: disable LTO > drivers/misc/lkdtm: disable LTO for rodata.o > arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY > arm64: vdso: disable LTO > arm64: allow LTO_CLANG and THINLTO to be selected > x86, vdso: disable LTO only for vDSO > x86, ftrace: disable recordmcount for ftrace_make_nop > x86, relocs: Ignore L4_PAGE_OFFSET relocations > x86, build: allow LTO_CLANG and THINLTO to be selected > > .gitignore | 1 + > Makefile | 27 ++- > arch/Kconfig | 65 +++++++ > arch/arm64/Kconfig | 2 + > arch/arm64/Makefile | 1 + > arch/arm64/kernel/vdso/Makefile | 4 +- > arch/x86/Kconfig | 2 + > arch/x86/Makefile | 5 + > arch/x86/entry/vdso/Makefile | 5 +- > arch/x86/kernel/ftrace.c | 1 + > arch/x86/tools/relocs.c | 1 + > drivers/firmware/efi/libstub/Makefile | 2 + > drivers/misc/lkdtm/Makefile | 1 + > include/asm-generic/vmlinux.lds.h | 12 +- > include/linux/compiler-clang.h | 4 + > include/linux/compiler.h | 2 +- > include/linux/compiler_types.h | 4 + > include/linux/init.h | 78 +++++++- > include/linux/pci.h | 15 +- > kernel/trace/ftrace.c | 1 + > lib/Kconfig.debug | 2 +- > scripts/Makefile.build | 55 +++++- > scripts/Makefile.lib | 6 +- > scripts/Makefile.modfinal | 40 +++- > scripts/Makefile.modpost | 26 ++- > scripts/generate_initcall_order.pl | 270 ++++++++++++++++++++++++++ > scripts/link-vmlinux.sh | 100 +++++++++- > scripts/mod/Makefile | 1 + > scripts/mod/modpost.c | 16 +- > scripts/mod/modpost.h | 9 + > scripts/mod/sumversion.c | 6 +- > scripts/module-lto.lds | 26 +++ > scripts/recordmcount.c | 3 +- > tools/objtool/elf.c | 2 +- > 34 files changed, 737 insertions(+), 58 deletions(-) > create mode 100755 scripts/generate_initcall_order.pl > create mode 100644 scripts/module-lto.lds > > > base-commit: 26e122e97a3d0390ebec389347f64f3730fdf48f > -- > 2.27.0.212.ge8ba1cc988-goog > > -- > You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. > To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200624203200.78870-1-samitolvanen%40google.com. -- Best Regards Masahiro Yamada _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-28 16:56 ` Masahiro Yamada @ 2020-06-29 23:20 ` Sami Tolvanen 2020-07-07 15:51 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-06-29 23:20 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel Hi Masahiro, On Mon, Jun 29, 2020 at 01:56:19AM +0900, Masahiro Yamada wrote: > On Thu, Jun 25, 2020 at 5:32 AM 'Sami Tolvanen' via Clang Built Linux > <clang-built-linux@googlegroups.com> wrote: > > > > This patch series adds support for building x86_64 and arm64 kernels > > with Clang's Link Time Optimization (LTO). > > > > In addition to performance, the primary motivation for LTO is to allow > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > Most of the patches are build system changes for handling LLVM bitcode, > > which Clang produces with LTO instead of ELF object files, postponing > > ELF processing until a later stage, and ensuring initcall ordering. > > > > Note that first objtool patch in the series is already in linux-next, > > but as it's needed with LTO, I'm including it also here to make testing > > easier. > > > I put this series on a testing branch, > and 0-day bot started reporting some issues. Yes, I'll fix those issues in v2. > (but 0-day bot is quieter than I expected. > Perhaps, 0-day bot does not turn on LLVM=1 ?) In order for it to test an LTO build, it would need to enable LTO_CLANG explicitly though, in addition to LLVM=1. > I also got an error for > ARCH=arm64 allyesconfig + CONFIG_LTO_CLANG=y > > > > $ make ARCH=arm64 LLVM=1 LLVM_IAS=1 > CROSS_COMPILE=~/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- > -j24 > > ... > > GEN .version > CHK include/generated/compile.h > UPD include/generated/compile.h > CC init/version.o > AR init/built-in.a > GEN .tmp_initcalls.lds > GEN .tmp_symversions.lds > LTO vmlinux.o > MODPOST vmlinux.symvers > MODINFO modules.builtin.modinfo > GEN modules.builtin > LD .tmp_vmlinux.kallsyms1 > ld.lld: error: undefined symbol: __compiletime_assert_905 > >>> referenced by irqbypass.c > >>> vmlinux.o:(jeq_imm) > make: *** [Makefile:1161: vmlinux] Error 1 I can reproduce this with ToT LLVM and it's BUILD_BUG_ON_MSG(..., "value too large for the field") in drivers/net/ethernet/netronome/nfp/bpf/jit.c. Specifically, the FIELD_FIT / __BF_FIELD_CHECK macro in ur_load_imm_any. This compiles just fine with an earlier LLVM revision, so it could be a relatively recent regression. I'll take a look. Thanks for catching this! Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-29 23:20 ` Sami Tolvanen @ 2020-07-07 15:51 ` Sami Tolvanen 2020-07-07 16:05 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-07-07 15:51 UTC (permalink / raw) To: Masahiro Yamada, Jiong Wang, Jakub Kicinski Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Mon, Jun 29, 2020 at 04:20:59PM -0700, Sami Tolvanen wrote: > Hi Masahiro, > > On Mon, Jun 29, 2020 at 01:56:19AM +0900, Masahiro Yamada wrote: > > I also got an error for > > ARCH=arm64 allyesconfig + CONFIG_LTO_CLANG=y > > > > > > > > $ make ARCH=arm64 LLVM=1 LLVM_IAS=1 > > CROSS_COMPILE=~/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- > > -j24 > > > > ... > > > > GEN .version > > CHK include/generated/compile.h > > UPD include/generated/compile.h > > CC init/version.o > > AR init/built-in.a > > GEN .tmp_initcalls.lds > > GEN .tmp_symversions.lds > > LTO vmlinux.o > > MODPOST vmlinux.symvers > > MODINFO modules.builtin.modinfo > > GEN modules.builtin > > LD .tmp_vmlinux.kallsyms1 > > ld.lld: error: undefined symbol: __compiletime_assert_905 > > >>> referenced by irqbypass.c > > >>> vmlinux.o:(jeq_imm) > > make: *** [Makefile:1161: vmlinux] Error 1 > > I can reproduce this with ToT LLVM and it's BUILD_BUG_ON_MSG(..., "value > too large for the field") in drivers/net/ethernet/netronome/nfp/bpf/jit.c. > Specifically, the FIELD_FIT / __BF_FIELD_CHECK macro in ur_load_imm_any. > > This compiles just fine with an earlier LLVM revision, so it could be a > relatively recent regression. I'll take a look. Thanks for catching this! After spending some time debugging this with Nick, it looks like the error is caused by a recent optimization change in LLVM, which together with the inlining of ur_load_imm_any into jeq_imm, changes a runtime check in FIELD_FIT that would always fail, to a compile-time check that breaks the build. In jeq_imm, we have: /* struct bpf_insn: _s32 imm */ u64 imm = insn->imm; /* sign extend */ ... if (imm >> 32) { /* non-zero only if insn->imm is negative */ /* inlined from ur_load_imm_any */ u32 __imm = imm >> 32; /* therefore, always 0xffffffff */ /* * __imm has a value known at compile-time, which means * __builtin_constant_p(__imm) is true and we end up with * essentially this in __BF_FIELD_CHECK: */ if (__builtin_constant_p(__imm) && __imm <= 255) __compiletime_assert_N(); The compile-time check comes from the following BUILD_BUG_ON_MSG: #define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx) \ ... BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ? \ ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \ _pfx "value too large for the field"); \ While we could stop the compiler from performing this optimization by telling it to never inline ur_load_imm_any, we feel like a better fix might be to replace FIELD_FIT(UR_REG_IMM_MAX, imm) with a simple imm <= UR_REG_IMM_MAX check that won't trip a compile-time assertion even when the condition is known to fail. Jiong, Jakub, do you see any issues here? Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-07 15:51 ` Sami Tolvanen @ 2020-07-07 16:05 ` Sami Tolvanen 2020-07-07 16:56 ` Jakub Kicinski 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-07-07 16:05 UTC (permalink / raw) To: Masahiro Yamada, Jakub Kicinski Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Tue, Jul 07, 2020 at 08:51:07AM -0700, Sami Tolvanen wrote: > After spending some time debugging this with Nick, it looks like the > error is caused by a recent optimization change in LLVM, which together > with the inlining of ur_load_imm_any into jeq_imm, changes a runtime > check in FIELD_FIT that would always fail, to a compile-time check that > breaks the build. In jeq_imm, we have: > > /* struct bpf_insn: _s32 imm */ > u64 imm = insn->imm; /* sign extend */ > ... > if (imm >> 32) { /* non-zero only if insn->imm is negative */ > /* inlined from ur_load_imm_any */ > u32 __imm = imm >> 32; /* therefore, always 0xffffffff */ > > /* > * __imm has a value known at compile-time, which means > * __builtin_constant_p(__imm) is true and we end up with > * essentially this in __BF_FIELD_CHECK: > */ > if (__builtin_constant_p(__imm) && __imm <= 255) Should be __imm > 255, of course, which means the compiler will generate a call to __compiletime_assert. > Jiong, Jakub, do you see any issues here? (Jiong's email bounced, so removing from the recipient list.) Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-07 16:05 ` Sami Tolvanen @ 2020-07-07 16:56 ` Jakub Kicinski 2020-07-07 17:17 ` Nick Desaulniers 0 siblings, 1 reply; 212+ messages in thread From: Jakub Kicinski @ 2020-07-07 16:56 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Tue, 7 Jul 2020 09:05:28 -0700 Sami Tolvanen wrote: > On Tue, Jul 07, 2020 at 08:51:07AM -0700, Sami Tolvanen wrote: > > After spending some time debugging this with Nick, it looks like the > > error is caused by a recent optimization change in LLVM, which together > > with the inlining of ur_load_imm_any into jeq_imm, changes a runtime > > check in FIELD_FIT that would always fail, to a compile-time check that > > breaks the build. In jeq_imm, we have: > > > > /* struct bpf_insn: _s32 imm */ > > u64 imm = insn->imm; /* sign extend */ > > ... > > if (imm >> 32) { /* non-zero only if insn->imm is negative */ > > /* inlined from ur_load_imm_any */ > > u32 __imm = imm >> 32; /* therefore, always 0xffffffff */ > > > > /* > > * __imm has a value known at compile-time, which means > > * __builtin_constant_p(__imm) is true and we end up with > > * essentially this in __BF_FIELD_CHECK: > > */ > > if (__builtin_constant_p(__imm) && __imm <= 255) > > Should be __imm > 255, of course, which means the compiler will generate > a call to __compiletime_assert. I think FIELD_FIT() should not pass the value into __BF_FIELD_CHECK(). So: diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h index 48ea093ff04c..4e035aca6f7e 100644 --- a/include/linux/bitfield.h +++ b/include/linux/bitfield.h @@ -77,7 +77,7 @@ */ #define FIELD_FIT(_mask, _val) \ ({ \ - __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_FIT: "); \ + __BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_FIT: "); \ !((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask)); \ }) It's perfectly legal to pass a constant which does not fit, in which case FIELD_FIT() should just return false not break the build. Right? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-07 16:56 ` Jakub Kicinski @ 2020-07-07 17:17 ` Nick Desaulniers 2020-07-07 17:30 ` Jakub Kicinski 0 siblings, 1 reply; 212+ messages in thread From: Nick Desaulniers @ 2020-07-07 17:17 UTC (permalink / raw) To: Jakub Kicinski Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Tue, Jul 7, 2020 at 9:56 AM Jakub Kicinski <kuba@kernel.org> wrote: > > > On Tue, Jul 07, 2020 at 08:51:07AM -0700, Sami Tolvanen wrote: > > > After spending some time debugging this with Nick, it looks like the > > > error is caused by a recent optimization change in LLVM, which together > > > with the inlining of ur_load_imm_any into jeq_imm, changes a runtime > > > check in FIELD_FIT that would always fail, to a compile-time check that > > > breaks the build. In jeq_imm, we have: > > > > > > /* struct bpf_insn: _s32 imm */ > > > u64 imm = insn->imm; /* sign extend */ > > > ... > > > if (imm >> 32) { /* non-zero only if insn->imm is negative */ > > > /* inlined from ur_load_imm_any */ > > > u32 __imm = imm >> 32; /* therefore, always 0xffffffff */ > > > > > > /* > > > * __imm has a value known at compile-time, which means > > > * __builtin_constant_p(__imm) is true and we end up with > > > * essentially this in __BF_FIELD_CHECK: > > > */ > > > if (__builtin_constant_p(__imm) && __imm > 255) > > I think FIELD_FIT() should not pass the value into __BF_FIELD_CHECK(). > > So: > > diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h > index 48ea093ff04c..4e035aca6f7e 100644 > --- a/include/linux/bitfield.h > +++ b/include/linux/bitfield.h > @@ -77,7 +77,7 @@ > */ > #define FIELD_FIT(_mask, _val) \ > ({ \ > - __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_FIT: "); \ > + __BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_FIT: "); \ > !((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask)); \ > }) > > It's perfectly legal to pass a constant which does not fit, in which > case FIELD_FIT() should just return false not break the build. > > Right? I see the value of the __builtin_constant_p check; this is just a very interesting case where rather than an integer literal appearing in the source, the compiler is able to deduce that the parameter can only have one value in one case, and allows __builtin_constant_p to evaluate to true for it. I had definitely asked Sami about the comment above FIELD_FIT: """ 76 * Return: true if @_val can fit inside @_mask, false if @_val is too big. """ in which FIELD_FIT doesn't return false if @_val is too big and a compile time constant. (Rather it breaks the build). Of the 14 expansion sites of FIELD_FIT I see in mainline, it doesn't look like any integral literals are passed, so maybe the compile time checks of _val are of little value for FIELD_FIT. So I think your suggested diff is the most concise fix. -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-07 17:17 ` Nick Desaulniers @ 2020-07-07 17:30 ` Jakub Kicinski 0 siblings, 0 replies; 212+ messages in thread From: Jakub Kicinski @ 2020-07-07 17:30 UTC (permalink / raw) To: Nick Desaulniers Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Tue, 7 Jul 2020 10:17:25 -0700 Nick Desaulniers wrote: > On Tue, Jul 7, 2020 at 9:56 AM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > On Tue, Jul 07, 2020 at 08:51:07AM -0700, Sami Tolvanen wrote: > > > > After spending some time debugging this with Nick, it looks like the > > > > error is caused by a recent optimization change in LLVM, which together > > > > with the inlining of ur_load_imm_any into jeq_imm, changes a runtime > > > > check in FIELD_FIT that would always fail, to a compile-time check that > > > > breaks the build. In jeq_imm, we have: > > > > > > > > /* struct bpf_insn: _s32 imm */ > > > > u64 imm = insn->imm; /* sign extend */ > > > > ... > > > > if (imm >> 32) { /* non-zero only if insn->imm is negative */ > > > > /* inlined from ur_load_imm_any */ > > > > u32 __imm = imm >> 32; /* therefore, always 0xffffffff */ > > > > > > > > /* > > > > * __imm has a value known at compile-time, which means > > > > * __builtin_constant_p(__imm) is true and we end up with > > > > * essentially this in __BF_FIELD_CHECK: > > > > */ > > > > if (__builtin_constant_p(__imm) && __imm > 255) > > > > I think FIELD_FIT() should not pass the value into __BF_FIELD_CHECK(). > > > > So: > > > > diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h > > index 48ea093ff04c..4e035aca6f7e 100644 > > --- a/include/linux/bitfield.h > > +++ b/include/linux/bitfield.h > > @@ -77,7 +77,7 @@ > > */ > > #define FIELD_FIT(_mask, _val) \ > > ({ \ > > - __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_FIT: "); \ > > + __BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_FIT: "); \ > > !((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask)); \ > > }) > > > > It's perfectly legal to pass a constant which does not fit, in which > > case FIELD_FIT() should just return false not break the build. > > > > Right? > > I see the value of the __builtin_constant_p check; this is just a very > interesting case where rather than an integer literal appearing in the > source, the compiler is able to deduce that the parameter can only > have one value in one case, and allows __builtin_constant_p to > evaluate to true for it. > > I had definitely asked Sami about the comment above FIELD_FIT: > """ > 76 * Return: true if @_val can fit inside @_mask, false if @_val is > too big. > """ > in which FIELD_FIT doesn't return false if @_val is too big and a > compile time constant. (Rather it breaks the build). > > Of the 14 expansion sites of FIELD_FIT I see in mainline, it doesn't > look like any integral literals are passed, so maybe the compile time > checks of _val are of little value for FIELD_FIT. Also I just double checked and all FIELD_FIT() uses check the return value. > So I think your suggested diff is the most concise fix. Feel free to submit that officially as a patch if it fixes the build for you, here's my sign-off: Signed-off-by: Jakub Kicinski <kuba@kernel.org> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (23 preceding siblings ...) 2020-06-28 16:56 ` Masahiro Yamada @ 2020-07-11 16:32 ` Paul Menzel 2020-07-12 8:59 ` Sedat Dilek 2020-07-12 23:34 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen 25 siblings, 2 replies; 212+ messages in thread From: Paul Menzel @ 2020-07-11 16:32 UTC (permalink / raw) To: Sami Tolvanen, Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, clang-built-linux, linux-pci, linux-arm-kernel Dear Sami, Am 24.06.20 um 22:31 schrieb Sami Tolvanen: > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). > > In addition to performance, the primary motivation for LTO is to allow > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > Pixel devices have shipped with LTO+CFI kernels since 2018. > > Most of the patches are build system changes for handling LLVM bitcode, > which Clang produces with LTO instead of ELF object files, postponing > ELF processing until a later stage, and ensuring initcall ordering. > > Note that first objtool patch in the series is already in linux-next, > but as it's needed with LTO, I'm including it also here to make testing > easier. […] Thank you very much for sending these changes. Do you have a branch, where your current work can be pulled from? Your branch on GitHub [1] seems 15 months old. Out of curiosity, I applied the changes, allowed the selection for i386 (x86), and with Clang 1:11~++20200701093119+ffee8040534-1~exp1 from Debian experimental, it failed with `Invalid absolute R_386_32 relocation: KERNEL_PAGES`: > make -f ./scripts/Makefile.build obj=arch/x86/boot arch/x86/boot/bzImage > make -f ./scripts/Makefile.build obj=arch/x86/boot/compressed arch/x86/boot/compressed/vmlinux > llvm-nm vmlinux | sed -n -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|__bss_start\|_end\)$/#define VO_ _AC(0x,UL)/p' > arch/x86/boot/compressed/../voffset.h > clang -Wp,-MMD,arch/x86/boot/compressed/.misc.o.d -nostdinc -isystem /usr/lib/llvm-11/lib/clang/11.0.0/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -Qunused-arguments -m32 -O2 -fno-strict-aliasing -fPIE -DDISABLE_BRANCH_PROFILING -march=i386 -mno-mmx -mno-sse -ffreestanding -fno-stack-protector -Wno-address-of-packed-member -Wno-gnu -Wno-pointer-sign -fmacro-prefix-map=./= -fno-asynchronous-unwind-tables -DKBUILD_MODFILE='"arch/x86/boot/compressed/misc"' -DKBUILD_BASENAME='"misc"' -DKBUILD_MODNAME='"misc"' -D__KBUILD_MODNAME=misc -c -o arch/x86/boot/compressed/misc.o arch/x86/boot/compressed/misc.c > llvm-objcopy -R .comment -S vmlinux arch/x86/boot/compressed/vmlinux.bin > arch/x86/tools/relocs vmlinux > arch/x86/boot/compressed/vmlinux.relocs;arch/x86/tools/relocs --abs-relocs vmlinux > Invalid absolute R_386_32 relocation: KERNEL_PAGES > make[2]: *** [arch/x86/boot/compressed/Makefile:134: arch/x86/boot/compressed/vmlinux.relocs] Error 1 > make[2]: *** Deleting file 'arch/x86/boot/compressed/vmlinux.relocs' > make[1]: *** [arch/x86/boot/Makefile:115: arch/x86/boot/compressed/vmlinux] Error 2 > make: *** [arch/x86/Makefile:268: bzImage] Error 2 Kind regards, Paul [1]: https://github.com/samitolvanen/linux/tree/clang-lto _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-11 16:32 ` Paul Menzel @ 2020-07-12 8:59 ` Sedat Dilek 2020-07-12 18:40 ` Nathan Chancellor 2020-07-12 23:34 ` Sami Tolvanen 1 sibling, 1 reply; 212+ messages in thread From: Sedat Dilek @ 2020-07-12 8:59 UTC (permalink / raw) To: Paul Menzel Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Clang-Built-Linux ML, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Sat, Jul 11, 2020 at 6:32 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote: > > Dear Sami, > > > Am 24.06.20 um 22:31 schrieb Sami Tolvanen: > > This patch series adds support for building x86_64 and arm64 kernels > > with Clang's Link Time Optimization (LTO). > > > > In addition to performance, the primary motivation for LTO is to allow > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > Most of the patches are build system changes for handling LLVM bitcode, > > which Clang produces with LTO instead of ELF object files, postponing > > ELF processing until a later stage, and ensuring initcall ordering. > > > > Note that first objtool patch in the series is already in linux-next, > > but as it's needed with LTO, I'm including it also here to make testing > > easier. > > […] > > Thank you very much for sending these changes. > > Do you have a branch, where your current work can be pulled from? Your > branch on GitHub [1] seems 15 months old. > Agreed it's easier to git-pull. I have seen [1] - not sure if this is the latest version. Alternatively, you can check patchwork LKML by searching for $submitter. ( You can open patch 01/22 and download the whole patch-series by following the link "series", see [3]. ) - Sedat - [1] https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git/log/?h=lto [2] https://lore.kernel.org/patchwork/project/lkml/list/?series=&submitter=19676 [3] https://lore.kernel.org/patchwork/series/450026/mbox/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-12 8:59 ` Sedat Dilek @ 2020-07-12 18:40 ` Nathan Chancellor 2020-07-14 9:44 ` Sedat Dilek 0 siblings, 1 reply; 212+ messages in thread From: Nathan Chancellor @ 2020-07-12 18:40 UTC (permalink / raw) To: Sedat Dilek Cc: linux-arch, Paul Menzel, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Clang-Built-Linux ML, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Sun, Jul 12, 2020 at 10:59:17AM +0200, Sedat Dilek wrote: > On Sat, Jul 11, 2020 at 6:32 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote: > > > > Dear Sami, > > > > > > Am 24.06.20 um 22:31 schrieb Sami Tolvanen: > > > This patch series adds support for building x86_64 and arm64 kernels > > > with Clang's Link Time Optimization (LTO). > > > > > > In addition to performance, the primary motivation for LTO is to allow > > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > > > Most of the patches are build system changes for handling LLVM bitcode, > > > which Clang produces with LTO instead of ELF object files, postponing > > > ELF processing until a later stage, and ensuring initcall ordering. > > > > > > Note that first objtool patch in the series is already in linux-next, > > > but as it's needed with LTO, I'm including it also here to make testing > > > easier. > > > > […] > > > > Thank you very much for sending these changes. > > > > Do you have a branch, where your current work can be pulled from? Your > > branch on GitHub [1] seems 15 months old. > > > > Agreed it's easier to git-pull. > I have seen [1] - not sure if this is the latest version. > Alternatively, you can check patchwork LKML by searching for $submitter. > ( You can open patch 01/22 and download the whole patch-series by > following the link "series", see [3]. ) > > - Sedat - > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git/log/?h=lto > [2] https://lore.kernel.org/patchwork/project/lkml/list/?series=&submitter=19676 > [3] https://lore.kernel.org/patchwork/series/450026/mbox/ > Sami tagged this series on his GitHub: https://github.com/samitolvanen/linux/releases/tag/lto-v1 git pull https://github.com/samitolvanen/linux lto-v1 Otherwise, he is updating the clang-cfi branch that includes both the LTO and CFI patchsets. You can pull that and just turn on CONFIG_LTO_CLANG. Lastly, for the future, I would recommend grabbing b4 to easily apply patches (specifically full series) from lore.kernel.org. https://git.kernel.org/pub/scm/utils/b4/b4.git/ https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst You could grab this series and apply it easily by either downloading the mbox file and following the instructions it gives for applying the mbox file: $ b4 am 20200624203200.78870-1-samitolvanen@google.com or I prefer piping so that I don't have to clean up later: $ b4 am -o - 20200624203200.78870-1-samitolvanen@google.com | git am Cheers, Nathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-12 18:40 ` Nathan Chancellor @ 2020-07-14 9:44 ` Sedat Dilek 2020-07-14 17:54 ` Nick Desaulniers 0 siblings, 1 reply; 212+ messages in thread From: Sedat Dilek @ 2020-07-14 9:44 UTC (permalink / raw) To: Nathan Chancellor Cc: linux-arch, Paul Menzel, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Clang-Built-Linux ML, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Sun, Jul 12, 2020 at 8:40 PM Nathan Chancellor <natechancellor@gmail.com> wrote: > > On Sun, Jul 12, 2020 at 10:59:17AM +0200, Sedat Dilek wrote: > > On Sat, Jul 11, 2020 at 6:32 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote: > > > > > > Dear Sami, > > > > > > > > > Am 24.06.20 um 22:31 schrieb Sami Tolvanen: > > > > This patch series adds support for building x86_64 and arm64 kernels > > > > with Clang's Link Time Optimization (LTO). > > > > > > > > In addition to performance, the primary motivation for LTO is to allow > > > > Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google's > > > > Pixel devices have shipped with LTO+CFI kernels since 2018. > > > > > > > > Most of the patches are build system changes for handling LLVM bitcode, > > > > which Clang produces with LTO instead of ELF object files, postponing > > > > ELF processing until a later stage, and ensuring initcall ordering. > > > > > > > > Note that first objtool patch in the series is already in linux-next, > > > > but as it's needed with LTO, I'm including it also here to make testing > > > > easier. > > > > > > […] > > > > > > Thank you very much for sending these changes. > > > > > > Do you have a branch, where your current work can be pulled from? Your > > > branch on GitHub [1] seems 15 months old. > > > > > > > Agreed it's easier to git-pull. > > I have seen [1] - not sure if this is the latest version. > > Alternatively, you can check patchwork LKML by searching for $submitter. > > ( You can open patch 01/22 and download the whole patch-series by > > following the link "series", see [3]. ) > > > > - Sedat - > > > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git/log/?h=lto > > [2] https://lore.kernel.org/patchwork/project/lkml/list/?series=&submitter=19676 > > [3] https://lore.kernel.org/patchwork/series/450026/mbox/ > > > > Sami tagged this series on his GitHub: > > https://github.com/samitolvanen/linux/releases/tag/lto-v1 > > git pull https://github.com/samitolvanen/linux lto-v1 > > Otherwise, he is updating the clang-cfi branch that includes both the > LTO and CFI patchsets. You can pull that and just turn on > CONFIG_LTO_CLANG. > > Lastly, for the future, I would recommend grabbing b4 to easily apply > patches (specifically full series) from lore.kernel.org. > > https://git.kernel.org/pub/scm/utils/b4/b4.git/ > https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst > > You could grab this series and apply it easily by either downloading the > mbox file and following the instructions it gives for applying the mbox > file: > > $ b4 am 20200624203200.78870-1-samitolvanen@google.com > > or I prefer piping so that I don't have to clean up later: > > $ b4 am -o - 20200624203200.78870-1-samitolvanen@google.com | git am > It is always a pleasure to read your replies and enrich my know-how beyond Linux-kernel hacking :-). Thanks for the tip with "b4" tool. Might add this to our ClangBuiltLinux wiki "Command line tips and tricks"? - Sedat - [1] https://github.com/ClangBuiltLinux/linux/wiki/Command-line-tips-and-tricks _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-14 9:44 ` Sedat Dilek @ 2020-07-14 17:54 ` Nick Desaulniers 0 siblings, 0 replies; 212+ messages in thread From: Nick Desaulniers @ 2020-07-14 17:54 UTC (permalink / raw) To: Sedat Dilek Cc: linux-arch, Paul Menzel, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, Linux Kbuild mailing list, LKML, Clang-Built-Linux ML, Sami Tolvanen, linux-pci, Nathan Chancellor, Will Deacon, Linux ARM On Tue, Jul 14, 2020 at 2:44 AM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > On Sun, Jul 12, 2020 at 8:40 PM Nathan Chancellor > <natechancellor@gmail.com> wrote: > > > > Lastly, for the future, I would recommend grabbing b4 to easily apply > > patches (specifically full series) from lore.kernel.org. > > > > https://git.kernel.org/pub/scm/utils/b4/b4.git/ > > https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst > > > > You could grab this series and apply it easily by either downloading the > > mbox file and following the instructions it gives for applying the mbox > > file: > > > > $ b4 am 20200624203200.78870-1-samitolvanen@google.com > > > > or I prefer piping so that I don't have to clean up later: > > > > $ b4 am -o - 20200624203200.78870-1-samitolvanen@google.com | git am > > > > It is always a pleasure to read your replies and enrich my know-how > beyond Linux-kernel hacking :-). > > Thanks for the tip with "b4" tool. > Might add this to our ClangBuiltLinux wiki "Command line tips and tricks"? > > - Sedat - > > [1] https://github.com/ClangBuiltLinux/linux/wiki/Command-line-tips-and-tricks Good idea, done. -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-11 16:32 ` Paul Menzel 2020-07-12 8:59 ` Sedat Dilek @ 2020-07-12 23:34 ` Sami Tolvanen 2020-07-14 12:16 ` Paul Menzel 1 sibling, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-07-12 23:34 UTC (permalink / raw) To: Paul Menzel Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Sat, Jul 11, 2020 at 9:32 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote: > Thank you very much for sending these changes. > > Do you have a branch, where your current work can be pulled from? Your > branch on GitHub [1] seems 15 months old. The clang-lto branch is rebased regularly on top of Linus' tree. GitHub just looks at the commit date of the last commit in the tree, which isn't all that informative. > Out of curiosity, I applied the changes, allowed the selection for i386 > (x86), and with Clang 1:11~++20200701093119+ffee8040534-1~exp1 from > Debian experimental, it failed with `Invalid absolute R_386_32 > relocation: KERNEL_PAGES`: I haven't looked at getting this to work on i386, which is why we only select ARCH_SUPPORTS_LTO for x86_64. I would expect there to be a few issues to address. > > arch/x86/tools/relocs vmlinux > arch/x86/boot/compressed/vmlinux.relocs;arch/x86/tools/relocs --abs-relocs vmlinux > > Invalid absolute R_386_32 relocation: KERNEL_PAGES KERNEL_PAGES looks like a constant, so it's probably safe to ignore the absolute relocation in tools/relocs.c. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-12 23:34 ` Sami Tolvanen @ 2020-07-14 12:16 ` Paul Menzel 2020-07-14 12:35 ` Sedat Dilek 0 siblings, 1 reply; 212+ messages in thread From: Paul Menzel @ 2020-07-14 12:16 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel Dear Sami, Am 13.07.20 um 01:34 schrieb Sami Tolvanen: > On Sat, Jul 11, 2020 at 9:32 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote: >> Thank you very much for sending these changes. >> >> Do you have a branch, where your current work can be pulled from? Your >> branch on GitHub [1] seems 15 months old. > > The clang-lto branch is rebased regularly on top of Linus' tree. > GitHub just looks at the commit date of the last commit in the tree, > which isn't all that informative. Thank you for clearing this up, and sorry for not checking myself. >> Out of curiosity, I applied the changes, allowed the selection for i386 >> (x86), and with Clang 1:11~++20200701093119+ffee8040534-1~exp1 from >> Debian experimental, it failed with `Invalid absolute R_386_32 >> relocation: KERNEL_PAGES`: > > I haven't looked at getting this to work on i386, which is why we only > select ARCH_SUPPORTS_LTO for x86_64. I would expect there to be a few > issues to address. > >>> arch/x86/tools/relocs vmlinux > arch/x86/boot/compressed/vmlinux.relocs;arch/x86/tools/relocs --abs-relocs vmlinux >>> Invalid absolute R_386_32 relocation: KERNEL_PAGES > > KERNEL_PAGES looks like a constant, so it's probably safe to ignore > the absolute relocation in tools/relocs.c. Thank you for pointing me to the right direction. I am happy to report, that with the diff below (no idea to what list to add the string), Linux 5.8-rc5 with the LLVM/Clang/LTO patches on top, builds and boots on the ASRock E350M1. ``` diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index 8f3bf34840cef..e91af127ed3c0 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -79,6 +79,7 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = { "__end_rodata_hpage_align|" #endif "__vvar_page|" + "KERNEL_PAGES|" "_end)$" }; ``` Kind regards, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH 00/22] add support for Clang LTO 2020-07-14 12:16 ` Paul Menzel @ 2020-07-14 12:35 ` Sedat Dilek 0 siblings, 0 replies; 212+ messages in thread From: Sedat Dilek @ 2020-07-14 12:35 UTC (permalink / raw) To: Paul Menzel Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, Clang-Built-Linux ML, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Tue, Jul 14, 2020 at 2:16 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote: > > Dear Sami, > > > Am 13.07.20 um 01:34 schrieb Sami Tolvanen: > > On Sat, Jul 11, 2020 at 9:32 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote: > >> Thank you very much for sending these changes. > >> > >> Do you have a branch, where your current work can be pulled from? Your > >> branch on GitHub [1] seems 15 months old. > > > > The clang-lto branch is rebased regularly on top of Linus' tree. > > GitHub just looks at the commit date of the last commit in the tree, > > which isn't all that informative. > > Thank you for clearing this up, and sorry for not checking myself. > > >> Out of curiosity, I applied the changes, allowed the selection for i386 > >> (x86), and with Clang 1:11~++20200701093119+ffee8040534-1~exp1 from > >> Debian experimental, it failed with `Invalid absolute R_386_32 > >> relocation: KERNEL_PAGES`: > > > > I haven't looked at getting this to work on i386, which is why we only > > select ARCH_SUPPORTS_LTO for x86_64. I would expect there to be a few > > issues to address. > > > >>> arch/x86/tools/relocs vmlinux > arch/x86/boot/compressed/vmlinux.relocs;arch/x86/tools/relocs --abs-relocs vmlinux > >>> Invalid absolute R_386_32 relocation: KERNEL_PAGES > > > > KERNEL_PAGES looks like a constant, so it's probably safe to ignore > > the absolute relocation in tools/relocs.c. > > Thank you for pointing me to the right direction. I am happy to report, > that with the diff below (no idea to what list to add the string), Linux > 5.8-rc5 with the LLVM/Clang/LTO patches on top, builds and boots on the > ASRock E350M1. > > ``` > diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c > index 8f3bf34840cef..e91af127ed3c0 100644 > --- a/arch/x86/tools/relocs.c > +++ b/arch/x86/tools/relocs.c > @@ -79,6 +79,7 @@ static const char * const > sym_regex_kernel[S_NSYMTYPES] = { > "__end_rodata_hpage_align|" > #endif > "__vvar_page|" > + "KERNEL_PAGES|" > "_end)$" > }; > ``` > What llvm-toolchain and version did you use? Can you post your linux-config? Thanks. - Sedat - _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 00/28] Add support for Clang LTO 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen ` (24 preceding siblings ...) 2020-07-11 16:32 ` Paul Menzel @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation Sami Tolvanen ` (32 more replies) 25 siblings, 33 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel This patch series adds support for building x86_64 and arm64 kernels with Clang's Link Time Optimization (LTO). In addition to performance, the primary motivation for LTO is to allow Clang's Control-Flow Integrity (CFI) to be used in the kernel. Google has shipped millions of Pixel devices running three major kernel versions with LTO+CFI since 2018. Most of the patches are build system changes for handling LLVM bitcode, which Clang produces with LTO instead of ELF object files, postponing ELF processing until a later stage, and ensuring initcall ordering. Note that patches 1-4 are not directly related to LTO, but are needed to compile LTO kernels with ToT Clang, so I'm including them in the series for your convenience: - Patches 1-3 are required for building the kernel with ToT Clang, and IAS, and patch 4 is needed to build allmodconfig with LTO. - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. --- Changes in v2: - Fixed -Wmissing-prototypes warnings with W=1. - Dropped cc-option from -fsplit-lto-unit and added .thinlto-cache scrubbing to make distclean. - Added a comment about Clang >=11 being required. - Added a patch to disable LTO for the arm64 KVM nVHE code. - Disabled objtool's noinstr validation with LTO unless enabled. - Included Peter's proposed objtool mcount patch in the series and replaced recordmcount with the objtool pass to avoid whitelisting relocations that are not calls. - Updated several commit messages with better explanations. Arvind Sankar (2): x86/boot/compressed: Disable relocation relaxation x86/asm: Replace __force_order with memory clobber Luca Stefani (1): RAS/CEC: Fix cec_init() prototype Nick Desaulniers (1): lib/string.c: implement stpcpy Peter Zijlstra (1): objtool: Add a pass for generating __mcount_loc Sami Tolvanen (23): objtool: Don't autodetect vmlinux.o kbuild: add support for objtool mcount x86, build: use objtool mcount kbuild: add support for Clang LTO kbuild: lto: fix module versioning kbuild: lto: postpone objtool kbuild: lto: limit inlining kbuild: lto: merge module sections kbuild: lto: remove duplicate dependencies from .mod files init: lto: ensure initcall ordering init: lto: fix PREL32 relocations PCI: Fix PREL32 relocations for LTO modpost: lto: strip .lto from module names scripts/mod: disable LTO for empty.c efi/libstub: disable LTO drivers/misc/lkdtm: disable LTO for rodata.o arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY arm64: vdso: disable LTO KVM: arm64: disable LTO for the nVHE directory arm64: allow LTO_CLANG and THINLTO to be selected x86, vdso: disable LTO only for vDSO x86, relocs: Ignore L4_PAGE_OFFSET relocations x86, build: allow LTO_CLANG and THINLTO to be selected .gitignore | 1 + Makefile | 65 ++++++- arch/Kconfig | 67 +++++++ arch/arm64/Kconfig | 2 + arch/arm64/Makefile | 1 + arch/arm64/kernel/vdso/Makefile | 4 +- arch/arm64/kvm/hyp/nvhe/Makefile | 4 +- arch/x86/Kconfig | 3 + arch/x86/Makefile | 5 + arch/x86/boot/compressed/Makefile | 2 + arch/x86/boot/compressed/pgtable_64.c | 9 - arch/x86/entry/vdso/Makefile | 5 +- arch/x86/include/asm/special_insns.h | 28 +-- arch/x86/kernel/cpu/common.c | 4 +- arch/x86/tools/relocs.c | 1 + drivers/firmware/efi/libstub/Makefile | 2 + drivers/misc/lkdtm/Makefile | 1 + drivers/ras/cec.c | 9 +- include/asm-generic/vmlinux.lds.h | 11 +- include/linux/init.h | 79 +++++++- include/linux/pci.h | 19 +- kernel/trace/Kconfig | 5 + lib/string.c | 24 +++ scripts/Makefile.build | 55 +++++- scripts/Makefile.lib | 6 +- scripts/Makefile.modfinal | 31 ++- scripts/Makefile.modpost | 26 ++- scripts/generate_initcall_order.pl | 270 ++++++++++++++++++++++++++ scripts/link-vmlinux.sh | 94 ++++++++- scripts/mod/Makefile | 1 + scripts/mod/modpost.c | 16 +- scripts/mod/modpost.h | 9 + scripts/mod/sumversion.c | 6 +- scripts/module-lto.lds | 26 +++ tools/objtool/builtin-check.c | 13 +- tools/objtool/builtin.h | 2 +- tools/objtool/check.c | 83 ++++++++ tools/objtool/check.h | 1 + tools/objtool/objtool.h | 1 + 39 files changed, 883 insertions(+), 108 deletions(-) create mode 100755 scripts/generate_initcall_order.pl create mode 100644 scripts/module-lto.lds base-commit: e28f0104343d0c132fa37f479870c9e43355fee4 -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:44 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber Sami Tolvanen ` (31 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Arvind Sankar, linux-pci, linux-arm-kernel From: Arvind Sankar <nivedita@alum.mit.edu> The x86-64 psABI [0] specifies special relocation types (R_X86_64_[REX_]GOTPCRELX) for indirection through the Global Offset Table, semantically equivalent to R_X86_64_GOTPCREL, which the linker can take advantage of for optimization (relaxation) at link time. This is supported by LLD and binutils versions 2.26 onwards. The compressed kernel is position-independent code, however, when using LLD or binutils versions before 2.27, it must be linked without the -pie option. In this case, the linker may optimize certain instructions into a non-position-independent form, by converting foo@GOTPCREL(%rip) to $foo. This potential issue has been present with LLD and binutils-2.26 for a long time, but it has never manifested itself before now: - LLD and binutils-2.26 only relax movq foo@GOTPCREL(%rip), %reg to leaq foo(%rip), %reg which is still position-independent, rather than mov $foo, %reg which is permitted by the psABI when -pie is not enabled. - gcc happens to only generate GOTPCREL relocations on mov instructions. - clang does generate GOTPCREL relocations on non-mov instructions, but when building the compressed kernel, it uses its integrated assembler (due to the redefinition of KBUILD_CFLAGS dropping -no-integrated-as), which has so far defaulted to not generating the GOTPCRELX relocations. Nick Desaulniers reports [1,2]: A recent change [3] to a default value of configuration variable (ENABLE_X86_RELAX_RELOCATIONS OFF -> ON) in LLVM now causes Clang's integrated assembler to emit R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX relocations. LLD will relax instructions with these relocations based on whether the image is being linked as position independent or not. When not, then LLD will relax these instructions to use absolute addressing mode (R_RELAX_GOT_PC_NOPIC). This causes kernels built with Clang and linked with LLD to fail to boot. Patch series [4] is a solution to allow the compressed kernel to be linked with -pie unconditionally, but even if merged is unlikely to be backported. As a simple solution that can be applied to stable as well, prevent the assembler from generating the relaxed relocation types using the -mrelax-relocations=no option. For ease of backporting, do this unconditionally. [0] https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/linker-optimization.tex#L65 [1] https://lore.kernel.org/lkml/20200807194100.3570838-1-ndesaulniers@google.com/ [2] https://github.com/ClangBuiltLinux/linux/issues/1121 [3] https://reviews.llvm.org/rGc41a18cf61790fc898dcda1055c3efbf442c14c0 [4] https://lore.kernel.org/lkml/20200731202738.2577854-1-nivedita@alum.mit.edu/ Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: stable@vger.kernel.org --- arch/x86/boot/compressed/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 3962f592633d..ff7894f39e0e 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -43,6 +43,8 @@ KBUILD_CFLAGS += -Wno-pointer-sign KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=) KBUILD_CFLAGS += -fno-asynchronous-unwind-tables KBUILD_CFLAGS += -D__DISABLE_EXPORTS +# Disable relocation relaxation in case the link is not PIE. +KBUILD_CFLAGS += $(call as-option,-Wa$(comma)-mrelax-relocations=no) KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ GCOV_PROFILE := n -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation 2020-09-03 20:30 ` [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation Sami Tolvanen @ 2020-09-03 21:44 ` Kees Cook 2020-09-03 23:42 ` Arvind Sankar 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:44 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Arvind Sankar, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:26PM -0700, Sami Tolvanen wrote: > From: Arvind Sankar <nivedita@alum.mit.edu> > > The x86-64 psABI [0] specifies special relocation types > (R_X86_64_[REX_]GOTPCRELX) for indirection through the Global Offset > Table, semantically equivalent to R_X86_64_GOTPCREL, which the linker > can take advantage of for optimization (relaxation) at link time. This > is supported by LLD and binutils versions 2.26 onwards. > > The compressed kernel is position-independent code, however, when using > LLD or binutils versions before 2.27, it must be linked without the -pie > option. In this case, the linker may optimize certain instructions into > a non-position-independent form, by converting foo@GOTPCREL(%rip) to $foo. > > This potential issue has been present with LLD and binutils-2.26 for a > long time, but it has never manifested itself before now: > - LLD and binutils-2.26 only relax > movq foo@GOTPCREL(%rip), %reg > to > leaq foo(%rip), %reg > which is still position-independent, rather than > mov $foo, %reg > which is permitted by the psABI when -pie is not enabled. > - gcc happens to only generate GOTPCREL relocations on mov instructions. > - clang does generate GOTPCREL relocations on non-mov instructions, but > when building the compressed kernel, it uses its integrated assembler > (due to the redefinition of KBUILD_CFLAGS dropping -no-integrated-as), > which has so far defaulted to not generating the GOTPCRELX > relocations. > > Nick Desaulniers reports [1,2]: > A recent change [3] to a default value of configuration variable > (ENABLE_X86_RELAX_RELOCATIONS OFF -> ON) in LLVM now causes Clang's > integrated assembler to emit R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX > relocations. LLD will relax instructions with these relocations based > on whether the image is being linked as position independent or not. > When not, then LLD will relax these instructions to use absolute > addressing mode (R_RELAX_GOT_PC_NOPIC). This causes kernels built with > Clang and linked with LLD to fail to boot. > > Patch series [4] is a solution to allow the compressed kernel to be > linked with -pie unconditionally, but even if merged is unlikely to be > backported. As a simple solution that can be applied to stable as well, > prevent the assembler from generating the relaxed relocation types using > the -mrelax-relocations=no option. For ease of backporting, do this > unconditionally. > > [0] https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/linker-optimization.tex#L65 > [1] https://lore.kernel.org/lkml/20200807194100.3570838-1-ndesaulniers@google.com/ > [2] https://github.com/ClangBuiltLinux/linux/issues/1121 > [3] https://reviews.llvm.org/rGc41a18cf61790fc898dcda1055c3efbf442c14c0 > [4] https://lore.kernel.org/lkml/20200731202738.2577854-1-nivedita@alum.mit.edu/ > > Reported-by: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation 2020-09-03 21:44 ` Kees Cook @ 2020-09-03 23:42 ` Arvind Sankar 2020-09-04 7:14 ` Nathan Chancellor 0 siblings, 1 reply; 212+ messages in thread From: Arvind Sankar @ 2020-09-03 23:42 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Arvind Sankar, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 02:44:41PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:26PM -0700, Sami Tolvanen wrote: > > From: Arvind Sankar <nivedita@alum.mit.edu> > > > > Patch series [4] is a solution to allow the compressed kernel to be > > linked with -pie unconditionally, but even if merged is unlikely to be > > backported. As a simple solution that can be applied to stable as well, > > prevent the assembler from generating the relaxed relocation types using > > the -mrelax-relocations=no option. For ease of backporting, do this > > unconditionally. > > > > [0] https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/linker-optimization.tex#L65 > > [1] https://lore.kernel.org/lkml/20200807194100.3570838-1-ndesaulniers@google.com/ > > [2] https://github.com/ClangBuiltLinux/linux/issues/1121 > > [3] https://reviews.llvm.org/rGc41a18cf61790fc898dcda1055c3efbf442c14c0 > > [4] https://lore.kernel.org/lkml/20200731202738.2577854-1-nivedita@alum.mit.edu/ > > > > Reported-by: Nick Desaulniers <ndesaulniers@google.com> > > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> > > Reviewed-by: Kees Cook <keescook@chromium.org> > > -- > Kees Cook Note that since [4] is now in tip, assuming it doesn't get dropped for some reason, this patch isn't necessary unless you need to backport this LTO series to 5.9 or below. Thanks. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation 2020-09-03 23:42 ` Arvind Sankar @ 2020-09-04 7:14 ` Nathan Chancellor 0 siblings, 0 replies; 212+ messages in thread From: Nathan Chancellor @ 2020-09-04 7:14 UTC (permalink / raw) To: Arvind Sankar Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 07:42:15PM -0400, Arvind Sankar wrote: > On Thu, Sep 03, 2020 at 02:44:41PM -0700, Kees Cook wrote: > > On Thu, Sep 03, 2020 at 01:30:26PM -0700, Sami Tolvanen wrote: > > > From: Arvind Sankar <nivedita@alum.mit.edu> > > > > > > Patch series [4] is a solution to allow the compressed kernel to be > > > linked with -pie unconditionally, but even if merged is unlikely to be > > > backported. As a simple solution that can be applied to stable as well, > > > prevent the assembler from generating the relaxed relocation types using > > > the -mrelax-relocations=no option. For ease of backporting, do this > > > unconditionally. > > > > > > [0] https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/linker-optimization.tex#L65 > > > [1] https://lore.kernel.org/lkml/20200807194100.3570838-1-ndesaulniers@google.com/ > > > [2] https://github.com/ClangBuiltLinux/linux/issues/1121 > > > [3] https://reviews.llvm.org/rGc41a18cf61790fc898dcda1055c3efbf442c14c0 > > > [4] https://lore.kernel.org/lkml/20200731202738.2577854-1-nivedita@alum.mit.edu/ > > > > > > Reported-by: Nick Desaulniers <ndesaulniers@google.com> > > > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> > > > > Reviewed-by: Kees Cook <keescook@chromium.org> > > > > -- > > Kees Cook > > Note that since [4] is now in tip, assuming it doesn't get dropped for > some reason, this patch isn't necessary unless you need to backport this > LTO series to 5.9 or below. > > Thanks. It is still necessary for tip of tree LLVM to work properly (specifically clang and ld.lld) regardless of whether or not LTO is used. [4] also fixes it but I don't think it can be backported to stable so it would still be nice to get it picked up so that it can be sent back there. We have been carrying it in our CI for a decent amount of time... Cheers, Nathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 03/28] lib/string.c: implement stpcpy Sami Tolvanen ` (30 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Arvind Sankar, linux-pci, linux-arm-kernel From: Arvind Sankar <nivedita@alum.mit.edu> The CRn accessor functions use __force_order as a dummy operand to prevent the compiler from reordering CRn reads/writes with respect to each other. The fact that the asm is volatile should be enough to prevent this: volatile asm statements should be executed in program order. However GCC 4.9.x and 5.x have a bug that might result in reordering. This was fixed in 8.1, 7.3 and 6.5. Versions prior to these, including 5.x and 4.9.x, may reorder volatile asm statements with respect to each other. There are some issues with __force_order as implemented: - It is used only as an input operand for the write functions, and hence doesn't do anything additional to prevent reordering writes. - It allows memory accesses to be cached/reordered across write functions, but CRn writes affect the semantics of memory accesses, so this could be dangerous. - __force_order is not actually defined in the kernel proper, but the LLVM toolchain can in some cases require a definition: LLVM (as well as GCC 4.9) requires it for PIE code, which is why the compressed kernel has a definition, but also the clang integrated assembler may consider the address of __force_order to be significant, resulting in a reference that requires a definition. Fix this by: - Using a memory clobber for the write functions to additionally prevent caching/reordering memory accesses across CRn writes. - Using a dummy input operand with an arbitrary constant address for the read functions, instead of a global variable. This will prevent reads from being reordered across writes, while allowing memory loads to be cached/reordered across CRn reads, which should be safe. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602 Link: https://lore.kernel.org/lkml/20200527135329.1172644-1-arnd@arndb.de/ --- arch/x86/boot/compressed/pgtable_64.c | 9 --------- arch/x86/include/asm/special_insns.h | 28 ++++++++++++++------------- arch/x86/kernel/cpu/common.c | 4 ++-- 3 files changed, 17 insertions(+), 24 deletions(-) diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index c8862696a47b..7d0394f4ebf9 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -5,15 +5,6 @@ #include "pgtable.h" #include "../string.h" -/* - * __force_order is used by special_insns.h asm code to force instruction - * serialization. - * - * It is not referenced from the code, but GCC < 5 with -fPIE would fail - * due to an undefined symbol. Define it to make these ancient GCCs work. - */ -unsigned long __force_order; - #define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */ #define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 59a3e13204c3..d6e3bb9363d2 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -11,45 +11,47 @@ #include <linux/jump_label.h> /* - * Volatile isn't enough to prevent the compiler from reordering the - * read/write functions for the control registers and messing everything up. - * A memory clobber would solve the problem, but would prevent reordering of - * all loads stores around it, which can hurt performance. Solution is to - * use a variable and mimic reads and writes to it to enforce serialization + * The compiler should not reorder volatile asm statements with respect to each + * other: they should execute in program order. However GCC 4.9.x and 5.x have + * a bug (which was fixed in 8.1, 7.3 and 6.5) where they might reorder + * volatile asm. The write functions are not affected since they have memory + * clobbers preventing reordering. To prevent reads from being reordered with + * respect to writes, use a dummy memory operand. */ -extern unsigned long __force_order; + +#define __FORCE_ORDER "m"(*(unsigned int *)0x1000UL) void native_write_cr0(unsigned long val); static inline unsigned long native_read_cr0(void) { unsigned long val; - asm volatile("mov %%cr0,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr0,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } static __always_inline unsigned long native_read_cr2(void) { unsigned long val; - asm volatile("mov %%cr2,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr2,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } static __always_inline void native_write_cr2(unsigned long val) { - asm volatile("mov %0,%%cr2": : "r" (val), "m" (__force_order)); + asm volatile("mov %0,%%cr2": : "r" (val) : "memory"); } static inline unsigned long __native_read_cr3(void) { unsigned long val; - asm volatile("mov %%cr3,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } static inline void native_write_cr3(unsigned long val) { - asm volatile("mov %0,%%cr3": : "r" (val), "m" (__force_order)); + asm volatile("mov %0,%%cr3": : "r" (val) : "memory"); } static inline unsigned long native_read_cr4(void) @@ -64,10 +66,10 @@ static inline unsigned long native_read_cr4(void) asm volatile("1: mov %%cr4, %0\n" "2:\n" _ASM_EXTABLE(1b, 2b) - : "=r" (val), "=m" (__force_order) : "0" (0)); + : "=r" (val) : "0" (0), __FORCE_ORDER); #else /* CR4 always exists on x86_64. */ - asm volatile("mov %%cr4,%0\n\t" : "=r" (val), "=m" (__force_order)); + asm volatile("mov %%cr4,%0\n\t" : "=r" (val) : __FORCE_ORDER); #endif return val; } diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index c5d6f17d9b9d..178499f90366 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -359,7 +359,7 @@ void native_write_cr0(unsigned long val) unsigned long bits_missing = 0; set_register: - asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); + asm volatile("mov %0,%%cr0": "+r" (val) : : "memory"); if (static_branch_likely(&cr_pinning)) { if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { @@ -378,7 +378,7 @@ void native_write_cr4(unsigned long val) unsigned long bits_changed = 0; set_register: - asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); + asm volatile("mov %0,%%cr4": "+r" (val) : : "memory"); if (static_branch_likely(&cr_pinning)) { if (unlikely((val & cr4_pinned_mask) != cr4_pinned_bits)) { -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber 2020-09-03 20:30 ` [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber Sami Tolvanen @ 2020-09-03 21:45 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:45 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Arvind Sankar, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:27PM -0700, Sami Tolvanen wrote: > From: Arvind Sankar <nivedita@alum.mit.edu> > > The CRn accessor functions use __force_order as a dummy operand to > prevent the compiler from reordering CRn reads/writes with respect to > each other. > > The fact that the asm is volatile should be enough to prevent this: > volatile asm statements should be executed in program order. However GCC > 4.9.x and 5.x have a bug that might result in reordering. This was fixed > in 8.1, 7.3 and 6.5. Versions prior to these, including 5.x and 4.9.x, > may reorder volatile asm statements with respect to each other. > > There are some issues with __force_order as implemented: > - It is used only as an input operand for the write functions, and hence > doesn't do anything additional to prevent reordering writes. > - It allows memory accesses to be cached/reordered across write > functions, but CRn writes affect the semantics of memory accesses, so > this could be dangerous. > - __force_order is not actually defined in the kernel proper, but the > LLVM toolchain can in some cases require a definition: LLVM (as well > as GCC 4.9) requires it for PIE code, which is why the compressed > kernel has a definition, but also the clang integrated assembler may > consider the address of __force_order to be significant, resulting in > a reference that requires a definition. > > Fix this by: > - Using a memory clobber for the write functions to additionally prevent > caching/reordering memory accesses across CRn writes. > - Using a dummy input operand with an arbitrary constant address for the > read functions, instead of a global variable. This will prevent reads > from being reordered across writes, while allowing memory loads to be > cached/reordered across CRn reads, which should be safe. > > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> In the primary thread for this patch I sent a Reviewed tag, but for good measure, here it is again: Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 03/28] lib/string.c: implement stpcpy 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:47 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype Sami Tolvanen ` (29 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel From: Nick Desaulniers <ndesaulniers@google.com> LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. `stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This optimization was introduced into clang-12. Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`. Similar to last year's fire drill with: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp") The kernel is somewhere between a "freestanding" environment (no full libc) and "hosted" environment (many symbols from libc exist with the same type, function signature, and semantics). As H. Peter Anvin notes, there's not really a great way to inform the compiler that you're targeting a freestanding environment but would like to opt-in to some libcall optimizations (see pr/47280 below), rather than opt-out. Arvind notes, -fno-builtin-* behaves slightly differently between GCC and Clang, and Clang is missing many __builtin_* definitions, which I consider a bug in Clang and am working on fixing. Masahiro summarizes the subtle distinction between compilers justly: To prevent transformation from foo() into bar(), there are two ways in Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is only one in GCC; -fno-buitin-foo. (Any difference in that behavior in Clang is likely a bug from a missing __builtin_* definition.) Masahiro also notes: We want to disable optimization from foo() to bar(), but we may still benefit from the optimization from foo() into something else. If GCC implements the same transform, we would run into a problem because it is not -fno-builtin-bar, but -fno-builtin-foo that disables that optimization. In this regard, -fno-builtin-foo would be more future-proof than -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We may want to prevent calls from foo() being optimized into calls to bar(), but we still may want other optimization on calls to foo(). It seems that compilers today don't quite provide the fine grain control over which libcall optimizations pseudo-freestanding environments would prefer. Finally, Kees notes that this interface is unsafe, so we should not encourage its use. As such, I've removed the declaration from any header, but it still needs to be exported to avoid linkage errors in modules. Reported-by: Sami Tolvanen <samitolvanen@google.com> Suggested-by: Andy Lavr <andy.lavr@gmail.com> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> Suggested-by: Joe Perches <joe@perches.com> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Cc: stable@vger.kernel.org Link: https://bugs.llvm.org/show_bug.cgi?id=47162 Link: https://bugs.llvm.org/show_bug.cgi?id=47280 Link: https://github.com/ClangBuiltLinux/linux/issues/1126 Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html Link: https://reviews.llvm.org/D85963 --- lib/string.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/lib/string.c b/lib/string.c index 6012c385fb31..6bd0cf0fb009 100644 --- a/lib/string.c +++ b/lib/string.c @@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count) } EXPORT_SYMBOL(strscpy_pad); +/** + * stpcpy - copy a string from src to dest returning a pointer to the new end + * of dest, including src's %NUL-terminator. May overrun dest. + * @dest: pointer to end of string being copied into. Must be large enough + * to receive copy. + * @src: pointer to the beginning of string being copied from. Must not overlap + * dest. + * + * stpcpy differs from strcpy in a key way: the return value is the new + * %NUL-terminated character. (for strcpy, the return value is a pointer to + * src. This interface is considered unsafe as it doesn't perform bounds + * checking of the inputs. As such it's not recommended for usage. Instead, + * its definition is provided in case the compiler lowers other libcalls to + * stpcpy. + */ +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src); +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{ + while ((*dest++ = *src++) != '\0') + /* nothing */; + return --dest; +} +EXPORT_SYMBOL(stpcpy); + #ifndef __HAVE_ARCH_STRCAT /** * strcat - Append one %NUL-terminated string to another -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 03/28] lib/string.c: implement stpcpy 2020-09-03 20:30 ` [PATCH v2 03/28] lib/string.c: implement stpcpy Sami Tolvanen @ 2020-09-03 21:47 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:47 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:28PM -0700, Sami Tolvanen wrote: > From: Nick Desaulniers <ndesaulniers@google.com> > > LLVM implemented a recent "libcall optimization" that lowers calls to > `sprintf(dest, "%s", str)` where the return value is used to > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > in parsing format strings. `stpcpy` is just like `strcpy` except it > returns the pointer to the new tail of `dest`. This optimization was > introduced into clang-12. > > Implement this so that we don't observe linkage failures due to missing > symbol definitions for `stpcpy`. > > Similar to last year's fire drill with: > commit 5f074f3e192f ("lib/string.c: implement a basic bcmp") > > The kernel is somewhere between a "freestanding" environment (no full libc) > and "hosted" environment (many symbols from libc exist with the same > type, function signature, and semantics). > > As H. Peter Anvin notes, there's not really a great way to inform the > compiler that you're targeting a freestanding environment but would like > to opt-in to some libcall optimizations (see pr/47280 below), rather than > opt-out. > > Arvind notes, -fno-builtin-* behaves slightly differently between GCC > and Clang, and Clang is missing many __builtin_* definitions, which I > consider a bug in Clang and am working on fixing. > > Masahiro summarizes the subtle distinction between compilers justly: > To prevent transformation from foo() into bar(), there are two ways in > Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is > only one in GCC; -fno-buitin-foo. > > (Any difference in that behavior in Clang is likely a bug from a missing > __builtin_* definition.) > > Masahiro also notes: > We want to disable optimization from foo() to bar(), > but we may still benefit from the optimization from > foo() into something else. If GCC implements the same transform, we > would run into a problem because it is not -fno-builtin-bar, but > -fno-builtin-foo that disables that optimization. > > In this regard, -fno-builtin-foo would be more future-proof than > -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We > may want to prevent calls from foo() being optimized into calls to > bar(), but we still may want other optimization on calls to foo(). > > It seems that compilers today don't quite provide the fine grain control > over which libcall optimizations pseudo-freestanding environments would > prefer. > > Finally, Kees notes that this interface is unsafe, so we should not > encourage its use. As such, I've removed the declaration from any > header, but it still needs to be exported to avoid linkage errors in > modules. > > Reported-by: Sami Tolvanen <samitolvanen@google.com> > Suggested-by: Andy Lavr <andy.lavr@gmail.com> > Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> > Suggested-by: Joe Perches <joe@perches.com> > Suggested-by: Masahiro Yamada <masahiroy@kernel.org> > Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> As you mentioned, this is in -next already (via -mm). I think I sent a tag for this before, but maybe akpm missed it, so for good measure: Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (2 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 03/28] lib/string.c: implement stpcpy Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:50 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc Sami Tolvanen ` (28 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, Luca Stefani, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel From: Luca Stefani <luca.stefani.ge1@gmail.com> late_initcall() expects a function that returns an integer. Update the function signature to match. [ bp: Massage commit message into proper sentences. ] Fixes: 9554bfe403nd ("x86/mce: Convert the CEC to use the MCE notifier") Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lkml.kernel.org/r/20200805095708.83939-1-luca.stefani.ge1@gmail.com --- drivers/ras/cec.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c index 569d9ad2c594..6939aa5b3dc7 100644 --- a/drivers/ras/cec.c +++ b/drivers/ras/cec.c @@ -553,20 +553,20 @@ static struct notifier_block cec_nb = { .priority = MCE_PRIO_CEC, }; -static void __init cec_init(void) +static int __init cec_init(void) { if (ce_arr.disabled) - return; + return -ENODEV; ce_arr.array = (void *)get_zeroed_page(GFP_KERNEL); if (!ce_arr.array) { pr_err("Error allocating CE array page!\n"); - return; + return -ENOMEM; } if (create_debugfs_nodes()) { free_page((unsigned long)ce_arr.array); - return; + return -ENOMEM; } INIT_DELAYED_WORK(&cec_work, cec_work_fn); @@ -575,6 +575,7 @@ static void __init cec_init(void) mce_register_decode_chain(&cec_nb); pr_info("Correctable Errors collector initialized.\n"); + return 0; } late_initcall(cec_init); -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype 2020-09-03 20:30 ` [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype Sami Tolvanen @ 2020-09-03 21:50 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:50 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, Luca Stefani, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:29PM -0700, Sami Tolvanen wrote: > From: Luca Stefani <luca.stefani.ge1@gmail.com> > > late_initcall() expects a function that returns an integer. Update the > function signature to match. > > [ bp: Massage commit message into proper sentences. ] > > Fixes: 9554bfe403nd ("x86/mce: Convert the CEC to use the MCE notifier") > Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> I don't see this in -next yet (next-20200903), but given Boris's SoB, I suspect it just hasn't snuck it's way there from -tip. Regardless: Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (3 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:51 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o Sami Tolvanen ` (27 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel From: Peter Zijlstra <peterz@infradead.org> Add the --mcount option for generating __mcount_loc sections needed for dynamic ftrace. Using this pass requires the kernel to be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined in Makefile. Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/ Signed-off-by: Peter Zijlstra <peterz@infradead.org> [Sami: rebased to mainline, dropped config changes, fixed to actually use --mcount, and wrote a commit message.] Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- tools/objtool/builtin-check.c | 3 +- tools/objtool/builtin.h | 2 +- tools/objtool/check.c | 83 +++++++++++++++++++++++++++++++++++ tools/objtool/check.h | 1 + tools/objtool/objtool.h | 1 + 5 files changed, 88 insertions(+), 2 deletions(-) diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 7a44174967b5..71595cf4946d 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -18,7 +18,7 @@ #include "builtin.h" #include "objtool.h" -bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux; +bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount; static const char * const check_usage[] = { "objtool check [<options>] file.o", @@ -35,6 +35,7 @@ const struct option check_options[] = { OPT_BOOLEAN('s', "stats", &stats, "print statistics"), OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"), OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"), + OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"), OPT_END(), }; diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h index 85c979caa367..94565a72b701 100644 --- a/tools/objtool/builtin.h +++ b/tools/objtool/builtin.h @@ -8,7 +8,7 @@ #include <subcmd/parse-options.h> extern const struct option check_options[]; -extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux; +extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount; extern int cmd_check(int argc, const char **argv); extern int cmd_orc(int argc, const char **argv); diff --git a/tools/objtool/check.c b/tools/objtool/check.c index e034a8f24f46..6e0b478dc065 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -433,6 +433,65 @@ static int add_dead_ends(struct objtool_file *file) return 0; } +static int create_mcount_loc_sections(struct objtool_file *file) +{ + struct section *sec, *reloc_sec; + struct reloc *reloc; + unsigned long *loc; + struct instruction *insn; + int idx; + + sec = find_section_by_name(file->elf, "__mcount_loc"); + if (sec) { + INIT_LIST_HEAD(&file->mcount_loc_list); + WARN("file already has __mcount_loc section, skipping"); + return 0; + } + + if (list_empty(&file->mcount_loc_list)) + return 0; + + idx = 0; + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) + idx++; + + sec = elf_create_section(file->elf, "__mcount_loc", sizeof(unsigned long), idx); + if (!sec) + return -1; + + reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA); + if (!reloc_sec) + return -1; + + idx = 0; + list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) { + + loc = (unsigned long *)sec->data->d_buf + idx; + memset(loc, 0, sizeof(unsigned long)); + + reloc = malloc(sizeof(*reloc)); + if (!reloc) { + perror("malloc"); + return -1; + } + memset(reloc, 0, sizeof(*reloc)); + + reloc->sym = insn->sec->sym; + reloc->addend = insn->offset; + reloc->type = R_X86_64_64; + reloc->offset = idx * sizeof(unsigned long); + reloc->sec = reloc_sec; + elf_add_reloc(file->elf, reloc); + + idx++; + } + + if (elf_rebuild_reloc_section(file->elf, reloc_sec)) + return -1; + + return 0; +} + /* * Warnings shouldn't be reported for ignored functions. */ @@ -784,6 +843,22 @@ static int add_call_destinations(struct objtool_file *file) insn->type = INSN_NOP; } + if (mcount && !strcmp(insn->call_dest->name, "__fentry__")) { + if (reloc) { + reloc->type = R_NONE; + elf_write_reloc(file->elf, reloc); + } + + elf_write_insn(file->elf, insn->sec, + insn->offset, insn->len, + arch_nop_insn(insn->len)); + + insn->type = INSN_NOP; + + list_add_tail(&insn->mcount_loc_node, + &file->mcount_loc_list); + } + /* * Whatever stack impact regular CALLs have, should be undone * by the RETURN of the called function. @@ -2791,6 +2866,7 @@ int check(const char *_objname, bool orc) INIT_LIST_HEAD(&file.insn_list); hash_init(file.insn_hash); + INIT_LIST_HEAD(&file.mcount_loc_list); file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment"); file.ignore_unreachables = no_unreachable; file.hints = false; @@ -2838,6 +2914,13 @@ int check(const char *_objname, bool orc) warnings += ret; } + if (mcount) { + ret = create_mcount_loc_sections(&file); + if (ret < 0) + goto out; + warnings += ret; + } + if (orc) { ret = create_orc(&file); if (ret < 0) diff --git a/tools/objtool/check.h b/tools/objtool/check.h index 061aa96e15d3..b62afd3d970b 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -22,6 +22,7 @@ struct insn_state { struct instruction { struct list_head list; struct hlist_node hash; + struct list_head mcount_loc_node; struct section *sec; unsigned long offset; unsigned int len; diff --git a/tools/objtool/objtool.h b/tools/objtool/objtool.h index 528028a66816..427806079540 100644 --- a/tools/objtool/objtool.h +++ b/tools/objtool/objtool.h @@ -16,6 +16,7 @@ struct objtool_file { struct elf *elf; struct list_head insn_list; DECLARE_HASHTABLE(insn_hash, 20); + struct list_head mcount_loc_list; bool ignore_unreachables, c_file, hints, rodata; }; -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc 2020-09-03 20:30 ` [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc Sami Tolvanen @ 2020-09-03 21:51 ` Kees Cook 2020-09-03 22:03 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:51 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:30PM -0700, Sami Tolvanen wrote: > From: Peter Zijlstra <peterz@infradead.org> > > Add the --mcount option for generating __mcount_loc sections > needed for dynamic ftrace. Using this pass requires the kernel to > be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined > in Makefile. > > Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/ > Signed-off-by: Peter Zijlstra <peterz@infradead.org> Hmm, I'm not sure why this hasn't gotten picked up yet. Is this expected to go through -tip or something else? Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc 2020-09-03 21:51 ` Kees Cook @ 2020-09-03 22:03 ` Sami Tolvanen 2020-09-04 9:31 ` peterz 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 22:03 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, X86 ML, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 3, 2020 at 2:51 PM Kees Cook <keescook@chromium.org> wrote: > > On Thu, Sep 03, 2020 at 01:30:30PM -0700, Sami Tolvanen wrote: > > From: Peter Zijlstra <peterz@infradead.org> > > > > Add the --mcount option for generating __mcount_loc sections > > needed for dynamic ftrace. Using this pass requires the kernel to > > be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined > > in Makefile. > > > > Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/ > > Signed-off-by: Peter Zijlstra <peterz@infradead.org> > > Hmm, I'm not sure why this hasn't gotten picked up yet. Is this expected > to go through -tip or something else? Note that I picked up this patch from Peter's original email, to which I included a link in the commit message, but it wasn't officially submitted as a patch. However, the previous discussion seems to have died, so I included the patch in this series, as it cleanly solves the problem of whitelisting non-call references to __fentry__. I was hoping for Peter and Steven to comment on how they prefer to proceed here. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc 2020-09-03 22:03 ` Sami Tolvanen @ 2020-09-04 9:31 ` peterz 2020-09-10 18:29 ` Kees Cook 0 siblings, 1 reply; 212+ messages in thread From: peterz @ 2020-09-04 9:31 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:03:30PM -0700, Sami Tolvanen wrote: > On Thu, Sep 3, 2020 at 2:51 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Thu, Sep 03, 2020 at 01:30:30PM -0700, Sami Tolvanen wrote: > > > From: Peter Zijlstra <peterz@infradead.org> > > > > > > Add the --mcount option for generating __mcount_loc sections > > > needed for dynamic ftrace. Using this pass requires the kernel to > > > be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined > > > in Makefile. > > > > > > Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/ > > > Signed-off-by: Peter Zijlstra <peterz@infradead.org> > > > > Hmm, I'm not sure why this hasn't gotten picked up yet. Is this expected > > to go through -tip or something else? > > Note that I picked up this patch from Peter's original email, to which > I included a link in the commit message, but it wasn't officially > submitted as a patch. However, the previous discussion seems to have > died, so I included the patch in this series, as it cleanly solves the > problem of whitelisting non-call references to __fentry__. I was > hoping for Peter and Steven to comment on how they prefer to proceed > here. Right; so I'm obviously fine with this patch and I suppose I can pick it (and the next) into tip/objtool/core, provided Steve is okay with this approach. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc 2020-09-04 9:31 ` peterz @ 2020-09-10 18:29 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-10 18:29 UTC (permalink / raw) To: Steven Rostedt Cc: linux-arch, X86 ML, Paul E. McKenney, Kernel Hardening, peterz, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Fri, Sep 04, 2020 at 11:31:04AM +0200, peterz@infradead.org wrote: > On Thu, Sep 03, 2020 at 03:03:30PM -0700, Sami Tolvanen wrote: > > On Thu, Sep 3, 2020 at 2:51 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Thu, Sep 03, 2020 at 01:30:30PM -0700, Sami Tolvanen wrote: > > > > From: Peter Zijlstra <peterz@infradead.org> > > > > > > > > Add the --mcount option for generating __mcount_loc sections > > > > needed for dynamic ftrace. Using this pass requires the kernel to > > > > be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined > > > > in Makefile. > > > > > > > > Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/ > > > > Signed-off-by: Peter Zijlstra <peterz@infradead.org> > > > > > > Hmm, I'm not sure why this hasn't gotten picked up yet. Is this expected > > > to go through -tip or something else? > > > > Note that I picked up this patch from Peter's original email, to which > > I included a link in the commit message, but it wasn't officially > > submitted as a patch. However, the previous discussion seems to have > > died, so I included the patch in this series, as it cleanly solves the > > problem of whitelisting non-call references to __fentry__. I was > > hoping for Peter and Steven to comment on how they prefer to proceed > > here. > > Right; so I'm obviously fine with this patch and I suppose I can pick it > (and the next) into tip/objtool/core, provided Steve is okay with this > approach. Hello Steven-of-the-future-after-4000-emails![1] ;) Getting your Ack on this would be very welcome, and would unblock a portion of this series. Thanks! :) [1] https://twitter.com/srostedt/status/1303697650592755712 -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (4 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:52 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 07/28] kbuild: add support for objtool mcount Sami Tolvanen ` (26 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, we run objtool on vmlinux.o, but don't want noinstr validation. This change requires --vmlinux to be passed to objtool explicitly. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/link-vmlinux.sh | 2 +- tools/objtool/builtin-check.c | 10 +--------- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index e6e2d9e5ff48..372c3719f94c 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -64,7 +64,7 @@ objtool_link() local objtoolopt; if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then - objtoolopt="check" + objtoolopt="check --vmlinux" if [ -z "${CONFIG_FRAME_POINTER}" ]; then objtoolopt="${objtoolopt} --no-fp" fi diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 71595cf4946d..eaa06eb18690 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -41,18 +41,10 @@ const struct option check_options[] = { int cmd_check(int argc, const char **argv) { - const char *objname, *s; - argc = parse_options(argc, argv, check_options, check_usage, 0); if (argc != 1) usage_with_options(check_usage, check_options); - objname = argv[0]; - - s = strstr(objname, "vmlinux.o"); - if (s && !s[9]) - vmlinux = true; - - return check(objname, false); + return check(argv[0], false); } -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o 2020-09-03 20:30 ` [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o Sami Tolvanen @ 2020-09-03 21:52 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:52 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:31PM -0700, Sami Tolvanen wrote: > With LTO, we run objtool on vmlinux.o, but don't want noinstr > validation. This change requires --vmlinux to be passed to objtool > explicitly. > > Suggested-by: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Looks right to me. Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 07/28] kbuild: add support for objtool mcount 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (5 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:56 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 08/28] x86, build: use " Sami Tolvanen ` (25 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel This change adds build support for using objtool to generate __mcount_loc sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 38 ++++++++++++++++++++++++++++++-------- kernel/trace/Kconfig | 5 +++++ scripts/Makefile.build | 9 +++++---- 3 files changed, 40 insertions(+), 12 deletions(-) diff --git a/Makefile b/Makefile index ff5e0731d26d..a9dae26c93b5 100644 --- a/Makefile +++ b/Makefile @@ -859,17 +859,34 @@ ifdef CONFIG_HAVE_FENTRY ifeq ($(call cc-option-yn, -mfentry),y) CC_FLAGS_FTRACE += -mfentry CC_FLAGS_USING += -DCC_USING_FENTRY + export CC_USING_FENTRY := 1 endif endif export CC_FLAGS_FTRACE -KBUILD_CFLAGS += $(CC_FLAGS_FTRACE) $(CC_FLAGS_USING) -KBUILD_AFLAGS += $(CC_FLAGS_USING) ifdef CONFIG_DYNAMIC_FTRACE - ifdef CONFIG_HAVE_C_RECORDMCOUNT - BUILD_C_RECORDMCOUNT := y - export BUILD_C_RECORDMCOUNT - endif + ifndef CC_USING_RECORD_MCOUNT + ifndef CC_USING_PATCHABLE_FUNCTION_ENTRY + # use objtool or recordmcount to generate mcount tables + ifdef CONFIG_HAVE_OBJTOOL_MCOUNT + ifdef CC_USING_FENTRY + USE_OBJTOOL_MCOUNT := y + CC_FLAGS_USING += -DCC_USING_NOP_MCOUNT + export USE_OBJTOOL_MCOUNT + endif + endif + ifndef USE_OBJTOOL_MCOUNT + USE_RECORDMCOUNT := y + export USE_RECORDMCOUNT + ifdef CONFIG_HAVE_C_RECORDMCOUNT + BUILD_C_RECORDMCOUNT := y + export BUILD_C_RECORDMCOUNT + endif + endif + endif + endif endif +KBUILD_CFLAGS += $(CC_FLAGS_FTRACE) $(CC_FLAGS_USING) +KBUILD_AFLAGS += $(CC_FLAGS_USING) endif # We trigger additional mismatches with less inlining @@ -1218,11 +1235,16 @@ uapi-asm-generic: PHONY += prepare-objtool prepare-resolve_btfids prepare-objtool: $(objtool_target) ifeq ($(SKIP_STACK_VALIDATION),1) +objtool-lib-prompt := "please install libelf-dev, libelf-devel or elfutils-libelf-devel" +ifdef USE_OBJTOOL_MCOUNT + @echo "error: Cannot generate __mcount_loc for CONFIG_DYNAMIC_FTRACE=y, $(objtool-lib-prompt)" >&2 + @false +endif ifdef CONFIG_UNWINDER_ORC - @echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2 + @echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, $(objtool-lib-prompt)" >&2 @false else - @echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2 + @echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, $(objtool-lib-prompt)" >&2 endif endif diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index a4020c0b4508..b510af5b216c 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -56,6 +56,11 @@ config HAVE_C_RECORDMCOUNT help C version of recordmcount available? +config HAVE_OBJTOOL_MCOUNT + bool + help + Arch supports objtool --mcount + config TRACER_MAX_TRACE bool diff --git a/scripts/Makefile.build b/scripts/Makefile.build index a467b9323442..6ecf30c70ced 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -178,8 +178,7 @@ cmd_modversions_c = \ fi endif -ifdef CONFIG_FTRACE_MCOUNT_RECORD -ifndef CC_USING_RECORD_MCOUNT +ifdef USE_RECORDMCOUNT # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") @@ -206,8 +205,7 @@ recordmcount_source := $(srctree)/scripts/recordmcount.pl endif # BUILD_C_RECORDMCOUNT cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \ $(sub_cmd_record_mcount)) -endif # CC_USING_RECORD_MCOUNT -endif # CONFIG_FTRACE_MCOUNT_RECORD +endif # USE_RECORDMCOUNT ifdef CONFIG_STACK_VALIDATION ifneq ($(SKIP_STACK_VALIDATION),1) @@ -230,6 +228,9 @@ endif ifdef CONFIG_X86_SMAP objtool_args += --uaccess endif +ifdef USE_OBJTOOL_MCOUNT + objtool_args += --mcount +endif # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 07/28] kbuild: add support for objtool mcount 2020-09-03 20:30 ` [PATCH v2 07/28] kbuild: add support for objtool mcount Sami Tolvanen @ 2020-09-03 21:56 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:56 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:32PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote: > This change adds build support for using objtool to generate > __mcount_loc sections. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Looks right to me. (There is probably an argument to be made to do all of the tooling detection in the Kconfig, but that's a larger issues and orthogonal to this fix, IMO.) Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 08/28] x86, build: use objtool mcount 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (6 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 07/28] kbuild: add support for objtool mcount Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 21:58 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen ` (24 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use objtool to generate __mcount_loc sections for dynamic ftrace with Clang and gcc <5. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7101ac64bb20..6de2e5c0bdba 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -163,6 +163,7 @@ config X86 select HAVE_CMPXCHG_LOCAL select HAVE_CONTEXT_TRACKING if X86_64 select HAVE_C_RECORDMCOUNT + select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 08/28] x86, build: use objtool mcount 2020-09-03 20:30 ` [PATCH v2 08/28] x86, build: use " Sami Tolvanen @ 2020-09-03 21:58 ` Kees Cook 2020-09-03 22:11 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 21:58 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:33PM -0700, Sami Tolvanen wrote: > Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use > objtool to generate __mcount_loc sections for dynamic ftrace with > Clang and gcc <5. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Am I right to understand that this fixes mcount for Clang generally (i.e. it's not _strictly_ related to LTO, though LTO depends on this change)? And does this mean mcount was working for gcc < 5? Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 08/28] x86, build: use objtool mcount 2020-09-03 21:58 ` Kees Cook @ 2020-09-03 22:11 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 22:11 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, X86 ML, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, LKML, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 3, 2020 at 2:58 PM Kees Cook <keescook@chromium.org> wrote: > > On Thu, Sep 03, 2020 at 01:30:33PM -0700, Sami Tolvanen wrote: > > Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use > > objtool to generate __mcount_loc sections for dynamic ftrace with > > Clang and gcc <5. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > Am I right to understand that this fixes mcount for Clang generally > (i.e. it's not _strictly_ related to LTO, though LTO depends on this > change)? No, this works fine with Clang when LTO is disabled, because recordmcount ignores files named "ftrace.o". However, with LTO, we process vmlinux.o instead, so we need a different method of ignoring __fentry__ relocations that are not calls. In v1, I used a function attribute to whitelist functions that refer to __fentry__, but as Peter pointed out back then, objtool already knows where the call sites are, so using it to generate __mcount_loc is cleaner. > And does this mean mcount was working for gcc < 5? Yes. I should have been clearer in the commit message. The reason I mentioned gcc <5 is that later gcc versions support -mrecord-mcount, which means they don't need an external tool for generating __mcount_loc anymore. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (7 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 08/28] x86, build: use " Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:08 ` Kees Cook ` (3 more replies) 2020-09-03 20:30 ` [PATCH v2 10/28] kbuild: lto: fix module versioning Sami Tolvanen ` (23 subsequent siblings) 32 siblings, 4 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel This change adds build system support for Clang's Link Time Optimization (LTO). With -flto, instead of ELF object files, Clang produces LLVM bitcode, which is compiled into native code at link time, allowing the final binary to be optimized globally. For more details, see: https://llvm.org/docs/LinkTimeOptimization.html The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, which defaults to LTO being disabled. To use LTO, the architecture must select ARCH_SUPPORTS_LTO_CLANG and support: - compiling with Clang, - compiling inline assembly with Clang's integrated assembler, - and linking with LLD. While using full LTO results in the best runtime performance, the compilation is not scalable in time or memory. CONFIG_THINLTO enables ThinLTO, which allows parallel optimization and faster incremental builds. ThinLTO is used by default if the architecture also selects ARCH_SUPPORTS_THINLTO: https://clang.llvm.org/docs/ThinLTO.html To enable LTO, LLVM tools must be used to handle bitcode files. The easiest way is to pass the LLVM=1 option to make: $ make LLVM=1 defconfig $ scripts/config -e LTO_CLANG $ make LLVM=1 Alternatively, at least the following LLVM tools must be used: CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm To prepare for LTO support with other compilers, common parts are gated behind the CONFIG_LTO option, and LTO can be disabled for specific files by filtering out CC_FLAGS_LTO. Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in follow-up patches. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 18 +++++++- arch/Kconfig | 68 +++++++++++++++++++++++++++++++ include/asm-generic/vmlinux.lds.h | 11 +++-- scripts/Makefile.build | 9 +++- scripts/Makefile.modfinal | 9 +++- scripts/Makefile.modpost | 24 ++++++++++- scripts/link-vmlinux.sh | 32 +++++++++++---- 7 files changed, 154 insertions(+), 17 deletions(-) diff --git a/Makefile b/Makefile index a9dae26c93b5..dd49eaea7c25 100644 --- a/Makefile +++ b/Makefile @@ -909,6 +909,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) export CC_FLAGS_SCS endif +ifdef CONFIG_LTO_CLANG +ifdef CONFIG_THINLTO +CC_FLAGS_LTO_CLANG := -flto=thin -fsplit-lto-unit +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache +else +CC_FLAGS_LTO_CLANG := -flto +endif +CC_FLAGS_LTO_CLANG += -fvisibility=default +endif + +ifdef CONFIG_LTO +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) +KBUILD_CFLAGS += $(CC_FLAGS_LTO) +export CC_FLAGS_LTO +endif + ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B KBUILD_CFLAGS += -falign-functions=32 endif @@ -1499,7 +1515,7 @@ MRPROPER_FILES += include/config include/generated \ *.spec # Directories & files removed with 'make distclean' -DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS +DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS .thinlto-cache # clean - Delete most, but leave enough to build external modules # diff --git a/arch/Kconfig b/arch/Kconfig index af14a567b493..11bb2f48dfe8 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -552,6 +552,74 @@ config SHADOW_CALL_STACK reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying the stacks. +config LTO + bool + +config ARCH_SUPPORTS_LTO_CLANG + bool + help + An architecture should select this option if it supports: + - compiling with Clang, + - compiling inline assembly with Clang's integrated assembler, + - and linking with LLD. + +config ARCH_SUPPORTS_THINLTO + bool + help + An architecture should select this option if it supports Clang's + ThinLTO. + +config THINLTO + bool "Clang ThinLTO" + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO + default y + help + This option enables Clang's ThinLTO, which allows for parallel + optimization and faster incremental compiles. More information + can be found from Clang's documentation: + + https://clang.llvm.org/docs/ThinLTO.html + +choice + prompt "Link Time Optimization (LTO)" + default LTO_NONE + help + This option enables Link Time Optimization (LTO), which allows the + compiler to optimize binaries globally. + + If unsure, select LTO_NONE. + +config LTO_NONE + bool "None" + +config LTO_CLANG + bool "Clang's Link Time Optimization (EXPERIMENTAL)" + # Clang >= 11: https://github.com/ClangBuiltLinux/linux/issues/510 + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD + depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) + depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) + depends on ARCH_SUPPORTS_LTO_CLANG + depends on !FTRACE_MCOUNT_RECORD + depends on !KASAN + depends on !GCOV_KERNEL + depends on !MODVERSIONS + select LTO + help + This option enables Clang's Link Time Optimization (LTO), which + allows the compiler to optimize the kernel globally. If you enable + this option, the compiler generates LLVM bitcode instead of ELF + object files, and the actual compilation from bitcode happens at + the LTO link step, which may take several minutes depending on the + kernel configuration. More information can be found from LLVM's + documentation: + + https://llvm.org/docs/LinkTimeOptimization.html + + To select this option, you also need to use LLVM tools to handle + the bitcode by passing LLVM=1 to make. + +endchoice + config HAVE_ARCH_WITHIN_STACK_FRAMES bool help diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 5430febd34be..c1f0d58272bd 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -89,15 +89,18 @@ * .data. We don't want to pull in .data..other sections, which Linux * has defined. Same for text and bss. * + * With LTO_CLANG, the linker also splits sections by default, so we need + * these macros to combine the sections during the final link. + * * RODATA_MAIN is not used because existing code already defines .rodata.x * sections to be brought in with rodata. */ -#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG) #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* -#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* +#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral* #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* -#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* -#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* +#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L* +#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..compoundliteral* #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* #else #define TEXT_MAIN .text diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 6ecf30c70ced..a5f4b5d407e6 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -111,7 +111,7 @@ endif # --------------------------------------------------------------------------- quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< $(obj)/%.s: $(src)/%.c FORCE $(call if_changed_dep,cc_s_c) @@ -428,8 +428,15 @@ $(obj)/lib.a: $(lib-y) FORCE # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object # module is turned into a multi object module, $^ will contain header file # dependencies recorded in the .*.cmd file. +ifdef CONFIG_LTO_CLANG +quiet_cmd_link_multi-m = AR [M] $@ +cmd_link_multi-m = \ + rm -f $@; \ + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) +else quiet_cmd_link_multi-m = LD [M] $@ cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^) +endif $(multi-used-m): FORCE $(call if_changed,link_multi-m) diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 411c1e600e7d..1005b147abd0 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -6,6 +6,7 @@ PHONY := __modfinal __modfinal: +include $(objtree)/include/config/auto.conf include $(srctree)/scripts/Kbuild.include # for c_flags @@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M] $@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) +ifdef CONFIG_LTO_CLANG +# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to +# avoid a second slow LTO link +prelink-ext := .lto +endif + quiet_cmd_ld_ko_o = LD [M] $@ cmd_ld_ko_o = \ $(LD) -r $(KBUILD_LDFLAGS) \ @@ -37,7 +44,7 @@ quiet_cmd_ld_ko_o = LD [M] $@ -o $@ $(filter %.o, $^); \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) -$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE +$(modules): %.ko: %$(prelink-ext).o %.mod.o $(KBUILD_LDS_MODULE) FORCE +$(call if_changed,ld_ko_o) targets += $(modules) $(modules:.ko=.mod.o) diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index f54b6ac37ac2..a70f1f7da6aa 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -102,12 +102,32 @@ $(input-symdump): @echo >&2 'WARNING: Symbol version dump "$@" is missing.' @echo >&2 ' Modules may not have dependencies or modversions.' +ifdef CONFIG_LTO_CLANG +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run +# LTO to compile them into native code before running modpost +prelink-ext = .lto + +quiet_cmd_cc_lto_link_modules = LTO [M] $@ +cmd_cc_lto_link_modules = \ + $(LD) $(ld_flags) -r -o $@ \ + --whole-archive $(filter-out FORCE,$^) + +%.lto.o: %.o FORCE + $(call if_changed,cc_lto_link_modules) + +PHONY += FORCE +FORCE: + +endif + +modules := $(sort $(shell cat $(MODORDER))) + # Read out modules.order to pass in modpost. # Otherwise, allmodconfig would fail with "Argument list too long". quiet_cmd_modpost = MODPOST $@ - cmd_modpost = sed 's/ko$$/o/' $< | $(MODPOST) -T - + cmd_modpost = sed 's/\.ko$$/$(prelink-ext)\.o/' $< | $(MODPOST) -T - -$(output-symdump): $(MODORDER) $(input-symdump) FORCE +$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE $(call if_changed,modpost) targets += $(output-symdump) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 372c3719f94c..ebb9f912aab6 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -56,6 +56,14 @@ modpost_link() ${KBUILD_VMLINUX_LIBS} \ --end-group" + if [ -n "${CONFIG_LTO_CLANG}" ]; then + # This might take a while, so indicate that we're doing + # an LTO link + info LTO ${1} + else + info LD ${1} + fi + ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} } @@ -103,13 +111,22 @@ vmlinux_link() fi if [ "${SRCARCH}" != "um" ]; then - objects="--whole-archive \ - ${KBUILD_VMLINUX_OBJS} \ - --no-whole-archive \ - --start-group \ - ${KBUILD_VMLINUX_LIBS} \ - --end-group \ - ${@}" + if [ -n "${CONFIG_LTO_CLANG}" ]; then + # Use vmlinux.o instead of performing the slow LTO + # link again. + objects="--whole-archive \ + vmlinux.o \ + --no-whole-archive \ + ${@}" + else + objects="--whole-archive \ + ${KBUILD_VMLINUX_OBJS} \ + --no-whole-archive \ + --start-group \ + ${KBUILD_VMLINUX_LIBS} \ + --end-group \ + ${@}" + fi ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ ${strip_debug#-Wl,} \ @@ -274,7 +291,6 @@ fi; ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1 #link vmlinux.o -info LD vmlinux.o modpost_link vmlinux.o objtool_link vmlinux.o -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen @ 2020-09-03 22:08 ` Kees Cook 2020-09-08 17:02 ` Sami Tolvanen 2020-09-05 19:36 ` Masahiro Yamada ` (2 subsequent siblings) 3 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:08 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:34PM -0700, Sami Tolvanen wrote: > This change adds build system support for Clang's Link Time > Optimization (LTO). With -flto, instead of ELF object files, Clang > produces LLVM bitcode, which is compiled into native code at link > time, allowing the final binary to be optimized globally. For more > details, see: > > https://llvm.org/docs/LinkTimeOptimization.html > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > which defaults to LTO being disabled. To use LTO, the architecture > must select ARCH_SUPPORTS_LTO_CLANG and support: > > - compiling with Clang, > - compiling inline assembly with Clang's integrated assembler, > - and linking with LLD. > > While using full LTO results in the best runtime performance, the > compilation is not scalable in time or memory. CONFIG_THINLTO > enables ThinLTO, which allows parallel optimization and faster > incremental builds. ThinLTO is used by default if the architecture > also selects ARCH_SUPPORTS_THINLTO: > > https://clang.llvm.org/docs/ThinLTO.html > > To enable LTO, LLVM tools must be used to handle bitcode files. The > easiest way is to pass the LLVM=1 option to make: > > $ make LLVM=1 defconfig > $ scripts/config -e LTO_CLANG > $ make LLVM=1 > > Alternatively, at least the following LLVM tools must be used: > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > To prepare for LTO support with other compilers, common parts are > gated behind the CONFIG_LTO option, and LTO can be disabled for > specific files by filtering out CC_FLAGS_LTO. > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > follow-up patches. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> I remain crazy excited about being able to use this in upstream. :) The only suggestion I have here, if it might help with clarity, would be to remove DISABLE_LTO globally as a separate patch, since it's entirely unused in the kernel right now. This series removes it as it goes, which I think is fine, but it might cause some reviewers to ponder "what's this DISABLE_LTO thing? Don't we need that?" without realizing currently unused in the kernel. I'm glad to see the general CONFIG_LTO, as I think it should be easy for GCC LTO support to get added when someone steps up to do it. The bulk of the changed needed to support GCC LTO are part of this series already, since the build problems involving non-ELF .o files and init ordering are shared by Clang and GCC AFAICT. Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-03 22:08 ` Kees Cook @ 2020-09-08 17:02 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 17:02 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:08:59PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:34PM -0700, Sami Tolvanen wrote: > > This change adds build system support for Clang's Link Time > > Optimization (LTO). With -flto, instead of ELF object files, Clang > > produces LLVM bitcode, which is compiled into native code at link > > time, allowing the final binary to be optimized globally. For more > > details, see: > > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > > which defaults to LTO being disabled. To use LTO, the architecture > > must select ARCH_SUPPORTS_LTO_CLANG and support: > > > > - compiling with Clang, > > - compiling inline assembly with Clang's integrated assembler, > > - and linking with LLD. > > > > While using full LTO results in the best runtime performance, the > > compilation is not scalable in time or memory. CONFIG_THINLTO > > enables ThinLTO, which allows parallel optimization and faster > > incremental builds. ThinLTO is used by default if the architecture > > also selects ARCH_SUPPORTS_THINLTO: > > > > https://clang.llvm.org/docs/ThinLTO.html > > > > To enable LTO, LLVM tools must be used to handle bitcode files. The > > easiest way is to pass the LLVM=1 option to make: > > > > $ make LLVM=1 defconfig > > $ scripts/config -e LTO_CLANG > > $ make LLVM=1 > > > > Alternatively, at least the following LLVM tools must be used: > > > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > > > To prepare for LTO support with other compilers, common parts are > > gated behind the CONFIG_LTO option, and LTO can be disabled for > > specific files by filtering out CC_FLAGS_LTO. > > > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > > follow-up patches. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > I remain crazy excited about being able to use this in upstream. :) > > The only suggestion I have here, if it might help with clarity, would be > to remove DISABLE_LTO globally as a separate patch, since it's entirely > unused in the kernel right now. This series removes it as it goes, which > I think is fine, but it might cause some reviewers to ponder "what's > this DISABLE_LTO thing? Don't we need that?" without realizing currently > unused in the kernel. Sure, that makes sense. I'll add a patch to remove DISABLE_LTO treewide in v3. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen 2020-09-03 22:08 ` Kees Cook @ 2020-09-05 19:36 ` Masahiro Yamada 2020-09-08 17:10 ` Sami Tolvanen 2020-09-05 20:17 ` Masahiro Yamada 2020-09-07 15:30 ` Masahiro Yamada 3 siblings, 1 reply; 212+ messages in thread From: Masahiro Yamada @ 2020-09-05 19:36 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > This change adds build system support for Clang's Link Time > Optimization (LTO). With -flto, instead of ELF object files, Clang > produces LLVM bitcode, which is compiled into native code at link > time, allowing the final binary to be optimized globally. For more > details, see: > > https://llvm.org/docs/LinkTimeOptimization.html > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > which defaults to LTO being disabled. What is the reason for doing this in a choice? To turn off LTO_CLANG for compile-testing? I would rather want to give LTO_CLANG more chances to be enabled/tested. > To use LTO, the architecture > must select ARCH_SUPPORTS_LTO_CLANG and support: > > - compiling with Clang, > - compiling inline assembly with Clang's integrated assembler, > - and linking with LLD. > > While using full LTO results in the best runtime performance, the > compilation is not scalable in time or memory. CONFIG_THINLTO > enables ThinLTO, which allows parallel optimization and faster > incremental builds. ThinLTO is used by default if the architecture > also selects ARCH_SUPPORTS_THINLTO: > > https://clang.llvm.org/docs/ThinLTO.html > > To enable LTO, LLVM tools must be used to handle bitcode files. The > easiest way is to pass the LLVM=1 option to make: > > $ make LLVM=1 defconfig > $ scripts/config -e LTO_CLANG > $ make LLVM=1 > > Alternatively, at least the following LLVM tools must be used: > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > To prepare for LTO support with other compilers, common parts are > gated behind the CONFIG_LTO option, and LTO can be disabled for > specific files by filtering out CC_FLAGS_LTO. > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > follow-up patches. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 18 +++++++- > arch/Kconfig | 68 +++++++++++++++++++++++++++++++ > include/asm-generic/vmlinux.lds.h | 11 +++-- > scripts/Makefile.build | 9 +++- > scripts/Makefile.modfinal | 9 +++- > scripts/Makefile.modpost | 24 ++++++++++- > scripts/link-vmlinux.sh | 32 +++++++++++---- > 7 files changed, 154 insertions(+), 17 deletions(-) > > diff --git a/Makefile b/Makefile > index a9dae26c93b5..dd49eaea7c25 100644 > --- a/Makefile > +++ b/Makefile > @@ -909,6 +909,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > export CC_FLAGS_SCS > endif > > +ifdef CONFIG_LTO_CLANG > +ifdef CONFIG_THINLTO > +CC_FLAGS_LTO_CLANG := -flto=thin -fsplit-lto-unit > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache I think this would break external module builds because it would create cache files in the kernel source tree. External module builds should never ever touch the kernel tree, which is usually located under the read-only /usr/src/ in distros. .thinlto-cache should be created in the module tree when it is built with M=. > +else > +CC_FLAGS_LTO_CLANG := -flto > +endif > +CC_FLAGS_LTO_CLANG += -fvisibility=default > +endif > + > +ifdef CONFIG_LTO > +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) > +KBUILD_CFLAGS += $(CC_FLAGS_LTO) > +export CC_FLAGS_LTO > +endif > + > ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B > KBUILD_CFLAGS += -falign-functions=32 > endif > @@ -1499,7 +1515,7 @@ MRPROPER_FILES += include/config include/generated \ > *.spec > > # Directories & files removed with 'make distclean' > -DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS > +DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS .thinlto-cache This was suggested in v1, but I could not understand why doing this in distclean was appropriate. Is keeping cache files of kernel objects useful for external module builds? Also, please clean up .thinlto-cache for external module builds. > > # clean - Delete most, but leave enough to build external modules > # > diff --git a/arch/Kconfig b/arch/Kconfig > index af14a567b493..11bb2f48dfe8 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -552,6 +552,74 @@ config SHADOW_CALL_STACK > reading and writing arbitrary memory may be able to locate them > and hijack control flow by modifying the stacks. > > +config LTO > + bool > + > +config ARCH_SUPPORTS_LTO_CLANG > + bool > + help > + An architecture should select this option if it supports: > + - compiling with Clang, > + - compiling inline assembly with Clang's integrated assembler, > + - and linking with LLD. > + > +config ARCH_SUPPORTS_THINLTO > + bool > + help > + An architecture should select this option if it supports Clang's > + ThinLTO. > + > +config THINLTO > + bool "Clang ThinLTO" > + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO > + default y > + help > + This option enables Clang's ThinLTO, which allows for parallel > + optimization and faster incremental compiles. More information > + can be found from Clang's documentation: > + > + https://clang.llvm.org/docs/ThinLTO.html > + > +choice > + prompt "Link Time Optimization (LTO)" > + default LTO_NONE > + help > + This option enables Link Time Optimization (LTO), which allows the > + compiler to optimize binaries globally. > + > + If unsure, select LTO_NONE. > + > +config LTO_NONE > + bool "None" > + > +config LTO_CLANG > + bool "Clang's Link Time Optimization (EXPERIMENTAL)" > + # Clang >= 11: https://github.com/ClangBuiltLinux/linux/issues/510 > + depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD > + depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) > + depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) > + depends on ARCH_SUPPORTS_LTO_CLANG > + depends on !FTRACE_MCOUNT_RECORD > + depends on !KASAN > + depends on !GCOV_KERNEL > + depends on !MODVERSIONS > + select LTO > + help > + This option enables Clang's Link Time Optimization (LTO), which > + allows the compiler to optimize the kernel globally. If you enable > + this option, the compiler generates LLVM bitcode instead of ELF > + object files, and the actual compilation from bitcode happens at > + the LTO link step, which may take several minutes depending on the > + kernel configuration. More information can be found from LLVM's > + documentation: > + > + https://llvm.org/docs/LinkTimeOptimization.html > + > + To select this option, you also need to use LLVM tools to handle > + the bitcode by passing LLVM=1 to make. > + > +endchoice > + > config HAVE_ARCH_WITHIN_STACK_FRAMES > bool > help > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h > index 5430febd34be..c1f0d58272bd 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -89,15 +89,18 @@ > * .data. We don't want to pull in .data..other sections, which Linux > * has defined. Same for text and bss. > * > + * With LTO_CLANG, the linker also splits sections by default, so we need > + * these macros to combine the sections during the final link. > + * > * RODATA_MAIN is not used because existing code already defines .rodata.x > * sections to be brought in with rodata. > */ > -#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION > +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG) > #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* > -#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* > +#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral* > #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* > -#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* > -#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* > +#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L* > +#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..compoundliteral* > #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* > #else > #define TEXT_MAIN .text > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 6ecf30c70ced..a5f4b5d407e6 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -111,7 +111,7 @@ endif > # --------------------------------------------------------------------------- > > quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ > - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< > + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< > > $(obj)/%.s: $(src)/%.c FORCE > $(call if_changed_dep,cc_s_c) > @@ -428,8 +428,15 @@ $(obj)/lib.a: $(lib-y) FORCE > # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object > # module is turned into a multi object module, $^ will contain header file > # dependencies recorded in the .*.cmd file. > +ifdef CONFIG_LTO_CLANG > +quiet_cmd_link_multi-m = AR [M] $@ > +cmd_link_multi-m = \ > + rm -f $@; \ > + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) > +else > quiet_cmd_link_multi-m = LD [M] $@ > cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^) > +endif > > $(multi-used-m): FORCE > $(call if_changed,link_multi-m) > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal > index 411c1e600e7d..1005b147abd0 100644 > --- a/scripts/Makefile.modfinal > +++ b/scripts/Makefile.modfinal > @@ -6,6 +6,7 @@ > PHONY := __modfinal > __modfinal: > > +include $(objtree)/include/config/auto.conf > include $(srctree)/scripts/Kbuild.include > > # for c_flags > @@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M] $@ > > ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) > > +ifdef CONFIG_LTO_CLANG > +# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to > +# avoid a second slow LTO link > +prelink-ext := .lto > +endif > + > quiet_cmd_ld_ko_o = LD [M] $@ > cmd_ld_ko_o = \ > $(LD) -r $(KBUILD_LDFLAGS) \ > @@ -37,7 +44,7 @@ quiet_cmd_ld_ko_o = LD [M] $@ > -o $@ $(filter %.o, $^); \ > $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) > > -$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE > +$(modules): %.ko: %$(prelink-ext).o %.mod.o $(KBUILD_LDS_MODULE) FORCE > +$(call if_changed,ld_ko_o) > > targets += $(modules) $(modules:.ko=.mod.o) > diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost > index f54b6ac37ac2..a70f1f7da6aa 100644 > --- a/scripts/Makefile.modpost > +++ b/scripts/Makefile.modpost > @@ -102,12 +102,32 @@ $(input-symdump): > @echo >&2 'WARNING: Symbol version dump "$@" is missing.' > @echo >&2 ' Modules may not have dependencies or modversions.' > > +ifdef CONFIG_LTO_CLANG > +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run > +# LTO to compile them into native code before running modpost > +prelink-ext = .lto > + > +quiet_cmd_cc_lto_link_modules = LTO [M] $@ > +cmd_cc_lto_link_modules = \ > + $(LD) $(ld_flags) -r -o $@ \ > + --whole-archive $(filter-out FORCE,$^) > + > +%.lto.o: %.o FORCE > + $(call if_changed,cc_lto_link_modules) > + > +PHONY += FORCE > +FORCE: > + > +endif > + > +modules := $(sort $(shell cat $(MODORDER))) > + > # Read out modules.order to pass in modpost. > # Otherwise, allmodconfig would fail with "Argument list too long". > quiet_cmd_modpost = MODPOST $@ > - cmd_modpost = sed 's/ko$$/o/' $< | $(MODPOST) -T - > + cmd_modpost = sed 's/\.ko$$/$(prelink-ext)\.o/' $< | $(MODPOST) -T - > > -$(output-symdump): $(MODORDER) $(input-symdump) FORCE > +$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE > $(call if_changed,modpost) > > targets += $(output-symdump) > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index 372c3719f94c..ebb9f912aab6 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -56,6 +56,14 @@ modpost_link() > ${KBUILD_VMLINUX_LIBS} \ > --end-group" > > + if [ -n "${CONFIG_LTO_CLANG}" ]; then > + # This might take a while, so indicate that we're doing > + # an LTO link > + info LTO ${1} > + else > + info LD ${1} > + fi > + > ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} > } > > @@ -103,13 +111,22 @@ vmlinux_link() > fi > > if [ "${SRCARCH}" != "um" ]; then > - objects="--whole-archive \ > - ${KBUILD_VMLINUX_OBJS} \ > - --no-whole-archive \ > - --start-group \ > - ${KBUILD_VMLINUX_LIBS} \ > - --end-group \ > - ${@}" > + if [ -n "${CONFIG_LTO_CLANG}" ]; then > + # Use vmlinux.o instead of performing the slow LTO > + # link again. > + objects="--whole-archive \ > + vmlinux.o \ > + --no-whole-archive \ > + ${@}" > + else > + objects="--whole-archive \ > + ${KBUILD_VMLINUX_OBJS} \ > + --no-whole-archive \ > + --start-group \ > + ${KBUILD_VMLINUX_LIBS} \ > + --end-group \ > + ${@}" > + fi > > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ > ${strip_debug#-Wl,} \ > @@ -274,7 +291,6 @@ fi; > ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1 > > #link vmlinux.o > -info LD vmlinux.o > modpost_link vmlinux.o > objtool_link vmlinux.o > > -- > 2.28.0.402.g5ffc5be6b7-goog > -- Best Regards Masahiro Yamada _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-05 19:36 ` Masahiro Yamada @ 2020-09-08 17:10 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 17:10 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Sun, Sep 06, 2020 at 04:36:32AM +0900, Masahiro Yamada wrote: > On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > This change adds build system support for Clang's Link Time > > Optimization (LTO). With -flto, instead of ELF object files, Clang > > produces LLVM bitcode, which is compiled into native code at link > > time, allowing the final binary to be optimized globally. For more > > details, see: > > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > > which defaults to LTO being disabled. > > What is the reason for doing this in a choice? > To turn off LTO_CLANG for compile-testing? > > I would rather want to give LTO_CLANG more chances > to be enabled/tested. It's a choice to prevent LTO from being enabled by default with allyesconfig and allmodconfig. It would take hours to build these even on a fast computer, and probably days on older hardware. > > +ifdef CONFIG_LTO_CLANG > > +ifdef CONFIG_THINLTO > > +CC_FLAGS_LTO_CLANG := -flto=thin -fsplit-lto-unit > > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache > > > I think this would break external module builds > because it would create cache files in the > kernel source tree. > > External module builds should never ever touch > the kernel tree, which is usually located under > the read-only /usr/src/ in distros. > > > .thinlto-cache should be created in the module tree > when it is built with M=. Thanks for pointing this out, I'll fix the path in v3. > > # Directories & files removed with 'make distclean' > > -DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS > > +DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS .thinlto-cache > > > > This was suggested in v1, but I could not understand > why doing this in distclean was appropriate. > > Is keeping cache files of kernel objects > useful for external module builds? No, the cache only speeds up incremental kernel builds. > Also, please clean up .thinlto-cache for external module builds. Ack. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen 2020-09-03 22:08 ` Kees Cook 2020-09-05 19:36 ` Masahiro Yamada @ 2020-09-05 20:17 ` Masahiro Yamada 2020-09-08 17:14 ` Sami Tolvanen 2020-09-07 15:30 ` Masahiro Yamada 3 siblings, 1 reply; 212+ messages in thread From: Masahiro Yamada @ 2020-09-05 20:17 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > This change adds build system support for Clang's Link Time > Optimization (LTO). With -flto, instead of ELF object files, Clang > produces LLVM bitcode, which is compiled into native code at link > time, allowing the final binary to be optimized globally. For more > details, see: > > https://llvm.org/docs/LinkTimeOptimization.html > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > which defaults to LTO being disabled. To use LTO, the architecture > must select ARCH_SUPPORTS_LTO_CLANG and support: > > - compiling with Clang, > - compiling inline assembly with Clang's integrated assembler, > - and linking with LLD. > > While using full LTO results in the best runtime performance, the > compilation is not scalable in time or memory. CONFIG_THINLTO > enables ThinLTO, which allows parallel optimization and faster > incremental builds. ThinLTO is used by default if the architecture > also selects ARCH_SUPPORTS_THINLTO: > > https://clang.llvm.org/docs/ThinLTO.html > > To enable LTO, LLVM tools must be used to handle bitcode files. The > easiest way is to pass the LLVM=1 option to make: > > $ make LLVM=1 defconfig > $ scripts/config -e LTO_CLANG > $ make LLVM=1 > > Alternatively, at least the following LLVM tools must be used: > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > To prepare for LTO support with other compilers, common parts are > gated behind the CONFIG_LTO option, and LTO can be disabled for > specific files by filtering out CC_FLAGS_LTO. > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > follow-up patches. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 18 +++++++- > arch/Kconfig | 68 +++++++++++++++++++++++++++++++ > include/asm-generic/vmlinux.lds.h | 11 +++-- > scripts/Makefile.build | 9 +++- > scripts/Makefile.modfinal | 9 +++- > scripts/Makefile.modpost | 24 ++++++++++- > scripts/link-vmlinux.sh | 32 +++++++++++---- > 7 files changed, 154 insertions(+), 17 deletions(-) > > diff --git a/Makefile b/Makefile > index a9dae26c93b5..dd49eaea7c25 100644 > --- a/Makefile > +++ b/Makefile > @@ -909,6 +909,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > export CC_FLAGS_SCS > endif > > +ifdef CONFIG_LTO_CLANG > +ifdef CONFIG_THINLTO > +CC_FLAGS_LTO_CLANG := -flto=thin -fsplit-lto-unit > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache > +else > +CC_FLAGS_LTO_CLANG := -flto > +endif > +CC_FLAGS_LTO_CLANG += -fvisibility=default > +endif > + > +ifdef CONFIG_LTO > +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) $(CC_FLAGS_LTO_CLANG) is not used elsewhere. Why didn't you add the flags to CC_FLAGS_LTO directly? Will it be useful if LTO_GCC is supported ? > +KBUILD_CFLAGS += $(CC_FLAGS_LTO) > +export CC_FLAGS_LTO > +endif -- Best Regards Masahiro Yamada _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-05 20:17 ` Masahiro Yamada @ 2020-09-08 17:14 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 17:14 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Sun, Sep 06, 2020 at 05:17:32AM +0900, Masahiro Yamada wrote: > On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > This change adds build system support for Clang's Link Time > > Optimization (LTO). With -flto, instead of ELF object files, Clang > > produces LLVM bitcode, which is compiled into native code at link > > time, allowing the final binary to be optimized globally. For more > > details, see: > > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > > which defaults to LTO being disabled. To use LTO, the architecture > > must select ARCH_SUPPORTS_LTO_CLANG and support: > > > > - compiling with Clang, > > - compiling inline assembly with Clang's integrated assembler, > > - and linking with LLD. > > > > While using full LTO results in the best runtime performance, the > > compilation is not scalable in time or memory. CONFIG_THINLTO > > enables ThinLTO, which allows parallel optimization and faster > > incremental builds. ThinLTO is used by default if the architecture > > also selects ARCH_SUPPORTS_THINLTO: > > > > https://clang.llvm.org/docs/ThinLTO.html > > > > To enable LTO, LLVM tools must be used to handle bitcode files. The > > easiest way is to pass the LLVM=1 option to make: > > > > $ make LLVM=1 defconfig > > $ scripts/config -e LTO_CLANG > > $ make LLVM=1 > > > > Alternatively, at least the following LLVM tools must be used: > > > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > > > To prepare for LTO support with other compilers, common parts are > > gated behind the CONFIG_LTO option, and LTO can be disabled for > > specific files by filtering out CC_FLAGS_LTO. > > > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > > follow-up patches. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > Makefile | 18 +++++++- > > arch/Kconfig | 68 +++++++++++++++++++++++++++++++ > > include/asm-generic/vmlinux.lds.h | 11 +++-- > > scripts/Makefile.build | 9 +++- > > scripts/Makefile.modfinal | 9 +++- > > scripts/Makefile.modpost | 24 ++++++++++- > > scripts/link-vmlinux.sh | 32 +++++++++++---- > > 7 files changed, 154 insertions(+), 17 deletions(-) > > > > diff --git a/Makefile b/Makefile > > index a9dae26c93b5..dd49eaea7c25 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -909,6 +909,22 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS) > > export CC_FLAGS_SCS > > endif > > > > +ifdef CONFIG_LTO_CLANG > > +ifdef CONFIG_THINLTO > > +CC_FLAGS_LTO_CLANG := -flto=thin -fsplit-lto-unit > > +KBUILD_LDFLAGS += --thinlto-cache-dir=.thinlto-cache > > +else > > +CC_FLAGS_LTO_CLANG := -flto > > +endif > > +CC_FLAGS_LTO_CLANG += -fvisibility=default > > +endif > > + > > +ifdef CONFIG_LTO > > +CC_FLAGS_LTO := $(CC_FLAGS_LTO_CLANG) > > > $(CC_FLAGS_LTO_CLANG) is not used elsewhere. > > Why didn't you add the flags to CC_FLAGS_LTO > directly? > > Will it be useful if LTO_GCC is supported ? The idea was to allow compiler-specific LTO flags to be filtered out separately if needed, but you're right, this is not really necessary right now. I'll drop CC_FLAGS_LTO_CLANG in v3. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen ` (2 preceding siblings ...) 2020-09-05 20:17 ` Masahiro Yamada @ 2020-09-07 15:30 ` Masahiro Yamada 2020-09-08 17:30 ` Sami Tolvanen 3 siblings, 1 reply; 212+ messages in thread From: Masahiro Yamada @ 2020-09-07 15:30 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > This change adds build system support for Clang's Link Time > Optimization (LTO). With -flto, instead of ELF object files, Clang > produces LLVM bitcode, which is compiled into native code at link > time, allowing the final binary to be optimized globally. For more > details, see: > > https://llvm.org/docs/LinkTimeOptimization.html > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > which defaults to LTO being disabled. To use LTO, the architecture > must select ARCH_SUPPORTS_LTO_CLANG and support: > > - compiling with Clang, > - compiling inline assembly with Clang's integrated assembler, > - and linking with LLD. > > While using full LTO results in the best runtime performance, the > compilation is not scalable in time or memory. CONFIG_THINLTO > enables ThinLTO, which allows parallel optimization and faster > incremental builds. ThinLTO is used by default if the architecture > also selects ARCH_SUPPORTS_THINLTO: > > https://clang.llvm.org/docs/ThinLTO.html > > To enable LTO, LLVM tools must be used to handle bitcode files. The > easiest way is to pass the LLVM=1 option to make: > > $ make LLVM=1 defconfig > $ scripts/config -e LTO_CLANG > $ make LLVM=1 > > Alternatively, at least the following LLVM tools must be used: > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > To prepare for LTO support with other compilers, common parts are > gated behind the CONFIG_LTO option, and LTO can be disabled for > specific files by filtering out CC_FLAGS_LTO. > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > follow-up patches. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 18 +++++++- > arch/Kconfig | 68 +++++++++++++++++++++++++++++++ > include/asm-generic/vmlinux.lds.h | 11 +++-- > scripts/Makefile.build | 9 +++- > scripts/Makefile.modfinal | 9 +++- > scripts/Makefile.modpost | 24 ++++++++++- > scripts/link-vmlinux.sh | 32 +++++++++++---- > 7 files changed, 154 insertions(+), 17 deletions(-) > #define TEXT_MAIN .text > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 6ecf30c70ced..a5f4b5d407e6 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -111,7 +111,7 @@ endif > # --------------------------------------------------------------------------- > > quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ > - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< > + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< > > $(obj)/%.s: $(src)/%.c FORCE > $(call if_changed_dep,cc_s_c) > @@ -428,8 +428,15 @@ $(obj)/lib.a: $(lib-y) FORCE > # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object > # module is turned into a multi object module, $^ will contain header file > # dependencies recorded in the .*.cmd file. > +ifdef CONFIG_LTO_CLANG > +quiet_cmd_link_multi-m = AR [M] $@ > +cmd_link_multi-m = \ > + rm -f $@; \ > + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) KBUILD_ARFLAGS no longer exists in the mainline. (commit 13dc8c029cabf52ba95f60c56eb104d4d95d5889) > +else > quiet_cmd_link_multi-m = LD [M] $@ > cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^) > +endif > > $(multi-used-m): FORCE > $(call if_changed,link_multi-m) > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal > index 411c1e600e7d..1005b147abd0 100644 > --- a/scripts/Makefile.modfinal > +++ b/scripts/Makefile.modfinal > @@ -6,6 +6,7 @@ > PHONY := __modfinal > __modfinal: > > +include $(objtree)/include/config/auto.conf > include $(srctree)/scripts/Kbuild.include > > # for c_flags > @@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M] $@ > > ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) > > +ifdef CONFIG_LTO_CLANG > +# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to > +# avoid a second slow LTO link > +prelink-ext := .lto > +endif > + > quiet_cmd_ld_ko_o = LD [M] $@ > cmd_ld_ko_o = \ > $(LD) -r $(KBUILD_LDFLAGS) \ > @@ -37,7 +44,7 @@ quiet_cmd_ld_ko_o = LD [M] $@ > -o $@ $(filter %.o, $^); \ > $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) > > -$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE > +$(modules): %.ko: %$(prelink-ext).o %.mod.o $(KBUILD_LDS_MODULE) FORCE > +$(call if_changed,ld_ko_o) > > targets += $(modules) $(modules:.ko=.mod.o) > diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost > index f54b6ac37ac2..a70f1f7da6aa 100644 > --- a/scripts/Makefile.modpost > +++ b/scripts/Makefile.modpost > @@ -102,12 +102,32 @@ $(input-symdump): > @echo >&2 'WARNING: Symbol version dump "$@" is missing.' > @echo >&2 ' Modules may not have dependencies or modversions.' > > +ifdef CONFIG_LTO_CLANG > +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, or, .o files might be even thin archives. For example, $ file net/ipv6/netfilter/nf_defrag_ipv6.o net/ipv6/netfilter/nf_defrag_ipv6.o: thin archive with 6 symbol entries Now we have 3 possibilities for .o files: - ELF (real .o) - LLVM bitcode (.bc) - Thin archive (.a) Let me discuss how to proceed with this... -- Best Regards Masahiro Yamada _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 09/28] kbuild: add support for Clang LTO 2020-09-07 15:30 ` Masahiro Yamada @ 2020-09-08 17:30 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 17:30 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Tue, Sep 08, 2020 at 12:30:14AM +0900, Masahiro Yamada wrote: > On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > This change adds build system support for Clang's Link Time > > Optimization (LTO). With -flto, instead of ELF object files, Clang > > produces LLVM bitcode, which is compiled into native code at link > > time, allowing the final binary to be optimized globally. For more > > details, see: > > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice, > > which defaults to LTO being disabled. To use LTO, the architecture > > must select ARCH_SUPPORTS_LTO_CLANG and support: > > > > - compiling with Clang, > > - compiling inline assembly with Clang's integrated assembler, > > - and linking with LLD. > > > > While using full LTO results in the best runtime performance, the > > compilation is not scalable in time or memory. CONFIG_THINLTO > > enables ThinLTO, which allows parallel optimization and faster > > incremental builds. ThinLTO is used by default if the architecture > > also selects ARCH_SUPPORTS_THINLTO: > > > > https://clang.llvm.org/docs/ThinLTO.html > > > > To enable LTO, LLVM tools must be used to handle bitcode files. The > > easiest way is to pass the LLVM=1 option to make: > > > > $ make LLVM=1 defconfig > > $ scripts/config -e LTO_CLANG > > $ make LLVM=1 > > > > Alternatively, at least the following LLVM tools must be used: > > > > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm > > > > To prepare for LTO support with other compilers, common parts are > > gated behind the CONFIG_LTO option, and LTO can be disabled for > > specific files by filtering out CC_FLAGS_LTO. > > > > Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in > > follow-up patches. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > Makefile | 18 +++++++- > > arch/Kconfig | 68 +++++++++++++++++++++++++++++++ > > include/asm-generic/vmlinux.lds.h | 11 +++-- > > scripts/Makefile.build | 9 +++- > > scripts/Makefile.modfinal | 9 +++- > > scripts/Makefile.modpost | 24 ++++++++++- > > scripts/link-vmlinux.sh | 32 +++++++++++---- > > 7 files changed, 154 insertions(+), 17 deletions(-) > > > > > #define TEXT_MAIN .text > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > > index 6ecf30c70ced..a5f4b5d407e6 100644 > > --- a/scripts/Makefile.build > > +++ b/scripts/Makefile.build > > @@ -111,7 +111,7 @@ endif > > # --------------------------------------------------------------------------- > > > > quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ > > - cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< > > + cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< > > > > $(obj)/%.s: $(src)/%.c FORCE > > $(call if_changed_dep,cc_s_c) > > @@ -428,8 +428,15 @@ $(obj)/lib.a: $(lib-y) FORCE > > # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object > > # module is turned into a multi object module, $^ will contain header file > > # dependencies recorded in the .*.cmd file. > > +ifdef CONFIG_LTO_CLANG > > +quiet_cmd_link_multi-m = AR [M] $@ > > +cmd_link_multi-m = \ > > + rm -f $@; \ > > + $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) > > > KBUILD_ARFLAGS no longer exists in the mainline. > (commit 13dc8c029cabf52ba95f60c56eb104d4d95d5889) Thanks, I'll drop this in the next version. > > +ifdef CONFIG_LTO_CLANG > > +# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, > > or, .o files might be even thin archives. Right, and with LTO the thin archive might also point to a mix of bitcode and ELF to further complicate things. > For example, > > $ file net/ipv6/netfilter/nf_defrag_ipv6.o > net/ipv6/netfilter/nf_defrag_ipv6.o: thin archive with 6 symbol entries > > > Now we have 3 possibilities for .o files: > > - ELF (real .o) > - LLVM bitcode (.bc) > - Thin archive (.a) > > > Let me discuss how to proceed with this... Did you have something in mind to make this cleaner? Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 10/28] kbuild: lto: fix module versioning 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (8 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:11 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 11/28] kbuild: lto: postpone objtool Sami Tolvanen ` (22 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With CONFIG_MODVERSIONS, version information is linked into each compilation unit that exports symbols. With LTO, we cannot use this method as all C code is compiled into LLVM bitcode instead. This change collects symbol versions into .symversions files and merges them in link-vmlinux.sh where they are all linked into vmlinux.o at the same time. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- .gitignore | 1 + Makefile | 3 ++- arch/Kconfig | 1 - scripts/Makefile.build | 33 +++++++++++++++++++++++++++++++-- scripts/Makefile.modpost | 2 ++ scripts/link-vmlinux.sh | 25 ++++++++++++++++++++++++- 6 files changed, 60 insertions(+), 5 deletions(-) diff --git a/.gitignore b/.gitignore index 162bd2b67bdf..06e76dc39ffe 100644 --- a/.gitignore +++ b/.gitignore @@ -41,6 +41,7 @@ *.so.dbg *.su *.symtypes +*.symversions *.tab.[ch] *.tar *.xz diff --git a/Makefile b/Makefile index dd49eaea7c25..2752be67b460 100644 --- a/Makefile +++ b/Makefile @@ -1847,7 +1847,8 @@ clean: $(clean-dirs) -o -name '.tmp_*.o.*' \ -o -name '*.c.[012]*.*' \ -o -name '*.ll' \ - -o -name '*.gcno' \) -type f -print | xargs rm -f + -o -name '*.gcno' \ + -o -name '*.*.symversions' \) -type f -print | xargs rm -f # Generate tags for editors # --------------------------------------------------------------------------- diff --git a/arch/Kconfig b/arch/Kconfig index 11bb2f48dfe8..71392e4a8900 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -602,7 +602,6 @@ config LTO_CLANG depends on !FTRACE_MCOUNT_RECORD depends on !KASAN depends on !GCOV_KERNEL - depends on !MODVERSIONS select LTO help This option enables Clang's Link Time Optimization (LTO), which diff --git a/scripts/Makefile.build b/scripts/Makefile.build index a5f4b5d407e6..c348e6d6b436 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -166,6 +166,15 @@ ifdef CONFIG_MODVERSIONS # the actual value of the checksum generated by genksyms # o remove .tmp_<file>.o to <file>.o +ifdef CONFIG_LTO_CLANG +# Generate .o.symversions files for each .o with exported symbols, and link these +# to the kernel and/or modules at the end. +cmd_modversions_c = \ + if $(NM) $@ 2>/dev/null | grep -q __ksymtab; then \ + $(call cmd_gensymtypes_c,$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \ + > $@.symversions; \ + fi; +else cmd_modversions_c = \ if $(OBJDUMP) -h $@ | grep -q __ksymtab; then \ $(call cmd_gensymtypes_c,$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \ @@ -177,6 +186,7 @@ cmd_modversions_c = \ rm -f $(@D)/.tmp_$(@F:.o=.ver); \ fi endif +endif ifdef USE_RECORDMCOUNT # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl @@ -393,6 +403,18 @@ $(obj)/%.asn1.c $(obj)/%.asn1.h: $(src)/%.asn1 $(objtree)/scripts/asn1_compiler $(subdir-builtin): $(obj)/%/built-in.a: $(obj)/% ; $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ; +# combine symversions for later processing +quiet_cmd_update_lto_symversions = SYMVER $@ +ifeq ($(CONFIG_LTO_CLANG) $(CONFIG_MODVERSIONS),y y) + cmd_update_lto_symversions = \ + rm -f $@.symversions \ + $(foreach n, $(filter-out FORCE,$^), \ + $(if $(wildcard $(n).symversions), \ + ; cat $(n).symversions >> $@.symversions)) +else + cmd_update_lto_symversions = echo >/dev/null +endif + # # Rule to compile a set of .o files into one .a file (without symbol table) # @@ -400,8 +422,11 @@ $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ; quiet_cmd_ar_builtin = AR $@ cmd_ar_builtin = rm -f $@; $(AR) cDPrST $@ $(real-prereqs) +quiet_cmd_ar_and_symver = AR $@ + cmd_ar_and_symver = $(cmd_update_lto_symversions); $(cmd_ar_builtin) + $(obj)/built-in.a: $(real-obj-y) FORCE - $(call if_changed,ar_builtin) + $(call if_changed,ar_and_symver) # # Rule to create modules.order file @@ -421,8 +446,11 @@ $(obj)/modules.order: $(obj-m) FORCE # # Rule to compile a set of .o files into one .a file (with symbol table) # +quiet_cmd_ar_lib = AR $@ + cmd_ar_lib = $(cmd_update_lto_symversions); $(cmd_ar) + $(obj)/lib.a: $(lib-y) FORCE - $(call if_changed,ar) + $(call if_changed,ar_lib) # NOTE: # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object @@ -431,6 +459,7 @@ $(obj)/lib.a: $(lib-y) FORCE ifdef CONFIG_LTO_CLANG quiet_cmd_link_multi-m = AR [M] $@ cmd_link_multi-m = \ + $(cmd_update_lto_symversions); \ rm -f $@; \ $(AR) rcsTP$(KBUILD_ARFLAGS) $@ $(filter %.o,$^) else diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index a70f1f7da6aa..f9718bf4172d 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -110,6 +110,8 @@ prelink-ext = .lto quiet_cmd_cc_lto_link_modules = LTO [M] $@ cmd_cc_lto_link_modules = \ $(LD) $(ld_flags) -r -o $@ \ + $(shell [ -s $(@:.lto.o=.o.symversions) ] && \ + echo -T $(@:.lto.o=.o.symversions)) \ --whole-archive $(filter-out FORCE,$^) %.lto.o: %.o FORCE diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index ebb9f912aab6..3e99a19b9195 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -43,11 +43,28 @@ info() fi } +# If CONFIG_LTO_CLANG is selected, collect generated symbol versions into +# .tmp_symversions.lds +gen_symversions() +{ + info GEN .tmp_symversions.lds + rm -f .tmp_symversions.lds + + for a in ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS}; do + for o in $(${AR} t $a 2>/dev/null); do + if [ -f ${o}.symversions ]; then + cat ${o}.symversions >> .tmp_symversions.lds + fi + done + done +} + # Link of vmlinux.o used for section mismatch analysis # ${1} output file modpost_link() { local objects + local lds="" objects="--whole-archive \ ${KBUILD_VMLINUX_OBJS} \ @@ -57,6 +74,11 @@ modpost_link() --end-group" if [ -n "${CONFIG_LTO_CLANG}" ]; then + if [ -n "${CONFIG_MODVERSIONS}" ]; then + gen_symversions + lds="${lds} -T .tmp_symversions.lds" + fi + # This might take a while, so indicate that we're doing # an LTO link info LTO ${1} @@ -64,7 +86,7 @@ modpost_link() info LD ${1} fi - ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects} + ${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${lds} ${objects} } objtool_link() @@ -242,6 +264,7 @@ cleanup() { rm -f .btf.* rm -f .tmp_System.map + rm -f .tmp_symversions.lds rm -f .tmp_vmlinux* rm -f System.map rm -f vmlinux -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 10/28] kbuild: lto: fix module versioning 2020-09-03 20:30 ` [PATCH v2 10/28] kbuild: lto: fix module versioning Sami Tolvanen @ 2020-09-03 22:11 ` Kees Cook 2020-09-08 18:23 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:11 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:35PM -0700, Sami Tolvanen wrote: > With CONFIG_MODVERSIONS, version information is linked into each > compilation unit that exports symbols. With LTO, we cannot use this > method as all C code is compiled into LLVM bitcode instead. This > change collects symbol versions into .symversions files and merges > them in link-vmlinux.sh where they are all linked into vmlinux.o at > the same time. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> The only thought I have here is I wonder if this change could be made universally instead of gating on LTO? (i.e. is it noticeably slower to do it this way under non-LTO?) Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 10/28] kbuild: lto: fix module versioning 2020-09-03 22:11 ` Kees Cook @ 2020-09-08 18:23 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 18:23 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:11:54PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:35PM -0700, Sami Tolvanen wrote: > > With CONFIG_MODVERSIONS, version information is linked into each > > compilation unit that exports symbols. With LTO, we cannot use this > > method as all C code is compiled into LLVM bitcode instead. This > > change collects symbol versions into .symversions files and merges > > them in link-vmlinux.sh where they are all linked into vmlinux.o at > > the same time. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > The only thought I have here is I wonder if this change could be made > universally instead of gating on LTO? (i.e. is it noticeably slower to > do it this way under non-LTO?) I don't think it's noticeably slower, but keeping the version information in object files when possible is cleaner, in my opinion. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 11/28] kbuild: lto: postpone objtool 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (9 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 10/28] kbuild: lto: fix module versioning Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:19 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 12/28] kbuild: lto: limit inlining Sami Tolvanen ` (21 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, LLVM bitcode won't be compiled into native code until modpost_link, or modfinal for modules. This change postpones calls to objtool until after these steps. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/Kconfig | 2 +- scripts/Makefile.build | 2 ++ scripts/Makefile.modfinal | 24 ++++++++++++++++++++++-- scripts/link-vmlinux.sh | 23 ++++++++++++++++++++++- 4 files changed, 47 insertions(+), 4 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 71392e4a8900..7a418907e686 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -599,7 +599,7 @@ config LTO_CLANG depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) depends on ARCH_SUPPORTS_LTO_CLANG - depends on !FTRACE_MCOUNT_RECORD + depends on HAVE_OBJTOOL_MCOUNT || !(X86_64 && DYNAMIC_FTRACE) depends on !KASAN depends on !GCOV_KERNEL select LTO diff --git a/scripts/Makefile.build b/scripts/Makefile.build index c348e6d6b436..b8f1f0d65a73 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -218,6 +218,7 @@ cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), endif # USE_RECORDMCOUNT ifdef CONFIG_STACK_VALIDATION +ifndef CONFIG_LTO_CLANG ifneq ($(SKIP_STACK_VALIDATION),1) __objtool_obj := $(objtree)/tools/objtool/objtool @@ -253,6 +254,7 @@ objtool_obj = $(if $(patsubst y%,, \ $(__objtool_obj)) endif # SKIP_STACK_VALIDATION +endif # CONFIG_LTO_CLANG endif # CONFIG_STACK_VALIDATION # Rebuild all objects when objtool changes, or is enabled/disabled. diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 1005b147abd0..909bd509edb4 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -34,10 +34,30 @@ ifdef CONFIG_LTO_CLANG # With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to # avoid a second slow LTO link prelink-ext := .lto -endif + +# ELF processing was skipped earlier because we didn't have native code, +# so let's now process the prelinked binary before we link the module. + +ifdef CONFIG_STACK_VALIDATION +ifneq ($(SKIP_STACK_VALIDATION),1) +cmd_ld_ko_o += \ + $(objtree)/tools/objtool/objtool \ + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ + --module \ + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ + $(if $(CONFIG_RETPOLINE),--retpoline,) \ + $(if $(CONFIG_X86_SMAP),--uaccess,) \ + $(if $(USE_OBJTOOL_MCOUNT),--mcount,) \ + $(@:.ko=$(prelink-ext).o); + +endif # SKIP_STACK_VALIDATION +endif # CONFIG_STACK_VALIDATION + +endif # CONFIG_LTO_CLANG quiet_cmd_ld_ko_o = LD [M] $@ - cmd_ld_ko_o = \ + cmd_ld_ko_o += \ $(LD) -r $(KBUILD_LDFLAGS) \ $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \ $(addprefix -T , $(KBUILD_LDS_MODULE)) \ diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 3e99a19b9195..a352a5ad9ef7 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -93,8 +93,29 @@ objtool_link() { local objtoolopt; + if [ "${CONFIG_LTO_CLANG} ${CONFIG_STACK_VALIDATION}" = "y y" ]; then + # Don't perform vmlinux validation unless explicitly requested, + # but run objtool on vmlinux.o now that we have an object file. + if [ -n "${CONFIG_UNWINDER_ORC}" ]; then + objtoolopt="orc generate" + else + objtoolopt="check" + fi + + if [ -n ${USE_OBJTOOL_MCOUNT} ]; then + objtoolopt="${objtoolopt} --mcount" + fi + fi + if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then - objtoolopt="check --vmlinux" + if [ -z "${objtoolopt}" ]; then + objtoolopt="check --vmlinux" + else + objtoolopt="${objtoolopt} --vmlinux" + fi + fi + + if [ -n "${objtoolopt}" ]; then if [ -z "${CONFIG_FRAME_POINTER}" ]; then objtoolopt="${objtoolopt} --no-fp" fi -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 11/28] kbuild: lto: postpone objtool 2020-09-03 20:30 ` [PATCH v2 11/28] kbuild: lto: postpone objtool Sami Tolvanen @ 2020-09-03 22:19 ` Kees Cook 2020-09-08 20:56 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:19 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:36PM -0700, Sami Tolvanen wrote: > With LTO, LLVM bitcode won't be compiled into native code until > modpost_link, or modfinal for modules. This change postpones calls > to objtool until after these steps. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> For a "fail fast" style of building, it makes sense to have objtool run as early as possible, so it makes sense to keep the current behavior in non-LTO mode. I do wonder, though, if there is a real benefit to having "fail fast" case. I imagine a lot of automated builds are using --keep-going with make, and actually waiting until the end to do the validation means more code will get build-tested before objtool rejects the results. *shrug* > --- > arch/Kconfig | 2 +- > scripts/Makefile.build | 2 ++ > scripts/Makefile.modfinal | 24 ++++++++++++++++++++++-- > scripts/link-vmlinux.sh | 23 ++++++++++++++++++++++- > 4 files changed, 47 insertions(+), 4 deletions(-) > > diff --git a/arch/Kconfig b/arch/Kconfig > index 71392e4a8900..7a418907e686 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -599,7 +599,7 @@ config LTO_CLANG > depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) > depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) > depends on ARCH_SUPPORTS_LTO_CLANG > - depends on !FTRACE_MCOUNT_RECORD > + depends on HAVE_OBJTOOL_MCOUNT || !(X86_64 && DYNAMIC_FTRACE) > depends on !KASAN > depends on !GCOV_KERNEL > select LTO > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index c348e6d6b436..b8f1f0d65a73 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -218,6 +218,7 @@ cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), > endif # USE_RECORDMCOUNT > > ifdef CONFIG_STACK_VALIDATION > +ifndef CONFIG_LTO_CLANG > ifneq ($(SKIP_STACK_VALIDATION),1) > > __objtool_obj := $(objtree)/tools/objtool/objtool > @@ -253,6 +254,7 @@ objtool_obj = $(if $(patsubst y%,, \ > $(__objtool_obj)) > > endif # SKIP_STACK_VALIDATION > +endif # CONFIG_LTO_CLANG > endif # CONFIG_STACK_VALIDATION > > # Rebuild all objects when objtool changes, or is enabled/disabled. > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal > index 1005b147abd0..909bd509edb4 100644 > --- a/scripts/Makefile.modfinal > +++ b/scripts/Makefile.modfinal > @@ -34,10 +34,30 @@ ifdef CONFIG_LTO_CLANG > # With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to > # avoid a second slow LTO link > prelink-ext := .lto > -endif > + > +# ELF processing was skipped earlier because we didn't have native code, > +# so let's now process the prelinked binary before we link the module. > + > +ifdef CONFIG_STACK_VALIDATION > +ifneq ($(SKIP_STACK_VALIDATION),1) > +cmd_ld_ko_o += \ > + $(objtree)/tools/objtool/objtool \ > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > + --module \ > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > + $(if $(USE_OBJTOOL_MCOUNT),--mcount,) \ > + $(@:.ko=$(prelink-ext).o); > + > +endif # SKIP_STACK_VALIDATION > +endif # CONFIG_STACK_VALIDATION I wonder if objtool_args could be reused here instead of having two places to keep in sync? It looks like that might mean moving things around a bit before this patch, since I can't quite see if Makefile.build's variables are visible to Makefile.modfinal? > + > +endif # CONFIG_LTO_CLANG > > quiet_cmd_ld_ko_o = LD [M] $@ > - cmd_ld_ko_o = \ > + cmd_ld_ko_o += \ > $(LD) -r $(KBUILD_LDFLAGS) \ > $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \ > $(addprefix -T , $(KBUILD_LDS_MODULE)) \ > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index 3e99a19b9195..a352a5ad9ef7 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -93,8 +93,29 @@ objtool_link() > { > local objtoolopt; > > + if [ "${CONFIG_LTO_CLANG} ${CONFIG_STACK_VALIDATION}" = "y y" ]; then > + # Don't perform vmlinux validation unless explicitly requested, > + # but run objtool on vmlinux.o now that we have an object file. > + if [ -n "${CONFIG_UNWINDER_ORC}" ]; then > + objtoolopt="orc generate" > + else > + objtoolopt="check" > + fi > + > + if [ -n ${USE_OBJTOOL_MCOUNT} ]; then > + objtoolopt="${objtoolopt} --mcount" > + fi > + fi > + > if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then > - objtoolopt="check --vmlinux" > + if [ -z "${objtoolopt}" ]; then > + objtoolopt="check --vmlinux" > + else > + objtoolopt="${objtoolopt} --vmlinux" > + fi > + fi > + > + if [ -n "${objtoolopt}" ]; then > if [ -z "${CONFIG_FRAME_POINTER}" ]; then > objtoolopt="${objtoolopt} --no-fp" > fi > -- > 2.28.0.402.g5ffc5be6b7-goog > -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 11/28] kbuild: lto: postpone objtool 2020-09-03 22:19 ` Kees Cook @ 2020-09-08 20:56 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 20:56 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:19:43PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:36PM -0700, Sami Tolvanen wrote: > > With LTO, LLVM bitcode won't be compiled into native code until > > modpost_link, or modfinal for modules. This change postpones calls > > to objtool until after these steps. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > For a "fail fast" style of building, it makes sense to have objtool run > as early as possible, so it makes sense to keep the current behavior in > non-LTO mode. I do wonder, though, if there is a real benefit to having > "fail fast" case. I imagine a lot of automated builds are using > --keep-going with make, and actually waiting until the end to do the > validation means more code will get build-tested before objtool rejects > the results. *shrug* > > > --- > > arch/Kconfig | 2 +- > > scripts/Makefile.build | 2 ++ > > scripts/Makefile.modfinal | 24 ++++++++++++++++++++++-- > > scripts/link-vmlinux.sh | 23 ++++++++++++++++++++++- > > 4 files changed, 47 insertions(+), 4 deletions(-) > > > > diff --git a/arch/Kconfig b/arch/Kconfig > > index 71392e4a8900..7a418907e686 100644 > > --- a/arch/Kconfig > > +++ b/arch/Kconfig > > @@ -599,7 +599,7 @@ config LTO_CLANG > > depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm) > > depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) > > depends on ARCH_SUPPORTS_LTO_CLANG > > - depends on !FTRACE_MCOUNT_RECORD > > + depends on HAVE_OBJTOOL_MCOUNT || !(X86_64 && DYNAMIC_FTRACE) > > depends on !KASAN > > depends on !GCOV_KERNEL > > select LTO > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > > index c348e6d6b436..b8f1f0d65a73 100644 > > --- a/scripts/Makefile.build > > +++ b/scripts/Makefile.build > > @@ -218,6 +218,7 @@ cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), > > endif # USE_RECORDMCOUNT > > > > ifdef CONFIG_STACK_VALIDATION > > +ifndef CONFIG_LTO_CLANG > > ifneq ($(SKIP_STACK_VALIDATION),1) > > > > __objtool_obj := $(objtree)/tools/objtool/objtool > > @@ -253,6 +254,7 @@ objtool_obj = $(if $(patsubst y%,, \ > > $(__objtool_obj)) > > > > endif # SKIP_STACK_VALIDATION > > +endif # CONFIG_LTO_CLANG > > endif # CONFIG_STACK_VALIDATION > > > > # Rebuild all objects when objtool changes, or is enabled/disabled. > > diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal > > index 1005b147abd0..909bd509edb4 100644 > > --- a/scripts/Makefile.modfinal > > +++ b/scripts/Makefile.modfinal > > @@ -34,10 +34,30 @@ ifdef CONFIG_LTO_CLANG > > # With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to > > # avoid a second slow LTO link > > prelink-ext := .lto > > -endif > > + > > +# ELF processing was skipped earlier because we didn't have native code, > > +# so let's now process the prelinked binary before we link the module. > > + > > +ifdef CONFIG_STACK_VALIDATION > > +ifneq ($(SKIP_STACK_VALIDATION),1) > > +cmd_ld_ko_o += \ > > + $(objtree)/tools/objtool/objtool \ > > + $(if $(CONFIG_UNWINDER_ORC),orc generate,check) \ > > + --module \ > > + $(if $(CONFIG_FRAME_POINTER),,--no-fp) \ > > + $(if $(CONFIG_GCOV_KERNEL),--no-unreachable,) \ > > + $(if $(CONFIG_RETPOLINE),--retpoline,) \ > > + $(if $(CONFIG_X86_SMAP),--uaccess,) \ > > + $(if $(USE_OBJTOOL_MCOUNT),--mcount,) \ > > + $(@:.ko=$(prelink-ext).o); > > + > > +endif # SKIP_STACK_VALIDATION > > +endif # CONFIG_STACK_VALIDATION > > I wonder if objtool_args could be reused here instead of having two > places to keep in sync? It looks like that might mean moving things > around a bit before this patch, since I can't quite see if > Makefile.build's variables are visible to Makefile.modfinal? It doesn't look like they are. I suppose we could move objtool_args to Makefile.lib. Masahiro, any thoughts? Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 12/28] kbuild: lto: limit inlining 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (10 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 11/28] kbuild: lto: postpone objtool Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:20 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 13/28] kbuild: lto: merge module sections Sami Tolvanen ` (20 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel This change limits function inlining across translation unit boundaries in order to reduce the binary size with LTO. The -import-instr-limit flag defines a size limit, as the number of LLVM IR instructions, for importing functions from other TUs, defaulting to 100. Based on testing with arm64 defconfig, we found that a limit of 5 is a reasonable compromise between performance and binary size, reducing the size of a stripped vmlinux by 11%. Suggested-by: George Burgess IV <gbiv@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Makefile b/Makefile index 2752be67b460..c69e07bd506a 100644 --- a/Makefile +++ b/Makefile @@ -917,6 +917,10 @@ else CC_FLAGS_LTO_CLANG := -flto endif CC_FLAGS_LTO_CLANG += -fvisibility=default + +# Limit inlining across translation units to reduce binary size +LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 +KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) endif ifdef CONFIG_LTO -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 12/28] kbuild: lto: limit inlining 2020-09-03 20:30 ` [PATCH v2 12/28] kbuild: lto: limit inlining Sami Tolvanen @ 2020-09-03 22:20 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:20 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:37PM -0700, Sami Tolvanen wrote: > This change limits function inlining across translation unit boundaries > in order to reduce the binary size with LTO. The -import-instr-limit > flag defines a size limit, as the number of LLVM IR instructions, for > importing functions from other TUs, defaulting to 100. > > Based on testing with arm64 defconfig, we found that a limit of 5 is a > reasonable compromise between performance and binary size, reducing the > size of a stripped vmlinux by 11%. > > Suggested-by: George Burgess IV <gbiv@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 13/28] kbuild: lto: merge module sections 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (11 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 12/28] kbuild: lto: limit inlining Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:23 ` Kees Cook 2020-09-07 15:25 ` Masahiro Yamada 2020-09-03 20:30 ` [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen ` (19 subsequent siblings) 32 siblings, 2 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel LLD always splits sections with LTO, which increases module sizes. This change adds a linker script that merges the split sections in the final module. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- Makefile | 2 ++ scripts/module-lto.lds | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) create mode 100644 scripts/module-lto.lds diff --git a/Makefile b/Makefile index c69e07bd506a..bb82a4323f1d 100644 --- a/Makefile +++ b/Makefile @@ -921,6 +921,8 @@ CC_FLAGS_LTO_CLANG += -fvisibility=default # Limit inlining across translation units to reduce binary size LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) + +KBUILD_LDS_MODULE += $(srctree)/scripts/module-lto.lds endif ifdef CONFIG_LTO diff --git a/scripts/module-lto.lds b/scripts/module-lto.lds new file mode 100644 index 000000000000..cbb11dc3639a --- /dev/null +++ b/scripts/module-lto.lds @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and + * -ffunction-sections, which increases the size of the final module. + * Merge the split sections in the final binary. + */ +SECTIONS { + __patchable_function_entries : { *(__patchable_function_entries) } + + .bss : { + *(.bss .bss.[0-9a-zA-Z_]*) + *(.bss..L*) + } + + .data : { + *(.data .data.[0-9a-zA-Z_]*) + *(.data..L*) + } + + .rodata : { + *(.rodata .rodata.[0-9a-zA-Z_]*) + *(.rodata..L*) + } + + .text : { *(.text .text.[0-9a-zA-Z_]*) } +} -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 13/28] kbuild: lto: merge module sections 2020-09-03 20:30 ` [PATCH v2 13/28] kbuild: lto: merge module sections Sami Tolvanen @ 2020-09-03 22:23 ` Kees Cook 2020-09-07 15:25 ` Masahiro Yamada 1 sibling, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:23 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:38PM -0700, Sami Tolvanen wrote: > LLD always splits sections with LTO, which increases module sizes. This > change adds a linker script that merges the split sections in the final > module. > > Suggested-by: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> We'll likely need to come back around to this for FGKASLR (to keep the .text.* sections separated), but that's no different than the existing concerns for FGKASLR on the main kernel. Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 13/28] kbuild: lto: merge module sections 2020-09-03 20:30 ` [PATCH v2 13/28] kbuild: lto: merge module sections Sami Tolvanen 2020-09-03 22:23 ` Kees Cook @ 2020-09-07 15:25 ` Masahiro Yamada 2020-09-08 21:07 ` Sami Tolvanen 1 sibling, 1 reply; 212+ messages in thread From: Masahiro Yamada @ 2020-09-07 15:25 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > LLD always splits sections with LTO, which increases module sizes. This > change adds a linker script that merges the split sections in the final > module. > > Suggested-by: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > Makefile | 2 ++ > scripts/module-lto.lds | 26 ++++++++++++++++++++++++++ > 2 files changed, 28 insertions(+) > create mode 100644 scripts/module-lto.lds > > diff --git a/Makefile b/Makefile > index c69e07bd506a..bb82a4323f1d 100644 > --- a/Makefile > +++ b/Makefile > @@ -921,6 +921,8 @@ CC_FLAGS_LTO_CLANG += -fvisibility=default > # Limit inlining across translation units to reduce binary size > LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 > KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) > + > +KBUILD_LDS_MODULE += $(srctree)/scripts/module-lto.lds > endif > > ifdef CONFIG_LTO > diff --git a/scripts/module-lto.lds b/scripts/module-lto.lds > new file mode 100644 > index 000000000000..cbb11dc3639a > --- /dev/null > +++ b/scripts/module-lto.lds > @@ -0,0 +1,26 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and > + * -ffunction-sections, which increases the size of the final module. > + * Merge the split sections in the final binary. > + */ > +SECTIONS { > + __patchable_function_entries : { *(__patchable_function_entries) } > + > + .bss : { > + *(.bss .bss.[0-9a-zA-Z_]*) > + *(.bss..L*) > + } > + > + .data : { > + *(.data .data.[0-9a-zA-Z_]*) > + *(.data..L*) > + } > + > + .rodata : { > + *(.rodata .rodata.[0-9a-zA-Z_]*) > + *(.rodata..L*) > + } > + > + .text : { *(.text .text.[0-9a-zA-Z_]*) } > +} > -- > 2.28.0.402.g5ffc5be6b7-goog > After I apply https://patchwork.kernel.org/patch/11757323/, is it possible to do like this ? #ifdef CONFIG_LTO SECTIONS { ... }; #endif in scripts/module.lds.S -- Best Regards Masahiro Yamada _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 13/28] kbuild: lto: merge module sections 2020-09-07 15:25 ` Masahiro Yamada @ 2020-09-08 21:07 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 21:07 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Tue, Sep 08, 2020 at 12:25:54AM +0900, Masahiro Yamada wrote: > On Fri, Sep 4, 2020 at 5:31 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > LLD always splits sections with LTO, which increases module sizes. This > > change adds a linker script that merges the split sections in the final > > module. > > > > Suggested-by: Nick Desaulniers <ndesaulniers@google.com> > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > Makefile | 2 ++ > > scripts/module-lto.lds | 26 ++++++++++++++++++++++++++ > > 2 files changed, 28 insertions(+) > > create mode 100644 scripts/module-lto.lds > > > > diff --git a/Makefile b/Makefile > > index c69e07bd506a..bb82a4323f1d 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -921,6 +921,8 @@ CC_FLAGS_LTO_CLANG += -fvisibility=default > > # Limit inlining across translation units to reduce binary size > > LD_FLAGS_LTO_CLANG := -mllvm -import-instr-limit=5 > > KBUILD_LDFLAGS += $(LD_FLAGS_LTO_CLANG) > > + > > +KBUILD_LDS_MODULE += $(srctree)/scripts/module-lto.lds > > endif > > > > ifdef CONFIG_LTO > > diff --git a/scripts/module-lto.lds b/scripts/module-lto.lds > > new file mode 100644 > > index 000000000000..cbb11dc3639a > > --- /dev/null > > +++ b/scripts/module-lto.lds > > @@ -0,0 +1,26 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +/* > > + * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and > > + * -ffunction-sections, which increases the size of the final module. > > + * Merge the split sections in the final binary. > > + */ > > +SECTIONS { > > + __patchable_function_entries : { *(__patchable_function_entries) } > > + > > + .bss : { > > + *(.bss .bss.[0-9a-zA-Z_]*) > > + *(.bss..L*) > > + } > > + > > + .data : { > > + *(.data .data.[0-9a-zA-Z_]*) > > + *(.data..L*) > > + } > > + > > + .rodata : { > > + *(.rodata .rodata.[0-9a-zA-Z_]*) > > + *(.rodata..L*) > > + } > > + > > + .text : { *(.text .text.[0-9a-zA-Z_]*) } > > +} > > -- > > 2.28.0.402.g5ffc5be6b7-goog > > > > > After I apply https://patchwork.kernel.org/patch/11757323/, > is it possible to do like this ? > > > #ifdef CONFIG_LTO > SECTIONS { > ... > }; > #endif > > in scripts/module.lds.S Yes, that should work. I'll change this in v3 after your change is applied. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (12 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 13/28] kbuild: lto: merge module sections Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:29 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 15/28] init: lto: ensure initcall ordering Sami Tolvanen ` (18 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, llvm-nm prints out symbols for each archive member separately, which results in a lot of duplicate dependencies in the .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module consists of several compilation units, the output can exceed the default xargs command size limit and split the dependency list to multiple lines, which results in used symbols getting trimmed. This change removes duplicate dependencies, which will reduce the probability of this happening and makes .mod files smaller and easier to read. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/Makefile.build b/scripts/Makefile.build index b8f1f0d65a73..3bb36b4b853c 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -286,7 +286,7 @@ endef # List module undefined symbols (or empty line if not enabled) ifdef CONFIG_TRIM_UNUSED_KSYMS -cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | xargs echo +cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | sort -u | xargs echo else cmd_undef_syms = echo endif -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files 2020-09-03 20:30 ` [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen @ 2020-09-03 22:29 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:29 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:39PM -0700, Sami Tolvanen wrote: > With LTO, llvm-nm prints out symbols for each archive member > separately, which results in a lot of duplicate dependencies in the > .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module > consists of several compilation units, the output can exceed the > default xargs command size limit and split the dependency list to > multiple lines, which results in used symbols getting trimmed. > > This change removes duplicate dependencies, which will reduce the > probability of this happening and makes .mod files smaller and > easier to read. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 15/28] init: lto: ensure initcall ordering 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (13 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:40 ` Kees Cook 2020-09-10 9:25 ` David Woodhouse 2020-09-03 20:30 ` [PATCH v2 16/28] init: lto: fix PREL32 relocations Sami Tolvanen ` (17 subsequent siblings) 32 siblings, 2 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, the compiler doesn't necessarily obey the link order for initcalls, and initcall variables need globally unique names to avoid collisions at link time. This change exports __KBUILD_MODNAME and adds the initcall_id() macro, which uses it together with __COUNTER__ and __LINE__ to help ensure these variables have unique names, and moves each variable to its own section when LTO is enabled, so the correct order can be specified using a linker script. The generate_initcall_ordering.pl script uses nm to find initcalls from the object files passed to the linker, and generates a linker script that specifies the intended order. With LTO, the script is called in link-vmlinux.sh. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/init.h | 52 +++++- scripts/Makefile.lib | 6 +- scripts/generate_initcall_order.pl | 270 +++++++++++++++++++++++++++++ scripts/link-vmlinux.sh | 14 ++ 4 files changed, 333 insertions(+), 9 deletions(-) create mode 100755 scripts/generate_initcall_order.pl diff --git a/include/linux/init.h b/include/linux/init.h index 212fc9e2f691..af638cd6dd52 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -184,19 +184,57 @@ extern bool initcall_debug; * as KEEP() in the linker script. */ +/* Format: <modname>__<counter>_<line>_<fn> */ +#define __initcall_id(fn) \ + __PASTE(__KBUILD_MODNAME, \ + __PASTE(__, \ + __PASTE(__COUNTER__, \ + __PASTE(_, \ + __PASTE(__LINE__, \ + __PASTE(_, fn)))))) + +/* Format: __<prefix>__<iid><id> */ +#define __initcall_name(prefix, __iid, id) \ + __PASTE(__, \ + __PASTE(prefix, \ + __PASTE(__, \ + __PASTE(__iid, id)))) + +#ifdef CONFIG_LTO_CLANG +/* + * With LTO, the compiler doesn't necessarily obey link order for + * initcalls. In order to preserve the correct order, we add each + * variable into its own section and generate a linker script (in + * scripts/link-vmlinux.sh) to specify the order of the sections. + */ +#define __initcall_section(__sec, __iid) \ + #__sec ".init.." #__iid +#else +#define __initcall_section(__sec, __iid) \ + #__sec ".init" +#endif + #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define ___define_initcall(fn, id, __sec) \ +#define ____define_initcall(fn, __name, __sec) \ __ADDRESSABLE(fn) \ - asm(".section \"" #__sec ".init\", \"a\" \n" \ - "__initcall_" #fn #id ": \n" \ + asm(".section \"" __sec "\", \"a\" \n" \ + __stringify(__name) ": \n" \ ".long " #fn " - . \n" \ ".previous \n"); #else -#define ___define_initcall(fn, id, __sec) \ - static initcall_t __initcall_##fn##id __used \ - __attribute__((__section__(#__sec ".init"))) = fn; +#define ____define_initcall(fn, __name, __sec) \ + static initcall_t __name __used \ + __attribute__((__section__(__sec))) = fn; #endif +#define __unique_initcall(fn, id, __sec, __iid) \ + ____define_initcall(fn, \ + __initcall_name(initcall, __iid, id), \ + __initcall_section(__sec, __iid)) + +#define ___define_initcall(fn, id, __sec) \ + __unique_initcall(fn, id, __sec, __initcall_id(fn)) + #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) /* @@ -236,7 +274,7 @@ extern bool initcall_debug; #define __exitcall(fn) \ static exitcall_t __exitcall_##fn __exit_call = fn -#define console_initcall(fn) ___define_initcall(fn,, .con_initcall) +#define console_initcall(fn) ___define_initcall(fn, con, .con_initcall) struct obs_kernel_param { const char *str; diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 3d599716940c..7e382d12d309 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -117,9 +117,11 @@ target-stem = $(basename $(patsubst $(obj)/%,%,$@)) # These flags are needed for modversions and compiling, so we define them here # $(modname_flags) defines KBUILD_MODNAME as the name of the module it will # end up in (or would, if it gets compiled in) -name-fix = $(call stringify,$(subst $(comma),_,$(subst -,_,$1))) +name-fix-token = $(subst $(comma),_,$(subst -,_,$1)) +name-fix = $(call stringify,$(call name-fix-token,$1)) basename_flags = -DKBUILD_BASENAME=$(call name-fix,$(basetarget)) -modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) +modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) \ + -D__KBUILD_MODNAME=kmod_$(call name-fix-token,$(modname)) modfile_flags = -DKBUILD_MODFILE=$(call stringify,$(modfile)) _c_flags = $(filter-out $(CFLAGS_REMOVE_$(target-stem).o), \ diff --git a/scripts/generate_initcall_order.pl b/scripts/generate_initcall_order.pl new file mode 100755 index 000000000000..fe83aec2b51e --- /dev/null +++ b/scripts/generate_initcall_order.pl @@ -0,0 +1,270 @@ +#!/usr/bin/env perl +# SPDX-License-Identifier: GPL-2.0 +# +# Generates a linker script that specifies the correct initcall order. +# +# Copyright (C) 2019 Google LLC + +use strict; +use warnings; +use IO::Handle; +use IO::Select; +use POSIX ":sys_wait_h"; + +my $nm = $ENV{'NM'} || die "$0: ERROR: NM not set?"; +my $objtree = $ENV{'objtree'} || '.'; + +## currently active child processes +my $jobs = {}; # child process pid -> file handle +## results from child processes +my $results = {}; # object index -> [ { level, secname }, ... ] + +## reads _NPROCESSORS_ONLN to determine the maximum number of processes to +## start +sub get_online_processors { + open(my $fh, "getconf _NPROCESSORS_ONLN 2>/dev/null |") + or die "$0: ERROR: failed to execute getconf: $!"; + my $procs = <$fh>; + close($fh); + + if (!($procs =~ /^\d+$/)) { + return 1; + } + + return int($procs); +} + +## writes results to the parent process +## format: <file index> <initcall level> <base initcall section name> +sub write_results { + my ($index, $initcalls) = @_; + + # sort by the counter value to ensure the order of initcalls within + # each object file is correct + foreach my $counter (sort { $a <=> $b } keys(%{$initcalls})) { + my $level = $initcalls->{$counter}->{'level'}; + + # section name for the initcall function + my $secname = $initcalls->{$counter}->{'module'} . '__' . + $counter . '_' . + $initcalls->{$counter}->{'line'} . '_' . + $initcalls->{$counter}->{'function'}; + + print "$index $level $secname\n"; + } +} + +## reads a result line from a child process and adds it to the $results array +sub read_results{ + my ($fh) = @_; + + # each child prints out a full line w/ autoflush and exits after the + # last line, so even if buffered I/O blocks here, it shouldn't block + # very long + my $data = <$fh>; + + if (!defined($data)) { + return 0; + } + + chomp($data); + + my ($index, $level, $secname) = $data =~ + /^(\d+)\ ([^\ ]+)\ (.*)$/; + + if (!defined($index) || + !defined($level) || + !defined($secname)) { + die "$0: ERROR: child process returned invalid data: $data\n"; + } + + $index = int($index); + + if (!exists($results->{$index})) { + $results->{$index} = []; + } + + push (@{$results->{$index}}, { + 'level' => $level, + 'secname' => $secname + }); + + return 1; +} + +## finds initcalls from an object file or all object files in an archive, and +## writes results back to the parent process +sub find_initcalls { + my ($index, $file) = @_; + + die "$0: ERROR: file $file doesn't exist?" if (! -f $file); + + open(my $fh, "\"$nm\" --defined-only \"$file\" 2>/dev/null |") + or die "$0: ERROR: failed to execute \"$nm\": $!"; + + my $initcalls = {}; + + while (<$fh>) { + chomp; + + # check for the start of a new object file (if processing an + # archive) + my ($path)= $_ =~ /^(.+)\:$/; + + if (defined($path)) { + write_results($index, $initcalls); + $initcalls = {}; + next; + } + + # look for an initcall + my ($module, $counter, $line, $symbol) = $_ =~ + /[a-z]\s+__initcall__(\S*)__(\d+)_(\d+)_(.*)$/; + + if (!defined($module)) { + $module = '' + } + + if (!defined($counter) || + !defined($line) || + !defined($symbol)) { + next; + } + + # parse initcall level + my ($function, $level) = $symbol =~ + /^(.*)((early|rootfs|con|[0-9])s?)$/; + + die "$0: ERROR: invalid initcall name $symbol in $file($path)" + if (!defined($function) || !defined($level)); + + $initcalls->{$counter} = { + 'module' => $module, + 'line' => $line, + 'function' => $function, + 'level' => $level, + }; + } + + close($fh); + write_results($index, $initcalls); +} + +## waits for any child process to complete, reads the results, and adds them to +## the $results array for later processing +sub wait_for_results { + my ($select) = @_; + + my $pid = 0; + do { + # unblock children that may have a full write buffer + foreach my $fh ($select->can_read(0)) { + read_results($fh); + } + + # check for children that have exited, read the remaining data + # from them, and clean up + $pid = waitpid(-1, WNOHANG); + if ($pid > 0) { + if (!exists($jobs->{$pid})) { + next; + } + + my $fh = $jobs->{$pid}; + $select->remove($fh); + + while (read_results($fh)) { + # until eof + } + + close($fh); + delete($jobs->{$pid}); + } + } while ($pid > 0); +} + +## forks a child to process each file passed in the command line and collects +## the results +sub process_files { + my $index = 0; + my $njobs = get_online_processors(); + my $select = IO::Select->new(); + + while (my $file = shift(@ARGV)) { + # fork a child process and read it's stdout + my $pid = open(my $fh, '-|'); + + if (!defined($pid)) { + die "$0: ERROR: failed to fork: $!"; + } elsif ($pid) { + # save the child process pid and the file handle + $select->add($fh); + $jobs->{$pid} = $fh; + } else { + # in the child process + STDOUT->autoflush(1); + find_initcalls($index, "$objtree/$file"); + exit; + } + + $index++; + + # limit the number of children to $njobs + if (scalar(keys(%{$jobs})) >= $njobs) { + wait_for_results($select); + } + } + + # wait for the remaining children to complete + while (scalar(keys(%{$jobs})) > 0) { + wait_for_results($select); + } +} + +sub generate_initcall_lds() { + process_files(); + + my $sections = {}; # level -> [ secname, ...] + + # sort results to retain link order and split to sections per + # initcall level + foreach my $index (sort { $a <=> $b } keys(%{$results})) { + foreach my $result (@{$results->{$index}}) { + my $level = $result->{'level'}; + + if (!exists($sections->{$level})) { + $sections->{$level} = []; + } + + push(@{$sections->{$level}}, $result->{'secname'}); + } + } + + die "$0: ERROR: no initcalls?" if (!keys(%{$sections})); + + # print out a linker script that defines the order of initcalls for + # each level + print "SECTIONS {\n"; + + foreach my $level (sort(keys(%{$sections}))) { + my $section; + + if ($level eq 'con') { + $section = '.con_initcall.init'; + } else { + $section = ".initcall${level}.init"; + } + + print "\t${section} : {\n"; + + foreach my $secname (@{$sections->{$level}}) { + print "\t\t*(${section}..${secname}) ;\n"; + } + + print "\t}\n"; + } + + print "}\n"; +} + +generate_initcall_lds(); diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index a352a5ad9ef7..1d5730176bed 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -43,6 +43,16 @@ info() fi } +# Generate a linker script to ensure correct ordering of initcalls. +gen_initcalls() +{ + info GEN .tmp_initcalls.lds + + ${srctree}/scripts/generate_initcall_order.pl \ + ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS} \ + > .tmp_initcalls.lds +} + # If CONFIG_LTO_CLANG is selected, collect generated symbol versions into # .tmp_symversions.lds gen_symversions() @@ -74,6 +84,9 @@ modpost_link() --end-group" if [ -n "${CONFIG_LTO_CLANG}" ]; then + gen_initcalls + lds="-T .tmp_initcalls.lds" + if [ -n "${CONFIG_MODVERSIONS}" ]; then gen_symversions lds="${lds} -T .tmp_symversions.lds" @@ -285,6 +298,7 @@ cleanup() { rm -f .btf.* rm -f .tmp_System.map + rm -f .tmp_initcalls.lds rm -f .tmp_symversions.lds rm -f .tmp_vmlinux* rm -f System.map -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 15/28] init: lto: ensure initcall ordering 2020-09-03 20:30 ` [PATCH v2 15/28] init: lto: ensure initcall ordering Sami Tolvanen @ 2020-09-03 22:40 ` Kees Cook 2020-09-08 21:16 ` Sami Tolvanen 2020-09-10 9:25 ` David Woodhouse 1 sibling, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:40 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:40PM -0700, Sami Tolvanen wrote: > With LTO, the compiler doesn't necessarily obey the link order for > initcalls, and initcall variables need globally unique names to avoid > collisions at link time. > > This change exports __KBUILD_MODNAME and adds the initcall_id() macro, > which uses it together with __COUNTER__ and __LINE__ to help ensure > these variables have unique names, and moves each variable to its own > section when LTO is enabled, so the correct order can be specified using > a linker script. > > The generate_initcall_ordering.pl script uses nm to find initcalls from > the object files passed to the linker, and generates a linker script > that specifies the intended order. With LTO, the script is called in > link-vmlinux.sh. I think I asked before about this being made unconditional, but the hit on final link time was noticeable. Am I remembering that right? If so, sure, let's keep it separate. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > include/linux/init.h | 52 +++++- > scripts/Makefile.lib | 6 +- > scripts/generate_initcall_order.pl | 270 +++++++++++++++++++++++++++++ > scripts/link-vmlinux.sh | 14 ++ > 4 files changed, 333 insertions(+), 9 deletions(-) > create mode 100755 scripts/generate_initcall_order.pl > > diff --git a/include/linux/init.h b/include/linux/init.h > index 212fc9e2f691..af638cd6dd52 100644 > --- a/include/linux/init.h > +++ b/include/linux/init.h > @@ -184,19 +184,57 @@ extern bool initcall_debug; > * as KEEP() in the linker script. > */ > > +/* Format: <modname>__<counter>_<line>_<fn> */ > +#define __initcall_id(fn) \ > + __PASTE(__KBUILD_MODNAME, \ > + __PASTE(__, \ > + __PASTE(__COUNTER__, \ > + __PASTE(_, \ > + __PASTE(__LINE__, \ > + __PASTE(_, fn)))))) > + > +/* Format: __<prefix>__<iid><id> */ > +#define __initcall_name(prefix, __iid, id) \ > + __PASTE(__, \ > + __PASTE(prefix, \ > + __PASTE(__, \ > + __PASTE(__iid, id)))) > + > +#ifdef CONFIG_LTO_CLANG > +/* > + * With LTO, the compiler doesn't necessarily obey link order for > + * initcalls. In order to preserve the correct order, we add each > + * variable into its own section and generate a linker script (in > + * scripts/link-vmlinux.sh) to specify the order of the sections. > + */ > +#define __initcall_section(__sec, __iid) \ > + #__sec ".init.." #__iid > +#else > +#define __initcall_section(__sec, __iid) \ > + #__sec ".init" > +#endif > + > #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS > -#define ___define_initcall(fn, id, __sec) \ > +#define ____define_initcall(fn, __name, __sec) \ > __ADDRESSABLE(fn) \ > - asm(".section \"" #__sec ".init\", \"a\" \n" \ > - "__initcall_" #fn #id ": \n" \ > + asm(".section \"" __sec "\", \"a\" \n" \ > + __stringify(__name) ": \n" \ > ".long " #fn " - . \n" \ > ".previous \n"); > #else > -#define ___define_initcall(fn, id, __sec) \ > - static initcall_t __initcall_##fn##id __used \ > - __attribute__((__section__(#__sec ".init"))) = fn; > +#define ____define_initcall(fn, __name, __sec) \ > + static initcall_t __name __used \ > + __attribute__((__section__(__sec))) = fn; > #endif > > +#define __unique_initcall(fn, id, __sec, __iid) \ > + ____define_initcall(fn, \ > + __initcall_name(initcall, __iid, id), \ > + __initcall_section(__sec, __iid)) > + > +#define ___define_initcall(fn, id, __sec) \ > + __unique_initcall(fn, id, __sec, __initcall_id(fn)) > + > #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) > > /* > @@ -236,7 +274,7 @@ extern bool initcall_debug; > #define __exitcall(fn) \ > static exitcall_t __exitcall_##fn __exit_call = fn > > -#define console_initcall(fn) ___define_initcall(fn,, .con_initcall) > +#define console_initcall(fn) ___define_initcall(fn, con, .con_initcall) > > struct obs_kernel_param { > const char *str; > diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib > index 3d599716940c..7e382d12d309 100644 > --- a/scripts/Makefile.lib > +++ b/scripts/Makefile.lib > @@ -117,9 +117,11 @@ target-stem = $(basename $(patsubst $(obj)/%,%,$@)) > # These flags are needed for modversions and compiling, so we define them here > # $(modname_flags) defines KBUILD_MODNAME as the name of the module it will > # end up in (or would, if it gets compiled in) > -name-fix = $(call stringify,$(subst $(comma),_,$(subst -,_,$1))) > +name-fix-token = $(subst $(comma),_,$(subst -,_,$1)) > +name-fix = $(call stringify,$(call name-fix-token,$1)) > basename_flags = -DKBUILD_BASENAME=$(call name-fix,$(basetarget)) > -modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) > +modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) \ > + -D__KBUILD_MODNAME=kmod_$(call name-fix-token,$(modname)) > modfile_flags = -DKBUILD_MODFILE=$(call stringify,$(modfile)) > > _c_flags = $(filter-out $(CFLAGS_REMOVE_$(target-stem).o), \ > diff --git a/scripts/generate_initcall_order.pl b/scripts/generate_initcall_order.pl > new file mode 100755 > index 000000000000..fe83aec2b51e > --- /dev/null > +++ b/scripts/generate_initcall_order.pl > @@ -0,0 +1,270 @@ > +#!/usr/bin/env perl > +# SPDX-License-Identifier: GPL-2.0 > +# > +# Generates a linker script that specifies the correct initcall order. > +# > +# Copyright (C) 2019 Google LLC > + > +use strict; > +use warnings; > +use IO::Handle; > +use IO::Select; > +use POSIX ":sys_wait_h"; > + > +my $nm = $ENV{'NM'} || die "$0: ERROR: NM not set?"; > +my $objtree = $ENV{'objtree'} || '.'; > + > +## currently active child processes > +my $jobs = {}; # child process pid -> file handle > +## results from child processes > +my $results = {}; # object index -> [ { level, secname }, ... ] > + > +## reads _NPROCESSORS_ONLN to determine the maximum number of processes to > +## start > +sub get_online_processors { > + open(my $fh, "getconf _NPROCESSORS_ONLN 2>/dev/null |") > + or die "$0: ERROR: failed to execute getconf: $!"; > + my $procs = <$fh>; > + close($fh); > + > + if (!($procs =~ /^\d+$/)) { > + return 1; > + } > + > + return int($procs); > +} > + > +## writes results to the parent process > +## format: <file index> <initcall level> <base initcall section name> > +sub write_results { > + my ($index, $initcalls) = @_; > + > + # sort by the counter value to ensure the order of initcalls within > + # each object file is correct > + foreach my $counter (sort { $a <=> $b } keys(%{$initcalls})) { > + my $level = $initcalls->{$counter}->{'level'}; > + > + # section name for the initcall function > + my $secname = $initcalls->{$counter}->{'module'} . '__' . > + $counter . '_' . > + $initcalls->{$counter}->{'line'} . '_' . > + $initcalls->{$counter}->{'function'}; > + > + print "$index $level $secname\n"; > + } > +} > + > +## reads a result line from a child process and adds it to the $results array > +sub read_results{ > + my ($fh) = @_; > + > + # each child prints out a full line w/ autoflush and exits after the > + # last line, so even if buffered I/O blocks here, it shouldn't block > + # very long > + my $data = <$fh>; > + > + if (!defined($data)) { > + return 0; > + } > + > + chomp($data); > + > + my ($index, $level, $secname) = $data =~ > + /^(\d+)\ ([^\ ]+)\ (.*)$/; > + > + if (!defined($index) || > + !defined($level) || > + !defined($secname)) { > + die "$0: ERROR: child process returned invalid data: $data\n"; > + } > + > + $index = int($index); > + > + if (!exists($results->{$index})) { > + $results->{$index} = []; > + } > + > + push (@{$results->{$index}}, { > + 'level' => $level, > + 'secname' => $secname > + }); > + > + return 1; > +} > + > +## finds initcalls from an object file or all object files in an archive, and > +## writes results back to the parent process > +sub find_initcalls { > + my ($index, $file) = @_; > + > + die "$0: ERROR: file $file doesn't exist?" if (! -f $file); > + > + open(my $fh, "\"$nm\" --defined-only \"$file\" 2>/dev/null |") > + or die "$0: ERROR: failed to execute \"$nm\": $!"; > + > + my $initcalls = {}; > + > + while (<$fh>) { > + chomp; > + > + # check for the start of a new object file (if processing an > + # archive) > + my ($path)= $_ =~ /^(.+)\:$/; > + > + if (defined($path)) { > + write_results($index, $initcalls); > + $initcalls = {}; > + next; > + } > + > + # look for an initcall > + my ($module, $counter, $line, $symbol) = $_ =~ > + /[a-z]\s+__initcall__(\S*)__(\d+)_(\d+)_(.*)$/; > + > + if (!defined($module)) { > + $module = '' > + } > + > + if (!defined($counter) || > + !defined($line) || > + !defined($symbol)) { > + next; > + } > + > + # parse initcall level > + my ($function, $level) = $symbol =~ > + /^(.*)((early|rootfs|con|[0-9])s?)$/; > + > + die "$0: ERROR: invalid initcall name $symbol in $file($path)" > + if (!defined($function) || !defined($level)); > + > + $initcalls->{$counter} = { > + 'module' => $module, > + 'line' => $line, > + 'function' => $function, > + 'level' => $level, > + }; > + } > + > + close($fh); > + write_results($index, $initcalls); > +} > + > +## waits for any child process to complete, reads the results, and adds them to > +## the $results array for later processing > +sub wait_for_results { > + my ($select) = @_; > + > + my $pid = 0; > + do { > + # unblock children that may have a full write buffer > + foreach my $fh ($select->can_read(0)) { > + read_results($fh); > + } > + > + # check for children that have exited, read the remaining data > + # from them, and clean up > + $pid = waitpid(-1, WNOHANG); > + if ($pid > 0) { > + if (!exists($jobs->{$pid})) { > + next; > + } > + > + my $fh = $jobs->{$pid}; > + $select->remove($fh); > + > + while (read_results($fh)) { > + # until eof > + } > + > + close($fh); > + delete($jobs->{$pid}); > + } > + } while ($pid > 0); > +} > + > +## forks a child to process each file passed in the command line and collects > +## the results > +sub process_files { > + my $index = 0; > + my $njobs = get_online_processors(); > + my $select = IO::Select->new(); > + > + while (my $file = shift(@ARGV)) { > + # fork a child process and read it's stdout > + my $pid = open(my $fh, '-|'); /me makes noises about make -jN and the jobserver and not using all processors on a machine if we were asked nicely not to. I wrote a jobserver aware tool for the documentation builds, but it's in python (scripts/jobserver-exec). Instead of reinventing that wheel (and in Perl), we could: 1) ignore this problem and assume anyone using LTO is fine with using all CPUs 2) implement a jobserver-aware Perl script to do this 3) make Python a build dependency of CONFIG_LTO and re-use scripts/jobserver-exec > + > + if (!defined($pid)) { > + die "$0: ERROR: failed to fork: $!"; > + } elsif ($pid) { > + # save the child process pid and the file handle > + $select->add($fh); > + $jobs->{$pid} = $fh; > + } else { > + # in the child process > + STDOUT->autoflush(1); > + find_initcalls($index, "$objtree/$file"); > + exit; > + } > + > + $index++; > + > + # limit the number of children to $njobs > + if (scalar(keys(%{$jobs})) >= $njobs) { > + wait_for_results($select); > + } > + } > + > + # wait for the remaining children to complete > + while (scalar(keys(%{$jobs})) > 0) { > + wait_for_results($select); > + } > +} > + > +sub generate_initcall_lds() { > + process_files(); > + > + my $sections = {}; # level -> [ secname, ...] > + > + # sort results to retain link order and split to sections per > + # initcall level > + foreach my $index (sort { $a <=> $b } keys(%{$results})) { > + foreach my $result (@{$results->{$index}}) { > + my $level = $result->{'level'}; > + > + if (!exists($sections->{$level})) { > + $sections->{$level} = []; > + } > + > + push(@{$sections->{$level}}, $result->{'secname'}); > + } > + } > + > + die "$0: ERROR: no initcalls?" if (!keys(%{$sections})); > + > + # print out a linker script that defines the order of initcalls for > + # each level > + print "SECTIONS {\n"; > + > + foreach my $level (sort(keys(%{$sections}))) { > + my $section; > + > + if ($level eq 'con') { > + $section = '.con_initcall.init'; > + } else { > + $section = ".initcall${level}.init"; > + } > + > + print "\t${section} : {\n"; > + > + foreach my $secname (@{$sections->{$level}}) { > + print "\t\t*(${section}..${secname}) ;\n"; > + } > + > + print "\t}\n"; > + } > + > + print "}\n"; > +} > + > +generate_initcall_lds(); > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index a352a5ad9ef7..1d5730176bed 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -43,6 +43,16 @@ info() > fi > } > > +# Generate a linker script to ensure correct ordering of initcalls. > +gen_initcalls() > +{ > + info GEN .tmp_initcalls.lds > + > + ${srctree}/scripts/generate_initcall_order.pl \ > + ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS} \ > + > .tmp_initcalls.lds > +} > + > # If CONFIG_LTO_CLANG is selected, collect generated symbol versions into > # .tmp_symversions.lds > gen_symversions() > @@ -74,6 +84,9 @@ modpost_link() > --end-group" > > if [ -n "${CONFIG_LTO_CLANG}" ]; then > + gen_initcalls > + lds="-T .tmp_initcalls.lds" Oh, I think lds should be explicitly a "local" at the start of this function, perhaps back in the symversions patch that touches this? > + > if [ -n "${CONFIG_MODVERSIONS}" ]; then > gen_symversions > lds="${lds} -T .tmp_symversions.lds" > @@ -285,6 +298,7 @@ cleanup() > { > rm -f .btf.* > rm -f .tmp_System.map > + rm -f .tmp_initcalls.lds > rm -f .tmp_symversions.lds > rm -f .tmp_vmlinux* > rm -f System.map > -- > 2.28.0.402.g5ffc5be6b7-goog > -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 15/28] init: lto: ensure initcall ordering 2020-09-03 22:40 ` Kees Cook @ 2020-09-08 21:16 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 21:16 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:40:31PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:40PM -0700, Sami Tolvanen wrote: > > With LTO, the compiler doesn't necessarily obey the link order for > > initcalls, and initcall variables need globally unique names to avoid > > collisions at link time. > > > > This change exports __KBUILD_MODNAME and adds the initcall_id() macro, > > which uses it together with __COUNTER__ and __LINE__ to help ensure > > these variables have unique names, and moves each variable to its own > > section when LTO is enabled, so the correct order can be specified using > > a linker script. > > > > The generate_initcall_ordering.pl script uses nm to find initcalls from > > the object files passed to the linker, and generates a linker script > > that specifies the intended order. With LTO, the script is called in > > link-vmlinux.sh. > > I think I asked before about this being made unconditional, but the hit > on final link time was noticeable. Am I remembering that right? If so, > sure, let's keep it separate. Yes, it was noticeable when compiling on systems with fewer CPU cores, so I would prefer to keep it separate. > > +## forks a child to process each file passed in the command line and collects > > +## the results > > +sub process_files { > > + my $index = 0; > > + my $njobs = get_online_processors(); > > + my $select = IO::Select->new(); > > + > > + while (my $file = shift(@ARGV)) { > > + # fork a child process and read it's stdout > > + my $pid = open(my $fh, '-|'); > > /me makes noises about make -jN and the jobserver and not using all > processors on a machine if we were asked nicely not to. > > I wrote a jobserver aware tool for the documentation builds, but it's in > python (scripts/jobserver-exec). Instead of reinventing that wheel (and > in Perl), we could: > > 1) ignore this problem and assume anyone using LTO is fine with using all CPUs > > 2) implement a jobserver-aware Perl script to do this > > 3) make Python a build dependency of CONFIG_LTO and re-use scripts/jobserver-exec I'm fine with any of these options, although I'm not sure why anyone would want to compile an LTO kernel without using all the available cores... :) Using jobserver-exec seems like the easiest option if we want to limit the number of cores used here. Any preferences? > > # If CONFIG_LTO_CLANG is selected, collect generated symbol versions into > > # .tmp_symversions.lds > > gen_symversions() > > @@ -74,6 +84,9 @@ modpost_link() > > --end-group" > > > > if [ -n "${CONFIG_LTO_CLANG}" ]; then > > + gen_initcalls > > + lds="-T .tmp_initcalls.lds" > > Oh, I think lds should be explicitly a "local" at the start of this > function, perhaps back in the symversions patch that touches this? It's already local, that part is just not visible in this patch. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 15/28] init: lto: ensure initcall ordering 2020-09-03 20:30 ` [PATCH v2 15/28] init: lto: ensure initcall ordering Sami Tolvanen 2020-09-03 22:40 ` Kees Cook @ 2020-09-10 9:25 ` David Woodhouse 2020-09-10 15:07 ` Sami Tolvanen 1 sibling, 1 reply; 212+ messages in thread From: David Woodhouse @ 2020-09-10 9:25 UTC (permalink / raw) To: Sami Tolvanen, Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, linux-arm-kernel [-- Attachment #1.1: Type: text/plain, Size: 876 bytes --] On Thu, 2020-09-03 at 13:30 -0700, Sami Tolvanen wrote: > With LTO, the compiler doesn't necessarily obey the link order for > initcalls, and initcall variables need globally unique names to avoid > collisions at link time. > > This change exports __KBUILD_MODNAME and adds the initcall_id() macro, > which uses it together with __COUNTER__ and __LINE__ to help ensure > these variables have unique names, and moves each variable to its own > section when LTO is enabled, so the correct order can be specified using > a linker script. > > The generate_initcall_ordering.pl script uses nm to find initcalls from > the object files passed to the linker, and generates a linker script > that specifies the intended order. With LTO, the script is called in > link-vmlinux.sh. Is this guaranteed to give you the *same* initcall ordering with LTO as without? [-- Attachment #1.2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5174 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 15/28] init: lto: ensure initcall ordering 2020-09-10 9:25 ` David Woodhouse @ 2020-09-10 15:07 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-10 15:07 UTC (permalink / raw) To: David Woodhouse Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 10, 2020 at 10:25:40AM +0100, David Woodhouse wrote: > On Thu, 2020-09-03 at 13:30 -0700, Sami Tolvanen wrote: > > With LTO, the compiler doesn't necessarily obey the link order for > > initcalls, and initcall variables need globally unique names to avoid > > collisions at link time. > > > > This change exports __KBUILD_MODNAME and adds the initcall_id() macro, > > which uses it together with __COUNTER__ and __LINE__ to help ensure > > these variables have unique names, and moves each variable to its own > > section when LTO is enabled, so the correct order can be specified using > > a linker script. > > > > The generate_initcall_ordering.pl script uses nm to find initcalls from > > the object files passed to the linker, and generates a linker script > > that specifies the intended order. With LTO, the script is called in > > link-vmlinux.sh. > > Is this guaranteed to give you the *same* initcall ordering with LTO as > without? Yes. It follows the link order, just like the linker without LTO, and also preserves the order within each file. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 16/28] init: lto: fix PREL32 relocations 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (14 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 15/28] init: lto: ensure initcall ordering Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:41 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO Sami Tolvanen ` (16 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, the compiler can rename static functions to avoid global naming collisions. As initcall functions are typically static, renaming can break references to them in inline assembly. This change adds a global stub with a stable name for each initcall to fix the issue when PREL32 relocations are used. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- include/linux/init.h | 31 +++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/init.h b/include/linux/init.h index af638cd6dd52..cea63f7e7705 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -209,26 +209,49 @@ extern bool initcall_debug; */ #define __initcall_section(__sec, __iid) \ #__sec ".init.." #__iid + +/* + * With LTO, the compiler can rename static functions to avoid + * global naming collisions. We use a global stub function for + * initcalls to create a stable symbol name whose address can be + * taken in inline assembly when PREL32 relocations are used. + */ +#define __initcall_stub(fn, __iid, id) \ + __initcall_name(initstub, __iid, id) + +#define __define_initcall_stub(__stub, fn) \ + int __init __stub(void); \ + int __init __stub(void) \ + { \ + return fn(); \ + } \ + __ADDRESSABLE(__stub) #else #define __initcall_section(__sec, __iid) \ #__sec ".init" + +#define __initcall_stub(fn, __iid, id) fn + +#define __define_initcall_stub(__stub, fn) \ + __ADDRESSABLE(fn) #endif #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define ____define_initcall(fn, __name, __sec) \ - __ADDRESSABLE(fn) \ +#define ____define_initcall(fn, __stub, __name, __sec) \ + __define_initcall_stub(__stub, fn) \ asm(".section \"" __sec "\", \"a\" \n" \ __stringify(__name) ": \n" \ - ".long " #fn " - . \n" \ + ".long " __stringify(__stub) " - . \n" \ ".previous \n"); #else -#define ____define_initcall(fn, __name, __sec) \ +#define ____define_initcall(fn, __unused, __name, __sec) \ static initcall_t __name __used \ __attribute__((__section__(__sec))) = fn; #endif #define __unique_initcall(fn, id, __sec, __iid) \ ____define_initcall(fn, \ + __initcall_stub(fn, __iid, id), \ __initcall_name(initcall, __iid, id), \ __initcall_section(__sec, __iid)) -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 16/28] init: lto: fix PREL32 relocations 2020-09-03 20:30 ` [PATCH v2 16/28] init: lto: fix PREL32 relocations Sami Tolvanen @ 2020-09-03 22:41 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:41 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:41PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote: > With LTO, the compiler can rename static functions to avoid global > naming collisions. As initcall functions are typically static, > renaming can break references to them in inline assembly. This > change adds a global stub with a stable name for each initcall to > fix the issue when PREL32 relocations are used. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> This was a Delight(tm) to get right. Thanks for finding the right magic here. :) Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (15 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 16/28] init: lto: fix PREL32 relocations Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:42 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 18/28] modpost: lto: strip .lto from module names Sami Tolvanen ` (15 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With Clang's Link Time Optimization (LTO), the compiler can rename static functions to avoid global naming collisions. As PCI fixup functions are typically static, renaming can break references to them in inline assembly. This change adds a global stub to DECLARE_PCI_FIXUP_SECTION to fix the issue when PREL32 relocations are used. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> --- include/linux/pci.h | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/include/linux/pci.h b/include/linux/pci.h index 835530605c0d..4e64421981c7 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1909,19 +1909,28 @@ enum pci_fixup_pass { }; #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ - class_shift, hook) \ - __ADDRESSABLE(hook) \ +#define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook, stub) \ + void stub(struct pci_dev *dev); \ + void stub(struct pci_dev *dev) \ + { \ + hook(dev); \ + } \ asm(".section " #sec ", \"a\" \n" \ ".balign 16 \n" \ ".short " #vendor ", " #device " \n" \ ".long " #class ", " #class_shift " \n" \ - ".long " #hook " - . \n" \ + ".long " #stub " - . \n" \ ".previous \n"); + +#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook, stub) \ + ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook, stub) #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ class_shift, hook) \ __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ - class_shift, hook) + class_shift, hook, __UNIQUE_ID(hook)) #else /* Anonymous variables would be nice... */ #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class, \ -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO 2020-09-03 20:30 ` [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO Sami Tolvanen @ 2020-09-03 22:42 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:42 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:42PM -0700, Sami Tolvanen wrote: > With Clang's Link Time Optimization (LTO), the compiler can rename > static functions to avoid global naming collisions. As PCI fixup > functions are typically static, renaming can break references > to them in inline assembly. This change adds a global stub to > DECLARE_PCI_FIXUP_SECTION to fix the issue when PREL32 relocations > are used. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 18/28] modpost: lto: strip .lto from module names 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (16 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:42 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 19/28] scripts/mod: disable LTO for empty.c Sami Tolvanen ` (14 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With LTO, everything is compiled into LLVM bitcode, so we have to link each module into native code before modpost. Kbuild uses the .lto.o suffix for these files, which also ends up in module information. This change strips the unnecessary .lto suffix from the module name. Suggested-by: Bill Wendling <morbo@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/mod/modpost.c | 16 +++++++--------- scripts/mod/modpost.h | 9 +++++++++ scripts/mod/sumversion.c | 6 +++++- 3 files changed, 21 insertions(+), 10 deletions(-) diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index 69341b36f271..5a329df55cc3 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -17,7 +17,6 @@ #include <ctype.h> #include <string.h> #include <limits.h> -#include <stdbool.h> #include <errno.h> #include "modpost.h" #include "../../include/linux/license.h" @@ -80,14 +79,6 @@ modpost_log(enum loglevel loglevel, const char *fmt, ...) exit(1); } -static inline bool strends(const char *str, const char *postfix) -{ - if (strlen(str) < strlen(postfix)) - return false; - - return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0; -} - void *do_nofail(void *ptr, const char *expr) { if (!ptr) @@ -1984,6 +1975,10 @@ static char *remove_dot(char *s) size_t m = strspn(s + n + 1, "0123456789"); if (m && (s[n + m] == '.' || s[n + m] == 0)) s[n] = 0; + + /* strip trailing .lto */ + if (strends(s, ".lto")) + s[strlen(s) - 4] = '\0'; } return s; } @@ -2007,6 +2002,9 @@ static void read_symbols(const char *modname) /* strip trailing .o */ tmp = NOFAIL(strdup(modname)); tmp[strlen(tmp) - 2] = '\0'; + /* strip trailing .lto */ + if (strends(tmp, ".lto")) + tmp[strlen(tmp) - 4] = '\0'; mod = new_module(tmp); free(tmp); } diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h index 3aa052722233..fab30d201f9e 100644 --- a/scripts/mod/modpost.h +++ b/scripts/mod/modpost.h @@ -2,6 +2,7 @@ #include <stdio.h> #include <stdlib.h> #include <stdarg.h> +#include <stdbool.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> @@ -180,6 +181,14 @@ static inline unsigned int get_secindex(const struct elf_info *info, return info->symtab_shndx_start[sym - info->symtab_start]; } +static inline bool strends(const char *str, const char *postfix) +{ + if (strlen(str) < strlen(postfix)) + return false; + + return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0; +} + /* file2alias.c */ extern unsigned int cross_build; void handle_moddevtable(struct module *mod, struct elf_info *info, diff --git a/scripts/mod/sumversion.c b/scripts/mod/sumversion.c index d587f40f1117..760e6baa7eda 100644 --- a/scripts/mod/sumversion.c +++ b/scripts/mod/sumversion.c @@ -391,10 +391,14 @@ void get_src_version(const char *modname, char sum[], unsigned sumlen) struct md4_ctx md; char *fname; char filelist[PATH_MAX + 1]; + int postfix_len = 1; + + if (strends(modname, ".lto.o")) + postfix_len = 5; /* objects for a module are listed in the first line of *.mod file. */ snprintf(filelist, sizeof(filelist), "%.*smod", - (int)strlen(modname) - 1, modname); + (int)strlen(modname) - postfix_len, modname); buf = read_text_file(filelist); -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 18/28] modpost: lto: strip .lto from module names 2020-09-03 20:30 ` [PATCH v2 18/28] modpost: lto: strip .lto from module names Sami Tolvanen @ 2020-09-03 22:42 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:42 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:43PM -0700, Sami Tolvanen wrote: > With LTO, everything is compiled into LLVM bitcode, so we have to link > each module into native code before modpost. Kbuild uses the .lto.o > suffix for these files, which also ends up in module information. This > change strips the unnecessary .lto suffix from the module name. > > Suggested-by: Bill Wendling <morbo@google.com> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 19/28] scripts/mod: disable LTO for empty.c 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (17 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 18/28] modpost: lto: strip .lto from module names Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:43 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 20/28] efi/libstub: disable LTO Sami Tolvanen ` (13 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With CONFIG_LTO_CLANG, clang generates LLVM IR instead of ELF object files. As empty.o is used for probing target properties, disable LTO for it to produce an object file instead. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/mod/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile index 78071681d924..c9e38ad937fd 100644 --- a/scripts/mod/Makefile +++ b/scripts/mod/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 OBJECT_FILES_NON_STANDARD := y +CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO) hostprogs-always-y += modpost mk_elfconfig always-y += empty.o -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 19/28] scripts/mod: disable LTO for empty.c 2020-09-03 20:30 ` [PATCH v2 19/28] scripts/mod: disable LTO for empty.c Sami Tolvanen @ 2020-09-03 22:43 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:43 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:44PM -0700, Sami Tolvanen wrote: > With CONFIG_LTO_CLANG, clang generates LLVM IR instead of ELF object > files. As empty.o is used for probing target properties, disable LTO > for it to produce an object file instead. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 20/28] efi/libstub: disable LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (18 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 19/28] scripts/mod: disable LTO for empty.c Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:43 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 21/28] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen ` (12 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel With CONFIG_LTO_CLANG, we produce LLVM bitcode instead of ELF object files. Since LTO is not really needed here and the Makefile assumes we produce an object file, disable LTO for libstub. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- drivers/firmware/efi/libstub/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 296b18fbd7a2..0ea5aa52c7fa 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -35,6 +35,8 @@ KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \ # remove SCS flags from all objects in this directory KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) +# disable LTO +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS)) GCOV_PROFILE := n # Sanitizer runtimes are unavailable and cannot be linked here. -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 20/28] efi/libstub: disable LTO 2020-09-03 20:30 ` [PATCH v2 20/28] efi/libstub: disable LTO Sami Tolvanen @ 2020-09-03 22:43 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:43 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:45PM -0700, Sami Tolvanen wrote: > With CONFIG_LTO_CLANG, we produce LLVM bitcode instead of ELF object > files. Since LTO is not really needed here and the Makefile assumes we > produce an object file, disable LTO for libstub. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 21/28] drivers/misc/lkdtm: disable LTO for rodata.o 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (19 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 20/28] efi/libstub: disable LTO Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen ` (11 subsequent siblings) 32 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Disable LTO for rodata.o to allow objcopy to be used to manipulate sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Kees Cook <keescook@chromium.org> --- drivers/misc/lkdtm/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile index c70b3822013f..dd4c936d4d73 100644 --- a/drivers/misc/lkdtm/Makefile +++ b/drivers/misc/lkdtm/Makefile @@ -13,6 +13,7 @@ lkdtm-$(CONFIG_LKDTM) += cfi.o KASAN_SANITIZE_stackleak.o := n KCOV_INSTRUMENT_rodata.o := n +CFLAGS_REMOVE_rodata.o += $(CC_FLAGS_LTO) OBJCOPYFLAGS := OBJCOPYFLAGS_rodata_objcopy.o := \ -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (20 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 21/28] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:44 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 23/28] arm64: vdso: disable LTO Sami Tolvanen ` (10 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Since arm64 does not use -pg in CC_FLAGS_FTRACE with DYNAMIC_FTRACE_WITH_REGS, skip running recordmcount by exporting CC_USING_PATCHABLE_FUNCTION_ENTRY. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index 130569f90c54..eeaf3c2e0971 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -127,6 +127,7 @@ endif ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y) KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY CC_FLAGS_FTRACE := -fpatchable-function-entry=2 + export CC_USING_PATCHABLE_FUNCTION_ENTRY := 1 endif # Default value -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY 2020-09-03 20:30 ` [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen @ 2020-09-03 22:44 ` Kees Cook 2020-09-08 21:23 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:44 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:47PM -0700, Sami Tolvanen wrote: > Since arm64 does not use -pg in CC_FLAGS_FTRACE with > DYNAMIC_FTRACE_WITH_REGS, skip running recordmcount by > exporting CC_USING_PATCHABLE_FUNCTION_ENTRY. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> How stand-alone is this? Does it depend on the earlier mcount fixes? Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY 2020-09-03 22:44 ` Kees Cook @ 2020-09-08 21:23 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 21:23 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:44:18PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:47PM -0700, Sami Tolvanen wrote: > > Since arm64 does not use -pg in CC_FLAGS_FTRACE with > > DYNAMIC_FTRACE_WITH_REGS, skip running recordmcount by > > exporting CC_USING_PATCHABLE_FUNCTION_ENTRY. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > How stand-alone is this? Does it depend on the earlier mcount fixes? It does, because exporting CC_USING_PATCHABLE_FUNCTION_ENTRY doesn't change anything without the earlier mcount changes. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 23/28] arm64: vdso: disable LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (21 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory Sami Tolvanen ` (9 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Disable LTO for the vDSO by filtering out CC_FLAGS_LTO, as there's no point in using link-time optimization for the small about of C code. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/kernel/vdso/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile index 45d5cfe46429..aa47070a3ccf 100644 --- a/arch/arm64/kernel/vdso/Makefile +++ b/arch/arm64/kernel/vdso/Makefile @@ -30,8 +30,8 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv \ ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18 ccflags-y += -DDISABLE_BRANCH_PROFILING -CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) $(GCC_PLUGINS_CFLAGS) -KBUILD_CFLAGS += $(DISABLE_LTO) +CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) $(GCC_PLUGINS_CFLAGS) \ + $(CC_FLAGS_LTO) KASAN_SANITIZE := n UBSAN_SANITIZE := n OBJECT_FILES_NON_STANDARD := y -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 23/28] arm64: vdso: disable LTO 2020-09-03 20:30 ` [PATCH v2 23/28] arm64: vdso: disable LTO Sami Tolvanen @ 2020-09-03 22:45 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:45 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:48PM -0700, Sami Tolvanen wrote: > Disable LTO for the vDSO by filtering out CC_FLAGS_LTO, as there's no > point in using link-time optimization for the small about of C code. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Yup. (And another replacement of the non-functional DISABLE_LTO...) Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (22 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 23/28] arm64: vdso: disable LTO Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen ` (8 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel We use objcopy to manipulate ELF binaries for the nvhe code, which fails with LTO as the compiler produces LLVM bitcode instead. Disable LTO for this code to allow objcopy to be used. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/kvm/hyp/nvhe/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index aef76487edc2..c903c8f31280 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -45,9 +45,9 @@ quiet_cmd_hypcopy = HYPCOPY $@ --rename-section=.text=.hyp.text \ $< $@ -# Remove ftrace and Shadow Call Stack CFLAGS. +# Remove ftrace, LTO, and Shadow Call Stack CFLAGS. # This is equivalent to the 'notrace' and '__noscs' annotations. -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_LTO) $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) # KVM nVHE code is run at a different exception code with a different map, so # compiler instrumentation that inserts callbacks or checks into the code may -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory 2020-09-03 20:30 ` [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory Sami Tolvanen @ 2020-09-03 22:45 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:45 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:49PM -0700, 'Sami Tolvanen' via Clang Built Linux wrote: > We use objcopy to manipulate ELF binaries for the nvhe code, > which fails with LTO as the compiler produces LLVM bitcode > instead. Disable LTO for this code to allow objcopy to be used. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (23 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO Sami Tolvanen ` (7 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Allow CONFIG_LTO_CLANG and CONFIG_THINLTO to be enabled. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6d232837cbee..2699fc5d332e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -72,6 +72,8 @@ config ARM64 select ARCH_USE_SYM_ANNOTATIONS select ARCH_SUPPORTS_MEMORY_FAILURE select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK + select ARCH_SUPPORTS_LTO_CLANG + select ARCH_SUPPORTS_THINLTO select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && (GCC_VERSION >= 50000 || CC_IS_CLANG) select ARCH_SUPPORTS_NUMA_BALANCING -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected 2020-09-03 20:30 ` [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen @ 2020-09-03 22:45 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:45 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:50PM -0700, Sami Tolvanen wrote: > Allow CONFIG_LTO_CLANG and CONFIG_THINLTO to be enabled. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (24 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:46 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen ` (6 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Remove the undefined DISABLE_LTO flag from the vDSO, and filter out CC_FLAGS_LTO flags instead where needed. Note that while we could use Clang's LTO for the 64-bit vDSO, it won't add noticeable benefit for the small amount of C code. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/entry/vdso/Makefile | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 215376d975a2..9b742f21d2db 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -9,8 +9,6 @@ ARCH_REL_TYPE_ABS := R_X86_64_JUMP_SLOT|R_X86_64_GLOB_DAT|R_X86_64_RELATIVE| ARCH_REL_TYPE_ABS += R_386_GLOB_DAT|R_386_JMP_SLOT|R_386_RELATIVE include $(srctree)/lib/vdso/Makefile -KBUILD_CFLAGS += $(DISABLE_LTO) - # Sanitizer runtimes are unavailable and cannot be linked here. KASAN_SANITIZE := n UBSAN_SANITIZE := n @@ -92,7 +90,7 @@ ifneq ($(RETPOLINE_VDSO_CFLAGS),) endif endif -$(vobjs): KBUILD_CFLAGS := $(filter-out $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL) +$(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL) # # vDSO code runs in userspace and -pg doesn't help with profiling anyway. @@ -150,6 +148,7 @@ KBUILD_CFLAGS_32 := $(filter-out -fno-pic,$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out -mfentry,$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS_32)) +KBUILD_CFLAGS_32 := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=0 -fpic KBUILD_CFLAGS_32 += -fno-stack-protector KBUILD_CFLAGS_32 += $(call cc-option, -foptimize-sibling-calls) -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO 2020-09-03 20:30 ` [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO Sami Tolvanen @ 2020-09-03 22:46 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:46 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:51PM -0700, Sami Tolvanen wrote: > Remove the undefined DISABLE_LTO flag from the vDSO, and filter out > CC_FLAGS_LTO flags instead where needed. Note that while we could use > Clang's LTO for the 64-bit vDSO, it won't add noticeable benefit for > the small amount of C code. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Moar DISABLE_LTO... Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (25 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:47 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen ` (5 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel L4_PAGE_OFFSET is a constant value, so don't warn about absolute relocations. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/tools/relocs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..8f3bf34840ce 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -47,6 +47,7 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = { [S_ABS] = "^(xen_irq_disable_direct_reloc$|" "xen_save_fl_direct_reloc$|" + "L4_PAGE_OFFSET|" "VDSO|" "__crc_)", -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations 2020-09-03 20:30 ` [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen @ 2020-09-03 22:47 ` Kees Cook 2020-09-08 23:28 ` Sami Tolvanen 0 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:47 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:52PM -0700, Sami Tolvanen wrote: > L4_PAGE_OFFSET is a constant value, so don't warn about absolute > relocations. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Any other details on this? I assume this is an ld.lld-ism. Any idea why this is only a problem under LTO? (Or is this an LLVM integrated assembler-ism?) Regardless, yes, let's nail it down: Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations 2020-09-03 22:47 ` Kees Cook @ 2020-09-08 23:28 ` Sami Tolvanen 0 siblings, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 23:28 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 03:47:32PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:52PM -0700, Sami Tolvanen wrote: > > L4_PAGE_OFFSET is a constant value, so don't warn about absolute > > relocations. > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > Any other details on this? I assume this is an ld.lld-ism. Any idea why > this is only a problem under LTO? (Or is this an LLVM integrated > assembler-ism?) Regardless, yes, let's nail it down: With the LTO v1 series, LLD generated this relocation somewhere in the .init.data section, but only with LTO: $ arch/x86/tools/relocs --abs-relocs vmlinux WARNING: Absolute relocations present Offset Info Type Sym.Value Sym.Name ffffffff828e7fe0 0000000100000001 R_X86_64_64 0000000000000111 L4_PAGE_OFFSET It actually looks like this might not be a problem anymore with the current ToT kernel and the v2 series, but I'll do some more testing to confirm this and drop the patch from v3 if it's no longer needed. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (26 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen @ 2020-09-03 20:30 ` Sami Tolvanen 2020-09-03 22:48 ` Kees Cook 2020-09-03 23:34 ` [PATCH v2 00/28] Add support for Clang LTO Kees Cook ` (4 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-03 20:30 UTC (permalink / raw) To: Masahiro Yamada, Will Deacon Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, linux-arm-kernel Allow CONFIG_LTO_CLANG and CONFIG_THINLTO to be enabled. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/x86/Kconfig | 2 ++ arch/x86/Makefile | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6de2e5c0bdba..0a49008c2363 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -92,6 +92,8 @@ config X86 select ARCH_SUPPORTS_ACPI select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 + select ARCH_SUPPORTS_LTO_CLANG if X86_64 + select ARCH_SUPPORTS_THINLTO if X86_64 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_QUEUED_RWLOCKS select ARCH_USE_QUEUED_SPINLOCKS diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 4346ffb2e39f..49e3b8674eb5 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -173,6 +173,11 @@ ifeq ($(ACCUMULATE_OUTGOING_ARGS), 1) KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args,) endif +ifdef CONFIG_LTO_CLANG +KBUILD_LDFLAGS += -plugin-opt=-code-model=kernel \ + -plugin-opt=-stack-alignment=$(if $(CONFIG_X86_32),4,8) +endif + # Workaround for a gcc prelease that unfortunately was shipped in a suse release KBUILD_CFLAGS += -Wno-sign-compare # -- 2.28.0.402.g5ffc5be6b7-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected 2020-09-03 20:30 ` [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen @ 2020-09-03 22:48 ` Kees Cook 0 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 22:48 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:53PM -0700, Sami Tolvanen wrote: > Allow CONFIG_LTO_CLANG and CONFIG_THINLTO to be enabled. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> I think it might be worth detailing why these arguments aren't handled in the normal fashion under Clang's LTO. Regardless, it's needed to make it work, so: Reviewed-by: Kees Cook <keescook@chromium.org> -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (27 preceding siblings ...) 2020-09-03 20:30 ` [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen @ 2020-09-03 23:34 ` Kees Cook 2020-09-04 4:45 ` Nathan Chancellor 2020-09-03 23:38 ` Kees Cook ` (3 subsequent siblings) 32 siblings, 1 reply; 212+ messages in thread From: Kees Cook @ 2020-09-03 23:34 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:25PM -0700, Sami Tolvanen wrote: > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). Tested-by: Kees Cook <keescook@chromium.org> FWIW, this gives me a happy booting x86 kernel: # cat /proc/version Linux version 5.9.0-rc3+ (kees@amarok) (clang version 12.0.0 (https://github.com/llvm/llvm-project.git db1ec04963cce70f2593e58cecac55f2e6accf52), LLD 12.0.0 (https://github.com/llvm/llvm-project.git db1ec04963cce70f2593e58cecac55f2e6accf52)) #1 SMP Thu Sep 3 15:54:14 PDT 2020 # zgrep 'LTO[_=]' /proc/config.gz CONFIG_LTO=y CONFIG_ARCH_SUPPORTS_LTO_CLANG=y CONFIG_ARCH_SUPPORTS_THINLTO=y CONFIG_THINLTO=y # CONFIG_LTO_NONE is not set CONFIG_LTO_CLANG=y I'd like to find a way to get this series landing sanely. It has dependencies on fixes/features in a few trees, and it looks like it's been difficult to keep forward momentum on LTO while trying to simultaneously chase changes in those trees, especially since it means no one care carry LTO in -next without shared branches. To that end, I'd like to find a way forward where Sami doesn't have to keep carrying a couple dozen patches. :) The fixes/features outside of, or partially overlapping, Masahiro's kbuild tree appear to be: [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber [PATCH v2 03/28] lib/string.c: implement stpcpy [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o [PATCH v2 07/28] kbuild: add support for objtool mcount [PATCH v2 08/28] x86, build: use objtool mcount [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO [PATCH v2 20/28] efi/libstub: disable LTO [PATCH v2 21/28] drivers/misc/lkdtm: disable LTO for rodata.o [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY [PATCH v2 23/28] arm64: vdso: disable LTO [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected The distinctly kbuild patches are: [PATCH v2 09/28] kbuild: add support for Clang LTO [PATCH v2 10/28] kbuild: lto: fix module versioning [PATCH v2 11/28] kbuild: lto: postpone objtool [PATCH v2 12/28] kbuild: lto: limit inlining [PATCH v2 13/28] kbuild: lto: merge module sections [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files [PATCH v2 15/28] init: lto: ensure initcall ordering [PATCH v2 16/28] init: lto: fix PREL32 relocations [PATCH v2 18/28] modpost: lto: strip .lto from module names [PATCH v2 19/28] scripts/mod: disable LTO for empty.c Patch 3 is in -mm and I expect it will land in the next rc (I hope, since it's needed universally for Clang builds). Patch 4 is living in -tip, to appear shortly in -next, AFAICT? I would expect 1 and 2 to appear in -tip soon, but I'm not sure? For patches 5, 6, 7, and 8 I would expect them to normally go via -tip's objtool tree, but getting an Ack would let them land elsewhere. Patch 17 I'd expect to normally go via Bjorn's tree, but he's given an Ack so it can live elsewhere without surprises. :) Patches 19, 20, 21, 23, 24, 26 are all simple "just disable LTO" patches. This leaves 9-16 and 18. Patches 10, 12, 14, 16, and 18 seem mostly "mechanical" in nature, leaving the bulk of the review on patches 9, 11, 13, and 15. Masahiro, given the spread of dependent patches between 2 (or more?) -tip branches and -mm, how do you want to proceed? I wonder if it might be possible to create a shared branch to avoid merge headaches, and I (or -tip folks, or you) could carry patches 1-8 there so patches 9 and later could have a common base? Thanks! -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-03 23:34 ` [PATCH v2 00/28] Add support for Clang LTO Kees Cook @ 2020-09-04 4:45 ` Nathan Chancellor 0 siblings, 0 replies; 212+ messages in thread From: Nathan Chancellor @ 2020-09-04 4:45 UTC (permalink / raw) To: Kees Cook Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 04:34:09PM -0700, Kees Cook wrote: > On Thu, Sep 03, 2020 at 01:30:25PM -0700, Sami Tolvanen wrote: > > This patch series adds support for building x86_64 and arm64 kernels > > with Clang's Link Time Optimization (LTO). > > Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> I have been continuously running this series on virtualized x86_64 (WSL2 on my home workstation) and bare metal arm64 (Raspberry Pi 4) with no major issues or regressions noticed. > FWIW, this gives me a happy booting x86 kernel: > > # cat /proc/version > Linux version 5.9.0-rc3+ (kees@amarok) (clang version 12.0.0 (https://github.com/llvm/llvm-project.git db1ec04963cce70f2593e58cecac55f2e6accf52), LLD 12.0.0 (https://github.com/llvm/llvm-project.git db1ec04963cce70f2593e58cecac55f2e6accf52)) #1 SMP Thu Sep 3 15:54:14 PDT 2020 > # zgrep 'LTO[_=]' /proc/config.gz > CONFIG_LTO=y > CONFIG_ARCH_SUPPORTS_LTO_CLANG=y > CONFIG_ARCH_SUPPORTS_THINLTO=y > CONFIG_THINLTO=y > # CONFIG_LTO_NONE is not set > CONFIG_LTO_CLANG=y > > I'd like to find a way to get this series landing sanely. It has > dependencies on fixes/features in a few trees, and it looks like > it's been difficult to keep forward momentum on LTO while trying to > simultaneously chase changes in those trees, especially since it means > no one care carry LTO in -next without shared branches. To that end, > I'd like to find a way forward where Sami doesn't have to keep carrying > a couple dozen patches. :) > > The fixes/features outside of, or partially overlapping, Masahiro's > kbuild tree appear to be: > > [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation > [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber > [PATCH v2 03/28] lib/string.c: implement stpcpy > [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype > [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc > [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o > [PATCH v2 07/28] kbuild: add support for objtool mcount > [PATCH v2 08/28] x86, build: use objtool mcount > [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO > [PATCH v2 20/28] efi/libstub: disable LTO > [PATCH v2 21/28] drivers/misc/lkdtm: disable LTO for rodata.o > [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY > [PATCH v2 23/28] arm64: vdso: disable LTO > [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory > [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected > [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO > [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations > [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected > > The distinctly kbuild patches are: > > [PATCH v2 09/28] kbuild: add support for Clang LTO > [PATCH v2 10/28] kbuild: lto: fix module versioning > [PATCH v2 11/28] kbuild: lto: postpone objtool > [PATCH v2 12/28] kbuild: lto: limit inlining > [PATCH v2 13/28] kbuild: lto: merge module sections > [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files > [PATCH v2 15/28] init: lto: ensure initcall ordering > [PATCH v2 16/28] init: lto: fix PREL32 relocations > [PATCH v2 18/28] modpost: lto: strip .lto from module names > [PATCH v2 19/28] scripts/mod: disable LTO for empty.c > > Patch 3 is in -mm and I expect it will land in the next rc (I hope, > since it's needed universally for Clang builds). > > Patch 4 is living in -tip, to appear shortly in -next, AFAICT? > > I would expect 1 and 2 to appear in -tip soon, but I'm not sure? > > For patches 5, 6, 7, and 8 I would expect them to normally go via -tip's > objtool tree, but getting an Ack would let them land elsewhere. > > Patch 17 I'd expect to normally go via Bjorn's tree, but he's given an > Ack so it can live elsewhere without surprises. :) > > Patches 19, 20, 21, 23, 24, 26 are all simple "just disable LTO" > patches. > > This leaves 9-16 and 18. Patches 10, 12, 14, 16, and 18 seem mostly > "mechanical" in nature, leaving the bulk of the review on patches 9, > 11, 13, and 15. > > Masahiro, given the spread of dependent patches between 2 (or more?) -tip > branches and -mm, how do you want to proceed? I wonder if it might > be possible to create a shared branch to avoid merge headaches, and I > (or -tip folks, or you) could carry patches 1-8 there so patches 9 and > later could have a common base? > > Thanks! > > -- > Kees Cook > For what it's worth, the static call series that is in -tip and about to land in -next conflicts relatively heavy with this. There are fairly innocuous conflicts in some objtool files but two contextual changes are needed to keep things building. It probably makes sense for most if not all of this to live in -tip with acks. Ideally, if the stpcpy patch gets merged into an -rc, this can just be based on that. check.c:556:80: error: too few arguments to function call, expected 5, have 4 sec = elf_create_section(file->elf, "__mcount_loc", sizeof(unsigned long), idx); ~~~~~~~~~~~~~~~~~~ ^ ./elf.h:124:17: note: 'elf_create_section' declared here struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr); ^ 1 error generated. kernel/static_call.c:438:16: error: returning 'void' from a function with incompatible result type 'int' early_initcall(static_call_init); ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ include/linux/init.h:268:47: note: expanded from macro 'early_initcall' #define early_initcall(fn) __define_initcall(fn, early) ~~~~~~~~~~~~~~~~~~^~~~~~~~~~ include/linux/init.h:261:54: note: expanded from macro '__define_initcall' #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ include/linux/init.h:259:20: note: expanded from macro '___define_initcall' __unique_initcall(fn, id, __sec, __initcall_id(fn)) ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/init.h:253:22: note: expanded from macro '__unique_initcall' ____define_initcall(fn, \ ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/init.h:241:33: note: expanded from macro '____define_initcall' __define_initcall_stub(__stub, fn) \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ include/linux/init.h:226:10: note: expanded from macro '__define_initcall_stub' return fn(); \ ^~~~ 1 error generated. Below is what I ended up with for fixes. Cheers, Nathan diff --git a/include/linux/static_call.h b/include/linux/static_call.h index bfa2ba39be57..61034e9798d6 100644 --- a/include/linux/static_call.h +++ b/include/linux/static_call.h @@ -136,7 +136,7 @@ extern void arch_static_call_transform(void *site, void *tramp, void *func, bool #ifdef CONFIG_HAVE_STATIC_CALL_INLINE -extern void __init static_call_init(void); +extern int __init static_call_init(void); struct static_call_mod { struct static_call_mod *next; diff --git a/kernel/static_call.c b/kernel/static_call.c index f8362b3f8fd5..84565c2a41b8 100644 --- a/kernel/static_call.c +++ b/kernel/static_call.c @@ -410,12 +410,12 @@ int static_call_text_reserved(void *start, void *end) return __static_call_mod_text_reserved(start, end); } -void __init static_call_init(void) +int __init static_call_init(void) { int ret; if (static_call_initialized) - return; + return 0; cpus_read_lock(); static_call_lock(); @@ -434,6 +434,7 @@ void __init static_call_init(void) #ifdef CONFIG_MODULES register_module_notifier(&static_call_module_nb); #endif + return 0; } early_initcall(static_call_init); diff --git a/tools/objtool/check.c b/tools/objtool/check.c index d31554adcf4e..34db58110f3d 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -553,7 +553,7 @@ static int create_mcount_loc_sections(struct objtool_file *file) list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) idx++; - sec = elf_create_section(file->elf, "__mcount_loc", sizeof(unsigned long), idx); + sec = elf_create_section(file->elf, "__mcount_loc", 0, sizeof(unsigned long), idx); if (!sec) return -1; _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (28 preceding siblings ...) 2020-09-03 23:34 ` [PATCH v2 00/28] Add support for Clang LTO Kees Cook @ 2020-09-03 23:38 ` Kees Cook 2020-09-04 7:53 ` Sedat Dilek ` (2 subsequent siblings) 32 siblings, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-03 23:38 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 03, 2020 at 01:30:25PM -0700, Sami Tolvanen wrote: > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). > [...] > base-commit: e28f0104343d0c132fa37f479870c9e43355fee4 And if you're not a b4 user, this tree can be found at either of these places: https://github.com/samitolvanen/linux/commits/clang-lto git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git kspp/sami/lto/v2 -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (29 preceding siblings ...) 2020-09-03 23:38 ` Kees Cook @ 2020-09-04 7:53 ` Sedat Dilek 2020-09-04 8:55 ` peterz 2020-09-06 0:24 ` Masahiro Yamada 32 siblings, 0 replies; 212+ messages in thread From: Sedat Dilek @ 2020-09-04 7:53 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Peter Zijlstra, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, Clang-Built-Linux ML, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 3, 2020 at 10:30 PM 'Sami Tolvanen' via Clang Built Linux <clang-built-linux@googlegroups.com> wrote: > > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). > > In addition to performance, the primary motivation for LTO is > to allow Clang's Control-Flow Integrity (CFI) to be used in the > kernel. Google has shipped millions of Pixel devices running three > major kernel versions with LTO+CFI since 2018. > > Most of the patches are build system changes for handling LLVM > bitcode, which Clang produces with LTO instead of ELF object files, > postponing ELF processing until a later stage, and ensuring initcall > ordering. > > Note that patches 1-4 are not directly related to LTO, but are > needed to compile LTO kernels with ToT Clang, so I'm including them > in the series for your convenience: > > - Patches 1-3 are required for building the kernel with ToT Clang, > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > I jumped to Sami's clang-cfi Git tree which includes clang-lto v2. My LLVM toolchain is version 11.0.0.0-rc2+ more precisely git 97ac9e82002d6b12831ca2c78f739cca65a4fa05. If this is OK, feel free to add my... Tested-by: Sedat Dilek <sedat.dilek@gmail.com> - Sedat - [1] https://github.com/samitolvanen/linux/commits/clang-cfi > --- > Changes in v2: > > - Fixed -Wmissing-prototypes warnings with W=1. > > - Dropped cc-option from -fsplit-lto-unit and added .thinlto-cache > scrubbing to make distclean. > > - Added a comment about Clang >=11 being required. > > - Added a patch to disable LTO for the arm64 KVM nVHE code. > > - Disabled objtool's noinstr validation with LTO unless enabled. > > - Included Peter's proposed objtool mcount patch in the series > and replaced recordmcount with the objtool pass to avoid > whitelisting relocations that are not calls. > > - Updated several commit messages with better explanations. > > > Arvind Sankar (2): > x86/boot/compressed: Disable relocation relaxation > x86/asm: Replace __force_order with memory clobber > > Luca Stefani (1): > RAS/CEC: Fix cec_init() prototype > > Nick Desaulniers (1): > lib/string.c: implement stpcpy > > Peter Zijlstra (1): > objtool: Add a pass for generating __mcount_loc > > Sami Tolvanen (23): > objtool: Don't autodetect vmlinux.o > kbuild: add support for objtool mcount > x86, build: use objtool mcount > kbuild: add support for Clang LTO > kbuild: lto: fix module versioning > kbuild: lto: postpone objtool > kbuild: lto: limit inlining > kbuild: lto: merge module sections > kbuild: lto: remove duplicate dependencies from .mod files > init: lto: ensure initcall ordering > init: lto: fix PREL32 relocations > PCI: Fix PREL32 relocations for LTO > modpost: lto: strip .lto from module names > scripts/mod: disable LTO for empty.c > efi/libstub: disable LTO > drivers/misc/lkdtm: disable LTO for rodata.o > arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY > arm64: vdso: disable LTO > KVM: arm64: disable LTO for the nVHE directory > arm64: allow LTO_CLANG and THINLTO to be selected > x86, vdso: disable LTO only for vDSO > x86, relocs: Ignore L4_PAGE_OFFSET relocations > x86, build: allow LTO_CLANG and THINLTO to be selected > > .gitignore | 1 + > Makefile | 65 ++++++- > arch/Kconfig | 67 +++++++ > arch/arm64/Kconfig | 2 + > arch/arm64/Makefile | 1 + > arch/arm64/kernel/vdso/Makefile | 4 +- > arch/arm64/kvm/hyp/nvhe/Makefile | 4 +- > arch/x86/Kconfig | 3 + > arch/x86/Makefile | 5 + > arch/x86/boot/compressed/Makefile | 2 + > arch/x86/boot/compressed/pgtable_64.c | 9 - > arch/x86/entry/vdso/Makefile | 5 +- > arch/x86/include/asm/special_insns.h | 28 +-- > arch/x86/kernel/cpu/common.c | 4 +- > arch/x86/tools/relocs.c | 1 + > drivers/firmware/efi/libstub/Makefile | 2 + > drivers/misc/lkdtm/Makefile | 1 + > drivers/ras/cec.c | 9 +- > include/asm-generic/vmlinux.lds.h | 11 +- > include/linux/init.h | 79 +++++++- > include/linux/pci.h | 19 +- > kernel/trace/Kconfig | 5 + > lib/string.c | 24 +++ > scripts/Makefile.build | 55 +++++- > scripts/Makefile.lib | 6 +- > scripts/Makefile.modfinal | 31 ++- > scripts/Makefile.modpost | 26 ++- > scripts/generate_initcall_order.pl | 270 ++++++++++++++++++++++++++ > scripts/link-vmlinux.sh | 94 ++++++++- > scripts/mod/Makefile | 1 + > scripts/mod/modpost.c | 16 +- > scripts/mod/modpost.h | 9 + > scripts/mod/sumversion.c | 6 +- > scripts/module-lto.lds | 26 +++ > tools/objtool/builtin-check.c | 13 +- > tools/objtool/builtin.h | 2 +- > tools/objtool/check.c | 83 ++++++++ > tools/objtool/check.h | 1 + > tools/objtool/objtool.h | 1 + > 39 files changed, 883 insertions(+), 108 deletions(-) > create mode 100755 scripts/generate_initcall_order.pl > create mode 100644 scripts/module-lto.lds > > > base-commit: e28f0104343d0c132fa37f479870c9e43355fee4 > -- > 2.28.0.402.g5ffc5be6b7-goog > > -- > You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. > To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200903203053.3411268-1-samitolvanen%40google.com. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (30 preceding siblings ...) 2020-09-04 7:53 ` Sedat Dilek @ 2020-09-04 8:55 ` peterz 2020-09-04 9:08 ` Sedat Dilek 2020-09-06 0:24 ` Masahiro Yamada 32 siblings, 1 reply; 212+ messages in thread From: peterz @ 2020-09-04 8:55 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel Please don't nest series! Start a new thread for every posting. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-04 8:55 ` peterz @ 2020-09-04 9:08 ` Sedat Dilek 0 siblings, 0 replies; 212+ messages in thread From: Sedat Dilek @ 2020-09-04 9:08 UTC (permalink / raw) To: peterz Cc: linux-arch, x86, Kees Cook, Paul E. McKenney, kernel-hardening, Greg Kroah-Hartman, Masahiro Yamada, linux-kbuild, Nick Desaulniers, linux-kernel, Steven Rostedt, Clang-Built-Linux ML, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Fri, Sep 4, 2020 at 10:55 AM <peterz@infradead.org> wrote: > > > Please don't nest series! > > Start a new thread for every posting. > You are right Peter, my apologies. - Sedat - _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen ` (31 preceding siblings ...) 2020-09-04 8:55 ` peterz @ 2020-09-06 0:24 ` Masahiro Yamada 2020-09-08 23:46 ` Sami Tolvanen 32 siblings, 1 reply; 212+ messages in thread From: Masahiro Yamada @ 2020-09-06 0:24 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel [-- Attachment #1: Type: text/plain, Size: 2491 bytes --] On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > This patch series adds support for building x86_64 and arm64 kernels > with Clang's Link Time Optimization (LTO). > > In addition to performance, the primary motivation for LTO is > to allow Clang's Control-Flow Integrity (CFI) to be used in the > kernel. Google has shipped millions of Pixel devices running three > major kernel versions with LTO+CFI since 2018. > > Most of the patches are build system changes for handling LLVM > bitcode, which Clang produces with LTO instead of ELF object files, > postponing ELF processing until a later stage, and ensuring initcall > ordering. > > Note that patches 1-4 are not directly related to LTO, but are > needed to compile LTO kernels with ToT Clang, so I'm including them > in the series for your convenience: > > - Patches 1-3 are required for building the kernel with ToT Clang, > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > I still do not understand how this patch set works. (only me?) Please let me ask fundamental questions. I applied this series on top of Linus' tree, and compiled for ARCH=arm64. I compared the kernel size with/without LTO. [1] No LTO (arm64 defconfig, CONFIG_LTO_NONE) $ llvm-size vmlinux text data bss dec hex filename 15848692 10099449 493060 26441201 19375f1 vmlinux [2] Clang LTO (arm64 defconfig + CONFIG_LTO_CLANG) $ llvm-size vmlinux text data bss dec hex filename 15906864 10197445 490804 26595113 195cf29 vmlinux I compared the size of raw binary, arch/arm64/boot/Image. Its size increased too. So, in my experiment, enabling CONFIG_LTO_CLANG increases the kernel size. Is this correct? One more thing, could you teach me how Clang LTO optimizes the code against relocatable objects? When I learned Clang LTO first, I read this document: https://llvm.org/docs/LinkTimeOptimization.html It is easy to confirm the final executable does not contain foo2, foo3... In contrast to userspace programs, kernel modules are basically relocatable objects. Does Clang drop unused symbols from relocatable objects? If so, how? I implemented an example module (see the attachment), and checked the symbols. Nothing was dropped. The situation is the same for build-in because LTO is run against vmlinux.o, which is relocatable as well. -- Best Regards Masahiro Yamada [-- Attachment #2: 0001-lto-test-module.patch --] [-- Type: application/x-patch, Size: 3428 bytes --] [-- Attachment #3: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-06 0:24 ` Masahiro Yamada @ 2020-09-08 23:46 ` Sami Tolvanen 2020-09-10 1:18 ` Masahiro Yamada 0 siblings, 1 reply; 212+ messages in thread From: Sami Tolvanen @ 2020-09-08 23:46 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > This patch series adds support for building x86_64 and arm64 kernels > > with Clang's Link Time Optimization (LTO). > > > > In addition to performance, the primary motivation for LTO is > > to allow Clang's Control-Flow Integrity (CFI) to be used in the > > kernel. Google has shipped millions of Pixel devices running three > > major kernel versions with LTO+CFI since 2018. > > > > Most of the patches are build system changes for handling LLVM > > bitcode, which Clang produces with LTO instead of ELF object files, > > postponing ELF processing until a later stage, and ensuring initcall > > ordering. > > > > Note that patches 1-4 are not directly related to LTO, but are > > needed to compile LTO kernels with ToT Clang, so I'm including them > > in the series for your convenience: > > > > - Patches 1-3 are required for building the kernel with ToT Clang, > > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > > > > > I still do not understand how this patch set works. > (only me?) > > Please let me ask fundamental questions. > > > > I applied this series on top of Linus' tree, > and compiled for ARCH=arm64. > > I compared the kernel size with/without LTO. > > > > [1] No LTO (arm64 defconfig, CONFIG_LTO_NONE) > > $ llvm-size vmlinux > text data bss dec hex filename > 15848692 10099449 493060 26441201 19375f1 vmlinux > > > > [2] Clang LTO (arm64 defconfig + CONFIG_LTO_CLANG) > > $ llvm-size vmlinux > text data bss dec hex filename > 15906864 10197445 490804 26595113 195cf29 vmlinux > > > I compared the size of raw binary, arch/arm64/boot/Image. > Its size increased too. > > > > So, in my experiment, enabling CONFIG_LTO_CLANG > increases the kernel size. > Is this correct? Yes. LTO does produce larger binaries, mostly due to function inlining between translation units, I believe. The compiler people can probably give you a more detailed answer here. Without -mllvm -import-instr-limit, the binaries would be even larger. > One more thing, could you teach me > how Clang LTO optimizes the code against > relocatable objects? > > > > When I learned Clang LTO first, I read this document: > https://llvm.org/docs/LinkTimeOptimization.html > > It is easy to confirm the final executable > does not contain foo2, foo3... > > > > In contrast to userspace programs, > kernel modules are basically relocatable objects. > > Does Clang drop unused symbols from relocatable objects? > If so, how? I don't think the compiler can legally drop global symbols from relocatable objects, but it can rename and possibly even drop static functions. This is why we need global wrappers for initcalls, for example, to have stable symbol names. Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-08 23:46 ` Sami Tolvanen @ 2020-09-10 1:18 ` Masahiro Yamada 2020-09-10 15:17 ` Sami Tolvanen 2020-09-10 18:18 ` Kees Cook 0 siblings, 2 replies; 212+ messages in thread From: Masahiro Yamada @ 2020-09-10 1:18 UTC (permalink / raw) To: Sami Tolvanen Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > This patch series adds support for building x86_64 and arm64 kernels > > > with Clang's Link Time Optimization (LTO). > > > > > > In addition to performance, the primary motivation for LTO is > > > to allow Clang's Control-Flow Integrity (CFI) to be used in the > > > kernel. Google has shipped millions of Pixel devices running three > > > major kernel versions with LTO+CFI since 2018. > > > > > > Most of the patches are build system changes for handling LLVM > > > bitcode, which Clang produces with LTO instead of ELF object files, > > > postponing ELF processing until a later stage, and ensuring initcall > > > ordering. > > > > > > Note that patches 1-4 are not directly related to LTO, but are > > > needed to compile LTO kernels with ToT Clang, so I'm including them > > > in the series for your convenience: > > > > > > - Patches 1-3 are required for building the kernel with ToT Clang, > > > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > > > > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > > > > > > > > > I still do not understand how this patch set works. > > (only me?) > > > > Please let me ask fundamental questions. > > > > > > > > I applied this series on top of Linus' tree, > > and compiled for ARCH=arm64. > > > > I compared the kernel size with/without LTO. > > > > > > > > [1] No LTO (arm64 defconfig, CONFIG_LTO_NONE) > > > > $ llvm-size vmlinux > > text data bss dec hex filename > > 15848692 10099449 493060 26441201 19375f1 vmlinux > > > > > > > > [2] Clang LTO (arm64 defconfig + CONFIG_LTO_CLANG) > > > > $ llvm-size vmlinux > > text data bss dec hex filename > > 15906864 10197445 490804 26595113 195cf29 vmlinux > > > > > > I compared the size of raw binary, arch/arm64/boot/Image. > > Its size increased too. > > > > > > > > So, in my experiment, enabling CONFIG_LTO_CLANG > > increases the kernel size. > > Is this correct? > > Yes. LTO does produce larger binaries, mostly due to function > inlining between translation units, I believe. The compiler people > can probably give you a more detailed answer here. Without -mllvm > -import-instr-limit, the binaries would be even larger. > > > One more thing, could you teach me > > how Clang LTO optimizes the code against > > relocatable objects? > > > > > > > > When I learned Clang LTO first, I read this document: > > https://llvm.org/docs/LinkTimeOptimization.html > > > > It is easy to confirm the final executable > > does not contain foo2, foo3... > > > > > > > > In contrast to userspace programs, > > kernel modules are basically relocatable objects. > > > > Does Clang drop unused symbols from relocatable objects? > > If so, how? > > I don't think the compiler can legally drop global symbols from > relocatable objects, but it can rename and possibly even drop static > functions. Compilers can drop static functions without LTO. Rather, it is a compiler warning (-Wunused-function), so the code should be cleaned up. > This is why we need global wrappers for initcalls, for > example, to have stable symbol names. > > Sami At first, I thought the motivation of LTO was to remove unused global symbols, and to perform further optimization. It is true for userspace programs. In fact, the example of https://llvm.org/docs/LinkTimeOptimization.html produces a smaller binary. In contrast, this patch set produces a bigger kernel because LTO cannot remove any unused symbol. So, I do not understand what the benefit is. Is inlining beneficial? I am not sure. Documentation/process/coding-style.rst "15) The inline disease" mentions that inlining is not always a good thing. As a whole, I still do not understand the motivation of this patch set. -- Best Regards Masahiro Yamada _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-10 1:18 ` Masahiro Yamada @ 2020-09-10 15:17 ` Sami Tolvanen 2020-09-10 18:18 ` Kees Cook 1 sibling, 0 replies; 212+ messages in thread From: Sami Tolvanen @ 2020-09-10 15:17 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Kees Cook, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 10, 2020 at 10:18:05AM +0900, Masahiro Yamada wrote: > On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > > > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > > > This patch series adds support for building x86_64 and arm64 kernels > > > > with Clang's Link Time Optimization (LTO). > > > > > > > > In addition to performance, the primary motivation for LTO is > > > > to allow Clang's Control-Flow Integrity (CFI) to be used in the > > > > kernel. Google has shipped millions of Pixel devices running three > > > > major kernel versions with LTO+CFI since 2018. > > > > > > > > Most of the patches are build system changes for handling LLVM > > > > bitcode, which Clang produces with LTO instead of ELF object files, > > > > postponing ELF processing until a later stage, and ensuring initcall > > > > ordering. > > > > > > > > Note that patches 1-4 are not directly related to LTO, but are > > > > needed to compile LTO kernels with ToT Clang, so I'm including them > > > > in the series for your convenience: > > > > > > > > - Patches 1-3 are required for building the kernel with ToT Clang, > > > > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > > > > > > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > > > > > > > > > > > > > I still do not understand how this patch set works. > > > (only me?) > > > > > > Please let me ask fundamental questions. > > > > > > > > > > > > I applied this series on top of Linus' tree, > > > and compiled for ARCH=arm64. > > > > > > I compared the kernel size with/without LTO. > > > > > > > > > > > > [1] No LTO (arm64 defconfig, CONFIG_LTO_NONE) > > > > > > $ llvm-size vmlinux > > > text data bss dec hex filename > > > 15848692 10099449 493060 26441201 19375f1 vmlinux > > > > > > > > > > > > [2] Clang LTO (arm64 defconfig + CONFIG_LTO_CLANG) > > > > > > $ llvm-size vmlinux > > > text data bss dec hex filename > > > 15906864 10197445 490804 26595113 195cf29 vmlinux > > > > > > > > > I compared the size of raw binary, arch/arm64/boot/Image. > > > Its size increased too. > > > > > > > > > > > > So, in my experiment, enabling CONFIG_LTO_CLANG > > > increases the kernel size. > > > Is this correct? > > > > Yes. LTO does produce larger binaries, mostly due to function > > inlining between translation units, I believe. The compiler people > > can probably give you a more detailed answer here. Without -mllvm > > -import-instr-limit, the binaries would be even larger. > > > > > One more thing, could you teach me > > > how Clang LTO optimizes the code against > > > relocatable objects? > > > > > > > > > > > > When I learned Clang LTO first, I read this document: > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > > > It is easy to confirm the final executable > > > does not contain foo2, foo3... > > > > > > > > > > > > In contrast to userspace programs, > > > kernel modules are basically relocatable objects. > > > > > > Does Clang drop unused symbols from relocatable objects? > > > If so, how? > > > > I don't think the compiler can legally drop global symbols from > > relocatable objects, but it can rename and possibly even drop static > > functions. > > > Compilers can drop static functions without LTO. > Rather, it is a compiler warning > (-Wunused-function), so the code should be cleaned up. > > > > > This is why we need global wrappers for initcalls, for > > example, to have stable symbol names. > > > > Sami > > > > At first, I thought the motivation of LTO > was to remove unused global symbols, and > to perform further optimization. > > > It is true for userspace programs. > In fact, the example of > https://llvm.org/docs/LinkTimeOptimization.html > produces a smaller binary. > > > In contrast, this patch set produces a bigger kernel > because LTO cannot remove any unused symbol. > > So, I do not understand what the benefit is. > > > Is inlining beneficial? > I am not sure. > > > Documentation/process/coding-style.rst > "15) The inline disease" > mentions that inlining is not always > a good thing. > > > As a whole, I still do not understand > the motivation of this patch set. Clang produces faster code with LTO even if unused functions are not removed, and I'm not sure how many unused globals there really are in the kernel that aren't exported for modules. However, as I mentioned in the cover letter, we also need LTO for Control-Flow Integrity (CFI), which we have used in Pixel kernels for a couple of years now, and plan to use in more Android devices in future: https://clang.llvm.org/docs/ControlFlowIntegrity.html Sami _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
* Re: [PATCH v2 00/28] Add support for Clang LTO 2020-09-10 1:18 ` Masahiro Yamada 2020-09-10 15:17 ` Sami Tolvanen @ 2020-09-10 18:18 ` Kees Cook 1 sibling, 0 replies; 212+ messages in thread From: Kees Cook @ 2020-09-10 18:18 UTC (permalink / raw) To: Masahiro Yamada Cc: linux-arch, X86 ML, Paul E. McKenney, Kernel Hardening, Peter Zijlstra, Greg Kroah-Hartman, Linux Kbuild mailing list, Nick Desaulniers, Linux Kernel Mailing List, Steven Rostedt, clang-built-linux, Sami Tolvanen, linux-pci, Will Deacon, linux-arm-kernel On Thu, Sep 10, 2020 at 10:18:05AM +0900, Masahiro Yamada wrote: > On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > > > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > > > > > This patch series adds support for building x86_64 and arm64 kernels > > > > with Clang's Link Time Optimization (LTO). > > > [...] > > > One more thing, could you teach me > > > how Clang LTO optimizes the code against > > > relocatable objects? > > > > > > When I learned Clang LTO first, I read this document: > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > > > It is easy to confirm the final executable > > > does not contain foo2, foo3... > > > > > > In contrast to userspace programs, > > > kernel modules are basically relocatable objects. > > > > > > Does Clang drop unused symbols from relocatable objects? > > > If so, how? > > > > I don't think the compiler can legally drop global symbols from > > relocatable objects, but it can rename and possibly even drop static > > functions. > > Compilers can drop static functions without LTO. > Rather, it is a compiler warning > (-Wunused-function), so the code should be cleaned up. Right -- I think you're both saying the same thing. Unused static functions can be dropped (modulo a warning) in both regular and LTO builds. > At first, I thought the motivation of LTO > was to remove unused global symbols, and > to perform further optimization. One of LTO's benefits is the performance optimizations, but that's not the driving motivation for it here. The performance optimizations are possible because LTO provides the compiler with a view of the entire built-in portion of the kernel (i.e. not shared objects). That "visible all at once" state is the central concern because CFI (Control Flow Integrity, the driving motivation for this series) needs it in the same way that the performance optimization passes need it. i.e. to gain CFI coverage, LTO is required. Since LTO is a distinct first step independent of CFI, it was split out to be upstreamed while fixes for CFI continued to land independently[1]. Once LTO is landed, CFI comes next. > In contrast, this patch set produces a bigger kernel > because LTO cannot remove any unused symbol. > > So, I do not understand what the benefit is. > > Is inlining beneficial? > I am not sure. This is just a side-effect of LTO. As Sami mentions, it's entirely tunable, and that tuning was chosen based on measurements made for the kernel being built with LTO[2]. > As a whole, I still do not understand > the motivation of this patch set. It is a prerequisite for CFI, and CFI has been protecting *mumble*billion Android device kernels against code-reuse attacks for the last 2ish years[3]. I want this available for the entire Linux ecosystem, not just Android; it is a strong security flaw mitigation technique. I hope that helps explain it! -Kees [1] for example, these are some: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=Control+Flow+Integrity [2] https://lore.kernel.org/lkml/20200624203200.78870-1-samitolvanen@google.com/T/#m6b576c3af79bdacada10f21651a2b02d33a4e32e [3] https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 212+ messages in thread
end of thread, other threads:[~2020-09-10 18:31 UTC | newest] Thread overview: 212+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-24 20:31 [PATCH 00/22] add support for Clang LTO Sami Tolvanen 2020-06-24 20:31 ` [PATCH 01/22] objtool: use sh_info to find the base for .rela sections Sami Tolvanen 2020-06-24 20:31 ` [PATCH 02/22] kbuild: add support for Clang LTO Sami Tolvanen 2020-06-24 20:53 ` Nick Desaulniers 2020-06-24 21:29 ` Sami Tolvanen 2020-06-25 2:26 ` Nathan Chancellor 2020-06-25 16:13 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 03/22] kbuild: lto: fix module versioning Sami Tolvanen 2020-06-24 20:31 ` [PATCH 04/22] kbuild: lto: fix recordmcount Sami Tolvanen 2020-06-24 21:27 ` Peter Zijlstra 2020-06-24 21:45 ` Sami Tolvanen 2020-06-25 7:45 ` Peter Zijlstra 2020-06-25 16:15 ` Sami Tolvanen 2020-06-25 20:02 ` [RFC][PATCH] objtool,x86_64: Replace recordmcount with objtool Peter Zijlstra 2020-06-25 20:54 ` Nick Desaulniers 2020-06-25 22:40 ` Sami Tolvanen 2020-06-26 11:29 ` Peter Zijlstra 2020-06-26 11:42 ` Peter Zijlstra 2020-07-17 17:28 ` Sami Tolvanen 2020-07-17 17:36 ` Steven Rostedt 2020-07-17 17:47 ` Sami Tolvanen 2020-07-17 18:05 ` Steven Rostedt 2020-07-20 16:52 ` Sami Tolvanen 2020-07-22 17:58 ` Steven Rostedt 2020-07-22 18:07 ` Sami Tolvanen 2020-07-22 17:55 ` Steven Rostedt 2020-07-22 18:41 ` Peter Zijlstra 2020-07-22 19:09 ` Steven Rostedt 2020-07-22 20:03 ` Sami Tolvanen 2020-07-22 23:56 ` Peter Zijlstra 2020-07-23 0:06 ` Steven Rostedt 2020-08-06 22:09 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 05/22] kbuild: lto: postpone objtool Sami Tolvanen 2020-06-24 21:19 ` Peter Zijlstra 2020-06-24 21:49 ` Sami Tolvanen 2020-06-25 7:47 ` Peter Zijlstra 2020-06-25 16:22 ` Sami Tolvanen 2020-06-25 18:33 ` Peter Zijlstra 2020-06-25 19:32 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 06/22] kbuild: lto: limit inlining Sami Tolvanen 2020-06-24 21:20 ` Peter Zijlstra 2020-06-24 23:37 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 07/22] kbuild: lto: merge module sections Sami Tolvanen 2020-06-24 21:01 ` Nick Desaulniers 2020-06-24 21:31 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 08/22] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen 2020-06-24 21:13 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 09/22] init: lto: ensure initcall ordering Sami Tolvanen 2020-06-25 0:58 ` kernel test robot 2020-06-25 4:19 ` kernel test robot 2020-06-24 20:31 ` [PATCH 10/22] init: lto: fix PREL32 relocations Sami Tolvanen 2020-06-24 20:31 ` [PATCH 11/22] pci: " Sami Tolvanen 2020-06-24 22:49 ` kernel test robot 2020-06-24 23:03 ` Nick Desaulniers 2020-06-24 23:21 ` Sami Tolvanen 2020-07-17 20:26 ` Bjorn Helgaas 2020-07-22 18:15 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 12/22] modpost: lto: strip .lto from module names Sami Tolvanen 2020-06-24 22:05 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 13/22] scripts/mod: disable LTO for empty.c Sami Tolvanen 2020-06-24 20:57 ` Nick Desaulniers 2020-06-24 20:31 ` [PATCH 14/22] efi/libstub: disable LTO Sami Tolvanen 2020-06-24 20:31 ` [PATCH 15/22] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen 2020-06-24 20:31 ` [PATCH 16/22] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen 2020-06-24 20:31 ` [PATCH 17/22] arm64: vdso: disable LTO Sami Tolvanen 2020-06-24 20:58 ` Nick Desaulniers 2020-06-24 21:09 ` Nick Desaulniers 2020-06-24 23:51 ` Andi Kleen 2020-06-24 21:52 ` Sami Tolvanen 2020-06-24 23:05 ` Nick Desaulniers 2020-06-24 23:39 ` Sami Tolvanen 2020-06-24 20:31 ` [PATCH 18/22] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen 2020-06-24 20:31 ` [PATCH 19/22] x86, vdso: disable LTO only for vDSO Sami Tolvanen 2020-06-24 20:31 ` [PATCH 20/22] x86, ftrace: disable recordmcount for ftrace_make_nop Sami Tolvanen 2020-06-24 20:31 ` [PATCH 21/22] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen 2020-06-24 20:32 ` [PATCH 22/22] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen 2020-06-24 21:15 ` [PATCH 00/22] add support for Clang LTO Peter Zijlstra 2020-06-24 21:30 ` Sami Tolvanen 2020-06-25 8:27 ` Will Deacon 2020-06-24 21:31 ` Nick Desaulniers 2020-06-25 8:03 ` Peter Zijlstra 2020-06-25 8:24 ` Peter Zijlstra 2020-06-25 8:57 ` Peter Zijlstra 2020-06-30 19:19 ` Marco Elver 2020-06-30 20:12 ` Peter Zijlstra 2020-06-30 20:30 ` Paul E. McKenney 2020-07-01 9:10 ` Peter Zijlstra 2020-07-01 14:20 ` David Laight 2020-07-01 16:06 ` Paul E. McKenney 2020-07-02 9:37 ` David Laight 2020-07-02 18:00 ` Paul E. McKenney 2020-07-01 9:41 ` Marco Elver 2020-07-01 10:03 ` Will Deacon 2020-07-01 11:40 ` Peter Zijlstra 2020-07-01 14:06 ` Paul E. McKenney 2020-07-01 15:05 ` Peter Zijlstra 2020-07-01 16:03 ` Paul E. McKenney 2020-07-02 8:20 ` Peter Zijlstra 2020-07-02 17:59 ` Paul E. McKenney 2020-07-03 13:13 ` Peter Zijlstra 2020-07-03 13:25 ` Peter Zijlstra 2020-07-03 14:51 ` Paul E. McKenney 2020-07-03 14:42 ` Paul E. McKenney 2020-07-06 16:26 ` Paul E. McKenney 2020-07-06 18:29 ` Peter Zijlstra 2020-07-06 18:39 ` Paul E. McKenney 2020-07-06 19:40 ` Peter Zijlstra 2020-07-06 23:41 ` Paul E. McKenney 2020-06-28 16:56 ` Masahiro Yamada 2020-06-29 23:20 ` Sami Tolvanen 2020-07-07 15:51 ` Sami Tolvanen 2020-07-07 16:05 ` Sami Tolvanen 2020-07-07 16:56 ` Jakub Kicinski 2020-07-07 17:17 ` Nick Desaulniers 2020-07-07 17:30 ` Jakub Kicinski 2020-07-11 16:32 ` Paul Menzel 2020-07-12 8:59 ` Sedat Dilek 2020-07-12 18:40 ` Nathan Chancellor 2020-07-14 9:44 ` Sedat Dilek 2020-07-14 17:54 ` Nick Desaulniers 2020-07-12 23:34 ` Sami Tolvanen 2020-07-14 12:16 ` Paul Menzel 2020-07-14 12:35 ` Sedat Dilek 2020-09-03 20:30 ` [PATCH v2 00/28] Add " Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 01/28] x86/boot/compressed: Disable relocation relaxation Sami Tolvanen 2020-09-03 21:44 ` Kees Cook 2020-09-03 23:42 ` Arvind Sankar 2020-09-04 7:14 ` Nathan Chancellor 2020-09-03 20:30 ` [PATCH v2 02/28] x86/asm: Replace __force_order with memory clobber Sami Tolvanen 2020-09-03 21:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 03/28] lib/string.c: implement stpcpy Sami Tolvanen 2020-09-03 21:47 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 04/28] RAS/CEC: Fix cec_init() prototype Sami Tolvanen 2020-09-03 21:50 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 05/28] objtool: Add a pass for generating __mcount_loc Sami Tolvanen 2020-09-03 21:51 ` Kees Cook 2020-09-03 22:03 ` Sami Tolvanen 2020-09-04 9:31 ` peterz 2020-09-10 18:29 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 06/28] objtool: Don't autodetect vmlinux.o Sami Tolvanen 2020-09-03 21:52 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 07/28] kbuild: add support for objtool mcount Sami Tolvanen 2020-09-03 21:56 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 08/28] x86, build: use " Sami Tolvanen 2020-09-03 21:58 ` Kees Cook 2020-09-03 22:11 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 09/28] kbuild: add support for Clang LTO Sami Tolvanen 2020-09-03 22:08 ` Kees Cook 2020-09-08 17:02 ` Sami Tolvanen 2020-09-05 19:36 ` Masahiro Yamada 2020-09-08 17:10 ` Sami Tolvanen 2020-09-05 20:17 ` Masahiro Yamada 2020-09-08 17:14 ` Sami Tolvanen 2020-09-07 15:30 ` Masahiro Yamada 2020-09-08 17:30 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 10/28] kbuild: lto: fix module versioning Sami Tolvanen 2020-09-03 22:11 ` Kees Cook 2020-09-08 18:23 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 11/28] kbuild: lto: postpone objtool Sami Tolvanen 2020-09-03 22:19 ` Kees Cook 2020-09-08 20:56 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 12/28] kbuild: lto: limit inlining Sami Tolvanen 2020-09-03 22:20 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 13/28] kbuild: lto: merge module sections Sami Tolvanen 2020-09-03 22:23 ` Kees Cook 2020-09-07 15:25 ` Masahiro Yamada 2020-09-08 21:07 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 14/28] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen 2020-09-03 22:29 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 15/28] init: lto: ensure initcall ordering Sami Tolvanen 2020-09-03 22:40 ` Kees Cook 2020-09-08 21:16 ` Sami Tolvanen 2020-09-10 9:25 ` David Woodhouse 2020-09-10 15:07 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 16/28] init: lto: fix PREL32 relocations Sami Tolvanen 2020-09-03 22:41 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 17/28] PCI: Fix PREL32 relocations for LTO Sami Tolvanen 2020-09-03 22:42 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 18/28] modpost: lto: strip .lto from module names Sami Tolvanen 2020-09-03 22:42 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 19/28] scripts/mod: disable LTO for empty.c Sami Tolvanen 2020-09-03 22:43 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 20/28] efi/libstub: disable LTO Sami Tolvanen 2020-09-03 22:43 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 21/28] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 22/28] arm64: export CC_USING_PATCHABLE_FUNCTION_ENTRY Sami Tolvanen 2020-09-03 22:44 ` Kees Cook 2020-09-08 21:23 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 23/28] arm64: vdso: disable LTO Sami Tolvanen 2020-09-03 22:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 24/28] KVM: arm64: disable LTO for the nVHE directory Sami Tolvanen 2020-09-03 22:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 25/28] arm64: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen 2020-09-03 22:45 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 26/28] x86, vdso: disable LTO only for vDSO Sami Tolvanen 2020-09-03 22:46 ` Kees Cook 2020-09-03 20:30 ` [PATCH v2 27/28] x86, relocs: Ignore L4_PAGE_OFFSET relocations Sami Tolvanen 2020-09-03 22:47 ` Kees Cook 2020-09-08 23:28 ` Sami Tolvanen 2020-09-03 20:30 ` [PATCH v2 28/28] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen 2020-09-03 22:48 ` Kees Cook 2020-09-03 23:34 ` [PATCH v2 00/28] Add support for Clang LTO Kees Cook 2020-09-04 4:45 ` Nathan Chancellor 2020-09-03 23:38 ` Kees Cook 2020-09-04 7:53 ` Sedat Dilek 2020-09-04 8:55 ` peterz 2020-09-04 9:08 ` Sedat Dilek 2020-09-06 0:24 ` Masahiro Yamada 2020-09-08 23:46 ` Sami Tolvanen 2020-09-10 1:18 ` Masahiro Yamada 2020-09-10 15:17 ` Sami Tolvanen 2020-09-10 18:18 ` Kees Cook
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).