linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/25] Add support for Clang LTO
@ 2020-10-13  0:31 Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 01/25] kbuild: preprocess module linker script Sami Tolvanen
                   ` (24 more replies)
  0 siblings, 25 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

This patch series adds support for building the x86_64 kernel with
Clang's Link Time Optimization (LTO).

In addition to performance, the primary motivation for LTO is
to allow Clang's Control-Flow Integrity (CFI) to be used in the
kernel. Google has shipped millions of Pixel devices running three
major kernel versions with LTO+CFI since 2018.

Most of the patches are build system changes for handling LLVM
bitcode, which Clang produces with LTO instead of ELF object files,
postponing ELF processing until a later stage, and ensuring initcall
ordering.

Note that this version is based on tip/master to reduce the number
of prerequisite patches, and to make it easier to manage changes to
objtool. Patch 1 is from Masahiro's kbuild tree, and while it's not
directly related to LTO, it makes the module linker script changes
cleaner.

Furthermore, patches 2-6 include Peter's patch for generating
__mcount_loc with objtool, and build system changes to enable it on
x86. With these patches, we no longer need to annotate functions
that have non-call references to __fentry__ with LTO, which greatly
simplifies supporting dynamic ftrace.

You can also pull this series from

  https://github.com/samitolvanen/linux.git lto-v6

---
Changes in v6:

  - Added the missing --mcount flag to patch 5.

  - Dropped the arm64 patches from this series and will repost them
    later.

Changes in v5:

  - Rebased on top of tip/master.

  - Changed the command line for objtool to use --vmlinux --duplicate
    to disable warnings about retpoline thunks and to fix .orc_unwind
    generation for vmlinux.o.

  - Added --noinstr flag to objtool, so we can use --vmlinux without
    also enabling noinstr validation.

  - Disabled objtool's unreachable instruction warnings with LTO to
    disable false positives for the int3 padding in vmlinux.o.

  - Added ANNOTATE_RETPOLINE_SAFE annotations to the indirect jumps
    in x86 assembly code to fix objtool warnings with retpoline.

  - Fixed modpost warnings about missing version information with
    CONFIG_MODVERSIONS.

  - Included Makefile.lib into Makefile.modpost for ld_flags. Thanks
    to Sedat for pointing this out.

  - Updated the help text for ThinLTO to better explain the trade-offs.

  - Updated commit messages with better explanations.

Changes in v4:

  - Fixed a typo in Makefile.lib to correctly pass --no-fp to objtool.

  - Moved ftrace configs related to generating __mcount_loc to Kconfig,
    so they are available also in Makefile.modfinal.

  - Dropped two prerequisite patches that were merged to Linus' tree.

Changes in v3:

  - Added a separate patch to remove the unused DISABLE_LTO treewide,
    as filtering out CC_FLAGS_LTO instead is preferred.

  - Updated the Kconfig help to explain why LTO is behind a choice
    and disabled by default.

  - Dropped CC_FLAGS_LTO_CLANG, compiler-specific LTO flags are now
    appended directly to CC_FLAGS_LTO.

  - Updated $(AR) flags as KBUILD_ARFLAGS was removed earlier.

  - Fixed ThinLTO cache handling for external module builds.

  - Rebased on top of Masahiro's patch for preprocessing modules.lds,
    and moved the contents of module-lto.lds to modules.lds.S.

  - Moved objtool_args to Makefile.lib to avoid duplication of the
    command line parameters in Makefile.modfinal.

  - Clarified in the commit message for the initcall ordering patch
    that the initcall order remains the same as without LTO.

  - Changed link-vmlinux.sh to use jobserver-exec to control the
    number of jobs started by generate_initcall_ordering.pl.

  - Dropped the x86/relocs patch to whitelist L4_PAGE_OFFSET as it's
    no longer needed with ToT kernel.

  - Disabled LTO for arch/x86/power/cpu.c to work around a Clang bug
    with stack protector attributes.

Changes in v2:

  - Fixed -Wmissing-prototypes warnings with W=1.

  - Dropped cc-option from -fsplit-lto-unit and added .thinlto-cache
    scrubbing to make distclean.

  - Added a comment about Clang >=11 being required.

  - Added a patch to disable LTO for the arm64 KVM nVHE code.

  - Disabled objtool's noinstr validation with LTO unless enabled.

  - Included Peter's proposed objtool mcount patch in the series
    and replaced recordmcount with the objtool pass to avoid
    whitelisting relocations that are not calls.

  - Updated several commit messages with better explanations.


Masahiro Yamada (1):
  kbuild: preprocess module linker script

Peter Zijlstra (1):
  objtool: Add a pass for generating __mcount_loc

Sami Tolvanen (23):
  objtool: Don't autodetect vmlinux.o
  tracing: move function tracer options to Kconfig
  tracing: add support for objtool mcount
  x86, build: use objtool mcount
  treewide: remove DISABLE_LTO
  kbuild: add support for Clang LTO
  kbuild: lto: fix module versioning
  objtool: Split noinstr validation from --vmlinux
  kbuild: lto: postpone objtool
  kbuild: lto: limit inlining
  kbuild: lto: merge module sections
  kbuild: lto: remove duplicate dependencies from .mod files
  init: lto: ensure initcall ordering
  init: lto: fix PREL32 relocations
  PCI: Fix PREL32 relocations for LTO
  modpost: lto: strip .lto from module names
  scripts/mod: disable LTO for empty.c
  efi/libstub: disable LTO
  drivers/misc/lkdtm: disable LTO for rodata.o
  x86/asm: annotate indirect jumps
  x86, vdso: disable LTO only for vDSO
  x86, cpu: disable LTO for cpu.c
  x86, build: allow LTO_CLANG and THINLTO to be selected

 .gitignore                                    |   1 +
 Makefile                                      |  68 +++--
 arch/Kconfig                                  |  74 +++++
 arch/arm/Makefile                             |   4 -
 .../module.lds => include/asm/module.lds.h}   |   2 +
 arch/arm64/Makefile                           |   4 -
 .../module.lds => include/asm/module.lds.h}   |   2 +
 arch/arm64/kernel/vdso/Makefile               |   1 -
 arch/ia64/Makefile                            |   1 -
 .../{module.lds => include/asm/module.lds.h}  |   0
 arch/m68k/Makefile                            |   1 -
 .../module.lds => include/asm/module.lds.h}   |   0
 arch/powerpc/Makefile                         |   1 -
 .../module.lds => include/asm/module.lds.h}   |   0
 arch/riscv/Makefile                           |   3 -
 .../module.lds => include/asm/module.lds.h}   |   3 +-
 arch/sparc/vdso/Makefile                      |   2 -
 arch/um/include/asm/Kbuild                    |   1 +
 arch/x86/Kconfig                              |   3 +
 arch/x86/Makefile                             |   5 +
 arch/x86/entry/vdso/Makefile                  |   5 +-
 arch/x86/kernel/acpi/wakeup_64.S              |   2 +
 arch/x86/platform/pvh/head.S                  |   2 +
 arch/x86/power/Makefile                       |   4 +
 arch/x86/power/hibernate_asm_64.S             |   3 +
 drivers/firmware/efi/libstub/Makefile         |   2 +
 drivers/misc/lkdtm/Makefile                   |   1 +
 include/asm-generic/Kbuild                    |   1 +
 include/asm-generic/module.lds.h              |  10 +
 include/asm-generic/vmlinux.lds.h             |  11 +-
 include/linux/init.h                          |  79 ++++-
 include/linux/pci.h                           |  19 +-
 kernel/Makefile                               |   3 -
 kernel/trace/Kconfig                          |  29 ++
 scripts/.gitignore                            |   1 +
 scripts/Makefile                              |   3 +
 scripts/Makefile.build                        |  69 +++--
 scripts/Makefile.lib                          |  17 +-
 scripts/Makefile.modfinal                     |  29 +-
 scripts/Makefile.modpost                      |  25 +-
 scripts/generate_initcall_order.pl            | 270 ++++++++++++++++++
 scripts/link-vmlinux.sh                       |  98 ++++++-
 scripts/mod/Makefile                          |   1 +
 scripts/mod/modpost.c                         |  16 +-
 scripts/mod/modpost.h                         |   9 +
 scripts/mod/sumversion.c                      |   6 +-
 scripts/{module-common.lds => module.lds.S}   |  31 ++
 scripts/package/builddeb                      |   2 +-
 tools/objtool/builtin-check.c                 |  10 +-
 tools/objtool/builtin.h                       |   2 +-
 tools/objtool/check.c                         |  84 +++++-
 tools/objtool/check.h                         |   1 +
 tools/objtool/objtool.c                       |   1 +
 tools/objtool/objtool.h                       |   1 +
 54 files changed, 895 insertions(+), 128 deletions(-)
 rename arch/arm/{kernel/module.lds => include/asm/module.lds.h} (72%)
 rename arch/arm64/{kernel/module.lds => include/asm/module.lds.h} (76%)
 rename arch/ia64/{module.lds => include/asm/module.lds.h} (100%)
 rename arch/m68k/{kernel/module.lds => include/asm/module.lds.h} (100%)
 rename arch/powerpc/{kernel/module.lds => include/asm/module.lds.h} (100%)
 rename arch/riscv/{kernel/module.lds => include/asm/module.lds.h} (84%)
 create mode 100644 include/asm-generic/module.lds.h
 create mode 100755 scripts/generate_initcall_order.pl
 rename scripts/{module-common.lds => module.lds.S} (59%)


base-commit: a292570e9f694ed50d3e69afd6d54272fd40deca
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [PATCH v6 01/25] kbuild: preprocess module linker script
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc Sami Tolvanen
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86

From: Masahiro Yamada <masahiroy@kernel.org>

There was a request to preprocess the module linker script like we
do for the vmlinux one. (https://lkml.org/lkml/2020/8/21/512)

The difference between vmlinux.lds and module.lds is that the latter
is needed for external module builds, thus must be cleaned up by
'make mrproper' instead of 'make clean'. Also, it must be created
by 'make modules_prepare'.

You cannot put it in arch/$(SRCARCH)/kernel/, which is cleaned up by
'make clean'. I moved arch/$(SRCARCH)/kernel/module.lds to
arch/$(SRCARCH)/include/asm/module.lds.h, which is included from
scripts/module.lds.S.

scripts/module.lds is fine because 'make clean' keeps all the
build artifacts under scripts/.

You can add arch-specific sections in <asm/module.lds.h>.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
---
 Makefile                                               | 10 ++++++----
 arch/arm/Makefile                                      |  4 ----
 .../{kernel/module.lds => include/asm/module.lds.h}    |  2 ++
 arch/arm64/Makefile                                    |  4 ----
 .../{kernel/module.lds => include/asm/module.lds.h}    |  2 ++
 arch/ia64/Makefile                                     |  1 -
 arch/ia64/{module.lds => include/asm/module.lds.h}     |  0
 arch/m68k/Makefile                                     |  1 -
 .../{kernel/module.lds => include/asm/module.lds.h}    |  0
 arch/powerpc/Makefile                                  |  1 -
 .../{kernel/module.lds => include/asm/module.lds.h}    |  0
 arch/riscv/Makefile                                    |  3 ---
 .../{kernel/module.lds => include/asm/module.lds.h}    |  3 ++-
 arch/um/include/asm/Kbuild                             |  1 +
 include/asm-generic/Kbuild                             |  1 +
 include/asm-generic/module.lds.h                       | 10 ++++++++++
 scripts/.gitignore                                     |  1 +
 scripts/Makefile                                       |  3 +++
 scripts/Makefile.modfinal                              |  5 ++---
 scripts/{module-common.lds => module.lds.S}            |  3 +++
 scripts/package/builddeb                               |  2 +-
 21 files changed, 34 insertions(+), 23 deletions(-)
 rename arch/arm/{kernel/module.lds => include/asm/module.lds.h} (72%)
 rename arch/arm64/{kernel/module.lds => include/asm/module.lds.h} (76%)
 rename arch/ia64/{module.lds => include/asm/module.lds.h} (100%)
 rename arch/m68k/{kernel/module.lds => include/asm/module.lds.h} (100%)
 rename arch/powerpc/{kernel/module.lds => include/asm/module.lds.h} (100%)
 rename arch/riscv/{kernel/module.lds => include/asm/module.lds.h} (84%)
 create mode 100644 include/asm-generic/module.lds.h
 rename scripts/{module-common.lds => module.lds.S} (93%)

diff --git a/Makefile b/Makefile
index 51540b291738..0dcf302fe2da 100644
--- a/Makefile
+++ b/Makefile
@@ -505,7 +505,6 @@ KBUILD_CFLAGS_KERNEL :=
 KBUILD_AFLAGS_MODULE  := -DMODULE
 KBUILD_CFLAGS_MODULE  := -DMODULE
 KBUILD_LDFLAGS_MODULE :=
-export KBUILD_LDS_MODULE := $(srctree)/scripts/module-common.lds
 KBUILD_LDFLAGS :=
 CLANG_FLAGS :=
 
@@ -1384,7 +1383,7 @@ endif
 # using awk while concatenating to the final file.
 
 PHONY += modules
-modules: $(if $(KBUILD_BUILTIN),vmlinux) modules_check
+modules: $(if $(KBUILD_BUILTIN),vmlinux) modules_check modules_prepare
 	$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modpost
 
 PHONY += modules_check
@@ -1401,6 +1400,7 @@ targets += modules.order
 # Target to prepare building external modules
 PHONY += modules_prepare
 modules_prepare: prepare
+	$(Q)$(MAKE) $(build)=scripts scripts/module.lds
 
 # Target to install modules
 PHONY += modules_install
@@ -1722,7 +1722,9 @@ help:
 	@echo  '  clean           - remove generated files in module directory only'
 	@echo  ''
 
-PHONY += prepare
+# no-op for external module builds
+PHONY += prepare modules_prepare
+
 endif # KBUILD_EXTMOD
 
 # Single targets
@@ -1755,7 +1757,7 @@ MODORDER := .modules.tmp
 endif
 
 PHONY += single_modpost
-single_modpost: $(single-no-ko)
+single_modpost: $(single-no-ko) modules_prepare
 	$(Q){ $(foreach m, $(single-ko), echo $(extmod-prefix)$m;) } > $(MODORDER)
 	$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modpost
 
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index e589da3c8949..9c4d19ae7d61 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -20,10 +20,6 @@ endif
 # linker. All sections should be explicitly named in the linker script.
 LDFLAGS_vmlinux += $(call ld-option, --orphan-handling=warn)
 
-ifeq ($(CONFIG_ARM_MODULE_PLTS),y)
-KBUILD_LDS_MODULE	+= $(srctree)/arch/arm/kernel/module.lds
-endif
-
 GZFLAGS		:=-9
 #KBUILD_CFLAGS	+=-pipe
 
diff --git a/arch/arm/kernel/module.lds b/arch/arm/include/asm/module.lds.h
similarity index 72%
rename from arch/arm/kernel/module.lds
rename to arch/arm/include/asm/module.lds.h
index 79cb6af565e5..0e7cb4e314b4 100644
--- a/arch/arm/kernel/module.lds
+++ b/arch/arm/include/asm/module.lds.h
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#ifdef CONFIG_ARM_MODULE_PLTS
 SECTIONS {
 	.plt : { BYTE(0) }
 	.init.plt : { BYTE(0) }
 }
+#endif
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index aff994473eb2..32d7e07b0acf 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -127,10 +127,6 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
-KBUILD_LDS_MODULE	+= $(srctree)/arch/arm64/kernel/module.lds
-endif
-
 ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
   KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
   CC_FLAGS_FTRACE := -fpatchable-function-entry=2
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/include/asm/module.lds.h
similarity index 76%
rename from arch/arm64/kernel/module.lds
rename to arch/arm64/include/asm/module.lds.h
index 22e36a21c113..691f15af788e 100644
--- a/arch/arm64/kernel/module.lds
+++ b/arch/arm64/include/asm/module.lds.h
@@ -1,5 +1,7 @@
+#ifdef CONFIG_ARM64_MODULE_PLTS
 SECTIONS {
 	.plt (NOLOAD) : { BYTE(0) }
 	.init.plt (NOLOAD) : { BYTE(0) }
 	.text.ftrace_trampoline (NOLOAD) : { BYTE(0) }
 }
+#endif
diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index 2876a7df1b0a..703b1c4f6d12 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -20,7 +20,6 @@ CHECKFLAGS	+= -D__ia64=1 -D__ia64__=1 -D_LP64 -D__LP64__
 
 OBJCOPYFLAGS	:= --strip-all
 LDFLAGS_vmlinux	:= -static
-KBUILD_LDS_MODULE += $(srctree)/arch/ia64/module.lds
 KBUILD_AFLAGS_KERNEL := -mconstant-gp
 EXTRA		:=
 
diff --git a/arch/ia64/module.lds b/arch/ia64/include/asm/module.lds.h
similarity index 100%
rename from arch/ia64/module.lds
rename to arch/ia64/include/asm/module.lds.h
diff --git a/arch/m68k/Makefile b/arch/m68k/Makefile
index 4438ffb4bbe1..ea14f2046fb4 100644
--- a/arch/m68k/Makefile
+++ b/arch/m68k/Makefile
@@ -75,7 +75,6 @@ KBUILD_CPPFLAGS += -D__uClinux__
 endif
 
 KBUILD_LDFLAGS := -m m68kelf
-KBUILD_LDS_MODULE += $(srctree)/arch/m68k/kernel/module.lds
 
 ifdef CONFIG_SUN3
 LDFLAGS_vmlinux = -N
diff --git a/arch/m68k/kernel/module.lds b/arch/m68k/include/asm/module.lds.h
similarity index 100%
rename from arch/m68k/kernel/module.lds
rename to arch/m68k/include/asm/module.lds.h
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 3e8da9cf2eb9..8935658fcd06 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -65,7 +65,6 @@ UTS_MACHINE := $(subst $(space),,$(machine-y))
 ifdef CONFIG_PPC32
 KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
 else
-KBUILD_LDS_MODULE += $(srctree)/arch/powerpc/kernel/module.lds
 ifeq ($(call ld-ifversion, -ge, 225000000, y),y)
 # Have the linker provide sfpr if possible.
 # There is a corresponding test in arch/powerpc/lib/Makefile
diff --git a/arch/powerpc/kernel/module.lds b/arch/powerpc/include/asm/module.lds.h
similarity index 100%
rename from arch/powerpc/kernel/module.lds
rename to arch/powerpc/include/asm/module.lds.h
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index fb6e37db836d..8edaa8bd86d6 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -53,9 +53,6 @@ endif
 ifeq ($(CONFIG_CMODEL_MEDANY),y)
 	KBUILD_CFLAGS += -mcmodel=medany
 endif
-ifeq ($(CONFIG_MODULE_SECTIONS),y)
-	KBUILD_LDS_MODULE += $(srctree)/arch/riscv/kernel/module.lds
-endif
 ifeq ($(CONFIG_PERF_EVENTS),y)
         KBUILD_CFLAGS += -fno-omit-frame-pointer
 endif
diff --git a/arch/riscv/kernel/module.lds b/arch/riscv/include/asm/module.lds.h
similarity index 84%
rename from arch/riscv/kernel/module.lds
rename to arch/riscv/include/asm/module.lds.h
index 295ecfb341a2..4254ff2ff049 100644
--- a/arch/riscv/kernel/module.lds
+++ b/arch/riscv/include/asm/module.lds.h
@@ -1,8 +1,9 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /* Copyright (C) 2017 Andes Technology Corporation */
-
+#ifdef CONFIG_MODULE_SECTIONS
 SECTIONS {
 	.plt (NOLOAD) : { BYTE(0) }
 	.got (NOLOAD) : { BYTE(0) }
 	.got.plt (NOLOAD) : { BYTE(0) }
 }
+#endif
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 8d435f8a6dec..1c63b260ecc4 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -16,6 +16,7 @@ generic-y += kdebug.h
 generic-y += mcs_spinlock.h
 generic-y += mm-arch-hooks.h
 generic-y += mmiowb.h
+generic-y += module.lds.h
 generic-y += param.h
 generic-y += pci.h
 generic-y += percpu.h
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index 74b0612601dd..7cd4e627e00e 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -40,6 +40,7 @@ mandatory-y += mmiowb.h
 mandatory-y += mmu.h
 mandatory-y += mmu_context.h
 mandatory-y += module.h
+mandatory-y += module.lds.h
 mandatory-y += msi.h
 mandatory-y += pci.h
 mandatory-y += percpu.h
diff --git a/include/asm-generic/module.lds.h b/include/asm-generic/module.lds.h
new file mode 100644
index 000000000000..f210d5c1b78b
--- /dev/null
+++ b/include/asm-generic/module.lds.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_GENERIC_MODULE_LDS_H
+#define __ASM_GENERIC_MODULE_LDS_H
+
+/*
+ * <asm/module.lds.h> can specify arch-specific sections for linking modules.
+ * Empty for the asm-generic header.
+ */
+
+#endif /* __ASM_GENERIC_MODULE_LDS_H */
diff --git a/scripts/.gitignore b/scripts/.gitignore
index 0d1c8e217cd7..a6c11316c969 100644
--- a/scripts/.gitignore
+++ b/scripts/.gitignore
@@ -8,3 +8,4 @@ asn1_compiler
 extract-cert
 sign-file
 insert-sys-cert
+/module.lds
diff --git a/scripts/Makefile b/scripts/Makefile
index bc018e4b733e..b5418ec587fb 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -29,6 +29,9 @@ endif
 # The following programs are only built on demand
 hostprogs += unifdef
 
+# The module linker script is preprocessed on demand
+targets += module.lds
+
 subdir-$(CONFIG_GCC_PLUGINS) += gcc-plugins
 subdir-$(CONFIG_MODVERSIONS) += genksyms
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 411c1e600e7d..ae01baf96f4e 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -33,11 +33,10 @@ quiet_cmd_ld_ko_o = LD [M]  $@
       cmd_ld_ko_o =                                                     \
 	$(LD) -r $(KBUILD_LDFLAGS)					\
 		$(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE)		\
-		$(addprefix -T , $(KBUILD_LDS_MODULE))			\
-		-o $@ $(filter %.o, $^);				\
+		-T scripts/module.lds -o $@ $(filter %.o, $^);		\
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-$(modules): %.ko: %.o %.mod.o $(KBUILD_LDS_MODULE) FORCE
+$(modules): %.ko: %.o %.mod.o scripts/module.lds FORCE
 	+$(call if_changed,ld_ko_o)
 
 targets += $(modules) $(modules:.ko=.mod.o)
diff --git a/scripts/module-common.lds b/scripts/module.lds.S
similarity index 93%
rename from scripts/module-common.lds
rename to scripts/module.lds.S
index d61b9e8678e8..69b9b71a6a47 100644
--- a/scripts/module-common.lds
+++ b/scripts/module.lds.S
@@ -24,3 +24,6 @@ SECTIONS {
 
 	__jump_table		0 : ALIGN(8) { KEEP(*(__jump_table)) }
 }
+
+/* bring in arch-specific sections */
+#include <asm/module.lds.h>
diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index 6df3c9f8b2da..44f212e37935 100755
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -55,7 +55,7 @@ deploy_kernel_headers () {
 		cd $srctree
 		find . arch/$SRCARCH -maxdepth 1 -name Makefile\*
 		find include scripts -type f -o -type l
-		find arch/$SRCARCH -name module.lds -o -name Kbuild.platforms -o -name Platform
+		find arch/$SRCARCH -name Kbuild.platforms -o -name Platform
 		find $(find arch/$SRCARCH -name include -o -name scripts -type d) -type f
 	) > debian/hdrsrcfiles
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 01/25] kbuild: preprocess module linker script Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-14 16:50   ` Ingo Molnar
  2020-10-13  0:31 ` [PATCH v6 03/25] objtool: Don't autodetect vmlinux.o Sami Tolvanen
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

From: Peter Zijlstra <peterz@infradead.org>

Add the --mcount option for generating __mcount_loc sections
needed for dynamic ftrace. Using this pass requires the kernel to
be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined
in Makefile.

Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[Sami: rebased, dropped config changes, fixed to actually use --mcount,
       and wrote a commit message.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 tools/objtool/builtin-check.c |  3 +-
 tools/objtool/builtin.h       |  2 +-
 tools/objtool/check.c         | 82 +++++++++++++++++++++++++++++++++++
 tools/objtool/check.h         |  1 +
 tools/objtool/objtool.c       |  1 +
 tools/objtool/objtool.h       |  1 +
 6 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index c6d199bfd0ae..e92e76f69176 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -18,7 +18,7 @@
 #include "builtin.h"
 #include "objtool.h"
 
-bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux;
+bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount;
 
 static const char * const check_usage[] = {
 	"objtool check [<options>] file.o",
@@ -35,6 +35,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('s', "stats", &stats, "print statistics"),
 	OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"),
 	OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"),
+	OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"),
 	OPT_END(),
 };
 
diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h
index 85c979caa367..94565a72b701 100644
--- a/tools/objtool/builtin.h
+++ b/tools/objtool/builtin.h
@@ -8,7 +8,7 @@
 #include <subcmd/parse-options.h>
 
 extern const struct option check_options[];
-extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux;
+extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount;
 
 extern int cmd_check(int argc, const char **argv);
 extern int cmd_orc(int argc, const char **argv);
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c6ab44543c92..c39c4d2b432a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -523,6 +523,65 @@ static int create_static_call_sections(struct objtool_file *file)
 	return 0;
 }
 
+static int create_mcount_loc_sections(struct objtool_file *file)
+{
+	struct section *sec, *reloc_sec;
+	struct reloc *reloc;
+	unsigned long *loc;
+	struct instruction *insn;
+	int idx;
+
+	sec = find_section_by_name(file->elf, "__mcount_loc");
+	if (sec) {
+		INIT_LIST_HEAD(&file->mcount_loc_list);
+		WARN("file already has __mcount_loc section, skipping");
+		return 0;
+	}
+
+	if (list_empty(&file->mcount_loc_list))
+		return 0;
+
+	idx = 0;
+	list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node)
+		idx++;
+
+	sec = elf_create_section(file->elf, "__mcount_loc", 0, sizeof(unsigned long), idx);
+	if (!sec)
+		return -1;
+
+	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
+	if (!reloc_sec)
+		return -1;
+
+	idx = 0;
+	list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) {
+
+		loc = (unsigned long *)sec->data->d_buf + idx;
+		memset(loc, 0, sizeof(unsigned long));
+
+		reloc = malloc(sizeof(*reloc));
+		if (!reloc) {
+			perror("malloc");
+			return -1;
+		}
+		memset(reloc, 0, sizeof(*reloc));
+
+		reloc->sym = insn->sec->sym;
+		reloc->addend = insn->offset;
+		reloc->type = R_X86_64_64;
+		reloc->offset = idx * sizeof(unsigned long);
+		reloc->sec = reloc_sec;
+		elf_add_reloc(file->elf, reloc);
+
+		idx++;
+	}
+
+	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
+		return -1;
+
+	return 0;
+}
+
 /*
  * Warnings shouldn't be reported for ignored functions.
  */
@@ -949,6 +1008,22 @@ static int add_call_destinations(struct objtool_file *file)
 			insn->type = INSN_NOP;
 		}
 
+		if (mcount && !strcmp(insn->call_dest->name, "__fentry__")) {
+			if (reloc) {
+				reloc->type = R_NONE;
+				elf_write_reloc(file->elf, reloc);
+			}
+
+			elf_write_insn(file->elf, insn->sec,
+				       insn->offset, insn->len,
+				       arch_nop_insn(insn->len));
+
+			insn->type = INSN_NOP;
+
+			list_add_tail(&insn->mcount_loc_node,
+				      &file->mcount_loc_list);
+		}
+
 		/*
 		 * Whatever stack impact regular CALLs have, should be undone
 		 * by the RETURN of the called function.
@@ -2920,6 +2995,13 @@ int check(struct objtool_file *file)
 		goto out;
 	warnings += ret;
 
+	if (mcount) {
+		ret = create_mcount_loc_sections(file);
+		if (ret < 0)
+			goto out;
+		warnings += ret;
+	}
+
 out:
 	if (ret < 0) {
 		/*
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 5ec00a4b891b..070baec2050a 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -23,6 +23,7 @@ struct instruction {
 	struct list_head list;
 	struct hlist_node hash;
 	struct list_head static_call_node;
+	struct list_head mcount_loc_node;
 	struct section *sec;
 	unsigned long offset;
 	unsigned int len;
diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
index 9df0cd86d310..c1819a6f2a18 100644
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -62,6 +62,7 @@ struct objtool_file *objtool_open_read(const char *_objname)
 	INIT_LIST_HEAD(&file.insn_list);
 	hash_init(file.insn_hash);
 	INIT_LIST_HEAD(&file.static_call_list);
+	INIT_LIST_HEAD(&file.mcount_loc_list);
 	file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment");
 	file.ignore_unreachables = no_unreachable;
 	file.hints = false;
diff --git a/tools/objtool/objtool.h b/tools/objtool/objtool.h
index 4125d4578b23..cf004dd60c2b 100644
--- a/tools/objtool/objtool.h
+++ b/tools/objtool/objtool.h
@@ -19,6 +19,7 @@ struct objtool_file {
 	struct list_head insn_list;
 	DECLARE_HASHTABLE(insn_hash, 20);
 	struct list_head static_call_list;
+	struct list_head mcount_loc_list;
 	bool ignore_unreachables, c_file, hints, rodata;
 };
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 03/25] objtool: Don't autodetect vmlinux.o
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 01/25] kbuild: preprocess module linker script Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 04/25] tracing: move function tracer options to Kconfig Sami Tolvanen
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With LTO, we run objtool on vmlinux.o, but don't want noinstr
validation. This change requires --vmlinux to be passed to objtool
explicitly.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 scripts/link-vmlinux.sh       | 2 +-
 tools/objtool/builtin-check.c | 6 +-----
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index e6e2d9e5ff48..372c3719f94c 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -64,7 +64,7 @@ objtool_link()
 	local objtoolopt;
 
 	if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then
-		objtoolopt="check"
+		objtoolopt="check --vmlinux"
 		if [ -z "${CONFIG_FRAME_POINTER}" ]; then
 			objtoolopt="${objtoolopt} --no-fp"
 		fi
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index e92e76f69176..facfc10bc5dc 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -41,7 +41,7 @@ const struct option check_options[] = {
 
 int cmd_check(int argc, const char **argv)
 {
-	const char *objname, *s;
+	const char *objname;
 	struct objtool_file *file;
 	int ret;
 
@@ -52,10 +52,6 @@ int cmd_check(int argc, const char **argv)
 
 	objname = argv[0];
 
-	s = strstr(objname, "vmlinux.o");
-	if (s && !s[9])
-		vmlinux = true;
-
 	file = objtool_open_read(objname);
 	if (!file)
 		return 1;
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 04/25] tracing: move function tracer options to Kconfig
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (2 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 03/25] objtool: Don't autodetect vmlinux.o Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 05/25] tracing: add support for objtool mcount Sami Tolvanen
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Move function tracer options to Kconfig to make it easier to add
new methods for generating __mcount_loc, and to make the options
available also when building kernel modules.

Note that FTRACE_MCOUNT_USE_* options are updated on rebuild and
therefore, work even if the .config was generated in a different
environment.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 Makefile               | 20 ++++++++------------
 kernel/trace/Kconfig   | 16 ++++++++++++++++
 scripts/Makefile.build |  6 ++----
 3 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/Makefile b/Makefile
index 0dcf302fe2da..129001b38357 100644
--- a/Makefile
+++ b/Makefile
@@ -841,12 +841,8 @@ KBUILD_CFLAGS += $(DEBUG_CFLAGS)
 export DEBUG_CFLAGS
 
 ifdef CONFIG_FUNCTION_TRACER
-ifdef CONFIG_FTRACE_MCOUNT_RECORD
-  # gcc 5 supports generating the mcount tables directly
-  ifeq ($(call cc-option-yn,-mrecord-mcount),y)
-    CC_FLAGS_FTRACE	+= -mrecord-mcount
-    export CC_USING_RECORD_MCOUNT := 1
-  endif
+ifdef CONFIG_FTRACE_MCOUNT_USE_CC
+  CC_FLAGS_FTRACE	+= -mrecord-mcount
   ifdef CONFIG_HAVE_NOP_MCOUNT
     ifeq ($(call cc-option-yn, -mnop-mcount),y)
       CC_FLAGS_FTRACE	+= -mnop-mcount
@@ -854,6 +850,12 @@ ifdef CONFIG_FTRACE_MCOUNT_RECORD
     endif
   endif
 endif
+ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
+  ifdef CONFIG_HAVE_C_RECORDMCOUNT
+    BUILD_C_RECORDMCOUNT := y
+    export BUILD_C_RECORDMCOUNT
+  endif
+endif
 ifdef CONFIG_HAVE_FENTRY
   ifeq ($(call cc-option-yn, -mfentry),y)
     CC_FLAGS_FTRACE	+= -mfentry
@@ -863,12 +865,6 @@ endif
 export CC_FLAGS_FTRACE
 KBUILD_CFLAGS	+= $(CC_FLAGS_FTRACE) $(CC_FLAGS_USING)
 KBUILD_AFLAGS	+= $(CC_FLAGS_USING)
-ifdef CONFIG_DYNAMIC_FTRACE
-	ifdef CONFIG_HAVE_C_RECORDMCOUNT
-		BUILD_C_RECORDMCOUNT := y
-		export BUILD_C_RECORDMCOUNT
-	endif
-endif
 endif
 
 # We trigger additional mismatches with less inlining
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index a4020c0b4508..927ad004888a 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -595,6 +595,22 @@ config FTRACE_MCOUNT_RECORD
 	depends on DYNAMIC_FTRACE
 	depends on HAVE_FTRACE_MCOUNT_RECORD
 
+config FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
+	bool
+	depends on FTRACE_MCOUNT_RECORD
+
+config FTRACE_MCOUNT_USE_CC
+	def_bool y
+	depends on $(cc-option,-mrecord-mcount)
+	depends on !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
+	depends on FTRACE_MCOUNT_RECORD
+
+config FTRACE_MCOUNT_USE_RECORDMCOUNT
+	def_bool y
+	depends on !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
+	depends on !FTRACE_MCOUNT_USE_CC
+	depends on FTRACE_MCOUNT_RECORD
+
 config TRACING_MAP
 	bool
 	depends on ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index a467b9323442..a4634aae1506 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -178,8 +178,7 @@ cmd_modversions_c =								\
 	fi
 endif
 
-ifdef CONFIG_FTRACE_MCOUNT_RECORD
-ifndef CC_USING_RECORD_MCOUNT
+ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
 # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl
 ifdef BUILD_C_RECORDMCOUNT
 ifeq ("$(origin RECORDMCOUNT_WARN)", "command line")
@@ -206,8 +205,7 @@ recordmcount_source := $(srctree)/scripts/recordmcount.pl
 endif # BUILD_C_RECORDMCOUNT
 cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),	\
 	$(sub_cmd_record_mcount))
-endif # CC_USING_RECORD_MCOUNT
-endif # CONFIG_FTRACE_MCOUNT_RECORD
+endif # CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
 
 ifdef CONFIG_STACK_VALIDATION
 ifneq ($(SKIP_STACK_VALIDATION),1)
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 05/25] tracing: add support for objtool mcount
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (3 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 04/25] tracing: move function tracer options to Kconfig Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 06/25] x86, build: use " Sami Tolvanen
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

This change adds build support for using objtool to generate
__mcount_loc sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 Makefile               | 12 ++++++++++--
 kernel/trace/Kconfig   | 13 +++++++++++++
 scripts/Makefile.build |  3 +++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 129001b38357..fda1f8a0b1c7 100644
--- a/Makefile
+++ b/Makefile
@@ -850,6 +850,9 @@ ifdef CONFIG_FTRACE_MCOUNT_USE_CC
     endif
   endif
 endif
+ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL
+  CC_FLAGS_USING	+= -DCC_USING_NOP_MCOUNT
+endif
 ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
   ifdef CONFIG_HAVE_C_RECORDMCOUNT
     BUILD_C_RECORDMCOUNT := y
@@ -1209,11 +1212,16 @@ uapi-asm-generic:
 PHONY += prepare-objtool prepare-resolve_btfids
 prepare-objtool: $(objtool_target)
 ifeq ($(SKIP_STACK_VALIDATION),1)
+objtool-lib-prompt := "please install libelf-dev, libelf-devel or elfutils-libelf-devel"
+ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL
+	@echo "error: Cannot generate __mcount_loc for CONFIG_DYNAMIC_FTRACE=y, $(objtool-lib-prompt)" >&2
+	@false
+endif
 ifdef CONFIG_UNWINDER_ORC
-	@echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2
+	@echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, $(objtool-lib-prompt)" >&2
 	@false
 else
-	@echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2
+	@echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, $(objtool-lib-prompt)" >&2
 endif
 endif
 
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 927ad004888a..89263210ab26 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -51,6 +51,11 @@ config HAVE_NOP_MCOUNT
 	help
 	  Arch supports the gcc options -pg with -mrecord-mcount and -nop-mcount
 
+config HAVE_OBJTOOL_MCOUNT
+	bool
+	help
+	  Arch supports objtool --mcount
+
 config HAVE_C_RECORDMCOUNT
 	bool
 	help
@@ -605,10 +610,18 @@ config FTRACE_MCOUNT_USE_CC
 	depends on !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
 	depends on FTRACE_MCOUNT_RECORD
 
+config FTRACE_MCOUNT_USE_OBJTOOL
+	def_bool y
+	depends on HAVE_OBJTOOL_MCOUNT
+	depends on !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
+	depends on !FTRACE_MCOUNT_USE_CC
+	depends on FTRACE_MCOUNT_RECORD
+
 config FTRACE_MCOUNT_USE_RECORDMCOUNT
 	def_bool y
 	depends on !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
 	depends on !FTRACE_MCOUNT_USE_CC
+	depends on !FTRACE_MCOUNT_USE_OBJTOOL
 	depends on FTRACE_MCOUNT_RECORD
 
 config TRACING_MAP
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index a4634aae1506..cd4294435fef 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -228,6 +228,9 @@ endif
 ifdef CONFIG_X86_SMAP
   objtool_args += --uaccess
 endif
+ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL
+  objtool_args += --mcount
+endif
 
 # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
 # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 06/25] x86, build: use objtool mcount
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (4 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 05/25] tracing: add support for objtool mcount Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 07/25] treewide: remove DISABLE_LTO Sami Tolvanen
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use
objtool to generate __mcount_loc sections for dynamic ftrace with
Clang and gcc <5 (later versions of gcc use -mrecord-mcount).

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5e832fd520b5..6d67646153bc 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -163,6 +163,7 @@ config X86
 	select HAVE_CMPXCHG_LOCAL
 	select HAVE_CONTEXT_TRACKING		if X86_64
 	select HAVE_C_RECORDMCOUNT
+	select HAVE_OBJTOOL_MCOUNT		if STACK_VALIDATION
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 07/25] treewide: remove DISABLE_LTO
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (5 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 06/25] x86, build: use " Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-14 22:43   ` Kees Cook
  2020-10-13  0:31 ` [PATCH v6 08/25] kbuild: add support for Clang LTO Sami Tolvanen
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

This change removes all instances of DISABLE_LTO from
Makefiles, as they are currently unused, and the preferred
method of disabling LTO is to filter out the flags instead.

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/arm64/kernel/vdso/Makefile | 1 -
 arch/sparc/vdso/Makefile        | 2 --
 arch/x86/entry/vdso/Makefile    | 2 --
 kernel/Makefile                 | 3 ---
 scripts/Makefile.build          | 2 +-
 5 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index 45d5cfe46429..e836e300440f 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -31,7 +31,6 @@ ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18
 ccflags-y += -DDISABLE_BRANCH_PROFILING
 
 CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) $(GCC_PLUGINS_CFLAGS)
-KBUILD_CFLAGS			+= $(DISABLE_LTO)
 KASAN_SANITIZE			:= n
 UBSAN_SANITIZE			:= n
 OBJECT_FILES_NON_STANDARD	:= y
diff --git a/arch/sparc/vdso/Makefile b/arch/sparc/vdso/Makefile
index f44355e46f31..476c4b315505 100644
--- a/arch/sparc/vdso/Makefile
+++ b/arch/sparc/vdso/Makefile
@@ -3,8 +3,6 @@
 # Building vDSO images for sparc.
 #
 
-KBUILD_CFLAGS += $(DISABLE_LTO)
-
 VDSO64-$(CONFIG_SPARC64)	:= y
 VDSOCOMPAT-$(CONFIG_COMPAT)	:= y
 
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 215376d975a2..ecc27018ae13 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -9,8 +9,6 @@ ARCH_REL_TYPE_ABS := R_X86_64_JUMP_SLOT|R_X86_64_GLOB_DAT|R_X86_64_RELATIVE|
 ARCH_REL_TYPE_ABS += R_386_GLOB_DAT|R_386_JMP_SLOT|R_386_RELATIVE
 include $(srctree)/lib/vdso/Makefile
 
-KBUILD_CFLAGS += $(DISABLE_LTO)
-
 # Sanitizer runtimes are unavailable and cannot be linked here.
 KASAN_SANITIZE			:= n
 UBSAN_SANITIZE			:= n
diff --git a/kernel/Makefile b/kernel/Makefile
index 16ec9262ce9d..2561abc91961 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -38,9 +38,6 @@ KASAN_SANITIZE_kcov.o := n
 KCSAN_SANITIZE_kcov.o := n
 CFLAGS_kcov.o := $(call cc-option, -fno-conserve-stack) -fno-stack-protector
 
-# cond_syscall is currently not LTO compatible
-CFLAGS_sys_ni.o = $(DISABLE_LTO)
-
 obj-y += sched/
 obj-y += locking/
 obj-y += power/
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index cd4294435fef..6db5b1f55b14 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -111,7 +111,7 @@ endif
 # ---------------------------------------------------------------------------
 
 quiet_cmd_cc_s_c = CC $(quiet_modtag)  $@
-      cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) $(DISABLE_LTO) -fverbose-asm -S -o $@ $<
+      cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) -fverbose-asm -S -o $@ $<
 
 $(obj)/%.s: $(src)/%.c FORCE
 	$(call if_changed_dep,cc_s_c)
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 08/25] kbuild: add support for Clang LTO
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (6 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 07/25] treewide: remove DISABLE_LTO Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 09/25] kbuild: lto: fix module versioning Sami Tolvanen
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

This change adds build system support for Clang's Link Time
Optimization (LTO). With -flto, instead of ELF object files, Clang
produces LLVM bitcode, which is compiled into native code at link
time, allowing the final binary to be optimized globally. For more
details, see:

  https://llvm.org/docs/LinkTimeOptimization.html

The Kconfig option CONFIG_LTO_CLANG is implemented as a choice,
which defaults to LTO being disabled. To use LTO, the architecture
must select ARCH_SUPPORTS_LTO_CLANG and support:

  - compiling with Clang,
  - compiling inline assembly with Clang's integrated assembler,
  - and linking with LLD.

While using full LTO results in the best runtime performance, the
compilation is not scalable in time or memory. CONFIG_THINLTO
enables ThinLTO, which allows parallel optimization and faster
incremental builds. ThinLTO is used by default if the architecture
also selects ARCH_SUPPORTS_THINLTO:

  https://clang.llvm.org/docs/ThinLTO.html

To enable LTO, LLVM tools must be used to handle bitcode files. The
easiest way is to pass the LLVM=1 option to make:

  $ make LLVM=1 defconfig
  $ scripts/config -e LTO_CLANG
  $ make LLVM=1

Alternatively, at least the following LLVM tools must be used:

  CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm

To prepare for LTO support with other compilers, common parts are
gated behind the CONFIG_LTO option, and LTO can be disabled for
specific files by filtering out CC_FLAGS_LTO.

Note that support for DYNAMIC_FTRACE and MODVERSIONS are added in
follow-up patches.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 Makefile                          | 20 ++++++++-
 arch/Kconfig                      | 75 +++++++++++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h | 11 +++--
 scripts/Makefile.build            |  9 +++-
 scripts/Makefile.modfinal         |  9 +++-
 scripts/Makefile.modpost          | 21 ++++++++-
 scripts/link-vmlinux.sh           | 32 +++++++++----
 7 files changed, 159 insertions(+), 18 deletions(-)

diff --git a/Makefile b/Makefile
index fda1f8a0b1c7..e91347082163 100644
--- a/Makefile
+++ b/Makefile
@@ -886,6 +886,21 @@ KBUILD_CFLAGS	+= $(CC_FLAGS_SCS)
 export CC_FLAGS_SCS
 endif
 
+ifdef CONFIG_LTO_CLANG
+ifdef CONFIG_THINLTO
+CC_FLAGS_LTO	+= -flto=thin -fsplit-lto-unit
+KBUILD_LDFLAGS	+= --thinlto-cache-dir=$(extmod-prefix).thinlto-cache
+else
+CC_FLAGS_LTO	+= -flto
+endif
+CC_FLAGS_LTO	+= -fvisibility=default
+endif
+
+ifdef CONFIG_LTO
+KBUILD_CFLAGS	+= $(CC_FLAGS_LTO)
+export CC_FLAGS_LTO
+endif
+
 ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B
 KBUILD_CFLAGS += -falign-functions=32
 endif
@@ -1477,7 +1492,7 @@ MRPROPER_FILES += include/config include/generated          \
 		  *.spec
 
 # Directories & files removed with 'make distclean'
-DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS
+DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS .thinlto-cache
 
 # clean - Delete most, but leave enough to build external modules
 #
@@ -1714,7 +1729,8 @@ _emodinst_post: _emodinst_
 	$(call cmd,depmod)
 
 clean-dirs := $(KBUILD_EXTMOD)
-clean: rm-files := $(KBUILD_EXTMOD)/Module.symvers $(KBUILD_EXTMOD)/modules.nsdeps
+clean: rm-files := $(KBUILD_EXTMOD)/Module.symvers $(KBUILD_EXTMOD)/modules.nsdeps \
+		   $(KBUILD_EXTMOD)/.thinlto-cache
 
 PHONY += help
 help:
diff --git a/arch/Kconfig b/arch/Kconfig
index 76ec3395b843..4ac5dda6d873 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -558,6 +558,81 @@ config SHADOW_CALL_STACK
 	  reading and writing arbitrary memory may be able to locate them
 	  and hijack control flow by modifying the stacks.
 
+config LTO
+	bool
+
+config ARCH_SUPPORTS_LTO_CLANG
+	bool
+	help
+	  An architecture should select this option if it supports:
+	  - compiling with Clang,
+	  - compiling inline assembly with Clang's integrated assembler,
+	  - and linking with LLD.
+
+config ARCH_SUPPORTS_THINLTO
+	bool
+	help
+	  An architecture should select this option if it supports Clang's
+	  ThinLTO.
+
+config THINLTO
+	bool "Clang ThinLTO"
+	depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO
+	default y
+	help
+	  This option enables Clang's ThinLTO, which allows for parallel
+	  optimization and faster incremental compiles. More information
+	  can be found from Clang's documentation:
+
+	    https://clang.llvm.org/docs/ThinLTO.html
+
+	  If you say N here, the compiler will use full LTO, which may
+	  produce faster code, but building the kernel will be significantly
+	  slower as the linker won't efficiently utilize multiple threads.
+
+	  If unsure, say Y.
+
+choice
+	prompt "Link Time Optimization (LTO)"
+	default LTO_NONE
+	help
+	  This option enables Link Time Optimization (LTO), which allows the
+	  compiler to optimize binaries globally.
+
+	  If unsure, select LTO_NONE. Note that LTO is very resource-intensive
+	  so it's disabled by default.
+
+config LTO_NONE
+	bool "None"
+
+config LTO_CLANG
+	bool "Clang's Link Time Optimization (EXPERIMENTAL)"
+	# Clang >= 11: https://github.com/ClangBuiltLinux/linux/issues/510
+	depends on CC_IS_CLANG && CLANG_VERSION >= 110000 && LD_IS_LLD
+	depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm)
+	depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm)
+	depends on ARCH_SUPPORTS_LTO_CLANG
+	depends on !FTRACE_MCOUNT_RECORD
+	depends on !KASAN
+	depends on !GCOV_KERNEL
+	depends on !MODVERSIONS
+	select LTO
+	help
+          This option enables Clang's Link Time Optimization (LTO), which
+          allows the compiler to optimize the kernel globally. If you enable
+          this option, the compiler generates LLVM bitcode instead of ELF
+          object files, and the actual compilation from bitcode happens at
+          the LTO link step, which may take several minutes depending on the
+          kernel configuration. More information can be found from LLVM's
+          documentation:
+
+	    https://llvm.org/docs/LinkTimeOptimization.html
+
+	  To select this option, you also need to use LLVM tools to handle
+	  the bitcode by passing LLVM=1 to make.
+
+endchoice
+
 config HAVE_ARCH_WITHIN_STACK_FRAMES
 	bool
 	help
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index e1843976754a..77e5bd069dd4 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -90,15 +90,18 @@
  * .data. We don't want to pull in .data..other sections, which Linux
  * has defined. Same for text and bss.
  *
+ * With LTO_CLANG, the linker also splits sections by default, so we need
+ * these macros to combine the sections during the final link.
+ *
  * RODATA_MAIN is not used because existing code already defines .rodata.x
  * sections to be brought in with rodata.
  */
-#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
+#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG)
 #define TEXT_MAIN .text .text.[0-9a-zA-Z_]*
-#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX*
+#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundliteral*
 #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]*
-#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]*
-#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]*
+#define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L*
+#define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..compoundliteral*
 #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]*
 #else
 #define TEXT_MAIN .text
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 6db5b1f55b14..81750ef4a633 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -111,7 +111,7 @@ endif
 # ---------------------------------------------------------------------------
 
 quiet_cmd_cc_s_c = CC $(quiet_modtag)  $@
-      cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS), $(c_flags)) -fverbose-asm -S -o $@ $<
+      cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $<
 
 $(obj)/%.s: $(src)/%.c FORCE
 	$(call if_changed_dep,cc_s_c)
@@ -428,8 +428,15 @@ $(obj)/lib.a: $(lib-y) FORCE
 # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object
 # module is turned into a multi object module, $^ will contain header file
 # dependencies recorded in the .*.cmd file.
+ifdef CONFIG_LTO_CLANG
+quiet_cmd_link_multi-m = AR [M]  $@
+cmd_link_multi-m =						\
+	rm -f $@; 						\
+	$(AR) cDPrsT $@ $(filter %.o,$^)
+else
 quiet_cmd_link_multi-m = LD [M]  $@
       cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(filter %.o,$^)
+endif
 
 $(multi-used-m): FORCE
 	$(call if_changed,link_multi-m)
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index ae01baf96f4e..2cb9a1d88434 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -6,6 +6,7 @@
 PHONY := __modfinal
 __modfinal:
 
+include $(objtree)/include/config/auto.conf
 include $(srctree)/scripts/Kbuild.include
 
 # for c_flags
@@ -29,6 +30,12 @@ quiet_cmd_cc_o_c = CC [M]  $@
 
 ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
 
+ifdef CONFIG_LTO_CLANG
+# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to
+# avoid a second slow LTO link
+prelink-ext := .lto
+endif
+
 quiet_cmd_ld_ko_o = LD [M]  $@
       cmd_ld_ko_o =                                                     \
 	$(LD) -r $(KBUILD_LDFLAGS)					\
@@ -36,7 +43,7 @@ quiet_cmd_ld_ko_o = LD [M]  $@
 		-T scripts/module.lds -o $@ $(filter %.o, $^);		\
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-$(modules): %.ko: %.o %.mod.o scripts/module.lds FORCE
+$(modules): %.ko: %$(prelink-ext).o %.mod.o scripts/module.lds FORCE
 	+$(call if_changed,ld_ko_o)
 
 targets += $(modules) $(modules:.ko=.mod.o)
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index f54b6ac37ac2..9ff8bfdb574d 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -43,6 +43,9 @@ __modpost:
 include include/config/auto.conf
 include scripts/Kbuild.include
 
+# for ld_flags
+include scripts/Makefile.lib
+
 MODPOST = scripts/mod/modpost								\
 	$(if $(CONFIG_MODVERSIONS),-m)							\
 	$(if $(CONFIG_MODULE_SRCVERSION_ALL),-a)					\
@@ -102,12 +105,26 @@ $(input-symdump):
 	@echo >&2 'WARNING: Symbol version dump "$@" is missing.'
 	@echo >&2 '         Modules may not have dependencies or modversions.'
 
+ifdef CONFIG_LTO_CLANG
+# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run
+# LTO to compile them into native code before running modpost
+prelink-ext := .lto
+
+quiet_cmd_cc_lto_link_modules = LTO [M] $@
+cmd_cc_lto_link_modules = $(LD) $(ld_flags) -r -o $@ --whole-archive $^
+
+%.lto.o: %.o
+	$(call if_changed,cc_lto_link_modules)
+endif
+
+modules := $(sort $(shell cat $(MODORDER)))
+
 # Read out modules.order to pass in modpost.
 # Otherwise, allmodconfig would fail with "Argument list too long".
 quiet_cmd_modpost = MODPOST $@
-      cmd_modpost = sed 's/ko$$/o/' $< | $(MODPOST) -T -
+      cmd_modpost = sed 's/\.ko$$/$(prelink-ext)\.o/' $< | $(MODPOST) -T -
 
-$(output-symdump): $(MODORDER) $(input-symdump) FORCE
+$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE
 	$(call if_changed,modpost)
 
 targets += $(output-symdump)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 372c3719f94c..ebb9f912aab6 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -56,6 +56,14 @@ modpost_link()
 		${KBUILD_VMLINUX_LIBS}				\
 		--end-group"
 
+	if [ -n "${CONFIG_LTO_CLANG}" ]; then
+		# This might take a while, so indicate that we're doing
+		# an LTO link
+		info LTO ${1}
+	else
+		info LD ${1}
+	fi
+
 	${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects}
 }
 
@@ -103,13 +111,22 @@ vmlinux_link()
 	fi
 
 	if [ "${SRCARCH}" != "um" ]; then
-		objects="--whole-archive			\
-			${KBUILD_VMLINUX_OBJS}			\
-			--no-whole-archive			\
-			--start-group				\
-			${KBUILD_VMLINUX_LIBS}			\
-			--end-group				\
-			${@}"
+		if [ -n "${CONFIG_LTO_CLANG}" ]; then
+			# Use vmlinux.o instead of performing the slow LTO
+			# link again.
+			objects="--whole-archive		\
+				vmlinux.o 			\
+				--no-whole-archive		\
+				${@}"
+		else
+			objects="--whole-archive		\
+				${KBUILD_VMLINUX_OBJS}		\
+				--no-whole-archive		\
+				--start-group			\
+				${KBUILD_VMLINUX_LIBS}		\
+				--end-group			\
+				${@}"
+		fi
 
 		${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}	\
 			${strip_debug#-Wl,}			\
@@ -274,7 +291,6 @@ fi;
 ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1
 
 #link vmlinux.o
-info LD vmlinux.o
 modpost_link vmlinux.o
 objtool_link vmlinux.o
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 09/25] kbuild: lto: fix module versioning
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (7 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 08/25] kbuild: add support for Clang LTO Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 10/25] objtool: Split noinstr validation from --vmlinux Sami Tolvanen
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With CONFIG_MODVERSIONS, version information is linked into each
compilation unit that exports symbols. With LTO, we cannot use this
method as all C code is compiled into LLVM bitcode instead. This
change collects symbol versions into .symversions files and merges
them in link-vmlinux.sh where they are all linked into vmlinux.o at
the same time.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 .gitignore               |  1 +
 Makefile                 |  3 ++-
 arch/Kconfig             |  1 -
 scripts/Makefile.build   | 33 +++++++++++++++++++++++++++++++--
 scripts/Makefile.modpost |  6 +++++-
 scripts/link-vmlinux.sh  | 23 ++++++++++++++++++++++-
 6 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/.gitignore b/.gitignore
index 162bd2b67bdf..06e76dc39ffe 100644
--- a/.gitignore
+++ b/.gitignore
@@ -41,6 +41,7 @@
 *.so.dbg
 *.su
 *.symtypes
+*.symversions
 *.tab.[ch]
 *.tar
 *.xz
diff --git a/Makefile b/Makefile
index e91347082163..91cd6caefa6e 100644
--- a/Makefile
+++ b/Makefile
@@ -1827,7 +1827,8 @@ clean: $(clean-dirs)
 		-o -name '.tmp_*.o.*' \
 		-o -name '*.c.[012]*.*' \
 		-o -name '*.ll' \
-		-o -name '*.gcno' \) -type f -print | xargs rm -f
+		-o -name '*.gcno' \
+		-o -name '*.*.symversions' \) -type f -print | xargs rm -f
 
 # Generate tags for editors
 # ---------------------------------------------------------------------------
diff --git a/arch/Kconfig b/arch/Kconfig
index 4ac5dda6d873..caeb6feb517e 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -615,7 +615,6 @@ config LTO_CLANG
 	depends on !FTRACE_MCOUNT_RECORD
 	depends on !KASAN
 	depends on !GCOV_KERNEL
-	depends on !MODVERSIONS
 	select LTO
 	help
           This option enables Clang's Link Time Optimization (LTO), which
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 81750ef4a633..e08e66413dbe 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -166,6 +166,15 @@ ifdef CONFIG_MODVERSIONS
 #   the actual value of the checksum generated by genksyms
 # o remove .tmp_<file>.o to <file>.o
 
+ifdef CONFIG_LTO_CLANG
+# Generate .o.symversions files for each .o with exported symbols, and link these
+# to the kernel and/or modules at the end.
+cmd_modversions_c =								\
+	if $(NM) $@ 2>/dev/null | grep -q __ksymtab; then			\
+		$(call cmd_gensymtypes_c,$(KBUILD_SYMTYPES),$(@:.o=.symtypes))	\
+		    > $@.symversions;						\
+	fi;
+else
 cmd_modversions_c =								\
 	if $(OBJDUMP) -h $@ | grep -q __ksymtab; then				\
 		$(call cmd_gensymtypes_c,$(KBUILD_SYMTYPES),$(@:.o=.symtypes))	\
@@ -177,6 +186,7 @@ cmd_modversions_c =								\
 		rm -f $(@D)/.tmp_$(@F:.o=.ver);					\
 	fi
 endif
+endif
 
 ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
 # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl
@@ -393,6 +403,18 @@ $(obj)/%.asn1.c $(obj)/%.asn1.h: $(src)/%.asn1 $(objtree)/scripts/asn1_compiler
 $(subdir-builtin): $(obj)/%/built-in.a: $(obj)/% ;
 $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ;
 
+# combine symversions for later processing
+quiet_cmd_update_lto_symversions = SYMVER  $@
+ifeq ($(CONFIG_LTO_CLANG) $(CONFIG_MODVERSIONS),y y)
+      cmd_update_lto_symversions =					\
+	rm -f $@.symversions						\
+	$(foreach n, $(filter-out FORCE,$^),				\
+		$(if $(wildcard $(n).symversions),			\
+			; cat $(n).symversions >> $@.symversions))
+else
+      cmd_update_lto_symversions = echo >/dev/null
+endif
+
 #
 # Rule to compile a set of .o files into one .a file (without symbol table)
 #
@@ -400,8 +422,11 @@ $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ;
 quiet_cmd_ar_builtin = AR      $@
       cmd_ar_builtin = rm -f $@; $(AR) cDPrST $@ $(real-prereqs)
 
+quiet_cmd_ar_and_symver = AR      $@
+      cmd_ar_and_symver = $(cmd_update_lto_symversions); $(cmd_ar_builtin)
+
 $(obj)/built-in.a: $(real-obj-y) FORCE
-	$(call if_changed,ar_builtin)
+	$(call if_changed,ar_and_symver)
 
 #
 # Rule to create modules.order file
@@ -421,8 +446,11 @@ $(obj)/modules.order: $(obj-m) FORCE
 #
 # Rule to compile a set of .o files into one .a file (with symbol table)
 #
+quiet_cmd_ar_lib = AR      $@
+      cmd_ar_lib = $(cmd_update_lto_symversions); $(cmd_ar)
+
 $(obj)/lib.a: $(lib-y) FORCE
-	$(call if_changed,ar)
+	$(call if_changed,ar_lib)
 
 # NOTE:
 # Do not replace $(filter %.o,^) with $(real-prereqs). When a single object
@@ -431,6 +459,7 @@ $(obj)/lib.a: $(lib-y) FORCE
 ifdef CONFIG_LTO_CLANG
 quiet_cmd_link_multi-m = AR [M]  $@
 cmd_link_multi-m =						\
+	$(cmd_update_lto_symversions);				\
 	rm -f $@; 						\
 	$(AR) cDPrsT $@ $(filter %.o,$^)
 else
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 9ff8bfdb574d..066beffca09a 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -111,7 +111,11 @@ ifdef CONFIG_LTO_CLANG
 prelink-ext := .lto
 
 quiet_cmd_cc_lto_link_modules = LTO [M] $@
-cmd_cc_lto_link_modules = $(LD) $(ld_flags) -r -o $@ --whole-archive $^
+cmd_cc_lto_link_modules =						\
+	$(LD) $(ld_flags) -r -o $@					\
+		$(shell [ -s $(@:.lto.o=.o.symversions) ] &&		\
+			echo -T $(@:.lto.o=.o.symversions))		\
+		--whole-archive $^
 
 %.lto.o: %.o
 	$(call if_changed,cc_lto_link_modules)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ebb9f912aab6..1a48ef525f46 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -43,11 +43,26 @@ info()
 	fi
 }
 
+# If CONFIG_LTO_CLANG is selected, collect generated symbol versions into
+# .tmp_symversions.lds
+gen_symversions()
+{
+	info GEN .tmp_symversions.lds
+	rm -f .tmp_symversions.lds
+
+	for o in ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS}; do
+		if [ -f ${o}.symversions ]; then
+			cat ${o}.symversions >> .tmp_symversions.lds
+		fi
+	done
+}
+
 # Link of vmlinux.o used for section mismatch analysis
 # ${1} output file
 modpost_link()
 {
 	local objects
+	local lds=""
 
 	objects="--whole-archive				\
 		${KBUILD_VMLINUX_OBJS}				\
@@ -57,6 +72,11 @@ modpost_link()
 		--end-group"
 
 	if [ -n "${CONFIG_LTO_CLANG}" ]; then
+		if [ -n "${CONFIG_MODVERSIONS}" ]; then
+			gen_symversions
+			lds="${lds} -T .tmp_symversions.lds"
+		fi
+
 		# This might take a while, so indicate that we're doing
 		# an LTO link
 		info LTO ${1}
@@ -64,7 +84,7 @@ modpost_link()
 		info LD ${1}
 	fi
 
-	${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects}
+	${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${lds} ${objects}
 }
 
 objtool_link()
@@ -242,6 +262,7 @@ cleanup()
 {
 	rm -f .btf.*
 	rm -f .tmp_System.map
+	rm -f .tmp_symversions.lds
 	rm -f .tmp_vmlinux*
 	rm -f System.map
 	rm -f vmlinux
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 10/25] objtool: Split noinstr validation from --vmlinux
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (8 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 09/25] kbuild: lto: fix module versioning Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 11/25] kbuild: lto: postpone objtool Sami Tolvanen
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

This change adds a --noinstr flag to objtool to allow us to specify
that we're processing vmlinux.o without also enabling noinstr
validation. This is needed to avoid false positives with LTO when we
run objtool on vmlinux.o without CONFIG_DEBUG_ENTRY.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 scripts/link-vmlinux.sh       | 2 +-
 tools/objtool/builtin-check.c | 3 ++-
 tools/objtool/builtin.h       | 2 +-
 tools/objtool/check.c         | 2 +-
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 1a48ef525f46..5ace1dc43993 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -92,7 +92,7 @@ objtool_link()
 	local objtoolopt;
 
 	if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then
-		objtoolopt="check --vmlinux"
+		objtoolopt="check --vmlinux --noinstr"
 		if [ -z "${CONFIG_FRAME_POINTER}" ]; then
 			objtoolopt="${objtoolopt} --no-fp"
 		fi
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index facfc10bc5dc..b84cdc72b51f 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -18,7 +18,7 @@
 #include "builtin.h"
 #include "objtool.h"
 
-bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount;
+bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount, noinstr;
 
 static const char * const check_usage[] = {
 	"objtool check [<options>] file.o",
@@ -34,6 +34,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('a', "uaccess", &uaccess, "enable uaccess checking"),
 	OPT_BOOLEAN('s', "stats", &stats, "print statistics"),
 	OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"),
+	OPT_BOOLEAN('n', "noinstr", &noinstr, "noinstr validation for vmlinux.o"),
 	OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"),
 	OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"),
 	OPT_END(),
diff --git a/tools/objtool/builtin.h b/tools/objtool/builtin.h
index 94565a72b701..2502bb27de17 100644
--- a/tools/objtool/builtin.h
+++ b/tools/objtool/builtin.h
@@ -8,7 +8,7 @@
 #include <subcmd/parse-options.h>
 
 extern const struct option check_options[];
-extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount;
+extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount, noinstr;
 
 extern int cmd_check(int argc, const char **argv);
 extern int cmd_orc(int argc, const char **argv);
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c39c4d2b432a..c216dd4d662c 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -244,7 +244,7 @@ static void init_insn_state(struct insn_state *state, struct section *sec)
 	 * not correctly determine insn->call_dest->sec (external symbols do
 	 * not have a section).
 	 */
-	if (vmlinux && sec)
+	if (vmlinux && noinstr && sec)
 		state->noinstr = sec->noinstr;
 }
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 11/25] kbuild: lto: postpone objtool
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (9 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 10/25] objtool: Split noinstr validation from --vmlinux Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 12/25] kbuild: lto: limit inlining Sami Tolvanen
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With LTO, LLVM bitcode won't be compiled into native code until
modpost_link, or modfinal for modules. This change postpones calls
to objtool until after these steps, and moves objtool_args to
Makefile.lib, so the arguments can be reused in Makefile.modfinal.

As we didn't have objects to process earlier, we use --duplicate
when processing vmlinux.o. This change also disables unreachable
instruction warnings with LTO to avoid warnings about the int3
padding between functions.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/Kconfig              |  2 +-
 scripts/Makefile.build    | 22 ++--------------------
 scripts/Makefile.lib      | 11 +++++++++++
 scripts/Makefile.modfinal | 19 ++++++++++++++++---
 scripts/link-vmlinux.sh   | 28 +++++++++++++++++++++++++---
 5 files changed, 55 insertions(+), 27 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index caeb6feb517e..74cbd6e3b116 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -612,7 +612,7 @@ config LTO_CLANG
 	depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm)
 	depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm)
 	depends on ARCH_SUPPORTS_LTO_CLANG
-	depends on !FTRACE_MCOUNT_RECORD
+	depends on !FTRACE_MCOUNT_USE_RECORDMCOUNT
 	depends on !KASAN
 	depends on !GCOV_KERNEL
 	select LTO
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index e08e66413dbe..ab0ddf4884fd 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -218,30 +218,11 @@ cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),
 endif # CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
 
 ifdef CONFIG_STACK_VALIDATION
+ifndef CONFIG_LTO_CLANG
 ifneq ($(SKIP_STACK_VALIDATION),1)
 
 __objtool_obj := $(objtree)/tools/objtool/objtool
 
-objtool_args = $(if $(CONFIG_UNWINDER_ORC),orc generate,check)
-
-objtool_args += $(if $(part-of-module), --module,)
-
-ifndef CONFIG_FRAME_POINTER
-objtool_args += --no-fp
-endif
-ifdef CONFIG_GCOV_KERNEL
-objtool_args += --no-unreachable
-endif
-ifdef CONFIG_RETPOLINE
-  objtool_args += --retpoline
-endif
-ifdef CONFIG_X86_SMAP
-  objtool_args += --uaccess
-endif
-ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL
-  objtool_args += --mcount
-endif
-
 # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
 # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file
 # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file
@@ -253,6 +234,7 @@ objtool_obj = $(if $(patsubst y%,, \
 	$(__objtool_obj))
 
 endif # SKIP_STACK_VALIDATION
+endif # CONFIG_LTO_CLANG
 endif # CONFIG_STACK_VALIDATION
 
 # Rebuild all objects when objtool changes, or is enabled/disabled.
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 3d599716940c..ecb97c9f5feb 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -216,6 +216,17 @@ dtc_cpp_flags  = -Wp,-MMD,$(depfile).pre.tmp -nostdinc                    \
 		 $(addprefix -I,$(DTC_INCLUDE))                          \
 		 -undef -D__DTS__
 
+# Objtool arguments are also needed for modfinal with LTO, so we define
+# then here to avoid duplication.
+objtool_args =								\
+	$(if $(CONFIG_UNWINDER_ORC),orc generate,check)			\
+	$(if $(part-of-module), --module,)				\
+	$(if $(CONFIG_FRAME_POINTER),, --no-fp)				\
+	$(if $(CONFIG_GCOV_KERNEL), --no-unreachable,)			\
+	$(if $(CONFIG_RETPOLINE), --retpoline,)				\
+	$(if $(CONFIG_X86_SMAP), --uaccess,)				\
+	$(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount,)
+
 # Useful for describing the dependency of composite objects
 # Usage:
 #   $(call multi_depend, multi_used_targets, suffix_to_remove, suffix_to_add)
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 2cb9a1d88434..1bd2953b11c4 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -9,7 +9,7 @@ __modfinal:
 include $(objtree)/include/config/auto.conf
 include $(srctree)/scripts/Kbuild.include
 
-# for c_flags
+# for c_flags and objtool_args
 include $(srctree)/scripts/Makefile.lib
 
 # find all modules listed in modules.order
@@ -34,10 +34,23 @@ ifdef CONFIG_LTO_CLANG
 # With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to
 # avoid a second slow LTO link
 prelink-ext := .lto
-endif
+
+# ELF processing was skipped earlier because we didn't have native code,
+# so let's now process the prelinked binary before we link the module.
+
+ifdef CONFIG_STACK_VALIDATION
+ifneq ($(SKIP_STACK_VALIDATION),1)
+cmd_ld_ko_o +=								\
+	$(objtree)/tools/objtool/objtool $(objtool_args)		\
+		$(@:.ko=$(prelink-ext).o);
+
+endif # SKIP_STACK_VALIDATION
+endif # CONFIG_STACK_VALIDATION
+
+endif # CONFIG_LTO_CLANG
 
 quiet_cmd_ld_ko_o = LD [M]  $@
-      cmd_ld_ko_o =                                                     \
+      cmd_ld_ko_o +=							\
 	$(LD) -r $(KBUILD_LDFLAGS)					\
 		$(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE)		\
 		-T scripts/module.lds -o $@ $(filter %.o, $^);		\
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 5ace1dc43993..7f4d19271180 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -89,14 +89,36 @@ modpost_link()
 
 objtool_link()
 {
+	local objtoolcmd;
 	local objtoolopt;
 
+	if [ "${CONFIG_LTO_CLANG} ${CONFIG_STACK_VALIDATION}" = "y y" ]; then
+		# Don't perform vmlinux validation unless explicitly requested,
+		# but run objtool on vmlinux.o now that we have an object file.
+		if [ -n "${CONFIG_UNWINDER_ORC}" ]; then
+			objtoolcmd="orc generate"
+		fi
+
+		objtoolopt="${objtoolopt} --duplicate"
+
+		if [ -n "${CONFIG_FTRACE_MCOUNT_USE_OBJTOOL}" ]; then
+			objtoolopt="${objtoolopt} --mcount"
+		fi
+	fi
+
 	if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then
-		objtoolopt="check --vmlinux --noinstr"
+		objtoolopt="${objtoolopt} --noinstr"
+	fi
+
+	if [ -n "${objtoolopt}" ]; then
+		if [ -z "${objtoolcmd}" ]; then
+			objtoolcmd="check"
+		fi
+		objtoolopt="${objtoolopt} --vmlinux"
 		if [ -z "${CONFIG_FRAME_POINTER}" ]; then
 			objtoolopt="${objtoolopt} --no-fp"
 		fi
-		if [ -n "${CONFIG_GCOV_KERNEL}" ]; then
+		if [ -n "${CONFIG_GCOV_KERNEL}" ] || [ -n "${CONFIG_LTO_CLANG}" ]; then
 			objtoolopt="${objtoolopt} --no-unreachable"
 		fi
 		if [ -n "${CONFIG_RETPOLINE}" ]; then
@@ -106,7 +128,7 @@ objtool_link()
 			objtoolopt="${objtoolopt} --uaccess"
 		fi
 		info OBJTOOL ${1}
-		tools/objtool/objtool ${objtoolopt} ${1}
+		tools/objtool/objtool ${objtoolcmd} ${objtoolopt} ${1}
 	fi
 }
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 12/25] kbuild: lto: limit inlining
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (10 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 11/25] kbuild: lto: postpone objtool Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 13/25] kbuild: lto: merge module sections Sami Tolvanen
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

This change limits function inlining across translation unit boundaries
in order to reduce the binary size with LTO. The -import-instr-limit
flag defines a size limit, as the number of LLVM IR instructions, for
importing functions from other TUs, defaulting to 100.

Based on testing with arm64 defconfig, we found that a limit of 5 is a
reasonable compromise between performance and binary size, reducing the
size of a stripped vmlinux by 11%.

Suggested-by: George Burgess IV <gbiv@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Makefile b/Makefile
index 91cd6caefa6e..41b01248d400 100644
--- a/Makefile
+++ b/Makefile
@@ -894,6 +894,9 @@ else
 CC_FLAGS_LTO	+= -flto
 endif
 CC_FLAGS_LTO	+= -fvisibility=default
+
+# Limit inlining across translation units to reduce binary size
+KBUILD_LDFLAGS += -mllvm -import-instr-limit=5
 endif
 
 ifdef CONFIG_LTO
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 13/25] kbuild: lto: merge module sections
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (11 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 12/25] kbuild: lto: limit inlining Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-14 22:49   ` Kees Cook
  2020-10-13  0:31 ` [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

LLD always splits sections with LTO, which increases module sizes. This
change adds linker script rules to merge the split sections in the final
module.

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 scripts/module.lds.S | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index 69b9b71a6a47..037120173a22 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -25,5 +25,33 @@ SECTIONS {
 	__jump_table		0 : ALIGN(8) { KEEP(*(__jump_table)) }
 }
 
+#ifdef CONFIG_LTO_CLANG
+/*
+ * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and
+ * -ffunction-sections, which increases the size of the final module.
+ * Merge the split sections in the final binary.
+ */
+SECTIONS {
+	__patchable_function_entries : { *(__patchable_function_entries) }
+
+	.bss : {
+		*(.bss .bss.[0-9a-zA-Z_]*)
+		*(.bss..L*)
+	}
+
+	.data : {
+		*(.data .data.[0-9a-zA-Z_]*)
+		*(.data..L*)
+	}
+
+	.rodata : {
+		*(.rodata .rodata.[0-9a-zA-Z_]*)
+		*(.rodata..L*)
+	}
+
+	.text : { *(.text .text.[0-9a-zA-Z_]*) }
+}
+#endif
+
 /* bring in arch-specific sections */
 #include <asm/module.lds.h>
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (12 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 13/25] kbuild: lto: merge module sections Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-14 22:50   ` Kees Cook
  2020-10-13  0:31 ` [PATCH v6 15/25] init: lto: ensure initcall ordering Sami Tolvanen
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With LTO, llvm-nm prints out symbols for each archive member
separately, which results in a lot of duplicate dependencies in the
.mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module
consists of several compilation units, the output can exceed the
default xargs command size limit and split the dependency list to
multiple lines, which results in used symbols getting trimmed.

This change removes duplicate dependencies, which will reduce the
probability of this happening and makes .mod files smaller and
easier to read.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 scripts/Makefile.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index ab0ddf4884fd..96d6c9e18901 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -266,7 +266,7 @@ endef
 
 # List module undefined symbols (or empty line if not enabled)
 ifdef CONFIG_TRIM_UNUSED_KSYMS
-cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
+cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | sort -u | xargs echo
 else
 cmd_undef_syms = echo
 endif
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 15/25] init: lto: ensure initcall ordering
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (13 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 16/25] init: lto: fix PREL32 relocations Sami Tolvanen
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With LTO, the compiler doesn't necessarily obey the link order for
initcalls, and initcall variables need globally unique names to avoid
collisions at link time.

This change exports __KBUILD_MODNAME and adds the initcall_id() macro,
which uses it together with __COUNTER__ and __LINE__ to help ensure
these variables have unique names, and moves each variable to its own
section when LTO is enabled, so the correct order can be specified using
a linker script.

The generate_initcall_ordering.pl script uses nm to find initcalls from
the object files passed to the linker, and generates a linker script
that specifies the same order for initcalls that we would have without
LTO. With LTO enabled, the script is called in link-vmlinux.sh through
jobserver-exec to limit the number of jobs spawned.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 include/linux/init.h               |  52 +++++-
 scripts/Makefile.lib               |   6 +-
 scripts/generate_initcall_order.pl | 270 +++++++++++++++++++++++++++++
 scripts/link-vmlinux.sh            |  15 ++
 4 files changed, 334 insertions(+), 9 deletions(-)
 create mode 100755 scripts/generate_initcall_order.pl

diff --git a/include/linux/init.h b/include/linux/init.h
index 212fc9e2f691..af638cd6dd52 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -184,19 +184,57 @@ extern bool initcall_debug;
  * as KEEP() in the linker script.
  */
 
+/* Format: <modname>__<counter>_<line>_<fn> */
+#define __initcall_id(fn)					\
+	__PASTE(__KBUILD_MODNAME,				\
+	__PASTE(__,						\
+	__PASTE(__COUNTER__,					\
+	__PASTE(_,						\
+	__PASTE(__LINE__,					\
+	__PASTE(_, fn))))))
+
+/* Format: __<prefix>__<iid><id> */
+#define __initcall_name(prefix, __iid, id)			\
+	__PASTE(__,						\
+	__PASTE(prefix,						\
+	__PASTE(__,						\
+	__PASTE(__iid, id))))
+
+#ifdef CONFIG_LTO_CLANG
+/*
+ * With LTO, the compiler doesn't necessarily obey link order for
+ * initcalls. In order to preserve the correct order, we add each
+ * variable into its own section and generate a linker script (in
+ * scripts/link-vmlinux.sh) to specify the order of the sections.
+ */
+#define __initcall_section(__sec, __iid)			\
+	#__sec ".init.." #__iid
+#else
+#define __initcall_section(__sec, __iid)			\
+	#__sec ".init"
+#endif
+
 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
-#define ___define_initcall(fn, id, __sec)			\
+#define ____define_initcall(fn, __name, __sec)			\
 	__ADDRESSABLE(fn)					\
-	asm(".section	\"" #__sec ".init\", \"a\"	\n"	\
-	"__initcall_" #fn #id ":			\n"	\
+	asm(".section	\"" __sec "\", \"a\"		\n"	\
+	    __stringify(__name) ":			\n"	\
 	    ".long	" #fn " - .			\n"	\
 	    ".previous					\n");
 #else
-#define ___define_initcall(fn, id, __sec) \
-	static initcall_t __initcall_##fn##id __used \
-		__attribute__((__section__(#__sec ".init"))) = fn;
+#define ____define_initcall(fn, __name, __sec)			\
+	static initcall_t __name __used 			\
+		__attribute__((__section__(__sec))) = fn;
 #endif
 
+#define __unique_initcall(fn, id, __sec, __iid)			\
+	____define_initcall(fn,					\
+		__initcall_name(initcall, __iid, id),		\
+		__initcall_section(__sec, __iid))
+
+#define ___define_initcall(fn, id, __sec)			\
+	__unique_initcall(fn, id, __sec, __initcall_id(fn))
+
 #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id)
 
 /*
@@ -236,7 +274,7 @@ extern bool initcall_debug;
 #define __exitcall(fn)						\
 	static exitcall_t __exitcall_##fn __exit_call = fn
 
-#define console_initcall(fn)	___define_initcall(fn,, .con_initcall)
+#define console_initcall(fn)	___define_initcall(fn, con, .con_initcall)
 
 struct obs_kernel_param {
 	const char *str;
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index ecb97c9f5feb..1142de6a4161 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -117,9 +117,11 @@ target-stem = $(basename $(patsubst $(obj)/%,%,$@))
 # These flags are needed for modversions and compiling, so we define them here
 # $(modname_flags) defines KBUILD_MODNAME as the name of the module it will
 # end up in (or would, if it gets compiled in)
-name-fix = $(call stringify,$(subst $(comma),_,$(subst -,_,$1)))
+name-fix-token = $(subst $(comma),_,$(subst -,_,$1))
+name-fix = $(call stringify,$(call name-fix-token,$1))
 basename_flags = -DKBUILD_BASENAME=$(call name-fix,$(basetarget))
-modname_flags  = -DKBUILD_MODNAME=$(call name-fix,$(modname))
+modname_flags  = -DKBUILD_MODNAME=$(call name-fix,$(modname)) \
+		 -D__KBUILD_MODNAME=kmod_$(call name-fix-token,$(modname))
 modfile_flags  = -DKBUILD_MODFILE=$(call stringify,$(modfile))
 
 _c_flags       = $(filter-out $(CFLAGS_REMOVE_$(target-stem).o), \
diff --git a/scripts/generate_initcall_order.pl b/scripts/generate_initcall_order.pl
new file mode 100755
index 000000000000..1a88d3f1b913
--- /dev/null
+++ b/scripts/generate_initcall_order.pl
@@ -0,0 +1,270 @@
+#!/usr/bin/env perl
+# SPDX-License-Identifier: GPL-2.0
+#
+# Generates a linker script that specifies the correct initcall order.
+#
+# Copyright (C) 2019 Google LLC
+
+use strict;
+use warnings;
+use IO::Handle;
+use IO::Select;
+use POSIX ":sys_wait_h";
+
+my $nm = $ENV{'NM'} || die "$0: ERROR: NM not set?";
+my $objtree = $ENV{'objtree'} || '.';
+
+## currently active child processes
+my $jobs = {};		# child process pid -> file handle
+## results from child processes
+my $results = {};	# object index -> [ { level, secname }, ... ]
+
+## reads _NPROCESSORS_ONLN to determine the maximum number of processes to
+## start
+sub get_online_processors {
+	open(my $fh, "getconf _NPROCESSORS_ONLN 2>/dev/null |")
+		or die "$0: ERROR: failed to execute getconf: $!";
+	my $procs = <$fh>;
+	close($fh);
+
+	if (!($procs =~ /^\d+$/)) {
+		return 1;
+	}
+
+	return int($procs);
+}
+
+## writes results to the parent process
+## format: <file index> <initcall level> <base initcall section name>
+sub write_results {
+	my ($index, $initcalls) = @_;
+
+	# sort by the counter value to ensure the order of initcalls within
+	# each object file is correct
+	foreach my $counter (sort { $a <=> $b } keys(%{$initcalls})) {
+		my $level = $initcalls->{$counter}->{'level'};
+
+		# section name for the initcall function
+		my $secname = $initcalls->{$counter}->{'module'} . '__' .
+			      $counter . '_' .
+			      $initcalls->{$counter}->{'line'} . '_' .
+			      $initcalls->{$counter}->{'function'};
+
+		print "$index $level $secname\n";
+	}
+}
+
+## reads a result line from a child process and adds it to the $results array
+sub read_results{
+	my ($fh) = @_;
+
+	# each child prints out a full line w/ autoflush and exits after the
+	# last line, so even if buffered I/O blocks here, it shouldn't block
+	# very long
+	my $data = <$fh>;
+
+	if (!defined($data)) {
+		return 0;
+	}
+
+	chomp($data);
+
+	my ($index, $level, $secname) = $data =~
+		/^(\d+)\ ([^\ ]+)\ (.*)$/;
+
+	if (!defined($index) ||
+		!defined($level) ||
+		!defined($secname)) {
+		die "$0: ERROR: child process returned invalid data: $data\n";
+	}
+
+	$index = int($index);
+
+	if (!exists($results->{$index})) {
+		$results->{$index} = [];
+	}
+
+	push (@{$results->{$index}}, {
+		'level'   => $level,
+		'secname' => $secname
+	});
+
+	return 1;
+}
+
+## finds initcalls from an object file or all object files in an archive, and
+## writes results back to the parent process
+sub find_initcalls {
+	my ($index, $file) = @_;
+
+	die "$0: ERROR: file $file doesn't exist?" if (! -f $file);
+
+	open(my $fh, "\"$nm\" --defined-only \"$file\" 2>/dev/null |")
+		or die "$0: ERROR: failed to execute \"$nm\": $!";
+
+	my $initcalls = {};
+
+	while (<$fh>) {
+		chomp;
+
+		# check for the start of a new object file (if processing an
+		# archive)
+		my ($path)= $_ =~ /^(.+)\:$/;
+
+		if (defined($path)) {
+			write_results($index, $initcalls);
+			$initcalls = {};
+			next;
+		}
+
+		# look for an initcall
+		my ($module, $counter, $line, $symbol) = $_ =~
+			/[a-z]\s+__initcall__(\S*)__(\d+)_(\d+)_(.*)$/;
+
+		if (!defined($module)) {
+			$module = ''
+		}
+
+		if (!defined($counter) ||
+			!defined($line) ||
+			!defined($symbol)) {
+			next;
+		}
+
+		# parse initcall level
+		my ($function, $level) = $symbol =~
+			/^(.*)((early|rootfs|con|[0-9])s?)$/;
+
+		die "$0: ERROR: invalid initcall name $symbol in $file($path)"
+			if (!defined($function) || !defined($level));
+
+		$initcalls->{$counter} = {
+			'module'   => $module,
+			'line'     => $line,
+			'function' => $function,
+			'level'    => $level,
+		};
+	}
+
+	close($fh);
+	write_results($index, $initcalls);
+}
+
+## waits for any child process to complete, reads the results, and adds them to
+## the $results array for later processing
+sub wait_for_results {
+	my ($select) = @_;
+
+	my $pid = 0;
+	do {
+		# unblock children that may have a full write buffer
+		foreach my $fh ($select->can_read(0)) {
+			read_results($fh);
+		}
+
+		# check for children that have exited, read the remaining data
+		# from them, and clean up
+		$pid = waitpid(-1, WNOHANG);
+		if ($pid > 0) {
+			if (!exists($jobs->{$pid})) {
+				next;
+			}
+
+			my $fh = $jobs->{$pid};
+			$select->remove($fh);
+
+			while (read_results($fh)) {
+				# until eof
+			}
+
+			close($fh);
+			delete($jobs->{$pid});
+		}
+	} while ($pid > 0);
+}
+
+## forks a child to process each file passed in the command line and collects
+## the results
+sub process_files {
+	my $index = 0;
+	my $njobs = $ENV{'PARALLELISM'} || get_online_processors();
+	my $select = IO::Select->new();
+
+	while (my $file = shift(@ARGV)) {
+		# fork a child process and read it's stdout
+		my $pid = open(my $fh, '-|');
+
+		if (!defined($pid)) {
+			die "$0: ERROR: failed to fork: $!";
+		} elsif ($pid) {
+			# save the child process pid and the file handle
+			$select->add($fh);
+			$jobs->{$pid} = $fh;
+		} else {
+			# in the child process
+			STDOUT->autoflush(1);
+			find_initcalls($index, "$objtree/$file");
+			exit;
+		}
+
+		$index++;
+
+		# limit the number of children to $njobs
+		if (scalar(keys(%{$jobs})) >= $njobs) {
+			wait_for_results($select);
+		}
+	}
+
+	# wait for the remaining children to complete
+	while (scalar(keys(%{$jobs})) > 0) {
+		wait_for_results($select);
+	}
+}
+
+sub generate_initcall_lds() {
+	process_files();
+
+	my $sections = {};	# level -> [ secname, ...]
+
+	# sort results to retain link order and split to sections per
+	# initcall level
+	foreach my $index (sort { $a <=> $b } keys(%{$results})) {
+		foreach my $result (@{$results->{$index}}) {
+			my $level = $result->{'level'};
+
+			if (!exists($sections->{$level})) {
+				$sections->{$level} = [];
+			}
+
+			push(@{$sections->{$level}}, $result->{'secname'});
+		}
+	}
+
+	die "$0: ERROR: no initcalls?" if (!keys(%{$sections}));
+
+	# print out a linker script that defines the order of initcalls for
+	# each level
+	print "SECTIONS {\n";
+
+	foreach my $level (sort(keys(%{$sections}))) {
+		my $section;
+
+		if ($level eq 'con') {
+			$section = '.con_initcall.init';
+		} else {
+			$section = ".initcall${level}.init";
+		}
+
+		print "\t${section} : {\n";
+
+		foreach my $secname (@{$sections->{$level}}) {
+			print "\t\t*(${section}..${secname}) ;\n";
+		}
+
+		print "\t}\n";
+	}
+
+	print "}\n";
+}
+
+generate_initcall_lds();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 7f4d19271180..cd649dc21c04 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -43,6 +43,17 @@ info()
 	fi
 }
 
+# Generate a linker script to ensure correct ordering of initcalls.
+gen_initcalls()
+{
+	info GEN .tmp_initcalls.lds
+
+	${PYTHON} ${srctree}/scripts/jobserver-exec		\
+	${PERL} ${srctree}/scripts/generate_initcall_order.pl	\
+		${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS}	\
+		> .tmp_initcalls.lds
+}
+
 # If CONFIG_LTO_CLANG is selected, collect generated symbol versions into
 # .tmp_symversions.lds
 gen_symversions()
@@ -72,6 +83,9 @@ modpost_link()
 		--end-group"
 
 	if [ -n "${CONFIG_LTO_CLANG}" ]; then
+		gen_initcalls
+		lds="-T .tmp_initcalls.lds"
+
 		if [ -n "${CONFIG_MODVERSIONS}" ]; then
 			gen_symversions
 			lds="${lds} -T .tmp_symversions.lds"
@@ -284,6 +298,7 @@ cleanup()
 {
 	rm -f .btf.*
 	rm -f .tmp_System.map
+	rm -f .tmp_initcalls.lds
 	rm -f .tmp_symversions.lds
 	rm -f .tmp_vmlinux*
 	rm -f System.map
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 16/25] init: lto: fix PREL32 relocations
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (14 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 15/25] init: lto: ensure initcall ordering Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-14 22:53   ` Kees Cook
  2020-10-15  0:12   ` Jann Horn
  2020-10-13  0:31 ` [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO Sami Tolvanen
                   ` (8 subsequent siblings)
  24 siblings, 2 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With LTO, the compiler can rename static functions to avoid global
naming collisions. As initcall functions are typically static,
renaming can break references to them in inline assembly. This
change adds a global stub with a stable name for each initcall to
fix the issue when PREL32 relocations are used.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 include/linux/init.h | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/include/linux/init.h b/include/linux/init.h
index af638cd6dd52..cea63f7e7705 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -209,26 +209,49 @@ extern bool initcall_debug;
  */
 #define __initcall_section(__sec, __iid)			\
 	#__sec ".init.." #__iid
+
+/*
+ * With LTO, the compiler can rename static functions to avoid
+ * global naming collisions. We use a global stub function for
+ * initcalls to create a stable symbol name whose address can be
+ * taken in inline assembly when PREL32 relocations are used.
+ */
+#define __initcall_stub(fn, __iid, id)				\
+	__initcall_name(initstub, __iid, id)
+
+#define __define_initcall_stub(__stub, fn)			\
+	int __init __stub(void);				\
+	int __init __stub(void)					\
+	{ 							\
+		return fn();					\
+	}							\
+	__ADDRESSABLE(__stub)
 #else
 #define __initcall_section(__sec, __iid)			\
 	#__sec ".init"
+
+#define __initcall_stub(fn, __iid, id)	fn
+
+#define __define_initcall_stub(__stub, fn)			\
+	__ADDRESSABLE(fn)
 #endif
 
 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
-#define ____define_initcall(fn, __name, __sec)			\
-	__ADDRESSABLE(fn)					\
+#define ____define_initcall(fn, __stub, __name, __sec)		\
+	__define_initcall_stub(__stub, fn)			\
 	asm(".section	\"" __sec "\", \"a\"		\n"	\
 	    __stringify(__name) ":			\n"	\
-	    ".long	" #fn " - .			\n"	\
+	    ".long	" __stringify(__stub) " - .	\n"	\
 	    ".previous					\n");
 #else
-#define ____define_initcall(fn, __name, __sec)			\
+#define ____define_initcall(fn, __unused, __name, __sec)	\
 	static initcall_t __name __used 			\
 		__attribute__((__section__(__sec))) = fn;
 #endif
 
 #define __unique_initcall(fn, id, __sec, __iid)			\
 	____define_initcall(fn,					\
+		__initcall_stub(fn, __iid, id),			\
 		__initcall_name(initcall, __iid, id),		\
 		__initcall_section(__sec, __iid))
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (15 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 16/25] init: lto: fix PREL32 relocations Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-14 22:58   ` Kees Cook
  2020-10-13  0:31 ` [PATCH v6 18/25] modpost: lto: strip .lto from module names Sami Tolvanen
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With Clang's Link Time Optimization (LTO), the compiler can rename
static functions to avoid global naming collisions. As PCI fixup
functions are typically static, renaming can break references
to them in inline assembly. This change adds a global stub to
DECLARE_PCI_FIXUP_SECTION to fix the issue when PREL32 relocations
are used.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 include/linux/pci.h | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 835530605c0d..4e64421981c7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1909,19 +1909,28 @@ enum pci_fixup_pass {
 };
 
 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
-#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
-				    class_shift, hook)			\
-	__ADDRESSABLE(hook)						\
+#define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
+				    class_shift, hook, stub)		\
+	void stub(struct pci_dev *dev);					\
+	void stub(struct pci_dev *dev)					\
+	{ 								\
+		hook(dev); 						\
+	}								\
 	asm(".section "	#sec ", \"a\"				\n"	\
 	    ".balign	16					\n"	\
 	    ".short "	#vendor ", " #device "			\n"	\
 	    ".long "	#class ", " #class_shift "		\n"	\
-	    ".long "	#hook " - .				\n"	\
+	    ".long "	#stub " - .				\n"	\
 	    ".previous						\n");
+
+#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
+				  class_shift, hook, stub)		\
+	___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
+				  class_shift, hook, stub)
 #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
 				  class_shift, hook)			\
 	__DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
-				  class_shift, hook)
+				  class_shift, hook, __UNIQUE_ID(hook))
 #else
 /* Anonymous variables would be nice... */
 #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class,	\
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 18/25] modpost: lto: strip .lto from module names
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (16 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 19/25] scripts/mod: disable LTO for empty.c Sami Tolvanen
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With LTO, everything is compiled into LLVM bitcode, so we have to link
each module into native code before modpost. Kbuild uses the .lto.o
suffix for these files, which also ends up in module information. This
change strips the unnecessary .lto suffix from the module name.

Suggested-by: Bill Wendling <morbo@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 scripts/mod/modpost.c    | 16 +++++++---------
 scripts/mod/modpost.h    |  9 +++++++++
 scripts/mod/sumversion.c |  6 +++++-
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 69341b36f271..5a329df55cc3 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -17,7 +17,6 @@
 #include <ctype.h>
 #include <string.h>
 #include <limits.h>
-#include <stdbool.h>
 #include <errno.h>
 #include "modpost.h"
 #include "../../include/linux/license.h"
@@ -80,14 +79,6 @@ modpost_log(enum loglevel loglevel, const char *fmt, ...)
 		exit(1);
 }
 
-static inline bool strends(const char *str, const char *postfix)
-{
-	if (strlen(str) < strlen(postfix))
-		return false;
-
-	return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0;
-}
-
 void *do_nofail(void *ptr, const char *expr)
 {
 	if (!ptr)
@@ -1984,6 +1975,10 @@ static char *remove_dot(char *s)
 		size_t m = strspn(s + n + 1, "0123456789");
 		if (m && (s[n + m] == '.' || s[n + m] == 0))
 			s[n] = 0;
+
+		/* strip trailing .lto */
+		if (strends(s, ".lto"))
+			s[strlen(s) - 4] = '\0';
 	}
 	return s;
 }
@@ -2007,6 +2002,9 @@ static void read_symbols(const char *modname)
 		/* strip trailing .o */
 		tmp = NOFAIL(strdup(modname));
 		tmp[strlen(tmp) - 2] = '\0';
+		/* strip trailing .lto */
+		if (strends(tmp, ".lto"))
+			tmp[strlen(tmp) - 4] = '\0';
 		mod = new_module(tmp);
 		free(tmp);
 	}
diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index 3aa052722233..fab30d201f9e 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -2,6 +2,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <stdarg.h>
+#include <stdbool.h>
 #include <string.h>
 #include <sys/types.h>
 #include <sys/stat.h>
@@ -180,6 +181,14 @@ static inline unsigned int get_secindex(const struct elf_info *info,
 	return info->symtab_shndx_start[sym - info->symtab_start];
 }
 
+static inline bool strends(const char *str, const char *postfix)
+{
+	if (strlen(str) < strlen(postfix))
+		return false;
+
+	return strcmp(str + strlen(str) - strlen(postfix), postfix) == 0;
+}
+
 /* file2alias.c */
 extern unsigned int cross_build;
 void handle_moddevtable(struct module *mod, struct elf_info *info,
diff --git a/scripts/mod/sumversion.c b/scripts/mod/sumversion.c
index d587f40f1117..760e6baa7eda 100644
--- a/scripts/mod/sumversion.c
+++ b/scripts/mod/sumversion.c
@@ -391,10 +391,14 @@ void get_src_version(const char *modname, char sum[], unsigned sumlen)
 	struct md4_ctx md;
 	char *fname;
 	char filelist[PATH_MAX + 1];
+	int postfix_len = 1;
+
+	if (strends(modname, ".lto.o"))
+		postfix_len = 5;
 
 	/* objects for a module are listed in the first line of *.mod file. */
 	snprintf(filelist, sizeof(filelist), "%.*smod",
-		 (int)strlen(modname) - 1, modname);
+		 (int)strlen(modname) - postfix_len, modname);
 
 	buf = read_text_file(filelist);
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 19/25] scripts/mod: disable LTO for empty.c
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (17 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 18/25] modpost: lto: strip .lto from module names Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 20/25] efi/libstub: disable LTO Sami Tolvanen
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With CONFIG_LTO_CLANG, clang generates LLVM IR instead of ELF object
files. As empty.o is used for probing target properties, disable LTO
for it to produce an object file instead.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 scripts/mod/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index 78071681d924..c9e38ad937fd 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD := y
+CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO)
 
 hostprogs-always-y	+= modpost mk_elfconfig
 always-y		+= empty.o
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 20/25] efi/libstub: disable LTO
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (18 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 19/25] scripts/mod: disable LTO for empty.c Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:31 ` [PATCH v6 21/25] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

With CONFIG_LTO_CLANG, we produce LLVM bitcode instead of ELF object
files. Since LTO is not really needed here and the Makefile assumes we
produce an object file, disable LTO for libstub.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 drivers/firmware/efi/libstub/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 0c911e391d75..e927876f3a05 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,6 +36,8 @@ KBUILD_CFLAGS			:= $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
 
 # remove SCS flags from all objects in this directory
 KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS))
+# disable LTO
+KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS))
 
 GCOV_PROFILE			:= n
 # Sanitizer runtimes are unavailable and cannot be linked here.
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 21/25] drivers/misc/lkdtm: disable LTO for rodata.o
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (19 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 20/25] efi/libstub: disable LTO Sami Tolvanen
@ 2020-10-13  0:31 ` Sami Tolvanen
  2020-10-13  0:32 ` [PATCH v6 22/25] x86/asm: annotate indirect jumps Sami Tolvanen
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:31 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Disable LTO for rodata.o to allow objcopy to be used to
manipulate sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
 drivers/misc/lkdtm/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile
index c70b3822013f..dd4c936d4d73 100644
--- a/drivers/misc/lkdtm/Makefile
+++ b/drivers/misc/lkdtm/Makefile
@@ -13,6 +13,7 @@ lkdtm-$(CONFIG_LKDTM)		+= cfi.o
 
 KASAN_SANITIZE_stackleak.o	:= n
 KCOV_INSTRUMENT_rodata.o	:= n
+CFLAGS_REMOVE_rodata.o		+= $(CC_FLAGS_LTO)
 
 OBJCOPYFLAGS :=
 OBJCOPYFLAGS_rodata_objcopy.o	:= \
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (20 preceding siblings ...)
  2020-10-13  0:31 ` [PATCH v6 21/25] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen
@ 2020-10-13  0:32 ` Sami Tolvanen
  2020-10-14 22:46   ` Kees Cook
  2020-10-14 23:23   ` Jann Horn
  2020-10-13  0:32 ` [PATCH v6 23/25] x86, vdso: disable LTO only for vDSO Sami Tolvanen
                   ` (2 subsequent siblings)
  24 siblings, 2 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:32 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Running objtool --vmlinux --duplicate on vmlinux.o produces a few
warnings about indirect jumps with retpoline:

  vmlinux.o: warning: objtool: wakeup_long64()+0x61: indirect jump
  found in RETPOLINE build
  ...

This change adds ANNOTATE_RETPOLINE_SAFE annotations to the jumps
in assembly code to stop the warnings.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 arch/x86/kernel/acpi/wakeup_64.S  | 2 ++
 arch/x86/platform/pvh/head.S      | 2 ++
 arch/x86/power/hibernate_asm_64.S | 3 +++
 3 files changed, 7 insertions(+)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index c8daa92f38dc..041e79c4e195 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -7,6 +7,7 @@
 #include <asm/msr.h>
 #include <asm/asm-offsets.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 # Copyright 2003 Pavel Machek <pavel@suse.cz
 
@@ -39,6 +40,7 @@ SYM_FUNC_START(wakeup_long64)
 	movq	saved_rbp, %rbp
 
 	movq	saved_rip, %rax
+	ANNOTATE_RETPOLINE_SAFE
 	jmp	*%rax
 SYM_FUNC_END(wakeup_long64)
 
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 43b4d864817e..640b79cc64b8 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -15,6 +15,7 @@
 #include <asm/asm.h>
 #include <asm/boot.h>
 #include <asm/processor-flags.h>
+#include <asm/nospec-branch.h>
 #include <asm/msr.h>
 #include <xen/interface/elfnote.h>
 
@@ -105,6 +106,7 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
 	/* startup_64 expects boot_params in %rsi. */
 	mov $_pa(pvh_bootparams), %rsi
 	mov $_pa(startup_64), %rax
+	ANNOTATE_RETPOLINE_SAFE
 	jmp *%rax
 
 #else /* CONFIG_X86_64 */
diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index 7918b8415f13..715509d94fa3 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -21,6 +21,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/processor-flags.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 SYM_FUNC_START(swsusp_arch_suspend)
 	movq	$saved_context, %rax
@@ -66,6 +67,7 @@ SYM_CODE_START(restore_image)
 
 	/* jump to relocated restore code */
 	movq	relocated_restore_code(%rip), %rcx
+	ANNOTATE_RETPOLINE_SAFE
 	jmpq	*%rcx
 SYM_CODE_END(restore_image)
 
@@ -97,6 +99,7 @@ SYM_CODE_START(core_restore_code)
 
 .Ldone:
 	/* jump to the restore_registers address from the image header */
+	ANNOTATE_RETPOLINE_SAFE
 	jmpq	*%r8
 SYM_CODE_END(core_restore_code)
 
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 23/25] x86, vdso: disable LTO only for vDSO
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (21 preceding siblings ...)
  2020-10-13  0:32 ` [PATCH v6 22/25] x86/asm: annotate indirect jumps Sami Tolvanen
@ 2020-10-13  0:32 ` Sami Tolvanen
  2020-10-13  0:32 ` [PATCH v6 24/25] x86, cpu: disable LTO for cpu.c Sami Tolvanen
  2020-10-13  0:32 ` [PATCH v6 25/25] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:32 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Disable LTO for the vDSO. Note that while we could use Clang's LTO
for the 64-bit vDSO, it won't add noticeable benefit for the small
amount of C code.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/entry/vdso/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index ecc27018ae13..9b742f21d2db 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -90,7 +90,7 @@ ifneq ($(RETPOLINE_VDSO_CFLAGS),)
 endif
 endif
 
-$(vobjs): KBUILD_CFLAGS := $(filter-out $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL)
+$(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL)
 
 #
 # vDSO code runs in userspace and -pg doesn't help with profiling anyway.
@@ -148,6 +148,7 @@ KBUILD_CFLAGS_32 := $(filter-out -fno-pic,$(KBUILD_CFLAGS_32))
 KBUILD_CFLAGS_32 := $(filter-out -mfentry,$(KBUILD_CFLAGS_32))
 KBUILD_CFLAGS_32 := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS_32))
 KBUILD_CFLAGS_32 := $(filter-out $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS_32))
+KBUILD_CFLAGS_32 := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS_32))
 KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=0 -fpic
 KBUILD_CFLAGS_32 += -fno-stack-protector
 KBUILD_CFLAGS_32 += $(call cc-option, -foptimize-sibling-calls)
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 24/25] x86, cpu: disable LTO for cpu.c
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (22 preceding siblings ...)
  2020-10-13  0:32 ` [PATCH v6 23/25] x86, vdso: disable LTO only for vDSO Sami Tolvanen
@ 2020-10-13  0:32 ` Sami Tolvanen
  2020-10-13  0:32 ` [PATCH v6 25/25] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:32 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Clang incorrectly inlines functions with differing stack protector
attributes, which breaks __restore_processor_state() that relies on
stack protector being disabled. This change disables LTO for cpu.c
to work aroung the bug.

Link: https://bugs.llvm.org/show_bug.cgi?id=47479
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 arch/x86/power/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile
index 6907b523e856..5f711a441623 100644
--- a/arch/x86/power/Makefile
+++ b/arch/x86/power/Makefile
@@ -5,5 +5,9 @@ OBJECT_FILES_NON_STANDARD_hibernate_asm_$(BITS).o := y
 # itself be stack-protected
 CFLAGS_cpu.o	:= -fno-stack-protector
 
+# Clang may incorrectly inline functions with stack protector enabled into
+# __restore_processor_state(): https://bugs.llvm.org/show_bug.cgi?id=47479
+CFLAGS_REMOVE_cpu.o := $(CC_FLAGS_LTO)
+
 obj-$(CONFIG_PM_SLEEP)		+= cpu.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate_$(BITS).o hibernate_asm_$(BITS).o hibernate.o
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v6 25/25] x86, build: allow LTO_CLANG and THINLTO to be selected
  2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
                   ` (23 preceding siblings ...)
  2020-10-13  0:32 ` [PATCH v6 24/25] x86, cpu: disable LTO for cpu.c Sami Tolvanen
@ 2020-10-13  0:32 ` Sami Tolvanen
  24 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-13  0:32 UTC (permalink / raw)
  To: Masahiro Yamada, Steven Rostedt
  Cc: Will Deacon, Peter Zijlstra, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	kernel-hardening, linux-arch, linux-arm-kernel, linux-kbuild,
	linux-kernel, linux-pci, x86, Sami Tolvanen

Pass code model and stack alignment to the linker as these are
not stored in LLVM bitcode, and allow both CONFIG_LTO_CLANG and
CONFIG_THINLTO to be selected.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/Kconfig  | 2 ++
 arch/x86/Makefile | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6d67646153bc..c579d7000b67 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -92,6 +92,8 @@ config X86
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
+	select ARCH_SUPPORTS_LTO_CLANG		if X86_64
+	select ARCH_SUPPORTS_THINLTO		if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 154259f18b8b..774a7debb27c 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -173,6 +173,11 @@ ifeq ($(ACCUMULATE_OUTGOING_ARGS), 1)
 	KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args,)
 endif
 
+ifdef CONFIG_LTO_CLANG
+KBUILD_LDFLAGS	+= -plugin-opt=-code-model=kernel \
+		   -plugin-opt=-stack-alignment=$(if $(CONFIG_X86_32),4,8)
+endif
+
 # Workaround for a gcc prelease that unfortunately was shipped in a suse release
 KBUILD_CFLAGS += -Wno-sign-compare
 #
-- 
2.28.0.1011.ga647a8990f-goog


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc
  2020-10-13  0:31 ` [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc Sami Tolvanen
@ 2020-10-14 16:50   ` Ingo Molnar
  2020-10-14 18:21     ` Peter Zijlstra
  0 siblings, 1 reply; 73+ messages in thread
From: Ingo Molnar @ 2020-10-14 16:50 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Masahiro Yamada, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Kees Cook,
	Nick Desaulniers, clang-built-linux, kernel-hardening,
	linux-arch, linux-arm-kernel, linux-kbuild, linux-kernel,
	linux-pci, x86, Josh Poimboeuf


* Sami Tolvanen <samitolvanen@google.com> wrote:

> From: Peter Zijlstra <peterz@infradead.org>
> 
> Add the --mcount option for generating __mcount_loc sections
> needed for dynamic ftrace. Using this pass requires the kernel to
> be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined
> in Makefile.
> 
> Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> [Sami: rebased, dropped config changes, fixed to actually use --mcount,
>        and wrote a commit message.]
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> ---
>  tools/objtool/builtin-check.c |  3 +-
>  tools/objtool/builtin.h       |  2 +-
>  tools/objtool/check.c         | 82 +++++++++++++++++++++++++++++++++++
>  tools/objtool/check.h         |  1 +
>  tools/objtool/objtool.c       |  1 +
>  tools/objtool/objtool.h       |  1 +
>  6 files changed, 88 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
> index c6d199bfd0ae..e92e76f69176 100644
> --- a/tools/objtool/builtin-check.c
> +++ b/tools/objtool/builtin-check.c
> @@ -18,7 +18,7 @@
>  #include "builtin.h"
>  #include "objtool.h"
>  
> -bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux;
> +bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount;
>  
>  static const char * const check_usage[] = {
>  	"objtool check [<options>] file.o",
> @@ -35,6 +35,7 @@ const struct option check_options[] = {
>  	OPT_BOOLEAN('s', "stats", &stats, "print statistics"),
>  	OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"),
>  	OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"),
> +	OPT_BOOLEAN('M', "mcount", &mcount, "generate __mcount_loc"),
>  	OPT_END(),
>  };

Meh, adding --mcount as an option to 'objtool check' was a valid hack for a 
prototype patchset, but please turn this into a proper subcommand, just 
like 'objtool orc' is.

'objtool check' should ... keep checking. :-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc
  2020-10-14 16:50   ` Ingo Molnar
@ 2020-10-14 18:21     ` Peter Zijlstra
  2020-10-15 20:10       ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-14 18:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Sami Tolvanen, Masahiro Yamada, Steven Rostedt, Will Deacon,
	Greg Kroah-Hartman, Paul E. McKenney, Kees Cook,
	Nick Desaulniers, clang-built-linux, kernel-hardening,
	linux-arch, linux-arm-kernel, linux-kbuild, linux-kernel,
	linux-pci, x86, Josh Poimboeuf

On Wed, Oct 14, 2020 at 06:50:04PM +0200, Ingo Molnar wrote:
> Meh, adding --mcount as an option to 'objtool check' was a valid hack for a 
> prototype patchset, but please turn this into a proper subcommand, just 
> like 'objtool orc' is.
> 
> 'objtool check' should ... keep checking. :-)

No, no subcommands. orc being a subcommand was a mistake.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 07/25] treewide: remove DISABLE_LTO
  2020-10-13  0:31 ` [PATCH v6 07/25] treewide: remove DISABLE_LTO Sami Tolvanen
@ 2020-10-14 22:43   ` Kees Cook
  2020-10-17  1:46     ` Masahiro Yamada
  0 siblings, 1 reply; 73+ messages in thread
From: Kees Cook @ 2020-10-14 22:43 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Sami Tolvanen, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, kernel-hardening, linux-arch,
	linux-arm-kernel, linux-kbuild, linux-kernel, linux-pci, x86

On Mon, Oct 12, 2020 at 05:31:45PM -0700, Sami Tolvanen wrote:
> This change removes all instances of DISABLE_LTO from
> Makefiles, as they are currently unused, and the preferred
> method of disabling LTO is to filter out the flags instead.
> 
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>

Hi Masahiro,

Since this is independent of anything else and could be seen as a
general cleanup, can this patch be taken into your tree, just to
separate it from the list of dependencies for this series?

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-13  0:32 ` [PATCH v6 22/25] x86/asm: annotate indirect jumps Sami Tolvanen
@ 2020-10-14 22:46   ` Kees Cook
  2020-10-14 23:23   ` Jann Horn
  1 sibling, 0 replies; 73+ messages in thread
From: Kees Cook @ 2020-10-14 22:46 UTC (permalink / raw)
  To: Thomas Gleixner, Josh Poimboeuf
  Cc: Masahiro Yamada, Sami Tolvanen, Steven Rostedt, Will Deacon,
	Peter Zijlstra, Greg Kroah-Hartman, Paul E. McKenney,
	Nick Desaulniers, clang-built-linux, kernel-hardening,
	linux-arch, linux-arm-kernel, linux-kbuild, linux-kernel,
	linux-pci, x86

On Mon, Oct 12, 2020 at 05:32:00PM -0700, Sami Tolvanen wrote:
> Running objtool --vmlinux --duplicate on vmlinux.o produces a few
> warnings about indirect jumps with retpoline:
> 
>   vmlinux.o: warning: objtool: wakeup_long64()+0x61: indirect jump
>   found in RETPOLINE build
>   ...
> 
> This change adds ANNOTATE_RETPOLINE_SAFE annotations to the jumps
> in assembly code to stop the warnings.
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

This looks like it's an independent fix -- can an x86 maintainer pick
up this patch directly?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 13/25] kbuild: lto: merge module sections
  2020-10-13  0:31 ` [PATCH v6 13/25] kbuild: lto: merge module sections Sami Tolvanen
@ 2020-10-14 22:49   ` Kees Cook
  2020-10-20 16:42     ` Sami Tolvanen
  0 siblings, 1 reply; 73+ messages in thread
From: Kees Cook @ 2020-10-14 22:49 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Masahiro Yamada, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, kernel-hardening, linux-arch,
	linux-arm-kernel, linux-kbuild, linux-kernel, linux-pci, x86

On Mon, Oct 12, 2020 at 05:31:51PM -0700, Sami Tolvanen wrote:
> LLD always splits sections with LTO, which increases module sizes. This
> change adds linker script rules to merge the split sections in the final
> module.
> 
> Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> ---
>  scripts/module.lds.S | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/scripts/module.lds.S b/scripts/module.lds.S
> index 69b9b71a6a47..037120173a22 100644
> --- a/scripts/module.lds.S
> +++ b/scripts/module.lds.S
> @@ -25,5 +25,33 @@ SECTIONS {
>  	__jump_table		0 : ALIGN(8) { KEEP(*(__jump_table)) }
>  }
>  
> +#ifdef CONFIG_LTO_CLANG

In looking at this again -- is this ifdef needed? Couldn't this be done
unconditionally? (Which would make it an independent change...)

-Kees

> +/*
> + * With CONFIG_LTO_CLANG, LLD always enables -fdata-sections and
> + * -ffunction-sections, which increases the size of the final module.
> + * Merge the split sections in the final binary.
> + */
> +SECTIONS {
> +	__patchable_function_entries : { *(__patchable_function_entries) }
> +
> +	.bss : {
> +		*(.bss .bss.[0-9a-zA-Z_]*)
> +		*(.bss..L*)
> +	}
> +
> +	.data : {
> +		*(.data .data.[0-9a-zA-Z_]*)
> +		*(.data..L*)
> +	}
> +
> +	.rodata : {
> +		*(.rodata .rodata.[0-9a-zA-Z_]*)
> +		*(.rodata..L*)
> +	}
> +
> +	.text : { *(.text .text.[0-9a-zA-Z_]*) }
> +}
> +#endif
> +
>  /* bring in arch-specific sections */
>  #include <asm/module.lds.h>
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files
  2020-10-13  0:31 ` [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen
@ 2020-10-14 22:50   ` Kees Cook
  2020-12-03 17:59     ` Masahiro Yamada
  0 siblings, 1 reply; 73+ messages in thread
From: Kees Cook @ 2020-10-14 22:50 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Sami Tolvanen, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, kernel-hardening, linux-arch,
	linux-arm-kernel, linux-kbuild, linux-kernel, linux-pci, x86

On Mon, Oct 12, 2020 at 05:31:52PM -0700, Sami Tolvanen wrote:
> With LTO, llvm-nm prints out symbols for each archive member
> separately, which results in a lot of duplicate dependencies in the
> .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module
> consists of several compilation units, the output can exceed the
> default xargs command size limit and split the dependency list to
> multiple lines, which results in used symbols getting trimmed.
> 
> This change removes duplicate dependencies, which will reduce the
> probability of this happening and makes .mod files smaller and
> easier to read.
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>

Hi Masahiro,

This appears to be a general improvement as well. This looks like it can
land without depending on the rest of the series.

-Kees

> ---
>  scripts/Makefile.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index ab0ddf4884fd..96d6c9e18901 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -266,7 +266,7 @@ endef
>  
>  # List module undefined symbols (or empty line if not enabled)
>  ifdef CONFIG_TRIM_UNUSED_KSYMS
> -cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
> +cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | sort -u | xargs echo
>  else
>  cmd_undef_syms = echo
>  endif
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 16/25] init: lto: fix PREL32 relocations
  2020-10-13  0:31 ` [PATCH v6 16/25] init: lto: fix PREL32 relocations Sami Tolvanen
@ 2020-10-14 22:53   ` Kees Cook
  2020-10-15  0:12   ` Jann Horn
  1 sibling, 0 replies; 73+ messages in thread
From: Kees Cook @ 2020-10-14 22:53 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Sami Tolvanen, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, kernel-hardening, linux-arch,
	linux-arm-kernel, linux-kbuild, linux-kernel, linux-pci, x86

On Mon, Oct 12, 2020 at 05:31:54PM -0700, Sami Tolvanen wrote:
> With LTO, the compiler can rename static functions to avoid global
> naming collisions. As initcall functions are typically static,
> renaming can break references to them in inline assembly. This
> change adds a global stub with a stable name for each initcall to
> fix the issue when PREL32 relocations are used.
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>

This is another independent improvement... this could land before the
other portions of the series.

-Kees

> ---
>  include/linux/init.h | 31 +++++++++++++++++++++++++++----
>  1 file changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/init.h b/include/linux/init.h
> index af638cd6dd52..cea63f7e7705 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -209,26 +209,49 @@ extern bool initcall_debug;
>   */
>  #define __initcall_section(__sec, __iid)			\
>  	#__sec ".init.." #__iid
> +
> +/*
> + * With LTO, the compiler can rename static functions to avoid
> + * global naming collisions. We use a global stub function for
> + * initcalls to create a stable symbol name whose address can be
> + * taken in inline assembly when PREL32 relocations are used.
> + */
> +#define __initcall_stub(fn, __iid, id)				\
> +	__initcall_name(initstub, __iid, id)
> +
> +#define __define_initcall_stub(__stub, fn)			\
> +	int __init __stub(void);				\
> +	int __init __stub(void)					\
> +	{ 							\
> +		return fn();					\
> +	}							\
> +	__ADDRESSABLE(__stub)
>  #else
>  #define __initcall_section(__sec, __iid)			\
>  	#__sec ".init"
> +
> +#define __initcall_stub(fn, __iid, id)	fn
> +
> +#define __define_initcall_stub(__stub, fn)			\
> +	__ADDRESSABLE(fn)
>  #endif
>  
>  #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> -#define ____define_initcall(fn, __name, __sec)			\
> -	__ADDRESSABLE(fn)					\
> +#define ____define_initcall(fn, __stub, __name, __sec)		\
> +	__define_initcall_stub(__stub, fn)			\
>  	asm(".section	\"" __sec "\", \"a\"		\n"	\
>  	    __stringify(__name) ":			\n"	\
> -	    ".long	" #fn " - .			\n"	\
> +	    ".long	" __stringify(__stub) " - .	\n"	\
>  	    ".previous					\n");
>  #else
> -#define ____define_initcall(fn, __name, __sec)			\
> +#define ____define_initcall(fn, __unused, __name, __sec)	\
>  	static initcall_t __name __used 			\
>  		__attribute__((__section__(__sec))) = fn;
>  #endif
>  
>  #define __unique_initcall(fn, id, __sec, __iid)			\
>  	____define_initcall(fn,					\
> +		__initcall_stub(fn, __iid, id),			\
>  		__initcall_name(initcall, __iid, id),		\
>  		__initcall_section(__sec, __iid))
>  
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO
  2020-10-13  0:31 ` [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO Sami Tolvanen
@ 2020-10-14 22:58   ` Kees Cook
  0 siblings, 0 replies; 73+ messages in thread
From: Kees Cook @ 2020-10-14 22:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Masahiro Yamada, Sami Tolvanen, Steven Rostedt, Will Deacon,
	Peter Zijlstra, Greg Kroah-Hartman, Paul E. McKenney,
	Nick Desaulniers, clang-built-linux, kernel-hardening,
	linux-arch, linux-arm-kernel, linux-kbuild, linux-kernel,
	linux-pci, x86

On Mon, Oct 12, 2020 at 05:31:55PM -0700, Sami Tolvanen wrote:
> With Clang's Link Time Optimization (LTO), the compiler can rename
> static functions to avoid global naming collisions. As PCI fixup
> functions are typically static, renaming can break references
> to them in inline assembly. This change adds a global stub to
> DECLARE_PCI_FIXUP_SECTION to fix the issue when PREL32 relocations
> are used.
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>

Another independent patch! :) Bjorn, since you've already Acked this
patch, would be be willing to pick it up for your tree?

-Kees

> ---
>  include/linux/pci.h | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 835530605c0d..4e64421981c7 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1909,19 +1909,28 @@ enum pci_fixup_pass {
>  };
>  
>  #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> -#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
> -				    class_shift, hook)			\
> -	__ADDRESSABLE(hook)						\
> +#define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
> +				    class_shift, hook, stub)		\
> +	void stub(struct pci_dev *dev);					\
> +	void stub(struct pci_dev *dev)					\
> +	{ 								\
> +		hook(dev); 						\
> +	}								\
>  	asm(".section "	#sec ", \"a\"				\n"	\
>  	    ".balign	16					\n"	\
>  	    ".short "	#vendor ", " #device "			\n"	\
>  	    ".long "	#class ", " #class_shift "		\n"	\
> -	    ".long "	#hook " - .				\n"	\
> +	    ".long "	#stub " - .				\n"	\
>  	    ".previous						\n");
> +
> +#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
> +				  class_shift, hook, stub)		\
> +	___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
> +				  class_shift, hook, stub)
>  #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
>  				  class_shift, hook)			\
>  	__DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,	\
> -				  class_shift, hook)
> +				  class_shift, hook, __UNIQUE_ID(hook))
>  #else
>  /* Anonymous variables would be nice... */
>  #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class,	\
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-13  0:32 ` [PATCH v6 22/25] x86/asm: annotate indirect jumps Sami Tolvanen
  2020-10-14 22:46   ` Kees Cook
@ 2020-10-14 23:23   ` Jann Horn
  2020-10-15 10:22     ` Peter Zijlstra
  1 sibling, 1 reply; 73+ messages in thread
From: Jann Horn @ 2020-10-14 23:23 UTC (permalink / raw)
  To: Sami Tolvanen, Josh Poimboeuf, Peter Zijlstra, the arch/x86 maintainers
  Cc: Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

+objtool folks

On Tue, Oct 13, 2020 at 2:35 AM Sami Tolvanen <samitolvanen@google.com> wrote:
> Running objtool --vmlinux --duplicate on vmlinux.o produces a few
> warnings about indirect jumps with retpoline:
>
>   vmlinux.o: warning: objtool: wakeup_long64()+0x61: indirect jump
>   found in RETPOLINE build
>   ...
>
> This change adds ANNOTATE_RETPOLINE_SAFE annotations to the jumps
> in assembly code to stop the warnings.

In other words, this patch deals with the fact that
OBJECT_FILES_NON_STANDARD stops being effective for object files that
are linked into the main kernel when LTO is on, right?
All the files you're touching here are supposed to be excluded from
objtool warnings at the moment:

$ grep OBJECT_FILES_NON_STANDARD arch/x86/kernel/acpi/Makefile
OBJECT_FILES_NON_STANDARD_wakeup_$(BITS).o := y
$ grep OBJECT_FILES_NON_STANDARD arch/x86/platform/pvh/Makefile
OBJECT_FILES_NON_STANDARD_head.o := y
$ grep OBJECT_FILES_NON_STANDARD arch/x86/power/Makefile
OBJECT_FILES_NON_STANDARD_hibernate_asm_$(BITS).o := y

It would probably be good to keep LTO and non-LTO builds in sync about
which files are subjected to objtool checks. So either you should be
removing the OBJECT_FILES_NON_STANDARD annotations for anything that
is linked into the main kernel (which would be a nice cleanup, if that
is possible), or alternatively ensure that code from these files is
excluded from objtool checks even with LTO (that'd probably be messy
and a bad idea?).

Grepping for other files marked as OBJECT_FILES_NON_STANDARD that
might be included in the main kernel on x86, I also see stuff like:

    5 arch/x86/crypto/Makefile                            5
OBJECT_FILES_NON_STANDARD := y
   10 arch/x86/kernel/Makefile                           39
OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o          := y
   12 arch/x86/kvm/Makefile                               7
OBJECT_FILES_NON_STANDARD_vmenter.o := y

for which I think the same thing applies.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 16/25] init: lto: fix PREL32 relocations
  2020-10-13  0:31 ` [PATCH v6 16/25] init: lto: fix PREL32 relocations Sami Tolvanen
  2020-10-14 22:53   ` Kees Cook
@ 2020-10-15  0:12   ` Jann Horn
  1 sibling, 0 replies; 73+ messages in thread
From: Jann Horn @ 2020-10-15  0:12 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Masahiro Yamada, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Kees Cook,
	Nick Desaulniers, clang-built-linux, Kernel Hardening,
	linux-arch, Linux ARM, linux-kbuild, kernel list, linux-pci,
	the arch/x86 maintainers

On Tue, Oct 13, 2020 at 2:34 AM Sami Tolvanen <samitolvanen@google.com> wrote:
> With LTO, the compiler can rename static functions to avoid global
> naming collisions. As initcall functions are typically static,
> renaming can break references to them in inline assembly. This
> change adds a global stub with a stable name for each initcall to
> fix the issue when PREL32 relocations are used.

While I understand that this may be necessary for now, are there any
plans to fix this in the compiler in the future? There was a thread
about this issue at
<http://lists.llvm.org/pipermail/llvm-dev/2016-April/thread.html#98047>,
and possible solutions were discussed there, but it looks like that
fizzled out...

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-14 23:23   ` Jann Horn
@ 2020-10-15 10:22     ` Peter Zijlstra
  2020-10-15 20:39       ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-15 10:22 UTC (permalink / raw)
  To: Jann Horn
  Cc: Sami Tolvanen, Josh Poimboeuf, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Thu, Oct 15, 2020 at 01:23:41AM +0200, Jann Horn wrote:

> It would probably be good to keep LTO and non-LTO builds in sync about
> which files are subjected to objtool checks. So either you should be
> removing the OBJECT_FILES_NON_STANDARD annotations for anything that
> is linked into the main kernel (which would be a nice cleanup, if that
> is possible), 

This, I've had to do that for a number of files already for the limited
vmlinux.o passes we needed for noinstr validation.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc
  2020-10-14 18:21     ` Peter Zijlstra
@ 2020-10-15 20:10       ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-10-15 20:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Sami Tolvanen, Masahiro Yamada, Steven Rostedt,
	Will Deacon, Greg Kroah-Hartman, Paul E. McKenney, Kees Cook,
	Nick Desaulniers, clang-built-linux, kernel-hardening,
	linux-arch, linux-arm-kernel, linux-kbuild, linux-kernel,
	linux-pci, x86

On Wed, Oct 14, 2020 at 08:21:15PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 14, 2020 at 06:50:04PM +0200, Ingo Molnar wrote:
> > Meh, adding --mcount as an option to 'objtool check' was a valid hack for a 
> > prototype patchset, but please turn this into a proper subcommand, just 
> > like 'objtool orc' is.
> > 
> > 'objtool check' should ... keep checking. :-)
> 
> No, no subcommands. orc being a subcommand was a mistake.

Yup, it gets real awkward when trying to combine subcommands.

I proposed a more logical design:

  https://lkml.kernel.org/r/20201002141303.hyl72to37wudoi66@treble

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-15 10:22     ` Peter Zijlstra
@ 2020-10-15 20:39       ` Josh Poimboeuf
  2020-10-20 16:45         ` Sami Tolvanen
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2020-10-15 20:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jann Horn, Sami Tolvanen, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Thu, Oct 15, 2020 at 12:22:16PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 15, 2020 at 01:23:41AM +0200, Jann Horn wrote:
> 
> > It would probably be good to keep LTO and non-LTO builds in sync about
> > which files are subjected to objtool checks. So either you should be
> > removing the OBJECT_FILES_NON_STANDARD annotations for anything that
> > is linked into the main kernel (which would be a nice cleanup, if that
> > is possible), 
> 
> This, I've had to do that for a number of files already for the limited
> vmlinux.o passes we needed for noinstr validation.

Getting rid of OBJECT_FILES_NON_STANDARD is indeed the end goal, though
I'm not sure how practical that will be for some of the weirder edge
case.

On a related note, I have some old crypto cleanups which need dusting
off.

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 07/25] treewide: remove DISABLE_LTO
  2020-10-14 22:43   ` Kees Cook
@ 2020-10-17  1:46     ` Masahiro Yamada
  0 siblings, 0 replies; 73+ messages in thread
From: Masahiro Yamada @ 2020-10-17  1:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: Sami Tolvanen, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, Kernel Hardening, linux-arch,
	linux-arm-kernel, Linux Kbuild mailing list,
	Linux Kernel Mailing List, linux-pci, X86 ML

On Thu, Oct 15, 2020 at 7:43 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Oct 12, 2020 at 05:31:45PM -0700, Sami Tolvanen wrote:
> > This change removes all instances of DISABLE_LTO from
> > Makefiles, as they are currently unused, and the preferred
> > method of disabling LTO is to filter out the flags instead.
> >
> > Suggested-by: Kees Cook <keescook@chromium.org>
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > Reviewed-by: Kees Cook <keescook@chromium.org>
>
> Hi Masahiro,
>
> Since this is independent of anything else and could be seen as a
> general cleanup, can this patch be taken into your tree, just to
> separate it from the list of dependencies for this series?
>
> -Kees
>
> --
> Kees Cook



Yes, this is stale code because GCC LTO was not pulled.

Applied to linux-kbuild.

I added the following historical background.



Note added by Masahiro Yamada:
DISABLE_LTO was added as preparation for GCC LTO, but GCC LTO was
not pulled into the mainline. (https://lkml.org/lkml/2014/4/8/272)




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 13/25] kbuild: lto: merge module sections
  2020-10-14 22:49   ` Kees Cook
@ 2020-10-20 16:42     ` Sami Tolvanen
  0 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-20 16:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Masahiro Yamada, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, Kernel Hardening, linux-arch,
	linux-arm-kernel, linux-kbuild, LKML, linux-pci, X86 ML

On Wed, Oct 14, 2020 at 3:49 PM Kees Cook <keescook@chromium.org> wrote:
> In looking at this again -- is this ifdef needed? Couldn't this be done
> unconditionally? (Which would make it an independent change...)

No, I suppose it's not needed. I can drop the ifdef from the next version.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-15 20:39       ` Josh Poimboeuf
@ 2020-10-20 16:45         ` Sami Tolvanen
  2020-10-20 18:52           ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-20 16:45 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Thu, Oct 15, 2020 at 1:39 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Thu, Oct 15, 2020 at 12:22:16PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 15, 2020 at 01:23:41AM +0200, Jann Horn wrote:
> >
> > > It would probably be good to keep LTO and non-LTO builds in sync about
> > > which files are subjected to objtool checks. So either you should be
> > > removing the OBJECT_FILES_NON_STANDARD annotations for anything that
> > > is linked into the main kernel (which would be a nice cleanup, if that
> > > is possible),
> >
> > This, I've had to do that for a number of files already for the limited
> > vmlinux.o passes we needed for noinstr validation.
>
> Getting rid of OBJECT_FILES_NON_STANDARD is indeed the end goal, though
> I'm not sure how practical that will be for some of the weirder edge
> case.
>
> On a related note, I have some old crypto cleanups which need dusting
> off.

Building allyesconfig with this series and LTO enabled, I still see
the following objtool warnings for vmlinux.o, grouped by source file:

arch/x86/entry/entry_64.S:
__switch_to_asm()+0x0: undefined stack state
.entry.text+0xffd: sibling call from callable instruction with
modified stack frame
.entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0

arch/x86/entry/entry_64_compat.S:
.entry.text+0x1754: unsupported instruction in callable function
.entry.text+0x1634: redundant CLD
.entry.text+0x15fd: stack state mismatch: cfa1=7-8 cfa2=-1+0
.entry.text+0x168c: stack state mismatch: cfa1=7-8 cfa2=-1+0

arch/x86/kernel/head_64.S:
.head.text+0xfb: unsupported instruction in callable function

arch/x86/kernel/acpi/wakeup_64.S:
do_suspend_lowlevel()+0x116: sibling call from callable instruction
with modified stack frame

arch/x86/crypto/camellia-aesni-avx2-asm_64.S:
camellia_cbc_dec_32way()+0xb3: stack state mismatch: cfa1=7+520 cfa2=7+8
camellia_ctr_32way()+0x1a: stack state mismatch: cfa1=7+520 cfa2=7+8

arch/x86/crypto/aesni-intel_avx-x86_64.S:
aesni_gcm_init_avx_gen2()+0x12: unsupported stack pointer realignment
aesni_gcm_enc_update_avx_gen2()+0x12: unsupported stack pointer realignment
aesni_gcm_dec_update_avx_gen2()+0x12: unsupported stack pointer realignment
aesni_gcm_finalize_avx_gen2()+0x12: unsupported stack pointer realignment
aesni_gcm_init_avx_gen4()+0x12: unsupported stack pointer realignment
aesni_gcm_enc_update_avx_gen4()+0x12: unsupported stack pointer realignment
aesni_gcm_dec_update_avx_gen4()+0x12: unsupported stack pointer realignment
aesni_gcm_finalize_avx_gen4()+0x12: unsupported stack pointer realignment

arch/x86/crypto/sha1_avx2_x86_64_asm.S:
sha1_transform_avx2()+0xc: unsupported stack pointer realignment

arch/x86/crypto/sha1_ni_asm.S:
sha1_ni_transform()+0x7: unsupported stack pointer realignment

arch/x86/crypto/sha256-avx2-asm.S:
sha256_transform_rorx()+0x13: unsupported stack pointer realignment

arch/x86/crypto/sha512-ssse3-asm.S:
sha512_transform_ssse3()+0x14: unsupported stack pointer realignment

arch/x86/crypto/sha512-avx-asm.S:
sha512_transform_avx()+0x14: unsupported stack pointer realignment

arch/x86/crypto/sha512-avx2-asm.S:
sha512_transform_rorx()+0x7: unsupported stack pointer realignment

arch/x86/lib/retpoline.S:
__x86_retpoline_rdi()+0x10: return with modified stack frame
__x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
__x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0

Josh, Peter, any thoughts on what would be the preferred way to fix
these, or how to tell objtool to ignore this code?

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-20 16:45         ` Sami Tolvanen
@ 2020-10-20 18:52           ` Josh Poimboeuf
  2020-10-20 19:24             ` Sami Tolvanen
  2020-10-21  9:51             ` Peter Zijlstra
  0 siblings, 2 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-10-20 18:52 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Tue, Oct 20, 2020 at 09:45:06AM -0700, Sami Tolvanen wrote:
> On Thu, Oct 15, 2020 at 1:39 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > On Thu, Oct 15, 2020 at 12:22:16PM +0200, Peter Zijlstra wrote:
> > > On Thu, Oct 15, 2020 at 01:23:41AM +0200, Jann Horn wrote:
> > >
> > > > It would probably be good to keep LTO and non-LTO builds in sync about
> > > > which files are subjected to objtool checks. So either you should be
> > > > removing the OBJECT_FILES_NON_STANDARD annotations for anything that
> > > > is linked into the main kernel (which would be a nice cleanup, if that
> > > > is possible),
> > >
> > > This, I've had to do that for a number of files already for the limited
> > > vmlinux.o passes we needed for noinstr validation.
> >
> > Getting rid of OBJECT_FILES_NON_STANDARD is indeed the end goal, though
> > I'm not sure how practical that will be for some of the weirder edge
> > case.
> >
> > On a related note, I have some old crypto cleanups which need dusting
> > off.
> 
> Building allyesconfig with this series and LTO enabled, I still see
> the following objtool warnings for vmlinux.o, grouped by source file:
> 
> arch/x86/entry/entry_64.S:
> __switch_to_asm()+0x0: undefined stack state
> .entry.text+0xffd: sibling call from callable instruction with
> modified stack frame
> .entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0

Not sure what this one's about, there's no OBJECT_FILES_NON_STANDARD?

> arch/x86/entry/entry_64_compat.S:
> .entry.text+0x1754: unsupported instruction in callable function
> .entry.text+0x1634: redundant CLD
> .entry.text+0x15fd: stack state mismatch: cfa1=7-8 cfa2=-1+0
> .entry.text+0x168c: stack state mismatch: cfa1=7-8 cfa2=-1+0

Ditto.

> arch/x86/kernel/head_64.S:
> .head.text+0xfb: unsupported instruction in callable function

Ditto.

> arch/x86/kernel/acpi/wakeup_64.S:
> do_suspend_lowlevel()+0x116: sibling call from callable instruction
> with modified stack frame

We'll need to look at how to handle this one.

> arch/x86/crypto/camellia-aesni-avx2-asm_64.S:
> camellia_cbc_dec_32way()+0xb3: stack state mismatch: cfa1=7+520 cfa2=7+8
> camellia_ctr_32way()+0x1a: stack state mismatch: cfa1=7+520 cfa2=7+8

I can clean off my patches for all the crypto warnings.

> arch/x86/lib/retpoline.S:
> __x86_retpoline_rdi()+0x10: return with modified stack frame
> __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
> __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0

Is this with upstream?  I thought we fixed that with
UNWIND_HINT_RET_OFFSET.

> Josh, Peter, any thoughts on what would be the preferred way to fix
> these, or how to tell objtool to ignore this code?

One way or another, the patches need to be free of warnings before
getting merged.  I can help, though I'm traveling and only have limited
bandwidth for at least the rest of the month.

Ideally we'd want to have objtool understand everything, with no
whitelisting, but some cases (e.g. suspend code) can be tricky.

I wouldn't be opposed to embedding the whitelist in the binary, in a
discardable section.  It should be relatively easy, but as I mentioned I
may or may not have time to work on it for a bit.  I'm working half
days, and now the ocean beckons from the window of my camper.

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-20 18:52           ` Josh Poimboeuf
@ 2020-10-20 19:24             ` Sami Tolvanen
  2020-10-21  8:56               ` Peter Zijlstra
  2020-10-21  9:51             ` Peter Zijlstra
  1 sibling, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-20 19:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Tue, Oct 20, 2020 at 11:52 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Tue, Oct 20, 2020 at 09:45:06AM -0700, Sami Tolvanen wrote:
> > On Thu, Oct 15, 2020 at 1:39 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > On Thu, Oct 15, 2020 at 12:22:16PM +0200, Peter Zijlstra wrote:
> > > > On Thu, Oct 15, 2020 at 01:23:41AM +0200, Jann Horn wrote:
> > > >
> > > > > It would probably be good to keep LTO and non-LTO builds in sync about
> > > > > which files are subjected to objtool checks. So either you should be
> > > > > removing the OBJECT_FILES_NON_STANDARD annotations for anything that
> > > > > is linked into the main kernel (which would be a nice cleanup, if that
> > > > > is possible),
> > > >
> > > > This, I've had to do that for a number of files already for the limited
> > > > vmlinux.o passes we needed for noinstr validation.
> > >
> > > Getting rid of OBJECT_FILES_NON_STANDARD is indeed the end goal, though
> > > I'm not sure how practical that will be for some of the weirder edge
> > > case.
> > >
> > > On a related note, I have some old crypto cleanups which need dusting
> > > off.
> >
> > Building allyesconfig with this series and LTO enabled, I still see
> > the following objtool warnings for vmlinux.o, grouped by source file:
> >
> > arch/x86/entry/entry_64.S:
> > __switch_to_asm()+0x0: undefined stack state
> > .entry.text+0xffd: sibling call from callable instruction with
> > modified stack frame
> > .entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
>
> Not sure what this one's about, there's no OBJECT_FILES_NON_STANDARD?

Correct, because with LTO, we won't have an ELF binary to process
until we compile everything into vmlinux.o, and at that point we can
no longer skip individual object files.

The sibling call warning is in
swapgs_restore_regs_and_return_to_usermode and the stack state
mismatch in entry_SYSCALL_64_after_hwframe.

> > arch/x86/entry/entry_64_compat.S:
> > .entry.text+0x1754: unsupported instruction in callable function

This comes from a sysretl instruction in entry_SYSCALL_compat.

> > .entry.text+0x1634: redundant CLD
> > .entry.text+0x15fd: stack state mismatch: cfa1=7-8 cfa2=-1+0
> > .entry.text+0x168c: stack state mismatch: cfa1=7-8 cfa2=-1+0
>
> Ditto.

These are all from entry_SYSENTER_compat_after_hwframe.

> > arch/x86/kernel/head_64.S:
> > .head.text+0xfb: unsupported instruction in callable function
>
> Ditto.

This is lretq in secondary_startup_64_no_verify.

> > arch/x86/crypto/camellia-aesni-avx2-asm_64.S:
> > camellia_cbc_dec_32way()+0xb3: stack state mismatch: cfa1=7+520 cfa2=7+8
> > camellia_ctr_32way()+0x1a: stack state mismatch: cfa1=7+520 cfa2=7+8
>
> I can clean off my patches for all the crypto warnings.

Great, sounds good.

> > arch/x86/lib/retpoline.S:
> > __x86_retpoline_rdi()+0x10: return with modified stack frame
> > __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
> > __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0
>
> Is this with upstream?  I thought we fixed that with
> UNWIND_HINT_RET_OFFSET.

Yes, and the UNWIND_HINT_RET_OFFSET is there.

> > Josh, Peter, any thoughts on what would be the preferred way to fix
> > these, or how to tell objtool to ignore this code?
>
> One way or another, the patches need to be free of warnings before
> getting merged.  I can help, though I'm traveling and only have limited
> bandwidth for at least the rest of the month.
>
> Ideally we'd want to have objtool understand everything, with no
> whitelisting, but some cases (e.g. suspend code) can be tricky.
>
> I wouldn't be opposed to embedding the whitelist in the binary, in a
> discardable section.  It should be relatively easy, but as I mentioned I
> may or may not have time to work on it for a bit.  I'm working half
> days, and now the ocean beckons from the window of my camper.

Something similar to STACK_FRAME_NON_STANDARD()? Using that seems to
result in "BUG: why am I validating an ignored function?" warnings, so
I suspect some additional changes are needed.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-20 19:24             ` Sami Tolvanen
@ 2020-10-21  8:56               ` Peter Zijlstra
  2020-10-21  9:08                 ` Peter Zijlstra
                                   ` (3 more replies)
  0 siblings, 4 replies; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-21  8:56 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Tue, Oct 20, 2020 at 12:24:37PM -0700, Sami Tolvanen wrote:
> > > Building allyesconfig with this series and LTO enabled, I still see
> > > the following objtool warnings for vmlinux.o, grouped by source file:
> > >
> > > arch/x86/entry/entry_64.S:
> > > __switch_to_asm()+0x0: undefined stack state
> > > .entry.text+0xffd: sibling call from callable instruction with
> > > modified stack frame
> > > .entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
> >
> > Not sure what this one's about, there's no OBJECT_FILES_NON_STANDARD?
> 
> Correct, because with LTO, we won't have an ELF binary to process
> until we compile everything into vmlinux.o, and at that point we can
> no longer skip individual object files.

I think what Josh was trying to say is; this file is subject to objtool
on a normal build and does not generate warnings. So why would it
generate warnings when subject to objtool as result of a vmlinux run
(due to LTO or otherwise).

In fact, when I build a x86_64-defconfig and then run:

  $ objtool check -barf defconfig-build/vmlinux.o

I do not see these in particular, although I do see a lot of:

  "sibling call from callable instruction with modified stack frame"
  "falls through to next function"

that did not show up in the individual objtool runs during the build.

The "falls through to next function" seems to be limited to things like:

  warning: objtool: setup_vq() falls through to next function setup_vq.cold()
  warning: objtool: e1000_xmit_frame() falls through to next function e1000_xmit_frame.cold()

So something's weird with the .cold thing on vmlinux.o runs.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21  8:56               ` Peter Zijlstra
@ 2020-10-21  9:08                 ` Peter Zijlstra
  2020-10-21  9:32                 ` Peter Zijlstra
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-21  9:08 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 10:56:06AM +0200, Peter Zijlstra wrote:

> The "falls through to next function" seems to be limited to things like:
> 
>   warning: objtool: setup_vq() falls through to next function setup_vq.cold()
>   warning: objtool: e1000_xmit_frame() falls through to next function e1000_xmit_frame.cold()
> 
> So something's weird with the .cold thing on vmlinux.o runs.

Shiny, check this:

$ nm defconfig-build/vmlinux.o | grep setup_vq
00000000004d33a0 t setup_vq
00000000004d4c20 t setup_vq
000000000001edcc t setup_vq.cold
000000000001ee31 t setup_vq.cold
00000000004d3dc0 t vp_setup_vq

$ nm defconfig-build/vmlinux.o | grep e1000_xmit_frame
0000000000741490 t e1000_xmit_frame
0000000000763620 t e1000_xmit_frame
000000000002f579 t e1000_xmit_frame.cold
0000000000032b6e t e1000_xmit_frame.cold

$ nm defconfig-build/vmlinux.o | grep e1000_diag_test
000000000074c220 t e1000_diag_test
000000000075eb70 t e1000_diag_test
000000000002fc2a t e1000_diag_test.cold
0000000000030880 t e1000_diag_test.cold

I guess objtool goes sideways when there's multiple symbols with the
same name in a single object file. This obvously never happens on single
TU .o files.

Not sure what to do about that.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21  8:56               ` Peter Zijlstra
  2020-10-21  9:08                 ` Peter Zijlstra
@ 2020-10-21  9:32                 ` Peter Zijlstra
  2020-10-21 21:27                   ` Josh Poimboeuf
  2020-10-21 15:01                 ` Sami Tolvanen
  2020-10-22  0:22                 ` Sami Tolvanen
  3 siblings, 1 reply; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-21  9:32 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 10:56:06AM +0200, Peter Zijlstra wrote:

> I do not see these in particular, although I do see a lot of:
> 
>   "sibling call from callable instruction with modified stack frame"

defconfig-build/vmlinux.o: warning: objtool: msr_write()+0x10a: sibling call from callable instruction with modified stack frame
defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x99: (branch)
defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x3e: (branch)
defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x0: <=== (sym)

$ nm defconfig-build/vmlinux.o | grep msr_write
0000000000043250 t msr_write
00000000004289c0 T msr_write
0000000000003056 t msr_write.cold

Below 'fixes' it. So this is also caused by duplicate symbols.

---
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 3bd905e10ee2..e36331f8f217 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -48,17 +48,6 @@ int msr_read(u32 msr, struct msr *m)
 	return err;
 }
 
-/**
- * Write an MSR with error handling
- *
- * @msr: MSR to write
- * @m: value to write
- */
-int msr_write(u32 msr, struct msr *m)
-{
-	return wrmsrl_safe(msr, m->q);
-}
-
 static inline int __flip_bit(u32 msr, u8 bit, bool set)
 {
 	struct msr m, m1;
@@ -80,7 +69,7 @@ static inline int __flip_bit(u32 msr, u8 bit, bool set)
 	if (m1.q == m.q)
 		return 0;
 
-	err = msr_write(msr, &m1);
+	err = wrmsr_safe(msr, m1.q);
 	if (err)
 		return err;
 

^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-20 18:52           ` Josh Poimboeuf
  2020-10-20 19:24             ` Sami Tolvanen
@ 2020-10-21  9:51             ` Peter Zijlstra
  2020-10-21 18:30               ` Josh Poimboeuf
  1 sibling, 1 reply; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-21  9:51 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Sami Tolvanen, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Tue, Oct 20, 2020 at 01:52:17PM -0500, Josh Poimboeuf wrote:
> > arch/x86/lib/retpoline.S:
> > __x86_retpoline_rdi()+0x10: return with modified stack frame
> > __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
> > __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0
> 
> Is this with upstream?  I thought we fixed that with
> UNWIND_HINT_RET_OFFSET.

I can't reproduce this one either; but I do get different warnings:

gcc (Debian 10.2.0-13) 10.2.0, x86_64-defconfig:

defconfig-build/vmlinux.o: warning: objtool: __x86_indirect_thunk_rax() falls through to next function __x86_retpoline_rax()
defconfig-build/vmlinux.o: warning: objtool:   .altinstr_replacement+0x1063: (branch)
defconfig-build/vmlinux.o: warning: objtool:   __x86_indirect_thunk_rax()+0x0: (alt)
defconfig-build/vmlinux.o: warning: objtool:   __x86_indirect_thunk_rax()+0x0: <=== (sym)

(for every single register, not just rax)

Which is daft as well, because the retpoline.o run is clean. It also
doesn't make sense because __x86_retpoline_rax isn't in fact STT_FUNC,
so WTH ?!

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21  8:56               ` Peter Zijlstra
  2020-10-21  9:08                 ` Peter Zijlstra
  2020-10-21  9:32                 ` Peter Zijlstra
@ 2020-10-21 15:01                 ` Sami Tolvanen
  2020-10-22  0:22                 ` Sami Tolvanen
  3 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-21 15:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 1:56 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Oct 20, 2020 at 12:24:37PM -0700, Sami Tolvanen wrote:
> > > > Building allyesconfig with this series and LTO enabled, I still see
> > > > the following objtool warnings for vmlinux.o, grouped by source file:
> > > >
> > > > arch/x86/entry/entry_64.S:
> > > > __switch_to_asm()+0x0: undefined stack state
> > > > .entry.text+0xffd: sibling call from callable instruction with
> > > > modified stack frame
> > > > .entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
> > >
> > > Not sure what this one's about, there's no OBJECT_FILES_NON_STANDARD?
> >
> > Correct, because with LTO, we won't have an ELF binary to process
> > until we compile everything into vmlinux.o, and at that point we can
> > no longer skip individual object files.
>
> I think what Josh was trying to say is; this file is subject to objtool
> on a normal build and does not generate warnings. So why would it
> generate warnings when subject to objtool as result of a vmlinux run
> (due to LTO or otherwise).

Ah, right. It also doesn't generate warnings when I build defconfig
with LTO, so clearly something confuses objtool here.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21  9:51             ` Peter Zijlstra
@ 2020-10-21 18:30               ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-10-21 18:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sami Tolvanen, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 11:51:33AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 20, 2020 at 01:52:17PM -0500, Josh Poimboeuf wrote:
> > > arch/x86/lib/retpoline.S:
> > > __x86_retpoline_rdi()+0x10: return with modified stack frame
> > > __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
> > > __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0
> > 
> > Is this with upstream?  I thought we fixed that with
> > UNWIND_HINT_RET_OFFSET.
> 
> I can't reproduce this one either; but I do get different warnings:
> 
> gcc (Debian 10.2.0-13) 10.2.0, x86_64-defconfig:
> 
> defconfig-build/vmlinux.o: warning: objtool: __x86_indirect_thunk_rax() falls through to next function __x86_retpoline_rax()
> defconfig-build/vmlinux.o: warning: objtool:   .altinstr_replacement+0x1063: (branch)
> defconfig-build/vmlinux.o: warning: objtool:   __x86_indirect_thunk_rax()+0x0: (alt)
> defconfig-build/vmlinux.o: warning: objtool:   __x86_indirect_thunk_rax()+0x0: <=== (sym)
> 
> (for every single register, not just rax)
> 
> Which is daft as well, because the retpoline.o run is clean. It also
> doesn't make sense because __x86_retpoline_rax isn't in fact STT_FUNC,
> so WTH ?!

It is STT_FUNC:

  SYM_FUNC_START_NOALIGN(__x86_retpoline_\reg)

  $ readelf -s vmlinux.o |grep __x86_retpoline_rax
  129749: 0000000000000005    17 FUNC    GLOBAL DEFAULT   39 __x86_retpoline_rax

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21  9:32                 ` Peter Zijlstra
@ 2020-10-21 21:27                   ` Josh Poimboeuf
  2020-10-22  7:25                     ` Peter Zijlstra
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2020-10-21 21:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sami Tolvanen, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 11:32:13AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 21, 2020 at 10:56:06AM +0200, Peter Zijlstra wrote:
> 
> > I do not see these in particular, although I do see a lot of:
> > 
> >   "sibling call from callable instruction with modified stack frame"
> 
> defconfig-build/vmlinux.o: warning: objtool: msr_write()+0x10a: sibling call from callable instruction with modified stack frame
> defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x99: (branch)
> defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x3e: (branch)
> defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x0: <=== (sym)
> 
> $ nm defconfig-build/vmlinux.o | grep msr_write
> 0000000000043250 t msr_write
> 00000000004289c0 T msr_write
> 0000000000003056 t msr_write.cold
> 
> Below 'fixes' it. So this is also caused by duplicate symbols.

There's a new linker flag for renaming duplicates:

  https://sourceware.org/bugzilla/show_bug.cgi?id=26391

But I guess that doesn't help us now.

I don't have access to GCC 10 at the moment so I can't recreate it.
Does this fix it?

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 4e1d7460574b..aecdf25b2381 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -217,11 +217,14 @@ struct symbol *find_func_containing(struct section *sec, unsigned long offset)
 	return NULL;
 }
 
+#define for_each_possible_sym(elf, name, sym)				\
+	elf_hash_for_each_possible(elf->symbol_name_hash, sym, name_hash, str_hash(name))
+
 struct symbol *find_symbol_by_name(const struct elf *elf, const char *name)
 {
 	struct symbol *sym;
 
-	elf_hash_for_each_possible(elf->symbol_name_hash, sym, name_hash, str_hash(name))
+	for_each_possible_sym(elf, name, sym)
 		if (!strcmp(sym->name, name))
 			return sym;
 
@@ -432,6 +435,8 @@ static int read_symbols(struct elf *elf)
 		list_for_each_entry(sym, &sec->symbol_list, list) {
 			char pname[MAX_NAME_LEN + 1];
 			size_t pnamelen;
+			struct symbol *psym;
+
 			if (sym->type != STT_FUNC)
 				continue;
 
@@ -454,8 +459,14 @@ static int read_symbols(struct elf *elf)
 
 			strncpy(pname, sym->name, pnamelen);
 			pname[pnamelen] = '\0';
-			pfunc = find_symbol_by_name(elf, pname);
-
+			pfunc = NULL;
+			for_each_possible_sym(elf, pname, psym) {
+				if ((!psym->cfunc || psym->cfunc == psym) &&
+				    !strcmp(psym->name, pname)) {
+					pfunc = psym;
+					break;
+				}
+			}
 			if (!pfunc) {
 				WARN("%s(): can't find parent function",
 				     sym->name);


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21  8:56               ` Peter Zijlstra
                                   ` (2 preceding siblings ...)
  2020-10-21 15:01                 ` Sami Tolvanen
@ 2020-10-22  0:22                 ` Sami Tolvanen
  2020-10-23 17:36                   ` Sami Tolvanen
  3 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-22  0:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 1:56 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Oct 20, 2020 at 12:24:37PM -0700, Sami Tolvanen wrote:
> > > > Building allyesconfig with this series and LTO enabled, I still see
> > > > the following objtool warnings for vmlinux.o, grouped by source file:
> > > >
> > > > arch/x86/entry/entry_64.S:
> > > > __switch_to_asm()+0x0: undefined stack state
> > > > .entry.text+0xffd: sibling call from callable instruction with
> > > > modified stack frame
> > > > .entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
> > >
> > > Not sure what this one's about, there's no OBJECT_FILES_NON_STANDARD?
> >
> > Correct, because with LTO, we won't have an ELF binary to process
> > until we compile everything into vmlinux.o, and at that point we can
> > no longer skip individual object files.
>
> I think what Josh was trying to say is; this file is subject to objtool
> on a normal build and does not generate warnings. So why would it
> generate warnings when subject to objtool as result of a vmlinux run
> (due to LTO or otherwise).
>
> In fact, when I build a x86_64-defconfig and then run:
>
>   $ objtool check -barf defconfig-build/vmlinux.o

Note that I'm passing also --vmlinux and --duplicate to objtool when
processing vmlinux.o, and this series has a patch to split noinstr
validation from --vmlinux, so that's skipped.

> I do not see these in particular, although I do see a lot of:
>
>   "sibling call from callable instruction with modified stack frame"
>   "falls through to next function"
>
> that did not show up in the individual objtool runs during the build.

I'm able to reproduce these warnings with gcc 9.3 + allyesconfig, with
KASAN and GCOV_KERNEL disabled, as they are not enabled in LTO builds
either. This is without the LTO series applied, so we also have the
retpoline warnings:

$ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep -vE
'(sibling|instr)'
vmlinux.o: warning: objtool: wakeup_long64()+0x61: indirect jump found
in RETPOLINE build
vmlinux.o: warning: objtool: .text+0x826308a: indirect jump found in
RETPOLINE build
vmlinux.o: warning: objtool: .text+0x82630c5: indirect jump found in
RETPOLINE build
vmlinux.o: warning: objtool: .head.text+0x748: indirect jump found in
RETPOLINE build
vmlinux.o: warning: objtool:
set_bringup_idt_handler.constprop.0()+0x0: undefined stack state
vmlinux.o: warning: objtool: .entry.text+0x1634: redundant CLD
vmlinux.o: warning: objtool: camellia_cbc_dec_32way()+0xb3: stack
state mismatch: cfa1=7+520 cfa2=7+8
vmlinux.o: warning: objtool: camellia_ctr_32way()+0x1a: stack state
mismatch: cfa1=7+520 cfa2=7+8
vmlinux.o: warning: objtool: aesni_gcm_init_avx_gen2()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_enc_update_avx_gen2()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_dec_update_avx_gen2()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_finalize_avx_gen2()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_init_avx_gen4()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_enc_update_avx_gen4()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_dec_update_avx_gen4()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: aesni_gcm_finalize_avx_gen4()+0x12:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: sha1_transform_avx2()+0xc: unsupported
stack pointer realignment
vmlinux.o: warning: objtool: sha1_ni_transform()+0x7: unsupported
stack pointer realignment
vmlinux.o: warning: objtool: sha256_transform_rorx()+0x13: unsupported
stack pointer realignment
vmlinux.o: warning: objtool: sha512_transform_ssse3()+0x14:
unsupported stack pointer realignment
vmlinux.o: warning: objtool: sha512_transform_avx()+0x14: unsupported
stack pointer realignment
vmlinux.o: warning: objtool: sha512_transform_rorx()+0x7: unsupported
stack pointer realignment
vmlinux.o: warning: objtool: __x86_retpoline_rdi()+0x10: return with
modified stack frame
vmlinux.o: warning: objtool: __x86_retpoline_rdi()+0x0: stack state
mismatch: cfa1=7+32 cfa2=7+8
vmlinux.o: warning: objtool: __x86_retpoline_rdi()+0x0: stack state
mismatch: cfa1=7+32 cfa2=-1+0
vmlinux.o: warning: objtool: reset_early_page_tables()+0x0: stack
state mismatch: cfa1=7+8 cfa2=-1+0
vmlinux.o: warning: objtool: .entry.text+0x48: stack state mismatch:
cfa1=7-8 cfa2=-1+0
vmlinux.o: warning: objtool: .entry.text+0x15fd: stack state mismatch:
cfa1=7-8 cfa2=-1+0
vmlinux.o: warning: objtool: .entry.text+0x168c: stack state mismatch:
cfa1=7-8 cfa2=-1+0

There are a couple of differences, like the first "undefined stack
state" warning pointing to set_bringup_idt_handler.constprop.0()
instead of __switch_to_asm(). I tried running this with --backtrace,
but objtool segfaults at the first .entry.text warning:

$ ./tools/objtool/objtool check -barfld vmlinux.o
...
vmlinux.o: warning: objtool:
set_bringup_idt_handler.constprop.0()+0x0: undefined stack state
vmlinux.o: warning: objtool:   xen_hypercall_set_trap_table()+0x0: <=== (sym)
...
vmlinux.o: warning: objtool: .entry.text+0xffd: sibling call from
callable instruction with modified stack frame
vmlinux.o: warning: objtool:   .entry.text+0xfcb: (branch)
Segmentation fault

Going back to the allyesconfig+LTO vmlinux.o, the "undefined stack
state" warning looks quite similar:

$ ./tools/objtool/objtool check -barlfd vmlinux.o
vmlinux.o: warning: objtool: __switch_to_asm()+0x0: undefined stack state
vmlinux.o: warning: objtool:   xen_hypercall_set_trap_table()+0x0: <=== (sym)
vmlinux.o: warning: objtool: .entry.text+0xffd: sibling call from
callable instruction with modified stack frame
vmlinux.o: warning: objtool:   .entry.text+0xfcb: (branch)
Segmentation fault

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-21 21:27                   ` Josh Poimboeuf
@ 2020-10-22  7:25                     ` Peter Zijlstra
  2020-10-23 17:48                       ` Sami Tolvanen
  0 siblings, 1 reply; 73+ messages in thread
From: Peter Zijlstra @ 2020-10-22  7:25 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Sami Tolvanen, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 04:27:47PM -0500, Josh Poimboeuf wrote:
> On Wed, Oct 21, 2020 at 11:32:13AM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 21, 2020 at 10:56:06AM +0200, Peter Zijlstra wrote:
> > 
> > > I do not see these in particular, although I do see a lot of:
> > > 
> > >   "sibling call from callable instruction with modified stack frame"
> > 
> > defconfig-build/vmlinux.o: warning: objtool: msr_write()+0x10a: sibling call from callable instruction with modified stack frame
> > defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x99: (branch)
> > defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x3e: (branch)
> > defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x0: <=== (sym)
> > 
> > $ nm defconfig-build/vmlinux.o | grep msr_write
> > 0000000000043250 t msr_write
> > 00000000004289c0 T msr_write
> > 0000000000003056 t msr_write.cold
> > 
> > Below 'fixes' it. So this is also caused by duplicate symbols.
> 
> There's a new linker flag for renaming duplicates:
> 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=26391
> 
> But I guess that doesn't help us now.

Well, depends a bit if clang can do it; we only need this for LTO builds
for now.

> I don't have access to GCC 10 at the moment so I can't recreate it.
> Does this fix it?

Doesn't seem to do the trick :/ I'll try and have a poke later.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-22  0:22                 ` Sami Tolvanen
@ 2020-10-23 17:36                   ` Sami Tolvanen
  2020-11-09 23:11                     ` Sami Tolvanen
  0 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-23 17:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Wed, Oct 21, 2020 at 05:22:59PM -0700, Sami Tolvanen wrote:
> There are a couple of differences, like the first "undefined stack
> state" warning pointing to set_bringup_idt_handler.constprop.0()
> instead of __switch_to_asm(). I tried running this with --backtrace,
> but objtool segfaults at the first .entry.text warning:

Looks like it segfaults when calling BT_FUNC() for an instruction that
doesn't have a section (?). Applying this patch allows objtool to finish
with --backtrace:

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c216dd4d662c..618b0c4f2890 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2604,7 +2604,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 				ret = validate_branch(file, func,
 						      insn->jump_dest, state);
 				if (ret) {
-					if (backtrace)
+					if (backtrace && insn->sec)
 						BT_FUNC("(branch)", insn);
 					return ret;
 				}


Running objtool -barfld on an allyesconfig+LTO vmlinux.o prints out the
following, ignoring the crypto warnings for now:

__switch_to_asm()+0x0: undefined stack state
  xen_hypercall_set_trap_table()+0x0: <=== (sym)
.entry.text+0xffd: sibling call from callable instruction with modified stack frame
  .entry.text+0xfcb: (branch)
  .entry.text+0xfb5: (alt)
  .entry.text+0xfb0: (alt)
  .entry.text+0xf78: (branch)
  .entry.text+0x9c: (branch)
  xen_syscall_target()+0x15: (branch)
  xen_syscall_target()+0x0: <=== (sym)
.entry.text+0x1754: unsupported instruction in callable function
  .entry.text+0x171d: (branch)
  .entry.text+0x1707: (alt)
  .entry.text+0x1701: (alt)
  xen_syscall32_target()+0x15: (branch)
  xen_syscall32_target()+0x0: <=== (sym)
.entry.text+0x1634: redundant CLD
do_suspend_lowlevel()+0x116: sibling call from callable instruction with modified stack frame
  do_suspend_lowlevel()+0x9a: (branch)
  do_suspend_lowlevel()+0x0: <=== (sym)
... [skipping crypto stack pointer alignment warnings] ...
__x86_retpoline_rdi()+0x10: return with modified stack frame
  __x86_retpoline_rdi()+0x0: (branch)
  .altinstr_replacement+0x13d: (branch)
  .text+0xaf4c7: (alt)
  .text+0xb03b0: (branch)
  .text+0xaf482: (branch)
  crc_pcl()+0x10: (branch)
  crc_pcl()+0x0: <=== (sym)
__x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
  .altinstr_replacement+0x20b: (branch)
  __x86_indirect_thunk_rdi()+0x0: (alt)
  __x86_indirect_thunk_rdi()+0x0: <=== (sym)
.head.text+0xfb: unsupported instruction in callable function
  .head.text+0x207: (branch)
  sev_es_play_dead()+0xff: (branch)
  sev_es_play_dead()+0xd2: (branch)
  sev_es_play_dead()+0xa8: (alt)
  sev_es_play_dead()+0x144: (branch)
  sev_es_play_dead()+0x10b: (branch)
  sev_es_play_dead()+0x1f: (branch)
  sev_es_play_dead()+0x0: <=== (sym)
__x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0
  .altinstr_replacement+0x107: (branch)
  .text+0x2885: (alt)
  .text+0x2860: <=== (hint)
.entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
  .altinstr_replacement+0xffffffffffffffff: (branch)
  .entry.text+0x21: (alt)
  .entry.text+0x1c: (alt)
  .entry.text+0x10: <=== (hint)
.entry.text+0x15fd: stack state mismatch: cfa1=7-8 cfa2=-1+0
  .altinstr_replacement+0xffffffffffffffff: (branch)
  .entry.text+0x15dc: (alt)
  .entry.text+0x15d7: (alt)
  .entry.text+0x15d0: <=== (hint)
.entry.text+0x168c: stack state mismatch: cfa1=7-8 cfa2=-1+0
  .altinstr_replacement+0xffffffffffffffff: (branch)
  .entry.text+0x166b: (alt)
  .entry.text+0x1666: (alt)
  .entry.text+0x1660: <=== (hint)

Sami

^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-22  7:25                     ` Peter Zijlstra
@ 2020-10-23 17:48                       ` Sami Tolvanen
  2020-10-23 18:04                         ` Nick Desaulniers
  0 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-10-23 17:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Thu, Oct 22, 2020 at 09:25:53AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 21, 2020 at 04:27:47PM -0500, Josh Poimboeuf wrote:
> > On Wed, Oct 21, 2020 at 11:32:13AM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 21, 2020 at 10:56:06AM +0200, Peter Zijlstra wrote:
> > > 
> > > > I do not see these in particular, although I do see a lot of:
> > > > 
> > > >   "sibling call from callable instruction with modified stack frame"
> > > 
> > > defconfig-build/vmlinux.o: warning: objtool: msr_write()+0x10a: sibling call from callable instruction with modified stack frame
> > > defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x99: (branch)
> > > defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x3e: (branch)
> > > defconfig-build/vmlinux.o: warning: objtool:   msr_write()+0x0: <=== (sym)
> > > 
> > > $ nm defconfig-build/vmlinux.o | grep msr_write
> > > 0000000000043250 t msr_write
> > > 00000000004289c0 T msr_write
> > > 0000000000003056 t msr_write.cold
> > > 
> > > Below 'fixes' it. So this is also caused by duplicate symbols.
> > 
> > There's a new linker flag for renaming duplicates:
> > 
> >   https://sourceware.org/bugzilla/show_bug.cgi?id=26391
> > 
> > But I guess that doesn't help us now.
> 
> Well, depends a bit if clang can do it; we only need this for LTO builds
> for now.

LLD doesn't seem to support -z unique-symbol.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-23 17:48                       ` Sami Tolvanen
@ 2020-10-23 18:04                         ` Nick Desaulniers
  0 siblings, 0 replies; 73+ messages in thread
From: Nick Desaulniers @ 2020-10-23 18:04 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Josh Poimboeuf, Jann Horn,
	the arch/x86 maintainers, Masahiro Yamada, Steven Rostedt,
	Will Deacon, Greg Kroah-Hartman, Paul E. McKenney, Kees Cook,
	clang-built-linux, Kernel Hardening, linux-arch, Linux ARM,
	linux-kbuild, kernel list, linux-pci

On Fri, Oct 23, 2020 at 10:48 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> On Thu, Oct 22, 2020 at 09:25:53AM +0200, Peter Zijlstra wrote:
> > > There's a new linker flag for renaming duplicates:
> > >
> > >   https://sourceware.org/bugzilla/show_bug.cgi?id=26391
> > >
> > > But I guess that doesn't help us now.
> >
> > Well, depends a bit if clang can do it; we only need this for LTO builds
> > for now.
>
> LLD doesn't seem to support -z unique-symbol.

https://github.com/ClangBuiltLinux/linux/issues/1184
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-10-23 17:36                   ` Sami Tolvanen
@ 2020-11-09 23:11                     ` Sami Tolvanen
  2020-11-10  2:29                       ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-11-09 23:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Oct 23, 2020 at 10:36 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> On Wed, Oct 21, 2020 at 05:22:59PM -0700, Sami Tolvanen wrote:
> > There are a couple of differences, like the first "undefined stack
> > state" warning pointing to set_bringup_idt_handler.constprop.0()
> > instead of __switch_to_asm(). I tried running this with --backtrace,
> > but objtool segfaults at the first .entry.text warning:
>
> Looks like it segfaults when calling BT_FUNC() for an instruction that
> doesn't have a section (?). Applying this patch allows objtool to finish
> with --backtrace:
>
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index c216dd4d662c..618b0c4f2890 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -2604,7 +2604,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
>                                 ret = validate_branch(file, func,
>                                                       insn->jump_dest, state);
>                                 if (ret) {
> -                                       if (backtrace)
> +                                       if (backtrace && insn->sec)
>                                                 BT_FUNC("(branch)", insn);
>                                         return ret;
>                                 }
>
>
> Running objtool -barfld on an allyesconfig+LTO vmlinux.o prints out the
> following, ignoring the crypto warnings for now:

OK, I spent some time looking at these warnings and the configs needed
to reproduce them without building allyesconfig:

CONFIG_XEN

__switch_to_asm()+0x0: undefined stack state
  xen_hypercall_set_trap_table()+0x0: <=== (sym)

CONFIG_XEN_PV

.entry.text+0xffd: sibling call from callable instruction with
modified stack frame
  .entry.text+0xfcb: (branch)
  .entry.text+0xfb5: (alt)
  .entry.text+0xfb0: (alt)
  .entry.text+0xf78: (branch)
  .entry.text+0x9c: (branch)
  xen_syscall_target()+0x15: (branch)
  xen_syscall_target()+0x0: <=== (sym)
.entry.text+0x1754: unsupported instruction in callable function
  .entry.text+0x171d: (branch)
  .entry.text+0x1707: (alt)
  .entry.text+0x1701: (alt)
  xen_syscall32_target()+0x15: (branch)
  xen_syscall32_target()+0x0: <=== (sym)
.entry.text+0x1634: redundant CLD

Backtrace doesn’t print out anything useful for the “redundant CLD”
error, but it occurs when validate_branch is looking at
xen_sysenter_target.

do_suspend_lowlevel()+0x116: sibling call from callable instruction
with modified stack frame
  do_suspend_lowlevel()+0x9a: (branch)
  do_suspend_lowlevel()+0x0: <=== (sym)

.entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
  .altinstr_replacement+0xffffffffffffffff: (branch)
  .entry.text+0x21: (alt)
  .entry.text+0x1c: (alt)
  .entry.text+0x10: <=== (hint)
.entry.text+0x15fd: stack state mismatch: cfa1=7-8 cfa2=-1+0
  .altinstr_replacement+0xffffffffffffffff: (branch)
  .entry.text+0x15dc: (alt)
  .entry.text+0x15d7: (alt)
  .entry.text+0x15d0: <=== (hint)
.entry.text+0x168c: stack state mismatch: cfa1=7-8 cfa2=-1+0
  .altinstr_replacement+0xffffffffffffffff: (branch)
  .entry.text+0x166b: (alt)
  .entry.text+0x1666: (alt)
  .entry.text+0x1660: <=== (hint)

It looks like the stack state mismatch warnings can be fixed by adding
unwind hints also to entry_SYSCALL_64_after_hwframe,
entry_SYSENTER_compat_after_hwframe, and
entry_SYSCALL_compat_after_hwframe. Does that sound correct?

CONFIG_AMD_MEM_ENCRYPT

.head.text+0xfb: unsupported instruction in callable function
  .head.text+0x207: (branch)
  sev_es_play_dead()+0xff: (branch)
  sev_es_play_dead()+0xd2: (branch)
  sev_es_play_dead()+0xa8: (alt)
  sev_es_play_dead()+0x144: (branch)
  sev_es_play_dead()+0x10b: (branch)
  sev_es_play_dead()+0x1f: (branch)
  sev_es_play_dead()+0x0: <=== (sym)

This happens because sev_es_play_dead calls start_cpu0. It always has,
but objtool hasn’t been able to follow the call when processing only
sev-es.o. Any thoughts on the preferred way to fix this one?

CONFIG_CRYPTO_CRC32C_INTEL

__x86_retpoline_rdi()+0x10: return with modified stack frame
  __x86_retpoline_rdi()+0x0: (branch)
  .altinstr_replacement+0x147: (branch)
  .text+0xaf4c7: (alt)
  .text+0xb03b0: (branch)
  .text+0xaf482: (branch)
  crc_pcl()+0x10: (branch)
  crc_pcl()+0x0: <=== (sym)

__x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
  .altinstr_replacement+0x265: (branch)
  __x86_indirect_thunk_rdi()+0x0: (alt)
  __x86_indirect_thunk_rdi()+0x0: <=== (sym)

This is different from the warnings in the rest of the arch/x86/crypto
code. Do we need some kind of a hint before the JMP_NOSPEC in crc_pcl?

CONFIG_FUNCTION_TRACER

__x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0
  .altinstr_replacement+0x111: (branch)
  .text+0x28a5: (alt)
  .text+0x2880: <=== (hint)

This unwind hint is in return_to_handler. Removing it obviously stops
the warning and doesn’t seem to result in any other complaints from
objtool. Is this hint correct?

The remaining warnings are all “unsupported stack pointer realignment”
issues in the crypto code and can be reproduced with the following
configs:

CONFIG_CRYPTO_AES_NI_INTEL
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64
CONFIG_CRYPTO_SHA1_SSSE3
CONFIG_CRYPTO_SHA256_SSSE3
CONFIG_CRYPTO_SHA512_SSSE3

Josh, have you had a chance to look at the crypto patches you mentioned earlier?

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-09 23:11                     ` Sami Tolvanen
@ 2020-11-10  2:29                       ` Josh Poimboeuf
  2020-11-10  3:18                         ` Nick Desaulniers
                                           ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-10  2:29 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Mon, Nov 09, 2020 at 03:11:41PM -0800, Sami Tolvanen wrote:
> On Fri, Oct 23, 2020 at 10:36 AM Sami Tolvanen <samitolvanen@google.com> wrote:
> >
> > On Wed, Oct 21, 2020 at 05:22:59PM -0700, Sami Tolvanen wrote:
> > > There are a couple of differences, like the first "undefined stack
> > > state" warning pointing to set_bringup_idt_handler.constprop.0()
> > > instead of __switch_to_asm(). I tried running this with --backtrace,
> > > but objtool segfaults at the first .entry.text warning:
> >
> > Looks like it segfaults when calling BT_FUNC() for an instruction that
> > doesn't have a section (?). Applying this patch allows objtool to finish
> > with --backtrace:
> >
> > diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> > index c216dd4d662c..618b0c4f2890 100644
> > --- a/tools/objtool/check.c
> > +++ b/tools/objtool/check.c
> > @@ -2604,7 +2604,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
> >                                 ret = validate_branch(file, func,
> >                                                       insn->jump_dest, state);
> >                                 if (ret) {
> > -                                       if (backtrace)
> > +                                       if (backtrace && insn->sec)
> >                                                 BT_FUNC("(branch)", insn);
> >                                         return ret;
> >                                 }
> >
> >
> > Running objtool -barfld on an allyesconfig+LTO vmlinux.o prints out the
> > following, ignoring the crypto warnings for now:
> 
> OK, I spent some time looking at these warnings and the configs needed
> to reproduce them without building allyesconfig:
> 
> CONFIG_XEN
> 
> __switch_to_asm()+0x0: undefined stack state
>   xen_hypercall_set_trap_table()+0x0: <=== (sym)
> 
> CONFIG_XEN_PV
> 
> .entry.text+0xffd: sibling call from callable instruction with
> modified stack frame
>   .entry.text+0xfcb: (branch)
>   .entry.text+0xfb5: (alt)
>   .entry.text+0xfb0: (alt)
>   .entry.text+0xf78: (branch)
>   .entry.text+0x9c: (branch)
>   xen_syscall_target()+0x15: (branch)
>   xen_syscall_target()+0x0: <=== (sym)
> .entry.text+0x1754: unsupported instruction in callable function
>   .entry.text+0x171d: (branch)
>   .entry.text+0x1707: (alt)
>   .entry.text+0x1701: (alt)
>   xen_syscall32_target()+0x15: (branch)
>   xen_syscall32_target()+0x0: <=== (sym)
> .entry.text+0x1634: redundant CLD
> 
> Backtrace doesn’t print out anything useful for the “redundant CLD”
> error, but it occurs when validate_branch is looking at
> xen_sysenter_target.
> 
> do_suspend_lowlevel()+0x116: sibling call from callable instruction
> with modified stack frame
>   do_suspend_lowlevel()+0x9a: (branch)
>   do_suspend_lowlevel()+0x0: <=== (sym)
> 
> .entry.text+0x48: stack state mismatch: cfa1=7-8 cfa2=-1+0
>   .altinstr_replacement+0xffffffffffffffff: (branch)
>   .entry.text+0x21: (alt)
>   .entry.text+0x1c: (alt)
>   .entry.text+0x10: <=== (hint)
> .entry.text+0x15fd: stack state mismatch: cfa1=7-8 cfa2=-1+0
>   .altinstr_replacement+0xffffffffffffffff: (branch)
>   .entry.text+0x15dc: (alt)
>   .entry.text+0x15d7: (alt)
>   .entry.text+0x15d0: <=== (hint)
> .entry.text+0x168c: stack state mismatch: cfa1=7-8 cfa2=-1+0
>   .altinstr_replacement+0xffffffffffffffff: (branch)
>   .entry.text+0x166b: (alt)
>   .entry.text+0x1666: (alt)
>   .entry.text+0x1660: <=== (hint)

I can't make much sense of most of these warnings.  Disassembly would
help.

(Also, something like the patch below should help objtool show more
symbol names.)

> It looks like the stack state mismatch warnings can be fixed by adding
> unwind hints also to entry_SYSCALL_64_after_hwframe,
> entry_SYSENTER_compat_after_hwframe, and
> entry_SYSCALL_compat_after_hwframe. Does that sound correct?

No, those code paths should already have the hints they need, unless I'm
missing something.

> CONFIG_AMD_MEM_ENCRYPT
> 
> .head.text+0xfb: unsupported instruction in callable function
>   .head.text+0x207: (branch)
>   sev_es_play_dead()+0xff: (branch)
>   sev_es_play_dead()+0xd2: (branch)
>   sev_es_play_dead()+0xa8: (alt)
>   sev_es_play_dead()+0x144: (branch)
>   sev_es_play_dead()+0x10b: (branch)
>   sev_es_play_dead()+0x1f: (branch)
>   sev_es_play_dead()+0x0: <=== (sym)
> 
> This happens because sev_es_play_dead calls start_cpu0. It always has,
> but objtool hasn’t been able to follow the call when processing only
> sev-es.o. Any thoughts on the preferred way to fix this one?

Objtool isn't supposed to traverse through call instructions like that.
Is LTO inlining the call or something?

> CONFIG_CRYPTO_CRC32C_INTEL
> 
> __x86_retpoline_rdi()+0x10: return with modified stack frame
>   __x86_retpoline_rdi()+0x0: (branch)
>   .altinstr_replacement+0x147: (branch)
>   .text+0xaf4c7: (alt)
>   .text+0xb03b0: (branch)
>   .text+0xaf482: (branch)
>   crc_pcl()+0x10: (branch)
>   crc_pcl()+0x0: <=== (sym)
> 
> __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
>   .altinstr_replacement+0x265: (branch)
>   __x86_indirect_thunk_rdi()+0x0: (alt)
>   __x86_indirect_thunk_rdi()+0x0: <=== (sym)
> 
> This is different from the warnings in the rest of the arch/x86/crypto
> code. Do we need some kind of a hint before the JMP_NOSPEC in crc_pcl?

I'll need to look more into that one.

> CONFIG_FUNCTION_TRACER
> 
> __x86_retpoline_rdi()+0x0: stack state mismatch: cfa1=7+32 cfa2=-1+0
>   .altinstr_replacement+0x111: (branch)
>   .text+0x28a5: (alt)
>   .text+0x2880: <=== (hint)
> 
> This unwind hint is in return_to_handler. Removing it obviously stops
> the warning and doesn’t seem to result in any other complaints from
> objtool. Is this hint correct?

The hint is supposed to be there.  I don't understand this one either.
Did it inline the call to ftrace_return_to_handler()?

> The remaining warnings are all “unsupported stack pointer realignment”
> issues in the crypto code and can be reproduced with the following
> configs:
> 
> CONFIG_CRYPTO_AES_NI_INTEL
> CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64
> CONFIG_CRYPTO_SHA1_SSSE3
> CONFIG_CRYPTO_SHA256_SSSE3
> CONFIG_CRYPTO_SHA512_SSSE3
> 
> Josh, have you had a chance to look at the crypto patches you mentioned earlier?

I've been traveling for several weeks, but now my work schedule is
getting more normal, so I'll hopefully be able to spend time on this.

How would I recreate all these warnings?

Is it

  https://github.com/samitolvanen/linux.git lto-v6

plus a certain version of clang?

Also, any details on how to build clang would be appreciated, it's been
a while since I tried.


Here's the patch for hopefully making the warnings more helpful:


diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c6ab44543c92..e5f5cb107664 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2432,6 +2432,9 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 	sec = insn->sec;
 
 	while (1) {
+
+		if (insn->offset == 0x48)
+			WARN_FUNC("yo", sec, insn->offset);
 		next_insn = next_insn_same_sec(file, insn);
 
 		if (file->c_file && func && insn->func && func != insn->func->pfunc) {
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 4e1d7460574b..ced7e4754cba 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -217,6 +217,21 @@ struct symbol *find_func_containing(struct section *sec, unsigned long offset)
 	return NULL;
 }
 
+struct symbol *find_symbol_preceding(struct section *sec, unsigned long offset)
+{
+	struct symbol *sym;
+
+	/*
+	 * This is slow, but used for warning messages.
+	 */
+	while (1) {
+		sym = find_symbol_by_offset(sec, offset);
+		if (sym || !offset)
+			return sym;
+		offset--;
+	}
+}
+
 struct symbol *find_symbol_by_name(const struct elf *elf, const char *name)
 {
 	struct symbol *sym;
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index 807f8c670097..841902ed381e 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -136,10 +136,11 @@ struct symbol *find_func_by_offset(struct section *sec, unsigned long offset);
 struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset);
 struct symbol *find_symbol_by_name(const struct elf *elf, const char *name);
 struct symbol *find_symbol_containing(const struct section *sec, unsigned long offset);
+struct symbol *find_func_containing(struct section *sec, unsigned long offset);
+struct symbol *find_symbol_preceding(struct section *sec, unsigned long offset);
 struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, unsigned long offset);
 struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *sec,
 				     unsigned long offset, unsigned int len);
-struct symbol *find_func_containing(struct section *sec, unsigned long offset);
 int elf_rebuild_reloc_section(struct elf *elf, struct section *sec);
 
 #define for_each_sec(file, sec)						\
diff --git a/tools/objtool/warn.h b/tools/objtool/warn.h
index 7799f60de80a..33da0f2ed9d5 100644
--- a/tools/objtool/warn.h
+++ b/tools/objtool/warn.h
@@ -22,6 +22,8 @@ static inline char *offstr(struct section *sec, unsigned long offset)
 	unsigned long name_off;
 
 	func = find_func_containing(sec, offset);
+	if (!func)
+		func = find_symbol_preceding(sec, offset);
 	if (func) {
 		name = func->name;
 		name_off = offset - func->offset;


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-10  2:29                       ` Josh Poimboeuf
@ 2020-11-10  3:18                         ` Nick Desaulniers
  2020-11-10  4:48                         ` Sami Tolvanen
  2020-11-10 17:46                         ` Josh Poimboeuf
  2 siblings, 0 replies; 73+ messages in thread
From: Nick Desaulniers @ 2020-11-10  3:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Sami Tolvanen, Peter Zijlstra, Jann Horn,
	the arch/x86 maintainers, Masahiro Yamada, Steven Rostedt,
	Will Deacon, Greg Kroah-Hartman, Paul E. McKenney, Kees Cook,
	clang-built-linux, Kernel Hardening, linux-arch, Linux ARM,
	linux-kbuild, kernel list, linux-pci

On Mon, Nov 9, 2020 at 6:29 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> Also, any details on how to build clang would be appreciated, it's been
> a while since I tried.

$ git clone https://github.com/llvm/llvm-project.git --depth 1
$ mkdir llvm-project/llvm/build
$ cd !$
$ cmake .. -DCMAKE_BUILD_TYPE=Release -G Ninja
-DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt"
$ ninja
$ export PATH=$(pwd)/bin:$PATH
$ clang --version
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-10  2:29                       ` Josh Poimboeuf
  2020-11-10  3:18                         ` Nick Desaulniers
@ 2020-11-10  4:48                         ` Sami Tolvanen
  2020-11-10 16:11                           ` Josh Poimboeuf
  2020-11-10 17:46                         ` Josh Poimboeuf
  2 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-11-10  4:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Mon, Nov 9, 2020 at 6:29 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> How would I recreate all these warnings?

You can reproduce all of these using a normal gcc build without any of
the LTO patches by running objtool check -arfld vmlinux.o. However,
with gcc you'll see even more warnings due to duplicate symbol names,
as Peter pointed out elsewhere in the thread, so I looked at only the
warnings that objtool also prints with LTO.

Note that the LTO series contains a patch to split noinstr validation
from --vmlinux, as we need to run objtool here even if
CONFIG_VMLINUX_VALIDATION isn't selected, so I have not looked at the
noinstr warnings. The latest LTO tree is available here:

https://github.com/samitolvanen/linux/commits/clang-lto

> Here's the patch for hopefully making the warnings more helpful:

Thanks, I'll give it a try.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-10  4:48                         ` Sami Tolvanen
@ 2020-11-10 16:11                           ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-10 16:11 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Mon, Nov 09, 2020 at 08:48:01PM -0800, Sami Tolvanen wrote:
> On Mon, Nov 9, 2020 at 6:29 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > How would I recreate all these warnings?
> 
> You can reproduce all of these using a normal gcc build without any of
> the LTO patches by running objtool check -arfld vmlinux.o. However,
> with gcc you'll see even more warnings due to duplicate symbol names,
> as Peter pointed out elsewhere in the thread, so I looked at only the
> warnings that objtool also prints with LTO.
> 
> Note that the LTO series contains a patch to split noinstr validation
> from --vmlinux, as we need to run objtool here even if
> CONFIG_VMLINUX_VALIDATION isn't selected, so I have not looked at the
> noinstr warnings. The latest LTO tree is available here:
> 
> https://github.com/samitolvanen/linux/commits/clang-lto
> 
> > Here's the patch for hopefully making the warnings more helpful:
> 
> Thanks, I'll give it a try.

Here's the version without the nonsensical debug warning :-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 4e1d7460574b..ced7e4754cba 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -217,6 +217,21 @@ struct symbol *find_func_containing(struct section *sec, unsigned long offset)
 	return NULL;
 }
 
+struct symbol *find_symbol_preceding(struct section *sec, unsigned long offset)
+{
+	struct symbol *sym;
+
+	/*
+	 * This is slow, but used for warning messages.
+	 */
+	while (1) {
+		sym = find_symbol_by_offset(sec, offset);
+		if (sym || !offset)
+			return sym;
+		offset--;
+	}
+}
+
 struct symbol *find_symbol_by_name(const struct elf *elf, const char *name)
 {
 	struct symbol *sym;
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index 807f8c670097..841902ed381e 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -136,10 +136,11 @@ struct symbol *find_func_by_offset(struct section *sec, unsigned long offset);
 struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset);
 struct symbol *find_symbol_by_name(const struct elf *elf, const char *name);
 struct symbol *find_symbol_containing(const struct section *sec, unsigned long offset);
+struct symbol *find_func_containing(struct section *sec, unsigned long offset);
+struct symbol *find_symbol_preceding(struct section *sec, unsigned long offset);
 struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, unsigned long offset);
 struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *sec,
 				     unsigned long offset, unsigned int len);
-struct symbol *find_func_containing(struct section *sec, unsigned long offset);
 int elf_rebuild_reloc_section(struct elf *elf, struct section *sec);
 
 #define for_each_sec(file, sec)						\
diff --git a/tools/objtool/warn.h b/tools/objtool/warn.h
index 7799f60de80a..33da0f2ed9d5 100644
--- a/tools/objtool/warn.h
+++ b/tools/objtool/warn.h
@@ -22,6 +22,8 @@ static inline char *offstr(struct section *sec, unsigned long offset)
 	unsigned long name_off;
 
 	func = find_func_containing(sec, offset);
+	if (!func)
+		func = find_symbol_preceding(sec, offset);
 	if (func) {
 		name = func->name;
 		name_off = offset - func->offset;


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-10  2:29                       ` Josh Poimboeuf
  2020-11-10  3:18                         ` Nick Desaulniers
  2020-11-10  4:48                         ` Sami Tolvanen
@ 2020-11-10 17:46                         ` Josh Poimboeuf
  2020-11-10 18:59                           ` Sami Tolvanen
  2 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-10 17:46 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Mon, Nov 09, 2020 at 08:29:24PM -0600, Josh Poimboeuf wrote:
> On Mon, Nov 09, 2020 at 03:11:41PM -0800, Sami Tolvanen wrote:
> > CONFIG_XEN
> > 
> > __switch_to_asm()+0x0: undefined stack state
> >   xen_hypercall_set_trap_table()+0x0: <=== (sym)

With your branch + GCC 9 I can recreate all the warnings except this
one.

Will do some digging on the others...

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-10 17:46                         ` Josh Poimboeuf
@ 2020-11-10 18:59                           ` Sami Tolvanen
  2020-11-13 19:54                             ` Josh Poimboeuf
  0 siblings, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-11-10 18:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Tue, Nov 10, 2020 at 9:46 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Mon, Nov 09, 2020 at 08:29:24PM -0600, Josh Poimboeuf wrote:
> > On Mon, Nov 09, 2020 at 03:11:41PM -0800, Sami Tolvanen wrote:
> > > CONFIG_XEN
> > >
> > > __switch_to_asm()+0x0: undefined stack state
> > >   xen_hypercall_set_trap_table()+0x0: <=== (sym)
>
> With your branch + GCC 9 I can recreate all the warnings except this
> one.

In a gcc build this warning is replaced with a different one:

vmlinux.o: warning: objtool: __startup_secondary_64()+0x7: return with
modified stack frame

This just seems to depend on which function is placed right after the
code in xen-head.S. With gcc, the disassembly looks like this:

0000000000000000 <asm_cpu_bringup_and_idle>:
       0:       e8 00 00 00 00          callq  5 <asm_cpu_bringup_and_idle+0x5>
                        1: R_X86_64_PLT32       cpu_bringup_and_idle-0x4
       5:       e9 f6 0f 00 00          jmpq   1000
<xen_hypercall_set_trap_table>
...
0000000000001000 <xen_hypercall_set_trap_table>:
        ...
...
0000000000002000 <__startup_secondary_64>:

With Clang+LTO, we end up with __switch_to_asm here instead of
__startup_secondary_64.

> Will do some digging on the others...

Thanks!

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-10 18:59                           ` Sami Tolvanen
@ 2020-11-13 19:54                             ` Josh Poimboeuf
  2020-11-13 20:24                               ` Sami Tolvanen
  0 siblings, 1 reply; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-13 19:54 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Tue, Nov 10, 2020 at 10:59:55AM -0800, Sami Tolvanen wrote:
> On Tue, Nov 10, 2020 at 9:46 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > On Mon, Nov 09, 2020 at 08:29:24PM -0600, Josh Poimboeuf wrote:
> > > On Mon, Nov 09, 2020 at 03:11:41PM -0800, Sami Tolvanen wrote:
> > > > CONFIG_XEN
> > > >
> > > > __switch_to_asm()+0x0: undefined stack state
> > > >   xen_hypercall_set_trap_table()+0x0: <=== (sym)
> >
> > With your branch + GCC 9 I can recreate all the warnings except this
> > one.
> 
> In a gcc build this warning is replaced with a different one:
> 
> vmlinux.o: warning: objtool: __startup_secondary_64()+0x7: return with
> modified stack frame
> 
> This just seems to depend on which function is placed right after the
> code in xen-head.S. With gcc, the disassembly looks like this:
> 
> 0000000000000000 <asm_cpu_bringup_and_idle>:
>        0:       e8 00 00 00 00          callq  5 <asm_cpu_bringup_and_idle+0x5>
>                         1: R_X86_64_PLT32       cpu_bringup_and_idle-0x4
>        5:       e9 f6 0f 00 00          jmpq   1000
> <xen_hypercall_set_trap_table>
> ...
> 0000000000001000 <xen_hypercall_set_trap_table>:
>         ...
> ...
> 0000000000002000 <__startup_secondary_64>:
> 
> With Clang+LTO, we end up with __switch_to_asm here instead of
> __startup_secondary_64.

I still don't see this warning for some reason.

Is it fixed by adding cpu_bringup_and_idle() to global_noreturns[] in
tools/objtool/check.c?

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 19:54                             ` Josh Poimboeuf
@ 2020-11-13 20:24                               ` Sami Tolvanen
  2020-11-13 20:52                                 ` Josh Poimboeuf
  2020-11-13 22:34                                 ` Josh Poimboeuf
  0 siblings, 2 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-11-13 20:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 11:54 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Tue, Nov 10, 2020 at 10:59:55AM -0800, Sami Tolvanen wrote:
> > On Tue, Nov 10, 2020 at 9:46 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > On Mon, Nov 09, 2020 at 08:29:24PM -0600, Josh Poimboeuf wrote:
> > > > On Mon, Nov 09, 2020 at 03:11:41PM -0800, Sami Tolvanen wrote:
> > > > > CONFIG_XEN
> > > > >
> > > > > __switch_to_asm()+0x0: undefined stack state
> > > > >   xen_hypercall_set_trap_table()+0x0: <=== (sym)
> > >
> > > With your branch + GCC 9 I can recreate all the warnings except this
> > > one.
> >
> > In a gcc build this warning is replaced with a different one:
> >
> > vmlinux.o: warning: objtool: __startup_secondary_64()+0x7: return with
> > modified stack frame
> >
> > This just seems to depend on which function is placed right after the
> > code in xen-head.S. With gcc, the disassembly looks like this:
> >
> > 0000000000000000 <asm_cpu_bringup_and_idle>:
> >        0:       e8 00 00 00 00          callq  5 <asm_cpu_bringup_and_idle+0x5>
> >                         1: R_X86_64_PLT32       cpu_bringup_and_idle-0x4
> >        5:       e9 f6 0f 00 00          jmpq   1000
> > <xen_hypercall_set_trap_table>
> > ...
> > 0000000000001000 <xen_hypercall_set_trap_table>:
> >         ...
> > ...
> > 0000000000002000 <__startup_secondary_64>:
> >
> > With Clang+LTO, we end up with __switch_to_asm here instead of
> > __startup_secondary_64.
>
> I still don't see this warning for some reason.

Do you have CONFIG_XEN enabled? I can reproduce this on ToT master as follows:

$ git rev-parse HEAD
585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
$ make defconfig && \
./scripts/config -e HYPERVISOR_GUEST -e PARAVIRT -e XEN && \
make olddefconfig && \
make -j110
...
$ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep secondary
vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with
modified stack frame

> Is it fixed by adding cpu_bringup_and_idle() to global_noreturns[] in
> tools/objtool/check.c?

No, that didn't fix the warning. Here's what I tested:

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index c6ab44543c92..f1f65f5688cf 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -156,6 +156,7 @@ static bool __dead_end_function(struct
objtool_file *file, struct symbol *func,
                "machine_real_restart",
                "rewind_stack_do_exit",
                "kunit_try_catch_throw",
+               "cpu_bringup_and_idle",
        };

        if (!func)

Sami

^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 20:24                               ` Sami Tolvanen
@ 2020-11-13 20:52                                 ` Josh Poimboeuf
  2020-11-13 22:34                                 ` Josh Poimboeuf
  1 sibling, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-13 20:52 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 12:24:32PM -0800, Sami Tolvanen wrote:
> On Fri, Nov 13, 2020 at 11:54 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > On Tue, Nov 10, 2020 at 10:59:55AM -0800, Sami Tolvanen wrote:
> > > On Tue, Nov 10, 2020 at 9:46 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > >
> > > > On Mon, Nov 09, 2020 at 08:29:24PM -0600, Josh Poimboeuf wrote:
> > > > > On Mon, Nov 09, 2020 at 03:11:41PM -0800, Sami Tolvanen wrote:
> > > > > > CONFIG_XEN
> > > > > >
> > > > > > __switch_to_asm()+0x0: undefined stack state
> > > > > >   xen_hypercall_set_trap_table()+0x0: <=== (sym)
> > > >
> > > > With your branch + GCC 9 I can recreate all the warnings except this
> > > > one.
> > >
> > > In a gcc build this warning is replaced with a different one:
> > >
> > > vmlinux.o: warning: objtool: __startup_secondary_64()+0x7: return with
> > > modified stack frame
> > >
> > > This just seems to depend on which function is placed right after the
> > > code in xen-head.S. With gcc, the disassembly looks like this:
> > >
> > > 0000000000000000 <asm_cpu_bringup_and_idle>:
> > >        0:       e8 00 00 00 00          callq  5 <asm_cpu_bringup_and_idle+0x5>
> > >                         1: R_X86_64_PLT32       cpu_bringup_and_idle-0x4
> > >        5:       e9 f6 0f 00 00          jmpq   1000
> > > <xen_hypercall_set_trap_table>
> > > ...
> > > 0000000000001000 <xen_hypercall_set_trap_table>:
> > >         ...
> > > ...
> > > 0000000000002000 <__startup_secondary_64>:
> > >
> > > With Clang+LTO, we end up with __switch_to_asm here instead of
> > > __startup_secondary_64.
> >
> > I still don't see this warning for some reason.
> 
> Do you have CONFIG_XEN enabled? I can reproduce this on ToT master as follows:
> 
> $ git rev-parse HEAD
> 585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
> $ make defconfig && \
> ./scripts/config -e HYPERVISOR_GUEST -e PARAVIRT -e XEN && \
> make olddefconfig && \
> make -j110
> ...
> $ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep secondary
> vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with
> modified stack frame

Ok, I see it now on Linus' tree.  I just didn't see it on your clang-lto
branch.

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 20:24                               ` Sami Tolvanen
  2020-11-13 20:52                                 ` Josh Poimboeuf
@ 2020-11-13 22:34                                 ` Josh Poimboeuf
  2020-11-13 22:54                                   ` Sami Tolvanen
  2020-11-13 23:31                                   ` Sami Tolvanen
  1 sibling, 2 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-13 22:34 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 12:24:32PM -0800, Sami Tolvanen wrote:
> > I still don't see this warning for some reason.
> 
> Do you have CONFIG_XEN enabled? I can reproduce this on ToT master as follows:
> 
> $ git rev-parse HEAD
> 585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
> $ make defconfig && \
> ./scripts/config -e HYPERVISOR_GUEST -e PARAVIRT -e XEN && \
> make olddefconfig && \
> make -j110
> ...
> $ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep secondary
> vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with
> modified stack frame
> 
> > Is it fixed by adding cpu_bringup_and_idle() to global_noreturns[] in
> > tools/objtool/check.c?
> 
> No, that didn't fix the warning. Here's what I tested:

I think this fixes it:

From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH] x86/xen: Fix objtool vmlinux.o validation of xen hypercalls

Objtool vmlinux.o validation is showing warnings like the following:

  # tools/objtool/objtool check -barfld vmlinux.o
  vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with modified stack frame
  vmlinux.o: warning: objtool:   xen_hypercall_set_trap_table()+0x0: <=== (sym)

Objtool falls through all the empty hypercall text and gets confused
when it encounters the first real function afterwards.  The empty unwind
hints in the hypercalls aren't working for some reason.  Replace them
with a more straightforward use of STACK_FRAME_NON_STANDARD.

Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/xen/xen-head.S | 9 ++++-----
 include/linux/objtool.h | 8 ++++++++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 2d7c8f34f56c..3c538b1ff4a6 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -6,6 +6,7 @@
 
 #include <linux/elfnote.h>
 #include <linux/init.h>
+#include <linux/objtool.h>
 
 #include <asm/boot.h>
 #include <asm/asm.h>
@@ -67,14 +68,12 @@ SYM_CODE_END(asm_cpu_bringup_and_idle)
 .pushsection .text
 	.balign PAGE_SIZE
 SYM_CODE_START(hypercall_page)
-	.rept (PAGE_SIZE / 32)
-		UNWIND_HINT_EMPTY
-		.skip 32
-	.endr
+	.skip PAGE_SIZE
 
 #define HYPERCALL(n) \
 	.equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \
-	.type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32
+	.type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32; \
+	STACK_FRAME_NON_STANDARD xen_hypercall_##n
 #include <asm/xen-hypercalls.h>
 #undef HYPERCALL
 SYM_CODE_END(hypercall_page)
diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index 577f51436cf9..746617265236 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -109,6 +109,12 @@ struct unwind_hint {
 	.popsection
 .endm
 
+.macro STACK_FRAME_NON_STANDARD func:req
+	.pushsection .discard.func_stack_frame_non_standard
+		.long \func - .
+	.popsection
+.endm
+
 #endif /* __ASSEMBLY__ */
 
 #else /* !CONFIG_STACK_VALIDATION */
@@ -123,6 +129,8 @@ struct unwind_hint {
 .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
 .endm
 #endif
+.macro STACK_FRAME_NON_STANDARD func:req
+.endm
 
 #endif /* CONFIG_STACK_VALIDATION */
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 22:34                                 ` Josh Poimboeuf
@ 2020-11-13 22:54                                   ` Sami Tolvanen
  2020-11-13 22:56                                     ` Josh Poimboeuf
  2020-11-13 23:31                                   ` Sami Tolvanen
  1 sibling, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-11-13 22:54 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 2:34 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Fri, Nov 13, 2020 at 12:24:32PM -0800, Sami Tolvanen wrote:
> > > I still don't see this warning for some reason.
> >
> > Do you have CONFIG_XEN enabled? I can reproduce this on ToT master as follows:
> >
> > $ git rev-parse HEAD
> > 585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
> > $ make defconfig && \
> > ./scripts/config -e HYPERVISOR_GUEST -e PARAVIRT -e XEN && \
> > make olddefconfig && \
> > make -j110
> > ...
> > $ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep secondary
> > vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with
> > modified stack frame
> >
> > > Is it fixed by adding cpu_bringup_and_idle() to global_noreturns[] in
> > > tools/objtool/check.c?
> >
> > No, that didn't fix the warning. Here's what I tested:
>
> I think this fixes it:
>
> From: Josh Poimboeuf <jpoimboe@redhat.com>
> Subject: [PATCH] x86/xen: Fix objtool vmlinux.o validation of xen hypercalls
>
> Objtool vmlinux.o validation is showing warnings like the following:
>
>   # tools/objtool/objtool check -barfld vmlinux.o
>   vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with modified stack frame
>   vmlinux.o: warning: objtool:   xen_hypercall_set_trap_table()+0x0: <=== (sym)
>
> Objtool falls through all the empty hypercall text and gets confused
> when it encounters the first real function afterwards.  The empty unwind
> hints in the hypercalls aren't working for some reason.  Replace them
> with a more straightforward use of STACK_FRAME_NON_STANDARD.
>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  arch/x86/xen/xen-head.S | 9 ++++-----
>  include/linux/objtool.h | 8 ++++++++
>  2 files changed, 12 insertions(+), 5 deletions(-)

Confirmed, this fixes the warning, also in LTO builds. Thanks!

Tested-by: Sami Tolvanen <samitolvanen@google.com>

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 22:54                                   ` Sami Tolvanen
@ 2020-11-13 22:56                                     ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-13 22:56 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 02:54:32PM -0800, Sami Tolvanen wrote:
> On Fri, Nov 13, 2020 at 2:34 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > On Fri, Nov 13, 2020 at 12:24:32PM -0800, Sami Tolvanen wrote:
> > > > I still don't see this warning for some reason.
> > >
> > > Do you have CONFIG_XEN enabled? I can reproduce this on ToT master as follows:
> > >
> > > $ git rev-parse HEAD
> > > 585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
> > > $ make defconfig && \
> > > ./scripts/config -e HYPERVISOR_GUEST -e PARAVIRT -e XEN && \
> > > make olddefconfig && \
> > > make -j110
> > > ...
> > > $ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep secondary
> > > vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with
> > > modified stack frame
> > >
> > > > Is it fixed by adding cpu_bringup_and_idle() to global_noreturns[] in
> > > > tools/objtool/check.c?
> > >
> > > No, that didn't fix the warning. Here's what I tested:
> >
> > I think this fixes it:
> >
> > From: Josh Poimboeuf <jpoimboe@redhat.com>
> > Subject: [PATCH] x86/xen: Fix objtool vmlinux.o validation of xen hypercalls
> >
> > Objtool vmlinux.o validation is showing warnings like the following:
> >
> >   # tools/objtool/objtool check -barfld vmlinux.o
> >   vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with modified stack frame
> >   vmlinux.o: warning: objtool:   xen_hypercall_set_trap_table()+0x0: <=== (sym)
> >
> > Objtool falls through all the empty hypercall text and gets confused
> > when it encounters the first real function afterwards.  The empty unwind
> > hints in the hypercalls aren't working for some reason.  Replace them
> > with a more straightforward use of STACK_FRAME_NON_STANDARD.
> >
> > Reported-by: Sami Tolvanen <samitolvanen@google.com>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  arch/x86/xen/xen-head.S | 9 ++++-----
> >  include/linux/objtool.h | 8 ++++++++
> >  2 files changed, 12 insertions(+), 5 deletions(-)
> 
> Confirmed, this fixes the warning, also in LTO builds. Thanks!
> 
> Tested-by: Sami Tolvanen <samitolvanen@google.com>

Good... I'll work through the rest of them.

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 22:34                                 ` Josh Poimboeuf
  2020-11-13 22:54                                   ` Sami Tolvanen
@ 2020-11-13 23:31                                   ` Sami Tolvanen
  2020-11-14  0:49                                     ` Josh Poimboeuf
  1 sibling, 1 reply; 73+ messages in thread
From: Sami Tolvanen @ 2020-11-13 23:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 2:34 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Fri, Nov 13, 2020 at 12:24:32PM -0800, Sami Tolvanen wrote:
> > > I still don't see this warning for some reason.
> >
> > Do you have CONFIG_XEN enabled? I can reproduce this on ToT master as follows:
> >
> > $ git rev-parse HEAD
> > 585e5b17b92dead8a3aca4e3c9876fbca5f7e0ba
> > $ make defconfig && \
> > ./scripts/config -e HYPERVISOR_GUEST -e PARAVIRT -e XEN && \
> > make olddefconfig && \
> > make -j110
> > ...
> > $ ./tools/objtool/objtool check -arfld vmlinux.o 2>&1 | grep secondary
> > vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with
> > modified stack frame
> >
> > > Is it fixed by adding cpu_bringup_and_idle() to global_noreturns[] in
> > > tools/objtool/check.c?
> >
> > No, that didn't fix the warning. Here's what I tested:
>
> I think this fixes it:
>
> From: Josh Poimboeuf <jpoimboe@redhat.com>
> Subject: [PATCH] x86/xen: Fix objtool vmlinux.o validation of xen hypercalls
>
> Objtool vmlinux.o validation is showing warnings like the following:
>
>   # tools/objtool/objtool check -barfld vmlinux.o
>   vmlinux.o: warning: objtool: __startup_secondary_64()+0x2: return with modified stack frame
>   vmlinux.o: warning: objtool:   xen_hypercall_set_trap_table()+0x0: <=== (sym)
>
> Objtool falls through all the empty hypercall text and gets confused
> when it encounters the first real function afterwards.  The empty unwind
> hints in the hypercalls aren't working for some reason.  Replace them
> with a more straightforward use of STACK_FRAME_NON_STANDARD.
>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  arch/x86/xen/xen-head.S | 9 ++++-----
>  include/linux/objtool.h | 8 ++++++++
>  2 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index 2d7c8f34f56c..3c538b1ff4a6 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -6,6 +6,7 @@
>
>  #include <linux/elfnote.h>
>  #include <linux/init.h>
> +#include <linux/objtool.h>
>
>  #include <asm/boot.h>
>  #include <asm/asm.h>
> @@ -67,14 +68,12 @@ SYM_CODE_END(asm_cpu_bringup_and_idle)
>  .pushsection .text
>         .balign PAGE_SIZE
>  SYM_CODE_START(hypercall_page)
> -       .rept (PAGE_SIZE / 32)
> -               UNWIND_HINT_EMPTY
> -               .skip 32
> -       .endr
> +       .skip PAGE_SIZE
>
>  #define HYPERCALL(n) \
>         .equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \
> -       .type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32
> +       .type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32; \
> +       STACK_FRAME_NON_STANDARD xen_hypercall_##n
>  #include <asm/xen-hypercalls.h>
>  #undef HYPERCALL
>  SYM_CODE_END(hypercall_page)
> diff --git a/include/linux/objtool.h b/include/linux/objtool.h
> index 577f51436cf9..746617265236 100644
> --- a/include/linux/objtool.h
> +++ b/include/linux/objtool.h
> @@ -109,6 +109,12 @@ struct unwind_hint {
>         .popsection
>  .endm
>
> +.macro STACK_FRAME_NON_STANDARD func:req
> +       .pushsection .discard.func_stack_frame_non_standard
> +               .long \func - .
> +       .popsection
> +.endm
> +
>  #endif /* __ASSEMBLY__ */
>
>  #else /* !CONFIG_STACK_VALIDATION */
> @@ -123,6 +129,8 @@ struct unwind_hint {
>  .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
>  .endm
>  #endif
> +.macro STACK_FRAME_NON_STANDARD func:req
> +.endm

This macro needs to be before the #endif, so it's defined only for
assembly code. This breaks my arm64 builds even though x86 curiously
worked just fine.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 22/25] x86/asm: annotate indirect jumps
  2020-11-13 23:31                                   ` Sami Tolvanen
@ 2020-11-14  0:49                                     ` Josh Poimboeuf
  0 siblings, 0 replies; 73+ messages in thread
From: Josh Poimboeuf @ 2020-11-14  0:49 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Peter Zijlstra, Jann Horn, the arch/x86 maintainers,
	Masahiro Yamada, Steven Rostedt, Will Deacon, Greg Kroah-Hartman,
	Paul E. McKenney, Kees Cook, Nick Desaulniers, clang-built-linux,
	Kernel Hardening, linux-arch, Linux ARM, linux-kbuild,
	kernel list, linux-pci

On Fri, Nov 13, 2020 at 03:31:34PM -0800, Sami Tolvanen wrote:
> >  #else /* !CONFIG_STACK_VALIDATION */
> > @@ -123,6 +129,8 @@ struct unwind_hint {
> >  .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
> >  .endm
> >  #endif
> > +.macro STACK_FRAME_NON_STANDARD func:req
> > +.endm
> 
> This macro needs to be before the #endif, so it's defined only for
> assembly code. This breaks my arm64 builds even though x86 curiously
> worked just fine.

Yeah, I noticed that after syncing objtool.h with the tools copy.  Fixed
now.

I've got fixes for some of the other warnings, but I'll queue them up
and post when they're all ready.

-- 
Josh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files
  2020-10-14 22:50   ` Kees Cook
@ 2020-12-03 17:59     ` Masahiro Yamada
  2020-12-03 18:47       ` Sami Tolvanen
  0 siblings, 1 reply; 73+ messages in thread
From: Masahiro Yamada @ 2020-12-03 17:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Sami Tolvanen, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, Kernel Hardening, linux-arch,
	linux-arm-kernel, Linux Kbuild mailing list,
	Linux Kernel Mailing List, linux-pci, X86 ML

On Thu, Oct 15, 2020 at 7:50 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Oct 12, 2020 at 05:31:52PM -0700, Sami Tolvanen wrote:
> > With LTO, llvm-nm prints out symbols for each archive member
> > separately, which results in a lot of duplicate dependencies in the
> > .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module
> > consists of several compilation units, the output can exceed the
> > default xargs command size limit and split the dependency list to
> > multiple lines, which results in used symbols getting trimmed.
> >
> > This change removes duplicate dependencies, which will reduce the
> > probability of this happening and makes .mod files smaller and
> > easier to read.
> >
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > Reviewed-by: Kees Cook <keescook@chromium.org>
>
> Hi Masahiro,
>
> This appears to be a general improvement as well. This looks like it can
> land without depending on the rest of the series.

It cannot.
Adding "sort -u" is pointless without the rest of the series
since the symbol duplication happens only with Clang LTO.

This is not a solution.
"reduce the probability of this happening" well describes it.

I wrote a different patch.



> -Kees
>
> > ---
> >  scripts/Makefile.build | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> > index ab0ddf4884fd..96d6c9e18901 100644
> > --- a/scripts/Makefile.build
> > +++ b/scripts/Makefile.build
> > @@ -266,7 +266,7 @@ endef
> >
> >  # List module undefined symbols (or empty line if not enabled)
> >  ifdef CONFIG_TRIM_UNUSED_KSYMS
> > -cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
> > +cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | sort -u | xargs echo
> >  else
> >  cmd_undef_syms = echo
> >  endif
> > --
> > 2.28.0.1011.ga647a8990f-goog
> >
>
> --
> Kees Cook
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/202010141549.412F2BF0%40keescook.



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files
  2020-12-03 17:59     ` Masahiro Yamada
@ 2020-12-03 18:47       ` Sami Tolvanen
  0 siblings, 0 replies; 73+ messages in thread
From: Sami Tolvanen @ 2020-12-03 18:47 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Kees Cook, Steven Rostedt, Will Deacon, Peter Zijlstra,
	Greg Kroah-Hartman, Paul E. McKenney, Nick Desaulniers,
	clang-built-linux, Kernel Hardening, linux-arch,
	linux-arm-kernel, Linux Kbuild mailing list,
	Linux Kernel Mailing List, PCI, X86 ML

On Thu, Dec 3, 2020 at 10:00 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Thu, Oct 15, 2020 at 7:50 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Mon, Oct 12, 2020 at 05:31:52PM -0700, Sami Tolvanen wrote:
> > > With LTO, llvm-nm prints out symbols for each archive member
> > > separately, which results in a lot of duplicate dependencies in the
> > > .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module
> > > consists of several compilation units, the output can exceed the
> > > default xargs command size limit and split the dependency list to
> > > multiple lines, which results in used symbols getting trimmed.
> > >
> > > This change removes duplicate dependencies, which will reduce the
> > > probability of this happening and makes .mod files smaller and
> > > easier to read.
> > >
> > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > > Reviewed-by: Kees Cook <keescook@chromium.org>
> >
> > Hi Masahiro,
> >
> > This appears to be a general improvement as well. This looks like it can
> > land without depending on the rest of the series.
>
> It cannot.
> Adding "sort -u" is pointless without the rest of the series
> since the symbol duplication happens only with Clang LTO.
>
> This is not a solution.
> "reduce the probability of this happening" well describes it.
>
> I wrote a different patch.

Great, thanks for looking into this. I'll drop this patch from the next version.

Sami

^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2020-12-03 18:48 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-13  0:31 [PATCH v6 00/25] Add support for Clang LTO Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 01/25] kbuild: preprocess module linker script Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 02/25] objtool: Add a pass for generating __mcount_loc Sami Tolvanen
2020-10-14 16:50   ` Ingo Molnar
2020-10-14 18:21     ` Peter Zijlstra
2020-10-15 20:10       ` Josh Poimboeuf
2020-10-13  0:31 ` [PATCH v6 03/25] objtool: Don't autodetect vmlinux.o Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 04/25] tracing: move function tracer options to Kconfig Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 05/25] tracing: add support for objtool mcount Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 06/25] x86, build: use " Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 07/25] treewide: remove DISABLE_LTO Sami Tolvanen
2020-10-14 22:43   ` Kees Cook
2020-10-17  1:46     ` Masahiro Yamada
2020-10-13  0:31 ` [PATCH v6 08/25] kbuild: add support for Clang LTO Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 09/25] kbuild: lto: fix module versioning Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 10/25] objtool: Split noinstr validation from --vmlinux Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 11/25] kbuild: lto: postpone objtool Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 12/25] kbuild: lto: limit inlining Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 13/25] kbuild: lto: merge module sections Sami Tolvanen
2020-10-14 22:49   ` Kees Cook
2020-10-20 16:42     ` Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files Sami Tolvanen
2020-10-14 22:50   ` Kees Cook
2020-12-03 17:59     ` Masahiro Yamada
2020-12-03 18:47       ` Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 15/25] init: lto: ensure initcall ordering Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 16/25] init: lto: fix PREL32 relocations Sami Tolvanen
2020-10-14 22:53   ` Kees Cook
2020-10-15  0:12   ` Jann Horn
2020-10-13  0:31 ` [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO Sami Tolvanen
2020-10-14 22:58   ` Kees Cook
2020-10-13  0:31 ` [PATCH v6 18/25] modpost: lto: strip .lto from module names Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 19/25] scripts/mod: disable LTO for empty.c Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 20/25] efi/libstub: disable LTO Sami Tolvanen
2020-10-13  0:31 ` [PATCH v6 21/25] drivers/misc/lkdtm: disable LTO for rodata.o Sami Tolvanen
2020-10-13  0:32 ` [PATCH v6 22/25] x86/asm: annotate indirect jumps Sami Tolvanen
2020-10-14 22:46   ` Kees Cook
2020-10-14 23:23   ` Jann Horn
2020-10-15 10:22     ` Peter Zijlstra
2020-10-15 20:39       ` Josh Poimboeuf
2020-10-20 16:45         ` Sami Tolvanen
2020-10-20 18:52           ` Josh Poimboeuf
2020-10-20 19:24             ` Sami Tolvanen
2020-10-21  8:56               ` Peter Zijlstra
2020-10-21  9:08                 ` Peter Zijlstra
2020-10-21  9:32                 ` Peter Zijlstra
2020-10-21 21:27                   ` Josh Poimboeuf
2020-10-22  7:25                     ` Peter Zijlstra
2020-10-23 17:48                       ` Sami Tolvanen
2020-10-23 18:04                         ` Nick Desaulniers
2020-10-21 15:01                 ` Sami Tolvanen
2020-10-22  0:22                 ` Sami Tolvanen
2020-10-23 17:36                   ` Sami Tolvanen
2020-11-09 23:11                     ` Sami Tolvanen
2020-11-10  2:29                       ` Josh Poimboeuf
2020-11-10  3:18                         ` Nick Desaulniers
2020-11-10  4:48                         ` Sami Tolvanen
2020-11-10 16:11                           ` Josh Poimboeuf
2020-11-10 17:46                         ` Josh Poimboeuf
2020-11-10 18:59                           ` Sami Tolvanen
2020-11-13 19:54                             ` Josh Poimboeuf
2020-11-13 20:24                               ` Sami Tolvanen
2020-11-13 20:52                                 ` Josh Poimboeuf
2020-11-13 22:34                                 ` Josh Poimboeuf
2020-11-13 22:54                                   ` Sami Tolvanen
2020-11-13 22:56                                     ` Josh Poimboeuf
2020-11-13 23:31                                   ` Sami Tolvanen
2020-11-14  0:49                                     ` Josh Poimboeuf
2020-10-21  9:51             ` Peter Zijlstra
2020-10-21 18:30               ` Josh Poimboeuf
2020-10-13  0:32 ` [PATCH v6 23/25] x86, vdso: disable LTO only for vDSO Sami Tolvanen
2020-10-13  0:32 ` [PATCH v6 24/25] x86, cpu: disable LTO for cpu.c Sami Tolvanen
2020-10-13  0:32 ` [PATCH v6 25/25] x86, build: allow LTO_CLANG and THINLTO to be selected Sami Tolvanen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).