linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS
@ 2021-02-25 16:02 Masahiro Yamada
  2021-02-25 16:02 ` [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO Masahiro Yamada
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 16:02 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada


Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
about the build speed.

I re-implemented this feature, and the build time cost is now
almost unnoticeable level.

I hope this makes Linus happy.



Masahiro Yamada (4):
  kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO
  export.h: make __ksymtab_strings per-symbol section
  kbuild: separate out vmlinux.lds generation
  kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in
    one-pass

 Makefile                          | 34 ++++++------
 arch/alpha/kernel/Makefile        |  3 +-
 arch/arc/kernel/Makefile          |  3 +-
 arch/arm/kernel/Makefile          |  3 +-
 arch/arm64/kernel/Makefile        |  3 +-
 arch/csky/kernel/Makefile         |  3 +-
 arch/h8300/kernel/Makefile        |  2 +-
 arch/hexagon/kernel/Makefile      |  3 +-
 arch/ia64/kernel/Makefile         |  3 +-
 arch/m68k/kernel/Makefile         |  2 +-
 arch/microblaze/kernel/Makefile   |  3 +-
 arch/mips/kernel/Makefile         |  3 +-
 arch/nds32/kernel/Makefile        |  3 +-
 arch/nios2/kernel/Makefile        |  2 +-
 arch/openrisc/kernel/Makefile     |  3 +-
 arch/parisc/kernel/Makefile       |  3 +-
 arch/powerpc/kernel/Makefile      |  2 +-
 arch/riscv/kernel/Makefile        |  2 +-
 arch/s390/kernel/Makefile         |  3 +-
 arch/sh/kernel/Makefile           |  3 +-
 arch/sparc/kernel/Makefile        |  2 +-
 arch/um/kernel/Makefile           |  2 +-
 arch/x86/kernel/Makefile          |  2 +-
 arch/xtensa/kernel/Makefile       |  3 +-
 include/asm-generic/export.h      | 25 +--------
 include/asm-generic/vmlinux.lds.h | 29 +++++++++--
 include/linux/export.h            | 56 +++++---------------
 init/Kconfig                      |  4 +-
 scripts/Makefile.build            |  7 +--
 scripts/adjust_autoksyms.sh       | 76 ---------------------------
 scripts/gen-keep-ksyms.sh         | 86 +++++++++++++++++++++++++++++++
 scripts/gen_autoksyms.sh          | 55 --------------------
 scripts/gen_ksymdeps.sh           | 25 ---------
 scripts/lto-used-symbollist.txt   |  5 --
 scripts/module.lds.S              | 38 ++++++++++----
 35 files changed, 210 insertions(+), 291 deletions(-)
 delete mode 100755 scripts/adjust_autoksyms.sh
 create mode 100755 scripts/gen-keep-ksyms.sh
 delete mode 100755 scripts/gen_autoksyms.sh
 delete mode 100755 scripts/gen_ksymdeps.sh
 delete mode 100644 scripts/lto-used-symbollist.txt

-- 
2.27.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO
  2021-02-25 16:02 [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
@ 2021-02-25 16:02 ` Masahiro Yamada
  2021-02-25 17:45   ` Sami Tolvanen
  2021-02-25 16:02 ` [PATCH 2/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 16:02 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada

Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
does not work as expected if the .config file has already specified
CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling
CONFIG_LTO_CLANG.

So, the user-supplied whitelist and LTO-specific white list must be
independent of each other.

I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO
handle whitelists in the same way.

Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

 init/Kconfig                    |  1 -
 scripts/gen_autoksyms.sh        | 33 ++++++++++++++++++++++++---------
 scripts/lto-used-symbollist.txt |  5 -----
 3 files changed, 24 insertions(+), 15 deletions(-)
 delete mode 100644 scripts/lto-used-symbollist.txt

diff --git a/init/Kconfig b/init/Kconfig
index 0bf5b340b80e..351161326e3c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2277,7 +2277,6 @@ config TRIM_UNUSED_KSYMS
 config UNUSED_KSYMS_WHITELIST
 	string "Whitelist of symbols to keep in ksymtab"
 	depends on TRIM_UNUSED_KSYMS
-	default "scripts/lto-used-symbollist.txt" if LTO_CLANG
 	help
 	  By default, all unused exported symbols will be un-exported from the
 	  build when TRIM_UNUSED_KSYMS is selected.
diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh
index d54dfba15bf2..b74d5949fea6 100755
--- a/scripts/gen_autoksyms.sh
+++ b/scripts/gen_autoksyms.sh
@@ -19,7 +19,24 @@ esac
 # We need access to CONFIG_ symbols
 . include/config/auto.conf
 
-ksym_wl=/dev/null
+needed_symbols=
+
+# Special case for modversions (see modpost.c)
+if [ -n "$CONFIG_MODVERSIONS" ]; then
+	needed_symbols="$needed_symbols module_layout"
+fi
+
+# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary
+# when the .mod files are generated, which means they don't yet contain
+# references to certain symbols that will be present in the final binaries.
+if [ -n "$CONFIG_LTO_CLANG" ]; then
+	# intrinsic functions
+	needed_symbols="$needed_symbols memcpy memmove memset"
+	# stack protector symbols
+	needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard"
+fi
+
+ksym_wl=
 if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then
 	# Use 'eval' to expand the whitelist path and check if it is relative
 	eval ksym_wl="$CONFIG_UNUSED_KSYMS_WHITELIST"
@@ -40,16 +57,14 @@ cat > "$output_file" << EOT
 EOT
 
 [ -f modules.order ] && modlist=modules.order || modlist=/dev/null
-sed 's/ko$/mod/' $modlist |
-xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
-cat - "$ksym_wl" |
+
+{
+	sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
+	echo "$needed_symbols"
+	[ -n "$ksym_wl" ] && cat "$ksym_wl"
+} | sed -e 's/ /\n/g' | sed -n -e '/^$/!p' |
 # Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
 # point addresses.
 sed -e 's/^\.//' |
 sort -u |
 sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
-
-# Special case for modversions (see modpost.c)
-if [ -n "$CONFIG_MODVERSIONS" ]; then
-	echo "#define __KSYM_module_layout 1" >> "$output_file"
-fi
diff --git a/scripts/lto-used-symbollist.txt b/scripts/lto-used-symbollist.txt
deleted file mode 100644
index 38e7bb9ebaae..000000000000
--- a/scripts/lto-used-symbollist.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-memcpy
-memmove
-memset
-__stack_chk_fail
-__stack_chk_guard
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] export.h: make __ksymtab_strings per-symbol section
  2021-02-25 16:02 [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
  2021-02-25 16:02 ` [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO Masahiro Yamada
@ 2021-02-25 16:02 ` Masahiro Yamada
  2021-02-25 16:02 ` [PATCH 3/4] kbuild: separate out vmlinux.lds generation Masahiro Yamada
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 16:02 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada

The export symbol tables are placed on own sections (__ksymtab*+<sym>)
and sorted by SORT (an alias of SORT_BY_NAME) because the module
subsystem uses the binary search for symbol resolution.

We did not have a good reason to do so for __ksymtab_strings, but
now I have.

To make CONFIG_TRIM_UNUSED_KSYMS work in one-pass, the linker needs
to trim unused strings of symbols and namespaces. To allow per-symbol
keep/drop choice, __ksymtab_strings must be placed on own sections.
Of course, SORT is unneeded here, though.

This keeps the string unification introduced by commit ce2b617ce8cb
("export.h: reduce __ksymtab_strings string duplication by using "MS"
section flags").

For example, the empty namespaces share the same address.

  $ nm -n
  [ snip ]
  ffffffff8233b53a r __kstrtabns_IO_APIC_get_PCI_irq_vector
  ffffffff8233b53a r __kstrtabns_I_BDEV
  ffffffff8233b53a r __kstrtabns_LZ4_decompress_fast
  ffffffff8233b53a r __kstrtabns_LZ4_decompress_fast_continue
  ffffffff8233b53a r __kstrtabns_LZ4_decompress_fast_usingDict
  ffffffff8233b53a r __kstrtabns_LZ4_decompress_safe
  ffffffff8233b53a r __kstrtabns_LZ4_decompress_safe_continue
    ...

I confirmed no size change in vmlinux.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

 include/asm-generic/export.h      | 2 +-
 include/asm-generic/vmlinux.lds.h | 2 +-
 include/linux/export.h            | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 07a36a874dca..e847f1fde367 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -39,7 +39,7 @@
 __ksymtab_\name:
 	__put \val, __kstrtab_\name
 	.previous
-	.section __ksymtab_strings,"aMS",%progbits,1
+	.section __ksymtab_strings+\name,"aMS",%progbits,1
 __kstrtab_\name:
 	.asciz "\name"
 	.previous
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index c54adce8f6f6..5a2b31890bb8 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -513,7 +513,7 @@
 									\
 	/* Kernel symbol table: strings */				\
         __ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) {	\
-		*(__ksymtab_strings)					\
+		*(__ksymtab_strings+*)					\
 	}								\
 									\
 	/* __*init sections */						\
diff --git a/include/linux/export.h b/include/linux/export.h
index 6271a5d9c988..01e6ab19b226 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -99,7 +99,7 @@ struct kernel_symbol {
 	extern const char __kstrtab_##sym[];					\
 	extern const char __kstrtabns_##sym[];					\
 	__CRC_SYMBOL(sym, sec);							\
-	asm("	.section \"__ksymtab_strings\",\"aMS\",%progbits,1	\n"	\
+	asm("	.section \"__ksymtab_strings+" #sym "\",\"aMS\",%progbits,1\n"	\
 	    "__kstrtab_" #sym ":					\n"	\
 	    "	.asciz 	\"" #sym "\"					\n"	\
 	    "__kstrtabns_" #sym ":					\n"	\
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] kbuild: separate out vmlinux.lds generation
  2021-02-25 16:02 [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
  2021-02-25 16:02 ` [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO Masahiro Yamada
  2021-02-25 16:02 ` [PATCH 2/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
@ 2021-02-25 16:02 ` Masahiro Yamada
  2021-02-25 16:02 ` [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
  2021-02-25 17:19 ` [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Nicolas Pitre
  4 siblings, 0 replies; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 16:02 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada

This is a preparation for the CONFIG_TRIM_UNUSED_KSYMS improvement.

In the new implementation of CONFIG_TRIM_UNUSED_KSYMS (next commit),
unused export symbols are trimmed at the link stage. Kbuild needs to
build the entire tree to know which symbols are needed by modules for
symbol resolution.

The list of needed symbols shall be generated after the directory
traverse, and included from vmlinux.lds.S and module.lds.S.

The build rule of module.lds.S is already separated as modules_prepare.

The build of vmlinux.lds must be delayed because such a list is not yet
available when Kbuild is visiting arch/$(SRCARCH)/kernel/Makefile.

Separate the build rule of vmlinux.lds, and invokes it from the top
Makefile.

I guarded the $(warning ) in scripts/Makefile.build, otherwise a false-
positive warning would be displayed for example when building ARCH=ia64
with CONFIG_IA64_PALINFO=m. Ideally, vmlinux.lds.S could be moved to a
different directory, but I am just doing less-invasive changes for now.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

 Makefile                        | 8 ++++++--
 arch/alpha/kernel/Makefile      | 3 ++-
 arch/arc/kernel/Makefile        | 3 ++-
 arch/arm/kernel/Makefile        | 3 ++-
 arch/arm64/kernel/Makefile      | 3 ++-
 arch/csky/kernel/Makefile       | 3 ++-
 arch/h8300/kernel/Makefile      | 2 +-
 arch/hexagon/kernel/Makefile    | 3 ++-
 arch/ia64/kernel/Makefile       | 3 ++-
 arch/m68k/kernel/Makefile       | 2 +-
 arch/microblaze/kernel/Makefile | 3 ++-
 arch/mips/kernel/Makefile       | 3 ++-
 arch/nds32/kernel/Makefile      | 3 ++-
 arch/nios2/kernel/Makefile      | 2 +-
 arch/openrisc/kernel/Makefile   | 3 ++-
 arch/parisc/kernel/Makefile     | 3 ++-
 arch/powerpc/kernel/Makefile    | 2 +-
 arch/riscv/kernel/Makefile      | 2 +-
 arch/s390/kernel/Makefile       | 3 ++-
 arch/sh/kernel/Makefile         | 3 ++-
 arch/sparc/kernel/Makefile      | 2 +-
 arch/um/kernel/Makefile         | 2 +-
 arch/x86/kernel/Makefile        | 2 +-
 arch/xtensa/kernel/Makefile     | 3 ++-
 scripts/Makefile.build          | 2 ++
 25 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/Makefile b/Makefile
index b18dbc634690..34393fd72fe1 100644
--- a/Makefile
+++ b/Makefile
@@ -1184,6 +1184,9 @@ quiet_cmd_autoksyms_h = GEN     $@
 $(autoksyms_h):
 	$(call cmd,autoksyms_h)
 
+$(KBUILD_LDS): prepare FORCE
+	$(Q)$(MAKE) $(build)=$(patsubst %/,%,$(dir $@)) $@
+
 ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
 
 # Final link of vmlinux with optional arch pass after final link
@@ -1191,14 +1194,15 @@ cmd_link-vmlinux =                                                 \
 	$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
+vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(KBUILD_LDS) \
+			$(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) FORCE
 	+$(call if_changed,link-vmlinux)
 
 targets := vmlinux
 
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
-$(sort $(vmlinux-deps) $(subdir-modorder)): descend ;
+$(sort $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) $(subdir-modorder)): descend ;
 
 filechk_kernel.release = \
 	echo "$(KERNELVERSION)$$($(CONFIG_SHELL) $(srctree)/scripts/setlocalversion $(srctree))"
diff --git a/arch/alpha/kernel/Makefile b/arch/alpha/kernel/Makefile
index 5a74581bf0ee..6e2baaebdee3 100644
--- a/arch/alpha/kernel/Makefile
+++ b/arch/alpha/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the linux kernel.
 #
 
-extra-y		:= head.o vmlinux.lds
+extra-y		:= head.o
+targets		+= vmlinux.lds
 asflags-y	:= $(KBUILD_CFLAGS)
 ccflags-y	:= -Wno-sign-compare
 
diff --git a/arch/arc/kernel/Makefile b/arch/arc/kernel/Makefile
index 8c4fc4b54c14..0a06c018f0cd 100644
--- a/arch/arc/kernel/Makefile
+++ b/arch/arc/kernel/Makefile
@@ -31,4 +31,5 @@ else
 obj-y += ctx_sw_asm.o
 endif
 
-extra-y := vmlinux.lds head.o
+targets += vmlinux.lds
+extra-y := head.o
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ae295a3bcfef..7483916c034d 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -106,4 +106,5 @@ endif
 
 obj-$(CONFIG_HAVE_ARM_SMCCC)	+= smccc-call.o
 
-extra-y := $(head-y) vmlinux.lds
+extra-y := $(head-y)
+targets += vmlinux.lds
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ed65576ce710..32e530c22cba 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -64,7 +64,8 @@ obj-$(CONFIG_COMPAT_VDSO)		+= vdso32-wrap.o
 
 obj-y					+= probes/
 head-y					:= head.o
-extra-y					+= $(head-y) vmlinux.lds
+extra-y					+= $(head-y)
+targets					+= vmlinux.lds
 
 ifeq ($(CONFIG_DEBUG_EFI),y)
 AFLAGS_head.o += -DVMLINUX_PATH="\"$(realpath $(objtree)/vmlinux)\""
diff --git a/arch/csky/kernel/Makefile b/arch/csky/kernel/Makefile
index 37f37c0e934a..2ebc393b57f4 100644
--- a/arch/csky/kernel/Makefile
+++ b/arch/csky/kernel/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-y += entry.o atomic.o signal.o traps.o irq.o time.o vdso.o
 obj-y += power.o syscall.o syscall_table.o setup.o
diff --git a/arch/h8300/kernel/Makefile b/arch/h8300/kernel/Makefile
index 307aa51576dd..7ef912ee576f 100644
--- a/arch/h8300/kernel/Makefile
+++ b/arch/h8300/kernel/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the linux kernel.
 #
 
-extra-y := vmlinux.lds
+targets += vmlinux.lds
 
 obj-y := process.o traps.o ptrace.o \
 	 signal.o setup.o syscalls.o \
diff --git a/arch/hexagon/kernel/Makefile b/arch/hexagon/kernel/Makefile
index fae3dce32fde..9765301d2672 100644
--- a/arch/hexagon/kernel/Makefile
+++ b/arch/hexagon/kernel/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-$(CONFIG_SMP) += smp.o
 
diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index c89bd5f8cbf8..d430230b21af 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -7,7 +7,8 @@ ifdef CONFIG_DYNAMIC_FTRACE
 CFLAGS_REMOVE_ftrace.o = -pg
 endif
 
-extra-y	:= head.o vmlinux.lds
+extra-y	:= head.o
+targets	+= vmlinux.lds
 
 obj-y := entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o irq_ia64.o	\
 	 irq_lsapic.o ivt.o pal.o patch.o process.o ptrace.o sal.o		\
diff --git a/arch/m68k/kernel/Makefile b/arch/m68k/kernel/Makefile
index dbac7f8743fc..b054f4198e63 100644
--- a/arch/m68k/kernel/Makefile
+++ b/arch/m68k/kernel/Makefile
@@ -12,7 +12,7 @@ extra-$(CONFIG_HP300)	:= head.o
 extra-$(CONFIG_Q40)	:= head.o
 extra-$(CONFIG_SUN3X)	:= head.o
 extra-$(CONFIG_SUN3)	:= sun3-head.o
-extra-y			+= vmlinux.lds
+targets			+= vmlinux.lds
 
 obj-y	:= entry.o irq.o module.o process.o ptrace.o
 obj-y	+= setup.o signal.o sys_m68k.o syscalltable.o time.o traps.o
diff --git a/arch/microblaze/kernel/Makefile b/arch/microblaze/kernel/Makefile
index 15a20eb814ce..cdf98cbfcce9 100644
--- a/arch/microblaze/kernel/Makefile
+++ b/arch/microblaze/kernel/Makefile
@@ -12,7 +12,8 @@ CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_process.o = -pg
 endif
 
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-y += dma.o exceptions.o \
 	hw_exception_handler.o irq.o \
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index b4a57f1de772..f2e82faa06c4 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the Linux/MIPS kernel.
 #
 
-extra-y		:= head.o vmlinux.lds
+extra-y		:= head.o
+targets		+= vmlinux.lds
 
 obj-y		+= branch.o cmpxchg.o elf.o entry.o genex.o idle.o irq.o \
 		   process.o prom.o ptrace.o reset.o setup.o signal.o \
diff --git a/arch/nds32/kernel/Makefile b/arch/nds32/kernel/Makefile
index 394df3f6442c..ec061f18f00f 100644
--- a/arch/nds32/kernel/Makefile
+++ b/arch/nds32/kernel/Makefile
@@ -19,7 +19,8 @@ obj-$(CONFIG_OF)		+= devtree.o
 obj-$(CONFIG_CACHE_L2)		+= atl2c.o
 obj-$(CONFIG_PERF_EVENTS) += perf_event_cpu.o
 obj-$(CONFIG_PM)		+= pm.o sleep.o
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 CFLAGS_fpu.o += -mext-fpu-sp -mext-fpu-dp
 
diff --git a/arch/nios2/kernel/Makefile b/arch/nios2/kernel/Makefile
index 0b645e1e3158..1ec4be68462e 100644
--- a/arch/nios2/kernel/Makefile
+++ b/arch/nios2/kernel/Makefile
@@ -4,7 +4,7 @@
 #
 
 extra-y	+= head.o
-extra-y	+= vmlinux.lds
+targets	+= vmlinux.lds
 
 obj-y	+= cpuinfo.o
 obj-y	+= entry.o
diff --git a/arch/openrisc/kernel/Makefile b/arch/openrisc/kernel/Makefile
index 2d172e79f58d..6be5c65ea3e9 100644
--- a/arch/openrisc/kernel/Makefile
+++ b/arch/openrisc/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the linux kernel.
 #
 
-extra-y	:= head.o vmlinux.lds
+extra-y	:= head.o
+targets	+= vmlinux.lds
 
 obj-y	:= setup.o or32_ksyms.o process.o dma.o \
 	   traps.o time.o irq.o entry.o ptrace.o signal.o \
diff --git a/arch/parisc/kernel/Makefile b/arch/parisc/kernel/Makefile
index 068d90950d93..31e5109251aa 100644
--- a/arch/parisc/kernel/Makefile
+++ b/arch/parisc/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for arch/parisc/kernel
 #
 
-extra-y			:= head.o vmlinux.lds
+extra-y		:= head.o
+targets		+= vmlinux.lds
 
 obj-y	     	:= cache.o pacache.o setup.o pdt.o traps.o time.o irq.o \
 		   pa7300lc.o syscall.o entry.o sys_parisc.o firmware.o \
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 6084fa499aa3..c7576957f05a 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -101,7 +101,7 @@ extra-$(CONFIG_40x)		:= head_40x.o
 extra-$(CONFIG_44x)		:= head_44x.o
 extra-$(CONFIG_FSL_BOOKE)	:= head_fsl_booke.o
 extra-$(CONFIG_PPC_8xx)		:= head_8xx.o
-extra-y				+= vmlinux.lds
+targets				+= vmlinux.lds
 
 obj-$(CONFIG_RELOCATABLE)	+= reloc_$(BITS).o
 
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index f6caf4d9ca15..fcebdb13bcda 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -9,7 +9,7 @@ CFLAGS_REMOVE_patch.o	= -pg
 endif
 
 extra-y += head.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds
 
 obj-y	+= soc.o
 obj-y	+= cpu.o
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
index c97818a382f3..15d3ee771f22 100644
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -42,7 +42,8 @@ obj-y	+= entry.o reipl.o relocate_kernel.o kdebugfs.o alternative.o
 obj-y	+= nospec-branch.o ipl_vmparm.o machine_kexec_reloc.o unwind_bc.o
 obj-y	+= smp.o
 
-extra-y				+= head64.o vmlinux.lds
+extra-y				+= head64.o
+targets				+= vmlinux.lds
 
 obj-$(CONFIG_SYSFS)		+= nospec-sysfs.o
 CFLAGS_REMOVE_nospec-branch.o	+= $(CC_FLAGS_EXPOLINE)
diff --git a/arch/sh/kernel/Makefile b/arch/sh/kernel/Makefile
index aa0fbc9202b1..e8384889f5f0 100644
--- a/arch/sh/kernel/Makefile
+++ b/arch/sh/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the Linux/SuperH kernel.
 #
 
-extra-y	:= head_32.o vmlinux.lds
+extra-y	:= head_32.o
+targets	+= vmlinux.lds
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not profile debug and lowlevel utilities
diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile
index d3a0e072ebe8..685669edb9f8 100644
--- a/arch/sparc/kernel/Makefile
+++ b/arch/sparc/kernel/Makefile
@@ -12,7 +12,7 @@ extra-y     := head_$(BITS).o
 # Undefine sparc when processing vmlinux.lds - it is used
 # And teach CPP we are doing $(BITS) builds (for this case)
 CPPFLAGS_vmlinux.lds := -Usparc -m$(BITS)
-extra-y              += vmlinux.lds
+targets              += vmlinux.lds
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not profile debug and lowlevel utilities
diff --git a/arch/um/kernel/Makefile b/arch/um/kernel/Makefile
index 5aa882011e04..76eea4cc00f0 100644
--- a/arch/um/kernel/Makefile
+++ b/arch/um/kernel/Makefile
@@ -12,7 +12,7 @@ CPPFLAGS_vmlinux.lds := -DSTART=$(LDS_START)		\
                         -DELF_ARCH=$(LDS_ELF_ARCH)	\
                         -DELF_FORMAT=$(LDS_ELF_FORMAT)	\
 			$(LDS_EXTRA)
-extra-y := vmlinux.lds
+targets += vmlinux.lds
 
 obj-y = config.o exec.o exitcode.o irq.o ksyms.o mem.o \
 	physmem.o process.o ptrace.o reboot.o sigio.o \
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 2ddf08351f0b..7d6fce044f97 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -7,7 +7,7 @@ extra-y	:= head_$(BITS).o
 extra-y	+= head$(BITS).o
 extra-y	+= ebda.o
 extra-y	+= platform-quirks.o
-extra-y	+= vmlinux.lds
+targets	+= vmlinux.lds
 
 CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
 
diff --git a/arch/xtensa/kernel/Makefile b/arch/xtensa/kernel/Makefile
index d4082c6a121b..79be7bfdf989 100644
--- a/arch/xtensa/kernel/Makefile
+++ b/arch/xtensa/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the Linux/Xtensa kernel.
 #
 
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-y := align.o coprocessor.o entry.o irq.o platform.o process.o \
 	 ptrace.o setup.o signal.o stacktrace.o syscall.o time.o traps.o \
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 3f6bf0ea7c0e..fd573e5ca0b9 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -63,12 +63,14 @@ ifndef obj
 $(warning kbuild: Makefile.build is included improperly)
 endif
 
+ifeq ($(filter-out %.mod, $(MAKECMDGOALS)),)
 ifeq ($(need-modorder),)
 ifneq ($(obj-m),)
 $(warning $(patsubst %.o,'%.ko',$(obj-m)) will not be built even though obj-m is specified.)
 $(warning You cannot use subdir-y/m to visit a module Makefile. Use obj-y/m instead.)
 endif
 endif
+endif
 
 # ===========================================================================
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-02-25 16:02 [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
                   ` (2 preceding siblings ...)
  2021-02-25 16:02 ` [PATCH 3/4] kbuild: separate out vmlinux.lds generation Masahiro Yamada
@ 2021-02-25 16:02 ` Masahiro Yamada
  2021-02-25 18:46   ` kernel test robot
  2021-02-25 18:56   ` kernel test robot
  2021-02-25 17:19 ` [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Nicolas Pitre
  4 siblings, 2 replies; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 16:02 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada

Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
guarding") re-enabled this feature, but Linus is still unhappy about
the build time.

The reason of the slowness is the recursion - after updating
<generated/autoksyms.h> (, which contains all symbols needed by modules),
Kbuild begins the second traverse, rebuilding objects whose EXPORT_SYMBOL
needs flipping.

This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work
in one pass. After the tree traverse, a linker script snippet
<generated/keep-ksyms.h> is generated. It feeds the list of necessary
sections to vmlinus.lds.S and modules.lds.S. The other sections fall
into DISCARDS.

There is no more build issue, I believe. I dropped the 'if EXPORT' and
'depends on !COMPILE_TEST' guarding.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

 Makefile                                      | 30 +++-----
 include/asm-generic/export.h                  | 23 ------
 include/asm-generic/vmlinux.lds.h             | 29 +++++--
 include/linux/export.h                        | 54 +++----------
 init/Kconfig                                  |  3 +-
 scripts/Makefile.build                        |  5 --
 scripts/adjust_autoksyms.sh                   | 76 -------------------
 .../{gen_autoksyms.sh => gen-keep-ksyms.sh}   | 34 ++++++---
 scripts/gen_ksymdeps.sh                       | 25 ------
 scripts/module.lds.S                          | 38 +++++++---
 10 files changed, 103 insertions(+), 214 deletions(-)
 delete mode 100755 scripts/adjust_autoksyms.sh
 rename scripts/{gen_autoksyms.sh => gen-keep-ksyms.sh} (78%)
 delete mode 100755 scripts/gen_ksymdeps.sh

diff --git a/Makefile b/Makefile
index 34393fd72fe1..cda800fa2f78 100644
--- a/Makefile
+++ b/Makefile
@@ -1160,29 +1160,23 @@ export KBUILD_LDS          := arch/$(SRCARCH)/kernel/vmlinux.lds
 # used by scripts/Makefile.package
 export KBUILD_ALLDIRS := $(sort $(filter-out arch/%,$(vmlinux-alldirs)) LICENSES arch include scripts tools)
 
-vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS)
+targets :=
 
-# Recurse until adjust_autoksyms.sh is satisfied
-PHONY += autoksyms_recursive
 ifdef CONFIG_TRIM_UNUSED_KSYMS
 # For the kernel to actually contain only the needed exported symbols,
 # we have to build modules as well to determine what those symbols are.
 # (this can be evaluated only once include/config/auto.conf has been included)
 KBUILD_MODULES := 1
 
-autoksyms_recursive: descend modules.order
-	$(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
-	  "$(MAKE) -f $(srctree)/Makefile vmlinux"
-endif
-
-autoksyms_h := $(if $(CONFIG_TRIM_UNUSED_KSYMS), include/generated/autoksyms.h)
+quiet_cmd_gen_used_ksyms = GEN     $@
+      cmd_gen_used_ksyms = $(CONFIG_SHELL) $(srctree)/scripts/gen-keep-ksyms.sh $< > $@
 
-quiet_cmd_autoksyms_h = GEN     $@
-      cmd_autoksyms_h = mkdir -p $(dir $@); \
-			$(CONFIG_SHELL) $(srctree)/scripts/gen_autoksyms.sh $@
+include/generated/keep-ksyms.h: modules.order FORCE
+	$(call if_changed,gen_used_ksyms)
+targets += include/generated/keep-ksyms.h
 
-$(autoksyms_h):
-	$(call cmd,autoksyms_h)
+$(KBUILD_LDS) modules_prepare: include/generated/keep-ksyms.h
+endif
 
 $(KBUILD_LDS): prepare FORCE
 	$(Q)$(MAKE) $(build)=$(patsubst %/,%,$(dir $@)) $@
@@ -1194,11 +1188,11 @@ cmd_link-vmlinux =                                                 \
 	$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(KBUILD_LDS) \
+vmlinux: scripts/link-vmlinux.sh $(KBUILD_LDS) \
 			$(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) FORCE
 	+$(call if_changed,link-vmlinux)
 
-targets := vmlinux
+targets += vmlinux
 
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
@@ -1227,7 +1221,7 @@ scripts: scripts_basic scripts_dtc
 PHONY += prepare archprepare
 
 archprepare: outputmakefile archheaders archscripts scripts include/config/kernel.release \
-	asm-generic $(version_h) $(autoksyms_h) include/generated/utsrelease.h \
+	asm-generic $(version_h) include/generated/utsrelease.h \
 	include/generated/autoconf.h
 
 prepare0: archprepare
@@ -1503,7 +1497,7 @@ endif # CONFIG_MODULES
 # make distclean Remove editor backup files, patch leftover files and the like
 
 # Directories & files removed with 'make clean'
-CLEAN_FILES += include/ksym vmlinux.symvers \
+CLEAN_FILES += vmlinux.symvers \
 	       modules.builtin modules.builtin.modinfo modules.nsdeps \
 	       compile_commands.json
 
diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index e847f1fde367..b9be5b1dd7e6 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -57,30 +57,7 @@ __kstrtab_\name:
 #endif
 .endm
 
-#if defined(CONFIG_TRIM_UNUSED_KSYMS)
-
-#include <linux/kconfig.h>
-#include <generated/autoksyms.h>
-
-.macro __ksym_marker sym
-	.section ".discard.ksym","a"
-__ksym_marker_\sym:
-	 .previous
-.endm
-
-#define __EXPORT_SYMBOL(sym, val, sec)				\
-	__ksym_marker sym;					\
-	__cond_export_sym(sym, val, sec, __is_defined(__KSYM_##sym))
-#define __cond_export_sym(sym, val, sec, conf)			\
-	___cond_export_sym(sym, val, sec, conf)
-#define ___cond_export_sym(sym, val, sec, enabled)		\
-	__cond_export_sym_##enabled(sym, val, sec)
-#define __cond_export_sym_1(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec
-#define __cond_export_sym_0(sym, val, sec) /* nothing */
-
-#else
 #define __EXPORT_SYMBOL(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec
-#endif
 
 #define EXPORT_SYMBOL(name)					\
 	__EXPORT_SYMBOL(name, KSYM_FUNC(name),)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5a2b31890bb8..f2b0990be159 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -50,6 +50,24 @@
  *               [__nosave_begin, __nosave_end] for the nosave data
  */
 
+#if CONFIG_TRIM_UNUSED_KSYMS
+#include <generated/keep-ksyms.h>
+
+#define KSYM_DISCARDS		*(___ksymtab+*) \
+				*(___ksymtab_gpl+*) \
+				*(___kcrctab+*) \
+				*(___kcrctab_gpl+*) \
+				*(__ksymtab_strings+*)
+
+#else
+#define KSYMTAB			KEEP(*(SORT(___ksymtab+*)))
+#define KSYMTAB_GPL		KEEP(*(SORT(___ksymtab_gpl+*)))
+#define KCRCTAB			KEEP(*(SORT(___kcrctab+*)))
+#define KCRCTAB_GPL		KEEP(*(SORT(___kcrctab_gpl+*)))
+#define KSYMTAB_STRINGS		*(__ksymtab_strings+*)
+#define KSYM_DISCARDS
+#endif
+
 #ifndef LOAD_OFFSET
 #define LOAD_OFFSET 0
 #endif
@@ -486,34 +504,34 @@
 	/* Kernel symbol table: Normal symbols */			\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
 		__start___ksymtab = .;					\
-		KEEP(*(SORT(___ksymtab+*)))				\
+		KSYMTAB							\
 		__stop___ksymtab = .;					\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
 		__start___ksymtab_gpl = .;				\
-		KEEP(*(SORT(___ksymtab_gpl+*)))				\
+		KSYMTAB_GPL						\
 		__stop___ksymtab_gpl = .;				\
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
 		__start___kcrctab = .;					\
-		KEEP(*(SORT(___kcrctab+*)))				\
+		KCRCTAB							\
 		__stop___kcrctab = .;					\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
 		__start___kcrctab_gpl = .;				\
-		KEEP(*(SORT(___kcrctab_gpl+*)))				\
+		KCRCTAB_GPL						\
 		__stop___kcrctab_gpl = .;				\
 	}								\
 									\
 	/* Kernel symbol table: strings */				\
         __ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) {	\
-		*(__ksymtab_strings+*)					\
+		KSYMTAB_STRINGS						\
 	}								\
 									\
 	/* __*init sections */						\
@@ -993,6 +1011,7 @@
 	/DISCARD/ : {							\
 	EXIT_DISCARDS							\
 	EXIT_CALL							\
+	KSYM_DISCARDS							\
 	COMMON_DISCARDS							\
 	}
 
diff --git a/include/linux/export.h b/include/linux/export.h
index 01e6ab19b226..f9cc13cd2c8c 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -76,9 +76,18 @@ struct kernel_symbol {
 };
 #endif
 
-#ifdef __GENKSYMS__
+#if !defined(CONFIG_MODULES) || defined(__DISABLE_EXPORTS)
+
+/*
+ * Allow symbol exports to be disabled completely so that C code may
+ * be reused in other execution contexts such as the UEFI stub or the
+ * decompressor.
+ */
+#define __EXPORT_SYMBOL(sym, sec, ns)
+
+#elif defined(__GENKSYMS__)
 
-#define ___EXPORT_SYMBOL(sym, sec, ns)	__GENKSYMS_EXPORT_SYMBOL(sym)
+#define __EXPORT_SYMBOL(sym, sec, ns)	__GENKSYMS_EXPORT_SYMBOL(sym)
 
 #else
 
@@ -94,7 +103,7 @@ struct kernel_symbol {
  * section flag requires it. Use '%progbits' instead of '@progbits' since the
  * former apparently works on all arches according to the binutils source.
  */
-#define ___EXPORT_SYMBOL(sym, sec, ns)						\
+#define __EXPORT_SYMBOL(sym, sec, ns)						\
 	extern typeof(sym) sym;							\
 	extern const char __kstrtab_##sym[];					\
 	extern const char __kstrtabns_##sym[];					\
@@ -107,45 +116,6 @@ struct kernel_symbol {
 	    "	.previous						\n");	\
 	__KSYMTAB_ENTRY(sym, sec)
 
-#endif
-
-#if !defined(CONFIG_MODULES) || defined(__DISABLE_EXPORTS)
-
-/*
- * Allow symbol exports to be disabled completely so that C code may
- * be reused in other execution contexts such as the UEFI stub or the
- * decompressor.
- */
-#define __EXPORT_SYMBOL(sym, sec, ns)
-
-#elif defined(CONFIG_TRIM_UNUSED_KSYMS)
-
-#include <generated/autoksyms.h>
-
-/*
- * For fine grained build dependencies, we want to tell the build system
- * about each possible exported symbol even if they're not actually exported.
- * We use a symbol pattern __ksym_marker_<symbol> that the build system filters
- * from the $(NM) output (see scripts/gen_ksymdeps.sh). These symbols are
- * discarded in the final link stage.
- */
-#define __ksym_marker(sym)	\
-	static int __ksym_marker_##sym[0] __section(".discard.ksym") __used
-
-#define __EXPORT_SYMBOL(sym, sec, ns)					\
-	__ksym_marker(sym);						\
-	__cond_export_sym(sym, sec, ns, __is_defined(__KSYM_##sym))
-#define __cond_export_sym(sym, sec, ns, conf)				\
-	___cond_export_sym(sym, sec, ns, conf)
-#define ___cond_export_sym(sym, sec, ns, enabled)			\
-	__cond_export_sym_##enabled(sym, sec, ns)
-#define __cond_export_sym_1(sym, sec, ns) ___EXPORT_SYMBOL(sym, sec, ns)
-#define __cond_export_sym_0(sym, sec, ns) /* nothing */
-
-#else
-
-#define __EXPORT_SYMBOL(sym, sec, ns)	___EXPORT_SYMBOL(sym, sec, ns)
-
 #endif /* CONFIG_MODULES */
 
 #ifdef DEFAULT_SYMBOL_NAMESPACE
diff --git a/init/Kconfig b/init/Kconfig
index 351161326e3c..e52034f66aeb 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2259,8 +2259,7 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
 	  If unsure, say N.
 
 config TRIM_UNUSED_KSYMS
-	bool "Trim unused exported kernel symbols" if EXPERT
-	depends on !COMPILE_TEST
+	bool "Trim unused exported kernel symbols"
 	help
 	  The kernel and some modules make many symbols available for
 	  other modules to use via EXPORT_SYMBOL() and variants. Depending
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index fd573e5ca0b9..fd2d7517a652 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -245,16 +245,12 @@ objtool_dep = $(objtool_obj)					\
 			 include/config/stack/validation.h)
 
 ifdef CONFIG_TRIM_UNUSED_KSYMS
-cmd_gen_ksymdeps = \
-	$(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
-
 # List module undefined symbols
 undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
 endif
 
 define rule_cc_o_c
 	$(call cmd_and_fixdep,cc_o_c)
-	$(call cmd,gen_ksymdeps)
 	$(call cmd,checksrc)
 	$(call cmd,checkdoc)
 	$(call cmd,objtool)
@@ -264,7 +260,6 @@ endef
 
 define rule_as_o_S
 	$(call cmd_and_fixdep,as_o_S)
-	$(call cmd,gen_ksymdeps)
 	$(call cmd,objtool)
 	$(call cmd,modversions_S)
 endef
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
deleted file mode 100755
index 2b366d945ccb..000000000000
--- a/scripts/adjust_autoksyms.sh
+++ /dev/null
@@ -1,76 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0-only
-
-# Script to update include/generated/autoksyms.h and dependency files
-#
-# Copyright:	(C) 2016  Linaro Limited
-# Created by:	Nicolas Pitre, January 2016
-#
-
-# Update the include/generated/autoksyms.h file.
-#
-# For each symbol being added or removed, the corresponding dependency
-# file's timestamp is updated to force a rebuild of the affected source
-# file. All arguments passed to this script are assumed to be a command
-# to be exec'd to trigger a rebuild of those files.
-
-set -e
-
-cur_ksyms_file="include/generated/autoksyms.h"
-new_ksyms_file="include/generated/autoksyms.h.tmpnew"
-
-info() {
-	if [ "$quiet" != "silent_" ]; then
-		printf "  %-7s %s\n" "$1" "$2"
-	fi
-}
-
-info "CHK" "$cur_ksyms_file"
-
-# Use "make V=1" to debug this script.
-case "$KBUILD_VERBOSE" in
-*1*)
-	set -x
-	;;
-esac
-
-# We need access to CONFIG_ symbols
-. include/config/auto.conf
-
-# Generate a new symbol list file
-$CONFIG_SHELL $srctree/scripts/gen_autoksyms.sh "$new_ksyms_file"
-
-# Extract changes between old and new list and touch corresponding
-# dependency files.
-changed=$(
-count=0
-sort "$cur_ksyms_file" "$new_ksyms_file" | uniq -u |
-sed -n 's/^#define __KSYM_\(.*\) 1/\1/p' | tr "A-Z_" "a-z/" |
-while read sympath; do
-	if [ -z "$sympath" ]; then continue; fi
-	depfile="include/ksym/${sympath}.h"
-	mkdir -p "$(dirname "$depfile")"
-	touch "$depfile"
-	# Filesystems with coarse time precision may create timestamps
-	# equal to the one from a file that was very recently built and that
-	# needs to be rebuild. Let's guard against that by making sure our
-	# dep files are always newer than the first file we created here.
-	while [ ! "$depfile" -nt "$new_ksyms_file" ]; do
-		touch "$depfile"
-	done
-	echo $((count += 1))
-done | tail -1 )
-changed=${changed:-0}
-
-if [ $changed -gt 0 ]; then
-	# Replace the old list with tne new one
-	old=$(grep -c "^#define __KSYM_" "$cur_ksyms_file" || true)
-	new=$(grep -c "^#define __KSYM_" "$new_ksyms_file" || true)
-	info "KSYMS" "symbols: before=$old, after=$new, changed=$changed"
-	info "UPD" "$cur_ksyms_file"
-	mv -f "$new_ksyms_file" "$cur_ksyms_file"
-	# Then trigger a rebuild of affected source files
-	exec $@
-else
-	rm -f "$new_ksyms_file"
-fi
diff --git a/scripts/gen_autoksyms.sh b/scripts/gen-keep-ksyms.sh
similarity index 78%
rename from scripts/gen_autoksyms.sh
rename to scripts/gen-keep-ksyms.sh
index b74d5949fea6..cedb18fac46b 100755
--- a/scripts/gen_autoksyms.sh
+++ b/scripts/gen-keep-ksyms.sh
@@ -1,13 +1,23 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0-only
 
-# Create an autoksyms.h header file from the list of all module's needed symbols
-# as recorded on the second line of *.mod files and the user-provided symbol
-# whitelist.
-
 set -e
 
-output_file="$1"
+modlist=$1
+
+emit ()
+{
+	local macro="$1"
+	local prefix="$2"
+	local syms="$3"
+
+	echo "#define $macro \\"
+	for s in $syms
+	do
+		echo "	KEEP(*($prefix$s)) \\"
+	done
+	echo
+}
 
 # Use "make V=1" to debug this script.
 case "$KBUILD_VERBOSE" in
@@ -49,15 +59,14 @@ fi
 
 # Generate a new ksym list file with symbols needed by the current
 # set of modules.
-cat > "$output_file" << EOT
+cat << EOT
 /*
  * Automatically generated file; DO NOT EDIT.
  */
 
 EOT
 
-[ -f modules.order ] && modlist=modules.order || modlist=/dev/null
-
+syms=$(
 {
 	sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
 	echo "$needed_symbols"
@@ -67,4 +76,11 @@ EOT
 # point addresses.
 sed -e 's/^\.//' |
 sort -u |
-sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
+sed -e 's/\(.*\)/\1/'
+)
+
+emit "KSYMTAB"		"___ksymtab+"		"$syms"
+emit "KSYMTAB_GPL"	"___ksymtab_gpl+"	"$syms"
+emit "KCRCTAB"		"___kcrctab_gpl+"	"$syms"
+emit "KCRCTAB_GPL"	"___kcrctab_gpl+"	"$syms"
+emit "KSYMTAB_STRINGS"	"__ksymtab_strings+"	"$syms"
diff --git a/scripts/gen_ksymdeps.sh b/scripts/gen_ksymdeps.sh
deleted file mode 100755
index 1324986e1362..000000000000
--- a/scripts/gen_ksymdeps.sh
+++ /dev/null
@@ -1,25 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-
-set -e
-
-# List of exported symbols
-ksyms=$($NM $1 | sed -n 's/.*__ksym_marker_\(.*\)/\1/p' | tr A-Z a-z)
-
-if [ -z "$ksyms" ]; then
-	exit 0
-fi
-
-echo
-echo "ksymdeps_$1 := \\"
-
-for s in $ksyms
-do
-	echo $s | sed -e 's:^_*:    $(wildcard include/ksym/:' \
-			-e 's:__*:/:g' -e 's/$/.h) \\/'
-done
-
-echo
-echo "$1: \$(ksymdeps_$1)"
-echo
-echo "\$(ksymdeps_$1):"
diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index 168cd27e6122..a6d2d96e29f0 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -3,16 +3,30 @@
  * Archs are free to supply their own linker scripts.  ld will
  * combine them automatically.
  */
-SECTIONS {
-	/DISCARD/ : {
-		*(.discard)
-		*(.discard.*)
-	}
 
-	__ksymtab		0 : { *(SORT(___ksymtab+*)) }
-	__ksymtab_gpl		0 : { *(SORT(___ksymtab_gpl+*)) }
-	__kcrctab		0 : { *(SORT(___kcrctab+*)) }
-	__kcrctab_gpl		0 : { *(SORT(___kcrctab_gpl+*)) }
+#if CONFIG_TRIM_UNUSED_KSYMS
+#include <generated/keep-ksyms.h>
+
+#define KSYM_DISCARDS	*(___ksymtab+*) \
+			*(___ksymtab_gpl+*) \
+			*(___kcrctab+*) \
+			*(___kcrctab_gpl+*) \
+			*(__ksymtab_strings+*)
+#else
+#define KSYMTAB		KEEP(*(SORT(___ksymtab+*)))
+#define KSYMTAB_GPL	KEEP(*(SORT(___ksymtab_gpl+*)))
+#define KCRCTAB		KEEP(*(SORT(___kcrctab+*)))
+#define KCRCTAB_GPL	KEEP(*(SORT(___kcrctab_gpl+*)))
+#define KSYMTAB_STRINGS		*(__ksymtab_strings+*)
+#define KSYM_DISCARDS
+#endif
+
+SECTIONS {
+	__ksymtab		0 : { KSYMTAB }
+	__ksymtab_gpl		0 : { KSYMTAB_GPL }
+	__kcrctab		0 : { KCRCTAB }
+	__kcrctab_gpl		0 : { KCRCTAB_GPL }
+	__ksymtab_strings	0 : { KSYMTAB_STRINGS }
 
 	.init_array		0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) }
 
@@ -41,6 +55,12 @@ SECTIONS {
 	}
 
 	.text : { *(.text .text.[0-9a-zA-Z_]*) }
+
+	/DISCARD/ : {
+		*(.discard)
+		*(.discard.*)
+		KSYM_DISCARDS
+	}
 }
 
 /* bring in arch-specific sections */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS
  2021-02-25 16:02 [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
                   ` (3 preceding siblings ...)
  2021-02-25 16:02 ` [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
@ 2021-02-25 17:19 ` Nicolas Pitre
  2021-02-25 18:57   ` Masahiro Yamada
  4 siblings, 1 reply; 17+ messages in thread
From: Nicolas Pitre @ 2021-02-25 17:19 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kbuild, Christoph Hellwig, Linus Torvalds, Jessica Yu,
	Sami Tolvanen, linux-kernel, linux-arch

On Fri, 26 Feb 2021, Masahiro Yamada wrote:

> 
> Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> about the build speed.
> 
> I re-implemented this feature, and the build time cost is now
> almost unnoticeable level.
> 
> I hope this makes Linus happy.

:-)

I'm surprised to see that Linus is using this feature. When disabled 
(the default) this should have had no impact on the build time.

This feature provides a nice security advantage by significantly 
reducing the kernel input surface. And people are using that also to 
better what third party vendor can and cannot do with a distro kernel, 
etc. But that's not the reason why I implemented this feature in the 
first place.

My primary goal was to efficiently reduce the kernel binary size using 
LTO even with kernel modules enabled. Each EXPORT_SYMBOL() created a 
symbol dependency that prevented LTO from optimizing out the related 
code even though a tiny fraction of those exported symbols were needed.

The idea behind the recursion was to catch those cases where disabling 
an exported symbol within a module would optimize out references to more 
exported symbols that, in turn, could be disabled and possibly trigger 
yet more code elimination. There is no way that can be achieved without 
extra compiler passes in a recursive manner.


Nicolas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO
  2021-02-25 16:02 ` [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO Masahiro Yamada
@ 2021-02-25 17:45   ` Sami Tolvanen
  2021-02-25 19:08     ` Masahiro Yamada
  0 siblings, 1 reply; 17+ messages in thread
From: Sami Tolvanen @ 2021-02-25 17:45 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kbuild, Christoph Hellwig, Linus Torvalds, Jessica Yu,
	Nicolas Pitre, LKML, linux-arch, Arnd Bergmann

Hi Masahiro,

On Thu, Feb 25, 2021 at 8:03 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> does not work as expected if the .config file has already specified
> CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling
> CONFIG_LTO_CLANG.
>
> So, the user-supplied whitelist and LTO-specific white list must be
> independent of each other.
>
> I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO
> handle whitelists in the same way.
>
> Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> ---
>
>  init/Kconfig                    |  1 -
>  scripts/gen_autoksyms.sh        | 33 ++++++++++++++++++++++++---------
>  scripts/lto-used-symbollist.txt |  5 -----
>  3 files changed, 24 insertions(+), 15 deletions(-)
>  delete mode 100644 scripts/lto-used-symbollist.txt

> +
> +ksym_wl=
>  if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then
>         # Use 'eval' to expand the whitelist path and check if it is relative
>         eval ksym_wl="$CONFIG_UNUSED_KSYMS_WHITELIST"
> @@ -40,16 +57,14 @@ cat > "$output_file" << EOT
>  EOT
>
>  [ -f modules.order ] && modlist=modules.order || modlist=/dev/null
> -sed 's/ko$/mod/' $modlist |
> -xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
> -cat - "$ksym_wl" |
> +
> +{
> +       sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
> +       echo "$needed_symbols"
> +       [ -n "$ksym_wl" ] && cat "$ksym_wl"
> +} | sed -e 's/ /\n/g' | sed -n -e '/^$/!p' |
>  # Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
>  # point addresses.
>  sed -e 's/^\.//' |
>  sort -u |
>  sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
> -
> -# Special case for modversions (see modpost.c)
> -if [ -n "$CONFIG_MODVERSIONS" ]; then
> -       echo "#define __KSYM_module_layout 1" >> "$output_file"
> -fi
> diff --git a/scripts/lto-used-symbollist.txt b/scripts/lto-used-symbollist.txt
> deleted file mode 100644
> index 38e7bb9ebaae..000000000000
> --- a/scripts/lto-used-symbollist.txt
> +++ /dev/null
> @@ -1,5 +0,0 @@
> -memcpy
> -memmove
> -memset
> -__stack_chk_fail
> -__stack_chk_guard
> --
> 2.27.0
>
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 0bf5b340b80e..351161326e3c 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2277,7 +2277,6 @@ config TRIM_UNUSED_KSYMS
>  config UNUSED_KSYMS_WHITELIST
>         string "Whitelist of symbols to keep in ksymtab"
>         depends on TRIM_UNUSED_KSYMS
> -       default "scripts/lto-used-symbollist.txt" if LTO_CLANG
>         help
>           By default, all unused exported symbols will be un-exported from the
>           build when TRIM_UNUSED_KSYMS is selected.
> diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh
> index d54dfba15bf2..b74d5949fea6 100755
> --- a/scripts/gen_autoksyms.sh
> +++ b/scripts/gen_autoksyms.sh
> @@ -19,7 +19,24 @@ esac
>  # We need access to CONFIG_ symbols
>  . include/config/auto.conf
>
> -ksym_wl=/dev/null
> +needed_symbols=
> +
> +# Special case for modversions (see modpost.c)
> +if [ -n "$CONFIG_MODVERSIONS" ]; then
> +       needed_symbols="$needed_symbols module_layout"
> +fi
> +
> +# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary
> +# when the .mod files are generated, which means they don't yet contain
> +# references to certain symbols that will be present in the final binaries.
> +if [ -n "$CONFIG_LTO_CLANG" ]; then
> +       # intrinsic functions
> +       needed_symbols="$needed_symbols memcpy memmove memset"
> +       # stack protector symbols
> +       needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard"
> +fi

Thank you for the patch!

Arnd just reported that _mcount is also needed with some
configurations. Would you mind including that in the next version?

https://lore.kernel.org/r/20210225143456.3829513-1-arnd@kernel.org/

Sami

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-02-25 16:02 ` [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
@ 2021-02-25 18:46   ` kernel test robot
  2021-02-25 20:06     ` Masahiro Yamada
  2021-02-25 18:56   ` kernel test robot
  1 sibling, 1 reply; 17+ messages in thread
From: kernel test robot @ 2021-02-25 18:46 UTC (permalink / raw)
  To: Masahiro Yamada, linux-kbuild
  Cc: kbuild-all, Christoph Hellwig, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada

[-- Attachment #1: Type: text/plain, Size: 2134 bytes --]

Hi Masahiro,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20210225]
[cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
config: powerpc-mpc8313_rdb_defconfig (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
        git checkout 014940331790a8cd9bee92c7201494ec3217201e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
       7 | #if CONFIG_TRIM_UNUSED_KSYMS
         |     ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/CONFIG_TRIM_UNUSED_KSYMS +7 scripts/module.lds.S

   > 7	#if CONFIG_TRIM_UNUSED_KSYMS
     8	#include <generated/keep-ksyms.h>
     9	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 19255 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-02-25 16:02 ` [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
  2021-02-25 18:46   ` kernel test robot
@ 2021-02-25 18:56   ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2021-02-25 18:56 UTC (permalink / raw)
  To: Masahiro Yamada, linux-kbuild
  Cc: kbuild-all, Christoph Hellwig, Jessica Yu, Nicolas Pitre,
	Sami Tolvanen, linux-kernel, linux-arch, Masahiro Yamada

[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]

Hi Masahiro,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20210225]
[cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
config: arc-randconfig-r031-20210225 (attached as .config)
compiler: arc-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
        git checkout 014940331790a8cd9bee92c7201494ec3217201e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from drivers/of/unittest-data/testcases.dtb.S:1:
>> include/asm-generic/vmlinux.lds.h:53:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
      53 | #if CONFIG_TRIM_UNUSED_KSYMS
         |     ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/CONFIG_TRIM_UNUSED_KSYMS +53 include/asm-generic/vmlinux.lds.h

  > 53	#if CONFIG_TRIM_UNUSED_KSYMS
    54	#include <generated/keep-ksyms.h>
    55	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25862 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS
  2021-02-25 17:19 ` [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Nicolas Pitre
@ 2021-02-25 18:57   ` Masahiro Yamada
  2021-02-25 19:24     ` Nicolas Pitre
  0 siblings, 1 reply; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 18:57 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Sami Tolvanen, Linux Kernel Mailing List, linux-arch

On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre <nico@fluxnic.net> wrote:
>
> On Fri, 26 Feb 2021, Masahiro Yamada wrote:
>
> >
> > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > about the build speed.
> >
> > I re-implemented this feature, and the build time cost is now
> > almost unnoticeable level.
> >
> > I hope this makes Linus happy.
>
> :-)
>
> I'm surprised to see that Linus is using this feature. When disabled
> (the default) this should have had no impact on the build time.

Linus is not using this feature, but does build tests.
After pulling the module subsystem pull request in this merge window,
CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.


> This feature provides a nice security advantage by significantly
> reducing the kernel input surface. And people are using that also to
> better what third party vendor can and cannot do with a distro kernel,
> etc. But that's not the reason why I implemented this feature in the
> first place.
>
> My primary goal was to efficiently reduce the kernel binary size using
> LTO even with kernel modules enabled.


Clang LTO landed in this MW.

Do you think it will reduce the kernel binary size?
No, opposite.

CONFIG_LTO_CLANG cannot trim any code even if it
is obviously unused.
Hence, it never reduces the kernel binary size.
Rather, it produces a bigger kernel.

The reason is Clang LTO was implemented against
relocatable ELF (vmlinux.o) .

I pointed out this flaw in the review process, but
it was dismissed.

This is the main reason why I did not give any Ack
(but it was merged via Kees Cook's tree).


So, the help text of this option should be revised:

          This option allows for unused exported symbols to be dropped from
          the build. In turn, this provides the compiler more opportunities
          (especially when using LTO) for optimizing the code and reducing
          binary size.  This might have some security advantages as well.

Clang LTO is opposite to your expectation.



> Each EXPORT_SYMBOL() created a
> symbol dependency that prevented LTO from optimizing out the related
> code even though a tiny fraction of those exported symbols were needed.
>
> The idea behind the recursion was to catch those cases where disabling
> an exported symbol within a module would optimize out references to more
> exported symbols that, in turn, could be disabled and possibly trigger
> yet more code elimination. There is no way that can be achieved without
> extra compiler passes in a recursive manner.

I do not understand.

Modules are relocatable ELF.
Clang LTO cannot eliminate any code.
GCC LTO does not work with relocatable ELF
in the first place.


Are you talking about a story in a perfect world?
But, I do not know how LTO can eliminate dead code
from relocatable ELF.




- Current implementation

  CLANG LTO works against vmlinux.o,
  so it is completely useless for the purpose of
  eliminating dead code.

  So, this case is don't care.
  TRIM_UNUSED_KSYMS removes only the meta data of EXPORT_SYMBOL,
  but no further optimization anyway.


- What if Clang LTO had been implemented in the final link?
   (this means LTO runs 3 times if KALLSYMS_ALL is enabled)

  With proper linker script input with /DISCARD/,
  the meta-data of EXPORT_SYMBOL() will be dropped,
  and LTO should be able to do further dead code elimination.
  So, I guess we do not need to no-op EXPORT_SYMBOL by CPP
  (unless I am missing something).






--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO
  2021-02-25 17:45   ` Sami Tolvanen
@ 2021-02-25 19:08     ` Masahiro Yamada
  0 siblings, 0 replies; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 19:08 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: linux-kbuild, Christoph Hellwig, Linus Torvalds, Jessica Yu,
	Nicolas Pitre, LKML, linux-arch, Arnd Bergmann

On Fri, Feb 26, 2021 at 2:46 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> Hi Masahiro,
>
> On Thu, Feb 25, 2021 at 8:03 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> > does not work as expected if the .config file has already specified
> > CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling
> > CONFIG_LTO_CLANG.
> >
> > So, the user-supplied whitelist and LTO-specific white list must be
> > independent of each other.
> >
> > I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO
> > handle whitelists in the same way.
> >
> > Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols")
> > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> > ---
> >
> >  init/Kconfig                    |  1 -
> >  scripts/gen_autoksyms.sh        | 33 ++++++++++++++++++++++++---------
> >  scripts/lto-used-symbollist.txt |  5 -----
> >  3 files changed, 24 insertions(+), 15 deletions(-)
> >  delete mode 100644 scripts/lto-used-symbollist.txt
>
> > +
> > +ksym_wl=
> >  if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then
> >         # Use 'eval' to expand the whitelist path and check if it is relative
> >         eval ksym_wl="$CONFIG_UNUSED_KSYMS_WHITELIST"
> > @@ -40,16 +57,14 @@ cat > "$output_file" << EOT
> >  EOT
> >
> >  [ -f modules.order ] && modlist=modules.order || modlist=/dev/null
> > -sed 's/ko$/mod/' $modlist |
> > -xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
> > -cat - "$ksym_wl" |
> > +
> > +{
> > +       sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
> > +       echo "$needed_symbols"
> > +       [ -n "$ksym_wl" ] && cat "$ksym_wl"
> > +} | sed -e 's/ /\n/g' | sed -n -e '/^$/!p' |
> >  # Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
> >  # point addresses.
> >  sed -e 's/^\.//' |
> >  sort -u |
> >  sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
> > -
> > -# Special case for modversions (see modpost.c)
> > -if [ -n "$CONFIG_MODVERSIONS" ]; then
> > -       echo "#define __KSYM_module_layout 1" >> "$output_file"
> > -fi
> > diff --git a/scripts/lto-used-symbollist.txt b/scripts/lto-used-symbollist.txt
> > deleted file mode 100644
> > index 38e7bb9ebaae..000000000000
> > --- a/scripts/lto-used-symbollist.txt
> > +++ /dev/null
> > @@ -1,5 +0,0 @@
> > -memcpy
> > -memmove
> > -memset
> > -__stack_chk_fail
> > -__stack_chk_guard
> > --
> > 2.27.0
> >
> >
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 0bf5b340b80e..351161326e3c 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -2277,7 +2277,6 @@ config TRIM_UNUSED_KSYMS
> >  config UNUSED_KSYMS_WHITELIST
> >         string "Whitelist of symbols to keep in ksymtab"
> >         depends on TRIM_UNUSED_KSYMS
> > -       default "scripts/lto-used-symbollist.txt" if LTO_CLANG
> >         help
> >           By default, all unused exported symbols will be un-exported from the
> >           build when TRIM_UNUSED_KSYMS is selected.
> > diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh
> > index d54dfba15bf2..b74d5949fea6 100755
> > --- a/scripts/gen_autoksyms.sh
> > +++ b/scripts/gen_autoksyms.sh
> > @@ -19,7 +19,24 @@ esac
> >  # We need access to CONFIG_ symbols
> >  . include/config/auto.conf
> >
> > -ksym_wl=/dev/null
> > +needed_symbols=
> > +
> > +# Special case for modversions (see modpost.c)
> > +if [ -n "$CONFIG_MODVERSIONS" ]; then
> > +       needed_symbols="$needed_symbols module_layout"
> > +fi
> > +
> > +# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary
> > +# when the .mod files are generated, which means they don't yet contain
> > +# references to certain symbols that will be present in the final binaries.
> > +if [ -n "$CONFIG_LTO_CLANG" ]; then
> > +       # intrinsic functions
> > +       needed_symbols="$needed_symbols memcpy memmove memset"
> > +       # stack protector symbols
> > +       needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard"
> > +fi
>
> Thank you for the patch!
>
> Arnd just reported that _mcount is also needed with some
> configurations. Would you mind including that in the next version?
>
> https://lore.kernel.org/r/20210225143456.3829513-1-arnd@kernel.org/

Sure, I can even pick it up
although that patch was not addressed to me or kbuild ML.



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS
  2021-02-25 18:57   ` Masahiro Yamada
@ 2021-02-25 19:24     ` Nicolas Pitre
  2021-03-09  7:28       ` Masahiro Yamada
  0 siblings, 1 reply; 17+ messages in thread
From: Nicolas Pitre @ 2021-02-25 19:24 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Sami Tolvanen, Linux Kernel Mailing List, linux-arch

On Fri, 26 Feb 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre <nico@fluxnic.net> wrote:
> >
> > On Fri, 26 Feb 2021, Masahiro Yamada wrote:
> >
> > >
> > > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > > about the build speed.
> > >
> > > I re-implemented this feature, and the build time cost is now
> > > almost unnoticeable level.
> > >
> > > I hope this makes Linus happy.
> >
> > :-)
> >
> > I'm surprised to see that Linus is using this feature. When disabled
> > (the default) this should have had no impact on the build time.
> 
> Linus is not using this feature, but does build tests.
> After pulling the module subsystem pull request in this merge window,
> CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.

If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease. 
That comes with the feature.

> > This feature provides a nice security advantage by significantly
> > reducing the kernel input surface. And people are using that also to
> > better what third party vendor can and cannot do with a distro kernel,
> > etc. But that's not the reason why I implemented this feature in the
> > first place.
> >
> > My primary goal was to efficiently reduce the kernel binary size using
> > LTO even with kernel modules enabled.
> 
> 
> Clang LTO landed in this MW.
> 
> Do you think it will reduce the kernel binary size?
> No, opposite.

LTO ought to reduce binary size. It is rather broken otherwise.
Having a global view before optimizing allows for the compiler to do 
project wide constant propagation and dead code elimination.

> CONFIG_LTO_CLANG cannot trim any code even if it
> is obviously unused.
> Hence, it never reduces the kernel binary size.
> Rather, it produces a bigger kernel.

Then what's the point?

> The reason is Clang LTO was implemented against
> relocatable ELF (vmlinux.o) .

That's not true LTO then.

> I pointed out this flaw in the review process, but
> it was dismissed.
> 
> This is the main reason why I did not give any Ack
> (but it was merged via Kees Cook's tree).

> So, the help text of this option should be revised:
> 
>           This option allows for unused exported symbols to be dropped from
>           the build. In turn, this provides the compiler more opportunities
>           (especially when using LTO) for optimizing the code and reducing
>           binary size.  This might have some security advantages as well.
> 
> Clang LTO is opposite to your expectation.

Then Clang LTO is a misnomer. That is the option to revise not this one.

> > Each EXPORT_SYMBOL() created a
> > symbol dependency that prevented LTO from optimizing out the related
> > code even though a tiny fraction of those exported symbols were needed.
> >
> > The idea behind the recursion was to catch those cases where disabling
> > an exported symbol within a module would optimize out references to more
> > exported symbols that, in turn, could be disabled and possibly trigger
> > yet more code elimination. There is no way that can be achieved without
> > extra compiler passes in a recursive manner.
> 
> I do not understand.
> 
> Modules are relocatable ELF.
> Clang LTO cannot eliminate any code.
> GCC LTO does not work with relocatable ELF
> in the first place.

I don't think I follow you here. What relocatable ELF has to do with LTO?

I've successfully used gcc LTO on the kernel quite a while ago.

For a reference about binary size reduction with LTO and 
CONFIG_TRIM_UNUSED_KSYMS please read this article:

https://lwn.net/Articles/746780/


Nicolas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-02-25 18:46   ` kernel test robot
@ 2021-02-25 20:06     ` Masahiro Yamada
  2021-02-25 21:20       ` Sami Tolvanen
  0 siblings, 1 reply; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-25 20:06 UTC (permalink / raw)
  To: kernel test robot
  Cc: Linux Kbuild mailing list, kbuild-all, Christoph Hellwig,
	Jessica Yu, Nicolas Pitre, Sami Tolvanen,
	Linux Kernel Mailing List, linux-arch

On Fri, Feb 26, 2021 at 3:47 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Masahiro,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [also build test WARNING on next-20210225]
> [cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
> config: powerpc-mpc8313_rdb_defconfig (attached as .config)
> compiler: powerpc-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
>         git checkout 014940331790a8cd9bee92c7201494ec3217201e
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
> >> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]

Thanks. This should be #ifdef, of course.



>        7 | #if CONFIG_TRIM_UNUSED_KSYMS
>          |     ^~~~~~~~~~~~~~~~~~~~~~~~
>
>
> vim +/CONFIG_TRIM_UNUSED_KSYMS +7 scripts/module.lds.S
>
>    > 7  #if CONFIG_TRIM_UNUSED_KSYMS
>      8  #include <generated/keep-ksyms.h>
>      9
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-02-25 20:06     ` Masahiro Yamada
@ 2021-02-25 21:20       ` Sami Tolvanen
  2021-02-26  7:04         ` Masahiro Yamada
  0 siblings, 1 reply; 17+ messages in thread
From: Sami Tolvanen @ 2021-02-25 21:20 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: kernel test robot, Linux Kbuild mailing list, kbuild-all,
	Christoph Hellwig, Jessica Yu, Nicolas Pitre,
	Linux Kernel Mailing List, linux-arch

Hi Masahiro,

On Thu, Feb 25, 2021 at 12:07 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Fri, Feb 26, 2021 at 3:47 AM kernel test robot <lkp@intel.com> wrote:
> >
> > Hi Masahiro,
> >
> > I love your patch! Perhaps something to improve:
> >
> > [auto build test WARNING on linus/master]
> > [also build test WARNING on next-20210225]
> > [cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch]
> >
> > url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
> > config: powerpc-mpc8313_rdb_defconfig (attached as .config)
> > compiler: powerpc-linux-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> >         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> >         chmod +x ~/bin/make.cross
> >         # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
> >         git remote add linux-review https://github.com/0day-ci/linux
> >         git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> >         git checkout 014940331790a8cd9bee92c7201494ec3217201e
> >         # save the attached .config to linux build tree
> >         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> >
> > All warnings (new ones prefixed by >>):
> >
> > >> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
>
> Thanks. This should be #ifdef, of course.

I applied this series and changed these from #if to #ifdef, but I
still see the following build error with TRIM_UNUSED_KSYMS +
OF_UNITTEST:

In file included from drivers/of/unittest-data/testcases.dtb.S:1:
../include/asm-generic/vmlinux.lds.h:54:10: fatal error:
'generated/keep-ksyms.h' file not found
#include <generated/keep-ksyms.h>
         ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

This is with x86_64_defconfig and scripts/config -e OF -e OF_UNITTEST
-e TRIM_UNUSED_KSYMS.

Sami

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-02-25 21:20       ` Sami Tolvanen
@ 2021-02-26  7:04         ` Masahiro Yamada
  0 siblings, 0 replies; 17+ messages in thread
From: Masahiro Yamada @ 2021-02-26  7:04 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: kernel test robot, Linux Kbuild mailing list, kbuild-all,
	Christoph Hellwig, Jessica Yu, Nicolas Pitre,
	Linux Kernel Mailing List, linux-arch

On Fri, Feb 26, 2021 at 6:20 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> Hi Masahiro,
>
> On Thu, Feb 25, 2021 at 12:07 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > On Fri, Feb 26, 2021 at 3:47 AM kernel test robot <lkp@intel.com> wrote:
> > >
> > > Hi Masahiro,
> > >
> > > I love your patch! Perhaps something to improve:
> > >
> > > [auto build test WARNING on linus/master]
> > > [also build test WARNING on next-20210225]
> > > [cannot apply to kbuild/for-next asm-generic/master arm64/for-next/core m68k/for-next openrisc/for-next hp-parisc/for-next arc/for-next uclinux-h8/h8300-next nios2/for-linus v5.11]
> > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > And when submitting patch, we suggest to use '--base' as documented in
> > > https://git-scm.com/docs/git-format-patch]
> > >
> > > url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 29c395c77a9a514c5857c45ceae2665e9bd99ac7
> > > config: powerpc-mpc8313_rdb_defconfig (attached as .config)
> > > compiler: powerpc-linux-gcc (GCC) 9.3.0
> > > reproduce (this is a W=1 build):
> > >         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > >         chmod +x ~/bin/make.cross
> > >         # https://github.com/0day-ci/linux/commit/014940331790a8cd9bee92c7201494ec3217201e
> > >         git remote add linux-review https://github.com/0day-ci/linux
> > >         git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvment-of-CONFIG_TRIM_UNUSED_KSYMS/20210226-000929
> > >         git checkout 014940331790a8cd9bee92c7201494ec3217201e
> > >         # save the attached .config to linux build tree
> > >         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > >
> > > All warnings (new ones prefixed by >>):
> > >
> > > >> scripts/module.lds.S:7:5: warning: "CONFIG_TRIM_UNUSED_KSYMS" is not defined, evaluates to 0 [-Wundef]
> >
> > Thanks. This should be #ifdef, of course.
>
> I applied this series and changed these from #if to #ifdef, but I
> still see the following build error with TRIM_UNUSED_KSYMS +
> OF_UNITTEST:
>
> In file included from drivers/of/unittest-data/testcases.dtb.S:1:
> ../include/asm-generic/vmlinux.lds.h:54:10: fatal error:
> 'generated/keep-ksyms.h' file not found
> #include <generated/keep-ksyms.h>
>          ^~~~~~~~~~~~~~~~~~~~~~~~
> 1 error generated.
>
> This is with x86_64_defconfig and scripts/config -e OF -e OF_UNITTEST
> -e TRIM_UNUSED_KSYMS.
>
> Sami

Thanks. I will fix it.
I will come back with v2
probably after v5.12-rc1 is tagged.





-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS
  2021-02-25 19:24     ` Nicolas Pitre
@ 2021-03-09  7:28       ` Masahiro Yamada
  2021-03-09 16:49         ` Nicolas Pitre
  0 siblings, 1 reply; 17+ messages in thread
From: Masahiro Yamada @ 2021-03-09  7:28 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Sami Tolvanen, Linux Kernel Mailing List, linux-arch

On Fri, Feb 26, 2021 at 4:24 AM Nicolas Pitre <nico@fluxnic.net> wrote:
>
> On Fri, 26 Feb 2021, Masahiro Yamada wrote:
>
> > On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre <nico@fluxnic.net> wrote:
> > >
> > > On Fri, 26 Feb 2021, Masahiro Yamada wrote:
> > >
> > > >
> > > > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > > > about the build speed.
> > > >
> > > > I re-implemented this feature, and the build time cost is now
> > > > almost unnoticeable level.
> > > >
> > > > I hope this makes Linus happy.
> > >
> > > :-)
> > >
> > > I'm surprised to see that Linus is using this feature. When disabled
> > > (the default) this should have had no impact on the build time.
> >
> > Linus is not using this feature, but does build tests.
> > After pulling the module subsystem pull request in this merge window,
> > CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.
>
> If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
> That comes with the feature.


This patch set intends to change this.
TRIM_UNUSED_KSYMS will build without additional cost,
like LD_DEAD_CODE_DATA_ELIMINATION.



>
> > > This feature provides a nice security advantage by significantly
> > > reducing the kernel input surface. And people are using that also to
> > > better what third party vendor can and cannot do with a distro kernel,
> > > etc. But that's not the reason why I implemented this feature in the
> > > first place.
> > >
> > > My primary goal was to efficiently reduce the kernel binary size using
> > > LTO even with kernel modules enabled.
> >
> >
> > Clang LTO landed in this MW.
> >
> > Do you think it will reduce the kernel binary size?
> > No, opposite.
>
> LTO ought to reduce binary size. It is rather broken otherwise.
> Having a global view before optimizing allows for the compiler to do
> project wide constant propagation and dead code elimination.
>
> > CONFIG_LTO_CLANG cannot trim any code even if it
> > is obviously unused.
> > Hence, it never reduces the kernel binary size.
> > Rather, it produces a bigger kernel.
>
> Then what's the point?


Presumably, reducing the size is not
the main interest for Googlers.


>
> > The reason is Clang LTO was implemented against
> > relocatable ELF (vmlinux.o) .
>
> That's not true LTO then.


This is the same as what I said in the review process.
:-)

https://lore.kernel.org/linux-kbuild/CAK7LNASQPOGohtUyzBM6n54pzpLN35kDXC7VbvWzX8QWUmqq9g@mail.gmail.com/




>
> > I pointed out this flaw in the review process, but
> > it was dismissed.
> >
> > This is the main reason why I did not give any Ack
> > (but it was merged via Kees Cook's tree).
>
> > So, the help text of this option should be revised:
> >
> >           This option allows for unused exported symbols to be dropped from
> >           the build. In turn, this provides the compiler more opportunities
> >           (especially when using LTO) for optimizing the code and reducing
> >           binary size.  This might have some security advantages as well.
> >
> > Clang LTO is opposite to your expectation.
>
> Then Clang LTO is a misnomer. That is the option to revise not this one.
>
> > > Each EXPORT_SYMBOL() created a
> > > symbol dependency that prevented LTO from optimizing out the related
> > > code even though a tiny fraction of those exported symbols were needed.
> > >
> > > The idea behind the recursion was to catch those cases where disabling
> > > an exported symbol within a module would optimize out references to more
> > > exported symbols that, in turn, could be disabled and possibly trigger
> > > yet more code elimination. There is no way that can be achieved without
> > > extra compiler passes in a recursive manner.
> >
> > I do not understand.
> >
> > Modules are relocatable ELF.
> > Clang LTO cannot eliminate any code.
> > GCC LTO does not work with relocatable ELF
> > in the first place.
>
> I don't think I follow you here. What relocatable ELF has to do with LTO?



What is important is,
GCC LTO is the feature of gcc, not binutils.
That is, LD_FINAL is $(CC).

GCC LTO can be implemented for the final link stage
by using $(CC) as the linker driver.
Then, it can determine which code is unreachable.
In other words, GCC LTO works only when building
the final executable.


On the other hand, a relocatable ELF is created
by $(LD) -r by combining some objects together.
The relocatable ELF can be fed to another $(LD) -r,
or the final link stage.


vmlinux is an executable ELF.
modules (*.ko files) are relocatable ELFs.


You can confirm it easily
by using the 'file' command.

masahiro@oscar:~/ref/linux$ file vmlinux
vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
statically linked,
BuildID[sha1]=ee0cef2ff3d9f490e0f5ee1d7e74b19aa167933b, not stripped
masahiro@oscar:~/ref/linux$ file  net/ipv4/netfilter/iptable_nat.ko
net/ipv4/netfilter/iptable_nat.ko: ELF 64-bit LSB relocatable, x86-64,
version 1 (SYSV),
BuildID[sha1]=4829e82f9b9e7fd65be3c19c1cf0e16a7ddf0967, not stripped



Modules are not filled with addresses yet
since we do not know which memory address
the module will be loaded to.
The addresses are resolved at modprobe time.

As I said above, modules are created by $(LD) -r.
It is not possible to implement GCC LTO for modules.



In contrast, Clang LTO is the ability of $(LD).
So, it can be implemented for not only for executable ELFs,
but also for relocated ELFs.
The problem is Clang LTO cannot determine which code is
unreachable if it is implemented for a relocatable ELF,
since it is not a final image.

Did I answer your question?





> I've successfully used gcc LTO on the kernel quite a while ago.
>
> For a reference about binary size reduction with LTO and
> CONFIG_TRIM_UNUSED_KSYMS please read this article:
>
> https://lwn.net/Articles/746780/


Thanks for the great articles.

Just for curiosity, I think you used GCC LTO from
Andy's GitHub.


In the article, you took stm32_defconfig as an example,
but ARM does not select ARCH_SUPPORTS_LTO.

Did you add some local hacks to make LTO work
for ARM?

I tried the lto-5.8.1 branch, but
I did not even succeed in building x86 + LTO.






>
> Nicolas



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS
  2021-03-09  7:28       ` Masahiro Yamada
@ 2021-03-09 16:49         ` Nicolas Pitre
  0 siblings, 0 replies; 17+ messages in thread
From: Nicolas Pitre @ 2021-03-09 16:49 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Sami Tolvanen, Linux Kernel Mailing List, linux-arch

On Tue, 9 Mar 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 4:24 AM Nicolas Pitre <nico@fluxnic.net> wrote:
> >
> > If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
> > That comes with the feature.
> 
> This patch set intends to change this.
> TRIM_UNUSED_KSYMS will build without additional cost,
> like LD_DEAD_CODE_DATA_ELIMINATION.

OK... I do see how you're going about it.

> > > Modules are relocatable ELF.
> > > Clang LTO cannot eliminate any code.
> > > GCC LTO does not work with relocatable ELF
> > > in the first place.
> >
> > I don't think I follow you here. What relocatable ELF has to do with LTO?
> 
> What is important is,
> GCC LTO is the feature of gcc, not binutils.
> That is, LD_FINAL is $(CC).

Exact.

> GCC LTO can be implemented for the final link stage
> by using $(CC) as the linker driver.
> Then, it can determine which code is unreachable.
> In other words, GCC LTO works only when building
> the final executable.

Yes. And it does so by filling .o files with its intermediate code 
representation and not ELF code.

> On the other hand, a relocatable ELF is created
> by $(LD) -r by combining some objects together.
> The relocatable ELF can be fed to another $(LD) -r,
> or the final link stage.

You still can create relocatable ELF using LTO. But LTO stops there. 
From that point on, .o files will no longer contain data that LTO can 
use if you further combine those object files together. But until that 
point, LTO is still usable.

> As I said above, modules are created by $(LD) -r.
> It is not possible to implement GCC LTO for modules.

If I remember correctly (that was a while ago) the problem with LTO and 
the kernel had to do with the fact that avery subdirectory was gathering 
object files in built-in.o using ld -r. At some point we switched to 
gathering object files into built-in.a files where no linking is taking 
place. The real linking happens in vmlinux.o where LTO may now do its 
magic.

The same is true for modules. Compiling foo_module.c into foo_module.o 
will create a .o file with LTO data rather than executable code. But 
when you create the final .o for the module then LTO takes place and 
produce the relocatable ELF executable.

> > I've successfully used gcc LTO on the kernel quite a while ago.
> >
> > For a reference about binary size reduction with LTO and
> > CONFIG_TRIM_UNUSED_KSYMS please read this article:
> >
> > https://lwn.net/Articles/746780/
> 
> Thanks for the great articles.
> 
> Just for curiosity, I think you used GCC LTO from
> Andy's GitHub.

Right. I provided the reference in the preceding article:
https://lwn.net/Articles/744507/ 

> In the article, you took stm32_defconfig as an example,
> but ARM does not select ARCH_SUPPORTS_LTO.
> 
> Did you add some local hacks to make LTO work
> for ARM?

Of course. This article was written in 2017 and no LTO support at all 
was in mainline back then. But, besides adding CONFIG_LTO, very little 
was needed to make it compile, and I did upstream most changes such as 
commit 75fea300d7, commit a85b2257a5, commit 5d48417592, commit 
19c233b79d, etc.

> I tried the lto-5.8.1 branch, but
> I did not even succeed in building x86 + LTO.

My latest working LTO branch (i.e. last time I worked on it) is much 
older than that.

Maybe people aren't very excited about LTO because it makes the time to 
recompiling the kernel many times longer because gcc does its 
optimization passes on the whole kernel even if you modify a single 
file.


Nicolas

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-03-09 16:50 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-25 16:02 [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
2021-02-25 16:02 ` [PATCH 1/4] kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO Masahiro Yamada
2021-02-25 17:45   ` Sami Tolvanen
2021-02-25 19:08     ` Masahiro Yamada
2021-02-25 16:02 ` [PATCH 2/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
2021-02-25 16:02 ` [PATCH 3/4] kbuild: separate out vmlinux.lds generation Masahiro Yamada
2021-02-25 16:02 ` [PATCH 4/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
2021-02-25 18:46   ` kernel test robot
2021-02-25 20:06     ` Masahiro Yamada
2021-02-25 21:20       ` Sami Tolvanen
2021-02-26  7:04         ` Masahiro Yamada
2021-02-25 18:56   ` kernel test robot
2021-02-25 17:19 ` [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS Nicolas Pitre
2021-02-25 18:57   ` Masahiro Yamada
2021-02-25 19:24     ` Nicolas Pitre
2021-03-09  7:28       ` Masahiro Yamada
2021-03-09 16:49         ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).