linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] kbuild: build speed improvement of CONFIG_TRIM_UNUSED_KSYMS
@ 2021-03-09 15:17 Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 1/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Masahiro Yamada @ 2021-03-09 15:17 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	linux-kernel, linux-arch, Masahiro Yamada


Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
about the build speed.

I re-implemented this feature, and the build time cost is now
almost unnoticeable level.


(no changes since v1)

Masahiro Yamada (4):
  export.h: make __ksymtab_strings per-symbol section
  kbuild: separate out vmlinux.lds generation
  kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in
    one-pass
  kbuild: remove guarding from TRIM_UNUSED_KSYMS

 Makefile                                      | 34 ++++-----
 arch/alpha/kernel/Makefile                    |  3 +-
 arch/arc/kernel/Makefile                      |  3 +-
 arch/arm/kernel/Makefile                      |  3 +-
 arch/arm64/kernel/Makefile                    |  3 +-
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S             |  1 +
 arch/csky/kernel/Makefile                     |  3 +-
 arch/h8300/kernel/Makefile                    |  2 +-
 arch/hexagon/kernel/Makefile                  |  3 +-
 arch/ia64/kernel/Makefile                     |  3 +-
 arch/m68k/kernel/Makefile                     |  2 +-
 arch/microblaze/kernel/Makefile               |  3 +-
 arch/mips/kernel/Makefile                     |  3 +-
 arch/nds32/kernel/Makefile                    |  3 +-
 arch/nios2/kernel/Makefile                    |  2 +-
 arch/openrisc/kernel/Makefile                 |  3 +-
 arch/parisc/kernel/Makefile                   |  3 +-
 arch/powerpc/kernel/Makefile                  |  2 +-
 arch/powerpc/kernel/vdso32/vdso32.lds.S       |  1 +
 arch/powerpc/kernel/vdso64/vdso64.lds.S       |  1 +
 arch/riscv/kernel/Makefile                    |  2 +-
 arch/s390/kernel/Makefile                     |  3 +-
 arch/s390/purgatory/purgatory.lds.S           |  1 +
 arch/sh/kernel/Makefile                       |  3 +-
 arch/sparc/kernel/Makefile                    |  2 +-
 arch/um/kernel/Makefile                       |  2 +-
 arch/x86/kernel/Makefile                      |  2 +-
 arch/xtensa/kernel/Makefile                   |  3 +-
 include/asm-generic/export.h                  | 25 +------
 include/asm-generic/vmlinux.lds.h             | 13 ++--
 include/linux/export.h                        | 56 ++++----------
 include/linux/ksyms.lds.h                     | 22 ++++++
 init/Kconfig                                  |  3 +-
 scripts/Makefile.build                        |  7 +-
 scripts/Makefile.lib                          |  1 +
 scripts/adjust_autoksyms.sh                   | 73 -------------------
 .../{gen_autoksyms.sh => gen-keep-ksyms.sh}   | 43 ++++++++---
 scripts/gen_ksymdeps.sh                       | 25 -------
 scripts/module.lds.S                          | 23 +++---
 39 files changed, 152 insertions(+), 238 deletions(-)
 create mode 100644 include/linux/ksyms.lds.h
 delete mode 100755 scripts/adjust_autoksyms.sh
 rename scripts/{gen_autoksyms.sh => gen-keep-ksyms.sh} (66%)
 delete mode 100755 scripts/gen_ksymdeps.sh

-- 
2.27.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] export.h: make __ksymtab_strings per-symbol section
  2021-03-09 15:17 [PATCH v2 0/4] kbuild: build speed improvement of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
@ 2021-03-09 15:17 ` Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 2/4] kbuild: separate out vmlinux.lds generation Masahiro Yamada
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Masahiro Yamada @ 2021-03-09 15:17 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	linux-kernel, linux-arch, Masahiro Yamada

The export symbol tables are placed on own sections (__ksymtab*+<sym>)
and sorted by SORT (an alias of SORT_BY_NAME) because the module
subsystem uses the binary search for symbol resolution.

We did not have a good reason to do so for __ksymtab_strings, but
now I have.

To make CONFIG_TRIM_UNUSED_KSYMS work in one-pass, the linker needs
to trim unused strings of symbols and namespaces. To allow per-symbol
keep/drop choice, __ksymtab_strings must be placed on own sections.

This keeps the string unification introduced by commit ce2b617ce8cb
("export.h: reduce __ksymtab_strings string duplication by using "MS"
section flags").

For example, the empty namespaces share the same address.

  $ nm -n vmlinux
  [ snip ]
  ffffffff8233b6aa r __kstrtabns_IO_APIC_get_PCI_irq_vector
  ffffffff8233b6aa r __kstrtabns_I_BDEV
  ffffffff8233b6aa r __kstrtabns_LZ4_decompress_fast
  ffffffff8233b6aa r __kstrtabns_LZ4_decompress_fast_continue
  ffffffff8233b6aa r __kstrtabns_LZ4_decompress_fast_usingDict
    ...

I confirmed no size change in vmlinux.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

(no changes since v1)

 include/asm-generic/export.h      | 2 +-
 include/asm-generic/vmlinux.lds.h | 2 +-
 include/linux/export.h            | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 07a36a874dca..e847f1fde367 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -39,7 +39,7 @@
 __ksymtab_\name:
 	__put \val, __kstrtab_\name
 	.previous
-	.section __ksymtab_strings,"aMS",%progbits,1
+	.section __ksymtab_strings+\name,"aMS",%progbits,1
 __kstrtab_\name:
 	.asciz "\name"
 	.previous
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0331d5d49551..6ce6dcabdccf 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -513,7 +513,7 @@
 									\
 	/* Kernel symbol table: strings */				\
         __ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) {	\
-		*(__ksymtab_strings)					\
+		*(__ksymtab_strings+*)					\
 	}								\
 									\
 	/* __*init sections */						\
diff --git a/include/linux/export.h b/include/linux/export.h
index 6271a5d9c988..01e6ab19b226 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -99,7 +99,7 @@ struct kernel_symbol {
 	extern const char __kstrtab_##sym[];					\
 	extern const char __kstrtabns_##sym[];					\
 	__CRC_SYMBOL(sym, sec);							\
-	asm("	.section \"__ksymtab_strings\",\"aMS\",%progbits,1	\n"	\
+	asm("	.section \"__ksymtab_strings+" #sym "\",\"aMS\",%progbits,1\n"	\
 	    "__kstrtab_" #sym ":					\n"	\
 	    "	.asciz 	\"" #sym "\"					\n"	\
 	    "__kstrtabns_" #sym ":					\n"	\
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] kbuild: separate out vmlinux.lds generation
  2021-03-09 15:17 [PATCH v2 0/4] kbuild: build speed improvement of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 1/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
@ 2021-03-09 15:17 ` Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS Masahiro Yamada
  3 siblings, 0 replies; 13+ messages in thread
From: Masahiro Yamada @ 2021-03-09 15:17 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	linux-kernel, linux-arch, Masahiro Yamada

This is a preparation for the CONFIG_TRIM_UNUSED_KSYMS improvement.

In the new implementation of CONFIG_TRIM_UNUSED_KSYMS (next commit),
unused export symbols will be trimmed at the link stage. Kbuild will
need to traverse the tree to know which symbols are needed by modules.

The list of sections that need linking shall be generated after the
directory traverse, and included from vmlinux.lds.S and module.lds.S.

The build rule of module.lds is already separated as modules_prepare.

The build of vmlinux.lds must be delayed because such a list is not yet
available while Kbuild is visiting arch/$(SRCARCH)/kernel/Makefile.

Separate the build rule of vmlinux.lds, and invokes it from the top
Makefile.

I guarded the $(warning ) in scripts/Makefile.build, otherwise a false-
positive warning would be displayed, for example when building ARCH=ia64
with CONFIG_IA64_PALINFO=m. Ideally, vmlinux.lds.S could be moved to a
different directory, but I am just doing less-invasive changes for now.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

(no changes since v1)

 Makefile                        | 8 ++++++--
 arch/alpha/kernel/Makefile      | 3 ++-
 arch/arc/kernel/Makefile        | 3 ++-
 arch/arm/kernel/Makefile        | 3 ++-
 arch/arm64/kernel/Makefile      | 3 ++-
 arch/csky/kernel/Makefile       | 3 ++-
 arch/h8300/kernel/Makefile      | 2 +-
 arch/hexagon/kernel/Makefile    | 3 ++-
 arch/ia64/kernel/Makefile       | 3 ++-
 arch/m68k/kernel/Makefile       | 2 +-
 arch/microblaze/kernel/Makefile | 3 ++-
 arch/mips/kernel/Makefile       | 3 ++-
 arch/nds32/kernel/Makefile      | 3 ++-
 arch/nios2/kernel/Makefile      | 2 +-
 arch/openrisc/kernel/Makefile   | 3 ++-
 arch/parisc/kernel/Makefile     | 3 ++-
 arch/powerpc/kernel/Makefile    | 2 +-
 arch/riscv/kernel/Makefile      | 2 +-
 arch/s390/kernel/Makefile       | 3 ++-
 arch/sh/kernel/Makefile         | 3 ++-
 arch/sparc/kernel/Makefile      | 2 +-
 arch/um/kernel/Makefile         | 2 +-
 arch/x86/kernel/Makefile        | 2 +-
 arch/xtensa/kernel/Makefile     | 3 ++-
 scripts/Makefile.build          | 2 ++
 25 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/Makefile b/Makefile
index 31dcdb3d61fa..89862b9f45d7 100644
--- a/Makefile
+++ b/Makefile
@@ -1186,6 +1186,9 @@ quiet_cmd_autoksyms_h = GEN     $@
 $(autoksyms_h):
 	$(call cmd,autoksyms_h)
 
+$(KBUILD_LDS): prepare FORCE
+	$(Q)$(MAKE) $(build)=$(patsubst %/,%,$(dir $@)) $@
+
 ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
 
 # Final link of vmlinux with optional arch pass after final link
@@ -1193,14 +1196,15 @@ cmd_link-vmlinux =                                                 \
 	$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
+vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(KBUILD_LDS) \
+			$(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) FORCE
 	+$(call if_changed,link-vmlinux)
 
 targets := vmlinux
 
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
-$(sort $(vmlinux-deps) $(subdir-modorder)): descend ;
+$(sort $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) $(subdir-modorder)): descend ;
 
 filechk_kernel.release = \
 	echo "$(KERNELVERSION)$$($(CONFIG_SHELL) $(srctree)/scripts/setlocalversion $(srctree))"
diff --git a/arch/alpha/kernel/Makefile b/arch/alpha/kernel/Makefile
index 5a74581bf0ee..6e2baaebdee3 100644
--- a/arch/alpha/kernel/Makefile
+++ b/arch/alpha/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the linux kernel.
 #
 
-extra-y		:= head.o vmlinux.lds
+extra-y		:= head.o
+targets		+= vmlinux.lds
 asflags-y	:= $(KBUILD_CFLAGS)
 ccflags-y	:= -Wno-sign-compare
 
diff --git a/arch/arc/kernel/Makefile b/arch/arc/kernel/Makefile
index 8c4fc4b54c14..0a06c018f0cd 100644
--- a/arch/arc/kernel/Makefile
+++ b/arch/arc/kernel/Makefile
@@ -31,4 +31,5 @@ else
 obj-y += ctx_sw_asm.o
 endif
 
-extra-y := vmlinux.lds head.o
+targets += vmlinux.lds
+extra-y := head.o
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ae295a3bcfef..7483916c034d 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -106,4 +106,5 @@ endif
 
 obj-$(CONFIG_HAVE_ARM_SMCCC)	+= smccc-call.o
 
-extra-y := $(head-y) vmlinux.lds
+extra-y := $(head-y)
+targets += vmlinux.lds
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ed65576ce710..32e530c22cba 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -64,7 +64,8 @@ obj-$(CONFIG_COMPAT_VDSO)		+= vdso32-wrap.o
 
 obj-y					+= probes/
 head-y					:= head.o
-extra-y					+= $(head-y) vmlinux.lds
+extra-y					+= $(head-y)
+targets					+= vmlinux.lds
 
 ifeq ($(CONFIG_DEBUG_EFI),y)
 AFLAGS_head.o += -DVMLINUX_PATH="\"$(realpath $(objtree)/vmlinux)\""
diff --git a/arch/csky/kernel/Makefile b/arch/csky/kernel/Makefile
index 6c0f36010ed0..06acc85a2640 100644
--- a/arch/csky/kernel/Makefile
+++ b/arch/csky/kernel/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-y += entry.o atomic.o signal.o traps.o irq.o time.o vdso.o vdso/
 obj-y += power.o syscall.o syscall_table.o setup.o
diff --git a/arch/h8300/kernel/Makefile b/arch/h8300/kernel/Makefile
index 307aa51576dd..7ef912ee576f 100644
--- a/arch/h8300/kernel/Makefile
+++ b/arch/h8300/kernel/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the linux kernel.
 #
 
-extra-y := vmlinux.lds
+targets += vmlinux.lds
 
 obj-y := process.o traps.o ptrace.o \
 	 signal.o setup.o syscalls.o \
diff --git a/arch/hexagon/kernel/Makefile b/arch/hexagon/kernel/Makefile
index fae3dce32fde..9765301d2672 100644
--- a/arch/hexagon/kernel/Makefile
+++ b/arch/hexagon/kernel/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-$(CONFIG_SMP) += smp.o
 
diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 78717819131c..02575e838e7a 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -7,7 +7,8 @@ ifdef CONFIG_DYNAMIC_FTRACE
 CFLAGS_REMOVE_ftrace.o = -pg
 endif
 
-extra-y	:= head.o vmlinux.lds
+extra-y	:= head.o
+targets	+= vmlinux.lds
 
 obj-y := entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o irq_ia64.o	\
 	 irq_lsapic.o ivt.o pal.o patch.o process.o ptrace.o sal.o		\
diff --git a/arch/m68k/kernel/Makefile b/arch/m68k/kernel/Makefile
index dbac7f8743fc..b054f4198e63 100644
--- a/arch/m68k/kernel/Makefile
+++ b/arch/m68k/kernel/Makefile
@@ -12,7 +12,7 @@ extra-$(CONFIG_HP300)	:= head.o
 extra-$(CONFIG_Q40)	:= head.o
 extra-$(CONFIG_SUN3X)	:= head.o
 extra-$(CONFIG_SUN3)	:= sun3-head.o
-extra-y			+= vmlinux.lds
+targets			+= vmlinux.lds
 
 obj-y	:= entry.o irq.o module.o process.o ptrace.o
 obj-y	+= setup.o signal.o sys_m68k.o syscalltable.o time.o traps.o
diff --git a/arch/microblaze/kernel/Makefile b/arch/microblaze/kernel/Makefile
index 15a20eb814ce..cdf98cbfcce9 100644
--- a/arch/microblaze/kernel/Makefile
+++ b/arch/microblaze/kernel/Makefile
@@ -12,7 +12,8 @@ CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_process.o = -pg
 endif
 
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-y += dma.o exceptions.o \
 	hw_exception_handler.o irq.o \
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index b4a57f1de772..f2e82faa06c4 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the Linux/MIPS kernel.
 #
 
-extra-y		:= head.o vmlinux.lds
+extra-y		:= head.o
+targets		+= vmlinux.lds
 
 obj-y		+= branch.o cmpxchg.o elf.o entry.o genex.o idle.o irq.o \
 		   process.o prom.o ptrace.o reset.o setup.o signal.o \
diff --git a/arch/nds32/kernel/Makefile b/arch/nds32/kernel/Makefile
index 394df3f6442c..ec061f18f00f 100644
--- a/arch/nds32/kernel/Makefile
+++ b/arch/nds32/kernel/Makefile
@@ -19,7 +19,8 @@ obj-$(CONFIG_OF)		+= devtree.o
 obj-$(CONFIG_CACHE_L2)		+= atl2c.o
 obj-$(CONFIG_PERF_EVENTS) += perf_event_cpu.o
 obj-$(CONFIG_PM)		+= pm.o sleep.o
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 CFLAGS_fpu.o += -mext-fpu-sp -mext-fpu-dp
 
diff --git a/arch/nios2/kernel/Makefile b/arch/nios2/kernel/Makefile
index 0b645e1e3158..1ec4be68462e 100644
--- a/arch/nios2/kernel/Makefile
+++ b/arch/nios2/kernel/Makefile
@@ -4,7 +4,7 @@
 #
 
 extra-y	+= head.o
-extra-y	+= vmlinux.lds
+targets	+= vmlinux.lds
 
 obj-y	+= cpuinfo.o
 obj-y	+= entry.o
diff --git a/arch/openrisc/kernel/Makefile b/arch/openrisc/kernel/Makefile
index 2d172e79f58d..6be5c65ea3e9 100644
--- a/arch/openrisc/kernel/Makefile
+++ b/arch/openrisc/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the linux kernel.
 #
 
-extra-y	:= head.o vmlinux.lds
+extra-y	:= head.o
+targets	+= vmlinux.lds
 
 obj-y	:= setup.o or32_ksyms.o process.o dma.o \
 	   traps.o time.o irq.o entry.o ptrace.o signal.o \
diff --git a/arch/parisc/kernel/Makefile b/arch/parisc/kernel/Makefile
index 068d90950d93..31e5109251aa 100644
--- a/arch/parisc/kernel/Makefile
+++ b/arch/parisc/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for arch/parisc/kernel
 #
 
-extra-y			:= head.o vmlinux.lds
+extra-y		:= head.o
+targets		+= vmlinux.lds
 
 obj-y	     	:= cache.o pacache.o setup.o pdt.o traps.o time.o irq.o \
 		   pa7300lc.o syscall.o entry.o sys_parisc.o firmware.o \
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 6084fa499aa3..c7576957f05a 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -101,7 +101,7 @@ extra-$(CONFIG_40x)		:= head_40x.o
 extra-$(CONFIG_44x)		:= head_44x.o
 extra-$(CONFIG_FSL_BOOKE)	:= head_fsl_booke.o
 extra-$(CONFIG_PPC_8xx)		:= head_8xx.o
-extra-y				+= vmlinux.lds
+targets				+= vmlinux.lds
 
 obj-$(CONFIG_RELOCATABLE)	+= reloc_$(BITS).o
 
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 3dc0abde988a..c74c495d4f56 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -10,7 +10,7 @@ CFLAGS_REMOVE_sbi.o	= $(CC_FLAGS_FTRACE)
 endif
 
 extra-y += head.o
-extra-y += vmlinux.lds
+targets += vmlinux.lds
 
 obj-y	+= soc.o
 obj-y	+= cpu.o
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
index c97818a382f3..15d3ee771f22 100644
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -42,7 +42,8 @@ obj-y	+= entry.o reipl.o relocate_kernel.o kdebugfs.o alternative.o
 obj-y	+= nospec-branch.o ipl_vmparm.o machine_kexec_reloc.o unwind_bc.o
 obj-y	+= smp.o
 
-extra-y				+= head64.o vmlinux.lds
+extra-y				+= head64.o
+targets				+= vmlinux.lds
 
 obj-$(CONFIG_SYSFS)		+= nospec-sysfs.o
 CFLAGS_REMOVE_nospec-branch.o	+= $(CC_FLAGS_EXPOLINE)
diff --git a/arch/sh/kernel/Makefile b/arch/sh/kernel/Makefile
index aa0fbc9202b1..e8384889f5f0 100644
--- a/arch/sh/kernel/Makefile
+++ b/arch/sh/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the Linux/SuperH kernel.
 #
 
-extra-y	:= head_32.o vmlinux.lds
+extra-y	:= head_32.o
+targets	+= vmlinux.lds
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not profile debug and lowlevel utilities
diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile
index d3a0e072ebe8..685669edb9f8 100644
--- a/arch/sparc/kernel/Makefile
+++ b/arch/sparc/kernel/Makefile
@@ -12,7 +12,7 @@ extra-y     := head_$(BITS).o
 # Undefine sparc when processing vmlinux.lds - it is used
 # And teach CPP we are doing $(BITS) builds (for this case)
 CPPFLAGS_vmlinux.lds := -Usparc -m$(BITS)
-extra-y              += vmlinux.lds
+targets              += vmlinux.lds
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not profile debug and lowlevel utilities
diff --git a/arch/um/kernel/Makefile b/arch/um/kernel/Makefile
index 5aa882011e04..76eea4cc00f0 100644
--- a/arch/um/kernel/Makefile
+++ b/arch/um/kernel/Makefile
@@ -12,7 +12,7 @@ CPPFLAGS_vmlinux.lds := -DSTART=$(LDS_START)		\
                         -DELF_ARCH=$(LDS_ELF_ARCH)	\
                         -DELF_FORMAT=$(LDS_ELF_FORMAT)	\
 			$(LDS_EXTRA)
-extra-y := vmlinux.lds
+targets += vmlinux.lds
 
 obj-y = config.o exec.o exitcode.o irq.o ksyms.o mem.o \
 	physmem.o process.o ptrace.o reboot.o sigio.o \
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 2ddf08351f0b..7d6fce044f97 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -7,7 +7,7 @@ extra-y	:= head_$(BITS).o
 extra-y	+= head$(BITS).o
 extra-y	+= ebda.o
 extra-y	+= platform-quirks.o
-extra-y	+= vmlinux.lds
+targets	+= vmlinux.lds
 
 CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
 
diff --git a/arch/xtensa/kernel/Makefile b/arch/xtensa/kernel/Makefile
index d4082c6a121b..79be7bfdf989 100644
--- a/arch/xtensa/kernel/Makefile
+++ b/arch/xtensa/kernel/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the Linux/Xtensa kernel.
 #
 
-extra-y := head.o vmlinux.lds
+extra-y := head.o
+targets += vmlinux.lds
 
 obj-y := align.o coprocessor.o entry.o irq.o platform.o process.o \
 	 ptrace.o setup.o signal.o stacktrace.o syscall.o time.o traps.o \
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 1b6094a13034..7df96bfe694e 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -62,12 +62,14 @@ ifndef obj
 $(warning kbuild: Makefile.build is included improperly)
 endif
 
+ifeq ($(filter-out %.mod, $(MAKECMDGOALS)),)
 ifeq ($(need-modorder),)
 ifneq ($(obj-m),)
 $(warning $(patsubst %.o,'%.ko',$(obj-m)) will not be built even though obj-m is specified.)
 $(warning You cannot use subdir-y/m to visit a module Makefile. Use obj-y/m instead.)
 endif
 endif
+endif
 
 # ===========================================================================
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 15:17 [PATCH v2 0/4] kbuild: build speed improvement of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 1/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
  2021-03-09 15:17 ` [PATCH v2 2/4] kbuild: separate out vmlinux.lds generation Masahiro Yamada
@ 2021-03-09 15:17 ` Masahiro Yamada
  2021-03-09 17:36   ` Nicolas Pitre
  2021-03-17 15:48   ` kernel test robot
  2021-03-09 15:17 ` [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS Masahiro Yamada
  3 siblings, 2 replies; 13+ messages in thread
From: Masahiro Yamada @ 2021-03-09 15:17 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	linux-kernel, linux-arch, Masahiro Yamada

Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
guarding") re-enabled this feature, but Linus is still unhappy about
the build time.

The reason of the slowness is the recursion - this basically works in
two loops.

In the first loop, Kbuild builds the entire tree based on the temporary
autoksyms.h, which contains macro defines to control whether their
corresponding EXPORT_SYMBOL() is enabled or not, and also gathers all
symbols required by modules. After the tree traverse, Kbuild updates
autoksyms.h and triggers the second loop to rebuild source files whose
EXPORT_SYMBOL() needs flipping.

This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work in
one pass. In the new design, unneeded EXPORT_SYMBOL() instances are
trimmed by the linker instead of the preprocessor.

After the tree traverse, a linker script snippet <generated/keep-ksyms.h>
is generated. It feeds the list of necessary sections to vmlinus.lds.S
and modules.lds.S. The other sections fall into /DISCARD/.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

Changes in v2:
  - Fix build errors
  - Add LC_ALL=C so the script will not be affected by the user's locale

 Makefile                                      | 30 +++-----
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S             |  1 +
 arch/powerpc/kernel/vdso32/vdso32.lds.S       |  1 +
 arch/powerpc/kernel/vdso64/vdso64.lds.S       |  1 +
 arch/s390/purgatory/purgatory.lds.S           |  1 +
 include/asm-generic/export.h                  | 23 ------
 include/asm-generic/vmlinux.lds.h             | 13 ++--
 include/linux/export.h                        | 54 +++-----------
 include/linux/ksyms.lds.h                     | 22 ++++++
 scripts/Makefile.build                        |  5 --
 scripts/Makefile.lib                          |  1 +
 scripts/adjust_autoksyms.sh                   | 73 -------------------
 .../{gen_autoksyms.sh => gen-keep-ksyms.sh}   | 43 ++++++++---
 scripts/gen_ksymdeps.sh                       | 25 -------
 scripts/module.lds.S                          | 23 +++---
 15 files changed, 105 insertions(+), 211 deletions(-)
 create mode 100644 include/linux/ksyms.lds.h
 delete mode 100755 scripts/adjust_autoksyms.sh
 rename scripts/{gen_autoksyms.sh => gen-keep-ksyms.sh} (66%)
 delete mode 100755 scripts/gen_ksymdeps.sh

diff --git a/Makefile b/Makefile
index 89862b9f45d7..25a5c0c3fb7d 100644
--- a/Makefile
+++ b/Makefile
@@ -1162,29 +1162,23 @@ export KBUILD_LDS          := arch/$(SRCARCH)/kernel/vmlinux.lds
 # used by scripts/Makefile.package
 export KBUILD_ALLDIRS := $(sort $(filter-out arch/%,$(vmlinux-alldirs)) LICENSES arch include scripts tools)
 
-vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS)
+targets :=
 
-# Recurse until adjust_autoksyms.sh is satisfied
-PHONY += autoksyms_recursive
 ifdef CONFIG_TRIM_UNUSED_KSYMS
 # For the kernel to actually contain only the needed exported symbols,
 # we have to build modules as well to determine what those symbols are.
 # (this can be evaluated only once include/config/auto.conf has been included)
 KBUILD_MODULES := 1
 
-autoksyms_recursive: descend modules.order
-	$(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
-	  "$(MAKE) -f $(srctree)/Makefile vmlinux"
-endif
-
-autoksyms_h := $(if $(CONFIG_TRIM_UNUSED_KSYMS), include/generated/autoksyms.h)
+quiet_cmd_gen_used_ksyms = GEN     $@
+      cmd_gen_used_ksyms = $(CONFIG_SHELL) $(srctree)/scripts/gen-keep-ksyms.sh $< > $@
 
-quiet_cmd_autoksyms_h = GEN     $@
-      cmd_autoksyms_h = mkdir -p $(dir $@); \
-			$(CONFIG_SHELL) $(srctree)/scripts/gen_autoksyms.sh $@
+include/generated/keep-ksyms.h: modules.order FORCE
+	$(call if_changed,gen_used_ksyms)
+targets += include/generated/keep-ksyms.h
 
-$(autoksyms_h):
-	$(call cmd,autoksyms_h)
+$(KBUILD_LDS) modules_prepare: include/generated/keep-ksyms.h
+endif
 
 $(KBUILD_LDS): prepare FORCE
 	$(Q)$(MAKE) $(build)=$(patsubst %/,%,$(dir $@)) $@
@@ -1196,11 +1190,11 @@ cmd_link-vmlinux =                                                 \
 	$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(KBUILD_LDS) \
+vmlinux: scripts/link-vmlinux.sh $(KBUILD_LDS) \
 			$(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) FORCE
 	+$(call if_changed,link-vmlinux)
 
-targets := vmlinux
+targets += vmlinux
 
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
@@ -1229,7 +1223,7 @@ scripts: scripts_basic scripts_dtc
 PHONY += prepare archprepare
 
 archprepare: outputmakefile archheaders archscripts scripts include/config/kernel.release \
-	asm-generic $(version_h) $(autoksyms_h) include/generated/utsrelease.h \
+	asm-generic $(version_h) include/generated/utsrelease.h \
 	include/generated/autoconf.h
 
 prepare0: archprepare
@@ -1515,7 +1509,7 @@ endif # CONFIG_MODULES
 # make distclean Remove editor backup files, patch leftover files and the like
 
 # Directories & files removed with 'make clean'
-CLEAN_FILES += include/ksym vmlinux.symvers \
+CLEAN_FILES += vmlinux.symvers \
 	       modules.builtin modules.builtin.modinfo modules.nsdeps \
 	       compile_commands.json .thinlto-cache
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index cd119d82d8e3..0b0407317756 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -6,6 +6,7 @@
  * Linker script used for partial linking of nVHE EL2 object files.
  */
 
+#define NO_TRIM_KSYMS
 #include <asm/hyp_image.h>
 #include <asm-generic/vmlinux.lds.h>
 #include <asm/cache.h>
diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S b/arch/powerpc/kernel/vdso32/vdso32.lds.S
index a4b806b0d618..5c519a1e1538 100644
--- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
+++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
@@ -3,6 +3,7 @@
  * This is the infamous ld script for the 32 bits vdso
  * library
  */
+#define NO_TRIM_KSYMS
 #include <asm/vdso.h>
 #include <asm/page.h>
 #include <asm-generic/vmlinux.lds.h>
diff --git a/arch/powerpc/kernel/vdso64/vdso64.lds.S b/arch/powerpc/kernel/vdso64/vdso64.lds.S
index 2f3c359cacd3..cab126eff255 100644
--- a/arch/powerpc/kernel/vdso64/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso64/vdso64.lds.S
@@ -3,6 +3,7 @@
  * This is the infamous ld script for the 64 bits vdso
  * library
  */
+#define NO_TRIM_KSYMS
 #include <asm/vdso.h>
 #include <asm/page.h>
 #include <asm-generic/vmlinux.lds.h>
diff --git a/arch/s390/purgatory/purgatory.lds.S b/arch/s390/purgatory/purgatory.lds.S
index 482eb4fbcef1..09c2d1544081 100644
--- a/arch/s390/purgatory/purgatory.lds.S
+++ b/arch/s390/purgatory/purgatory.lds.S
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#define NO_TRIM_KSYMS
 #include <asm-generic/vmlinux.lds.h>
 
 OUTPUT_FORMAT("elf64-s390", "elf64-s390", "elf64-s390")
diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index e847f1fde367..b9be5b1dd7e6 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -57,30 +57,7 @@ __kstrtab_\name:
 #endif
 .endm
 
-#if defined(CONFIG_TRIM_UNUSED_KSYMS)
-
-#include <linux/kconfig.h>
-#include <generated/autoksyms.h>
-
-.macro __ksym_marker sym
-	.section ".discard.ksym","a"
-__ksym_marker_\sym:
-	 .previous
-.endm
-
-#define __EXPORT_SYMBOL(sym, val, sec)				\
-	__ksym_marker sym;					\
-	__cond_export_sym(sym, val, sec, __is_defined(__KSYM_##sym))
-#define __cond_export_sym(sym, val, sec, conf)			\
-	___cond_export_sym(sym, val, sec, conf)
-#define ___cond_export_sym(sym, val, sec, enabled)		\
-	__cond_export_sym_##enabled(sym, val, sec)
-#define __cond_export_sym_1(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec
-#define __cond_export_sym_0(sym, val, sec) /* nothing */
-
-#else
 #define __EXPORT_SYMBOL(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec
-#endif
 
 #define EXPORT_SYMBOL(name)					\
 	__EXPORT_SYMBOL(name, KSYM_FUNC(name),)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 6ce6dcabdccf..a2c2eb6f70ea 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -50,6 +50,8 @@
  *               [__nosave_begin, __nosave_end] for the nosave data
  */
 
+#include <linux/ksyms.lds.h>
+
 #ifndef LOAD_OFFSET
 #define LOAD_OFFSET 0
 #endif
@@ -486,34 +488,34 @@
 	/* Kernel symbol table: Normal symbols */			\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
 		__start___ksymtab = .;					\
-		KEEP(*(SORT(___ksymtab+*)))				\
+		KSYMTAB							\
 		__stop___ksymtab = .;					\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
 		__start___ksymtab_gpl = .;				\
-		KEEP(*(SORT(___ksymtab_gpl+*)))				\
+		KSYMTAB_GPL						\
 		__stop___ksymtab_gpl = .;				\
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
 		__start___kcrctab = .;					\
-		KEEP(*(SORT(___kcrctab+*)))				\
+		KCRCTAB							\
 		__stop___kcrctab = .;					\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
 		__start___kcrctab_gpl = .;				\
-		KEEP(*(SORT(___kcrctab_gpl+*)))				\
+		KCRCTAB_GPL						\
 		__stop___kcrctab_gpl = .;				\
 	}								\
 									\
 	/* Kernel symbol table: strings */				\
         __ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) {	\
-		*(__ksymtab_strings+*)					\
+		KSYMTAB_STRINGS						\
 	}								\
 									\
 	/* __*init sections */						\
@@ -999,6 +1001,7 @@
 	/DISCARD/ : {							\
 	EXIT_DISCARDS							\
 	EXIT_CALL							\
+	KSYM_DISCARDS							\
 	COMMON_DISCARDS							\
 	}
 
diff --git a/include/linux/export.h b/include/linux/export.h
index 01e6ab19b226..f9cc13cd2c8c 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -76,9 +76,18 @@ struct kernel_symbol {
 };
 #endif
 
-#ifdef __GENKSYMS__
+#if !defined(CONFIG_MODULES) || defined(__DISABLE_EXPORTS)
+
+/*
+ * Allow symbol exports to be disabled completely so that C code may
+ * be reused in other execution contexts such as the UEFI stub or the
+ * decompressor.
+ */
+#define __EXPORT_SYMBOL(sym, sec, ns)
+
+#elif defined(__GENKSYMS__)
 
-#define ___EXPORT_SYMBOL(sym, sec, ns)	__GENKSYMS_EXPORT_SYMBOL(sym)
+#define __EXPORT_SYMBOL(sym, sec, ns)	__GENKSYMS_EXPORT_SYMBOL(sym)
 
 #else
 
@@ -94,7 +103,7 @@ struct kernel_symbol {
  * section flag requires it. Use '%progbits' instead of '@progbits' since the
  * former apparently works on all arches according to the binutils source.
  */
-#define ___EXPORT_SYMBOL(sym, sec, ns)						\
+#define __EXPORT_SYMBOL(sym, sec, ns)						\
 	extern typeof(sym) sym;							\
 	extern const char __kstrtab_##sym[];					\
 	extern const char __kstrtabns_##sym[];					\
@@ -107,45 +116,6 @@ struct kernel_symbol {
 	    "	.previous						\n");	\
 	__KSYMTAB_ENTRY(sym, sec)
 
-#endif
-
-#if !defined(CONFIG_MODULES) || defined(__DISABLE_EXPORTS)
-
-/*
- * Allow symbol exports to be disabled completely so that C code may
- * be reused in other execution contexts such as the UEFI stub or the
- * decompressor.
- */
-#define __EXPORT_SYMBOL(sym, sec, ns)
-
-#elif defined(CONFIG_TRIM_UNUSED_KSYMS)
-
-#include <generated/autoksyms.h>
-
-/*
- * For fine grained build dependencies, we want to tell the build system
- * about each possible exported symbol even if they're not actually exported.
- * We use a symbol pattern __ksym_marker_<symbol> that the build system filters
- * from the $(NM) output (see scripts/gen_ksymdeps.sh). These symbols are
- * discarded in the final link stage.
- */
-#define __ksym_marker(sym)	\
-	static int __ksym_marker_##sym[0] __section(".discard.ksym") __used
-
-#define __EXPORT_SYMBOL(sym, sec, ns)					\
-	__ksym_marker(sym);						\
-	__cond_export_sym(sym, sec, ns, __is_defined(__KSYM_##sym))
-#define __cond_export_sym(sym, sec, ns, conf)				\
-	___cond_export_sym(sym, sec, ns, conf)
-#define ___cond_export_sym(sym, sec, ns, enabled)			\
-	__cond_export_sym_##enabled(sym, sec, ns)
-#define __cond_export_sym_1(sym, sec, ns) ___EXPORT_SYMBOL(sym, sec, ns)
-#define __cond_export_sym_0(sym, sec, ns) /* nothing */
-
-#else
-
-#define __EXPORT_SYMBOL(sym, sec, ns)	___EXPORT_SYMBOL(sym, sec, ns)
-
 #endif /* CONFIG_MODULES */
 
 #ifdef DEFAULT_SYMBOL_NAMESPACE
diff --git a/include/linux/ksyms.lds.h b/include/linux/ksyms.lds.h
new file mode 100644
index 000000000000..fdb7a04053b0
--- /dev/null
+++ b/include/linux/ksyms.lds.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KSYMS_LDS_H
+#define __KSYMS_LDS_H
+
+#if defined(CONFIG_TRIM_UNUSED_KSYMS) && !defined(NO_TRIM_KSYMS)
+#include <generated/keep-ksyms.h>
+
+#define KSYM_DISCARDS		*(___ksymtab+*) \
+				*(___ksymtab_gpl+*) \
+				*(___kcrctab+*) \
+				*(___kcrctab_gpl+*) \
+				*(__ksymtab_strings+*)
+#else
+#define KSYMTAB			KEEP(*(SORT(___ksymtab+*)))
+#define KSYMTAB_GPL		KEEP(*(SORT(___ksymtab_gpl+*)))
+#define KCRCTAB			KEEP(*(SORT(___kcrctab+*)))
+#define KCRCTAB_GPL		KEEP(*(SORT(___kcrctab_gpl+*)))
+#define KSYMTAB_STRINGS		*(__ksymtab_strings+*)
+#define KSYM_DISCARDS
+#endif
+
+#endif /* __KSYMS_LDS_H */
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 7df96bfe694e..71ffe1d06265 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -244,16 +244,12 @@ objtool_dep = $(objtool_obj)					\
 			 include/config/stack/validation.h)
 
 ifdef CONFIG_TRIM_UNUSED_KSYMS
-cmd_gen_ksymdeps = \
-	$(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
-
 # List module undefined symbols
 undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
 endif
 
 define rule_cc_o_c
 	$(call cmd_and_fixdep,cc_o_c)
-	$(call cmd,gen_ksymdeps)
 	$(call cmd,checksrc)
 	$(call cmd,checkdoc)
 	$(call cmd,objtool)
@@ -263,7 +259,6 @@ endef
 
 define rule_as_o_S
 	$(call cmd_and_fixdep,as_o_S)
-	$(call cmd,gen_ksymdeps)
 	$(call cmd,objtool)
 	$(call cmd,modversions_S)
 endef
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index eee59184de64..f3da140191fb 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -311,6 +311,7 @@ DTC_FLAGS += $(DTC_FLAGS_$(basetarget))
 quiet_cmd_dt_S_dtb= DTB     $@
 cmd_dt_S_dtb=						\
 {							\
+	echo '\#define NO_TRIM_KSYMS';			\
 	echo '\#include <asm-generic/vmlinux.lds.h>'; 	\
 	echo '.section .dtb.init.rodata,"a"';		\
 	echo '.balign STRUCT_ALIGNMENT';		\
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
deleted file mode 100755
index d8f6f9c63043..000000000000
--- a/scripts/adjust_autoksyms.sh
+++ /dev/null
@@ -1,73 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0-only
-
-# Script to update include/generated/autoksyms.h and dependency files
-#
-# Copyright:	(C) 2016  Linaro Limited
-# Created by:	Nicolas Pitre, January 2016
-#
-
-# Update the include/generated/autoksyms.h file.
-#
-# For each symbol being added or removed, the corresponding dependency
-# file's timestamp is updated to force a rebuild of the affected source
-# file. All arguments passed to this script are assumed to be a command
-# to be exec'd to trigger a rebuild of those files.
-
-set -e
-
-cur_ksyms_file="include/generated/autoksyms.h"
-new_ksyms_file="include/generated/autoksyms.h.tmpnew"
-
-info() {
-	if [ "$quiet" != "silent_" ]; then
-		printf "  %-7s %s\n" "$1" "$2"
-	fi
-}
-
-info "CHK" "$cur_ksyms_file"
-
-# Use "make V=1" to debug this script.
-case "$KBUILD_VERBOSE" in
-*1*)
-	set -x
-	;;
-esac
-
-# Generate a new symbol list file
-$CONFIG_SHELL $srctree/scripts/gen_autoksyms.sh "$new_ksyms_file"
-
-# Extract changes between old and new list and touch corresponding
-# dependency files.
-changed=$(
-count=0
-sort "$cur_ksyms_file" "$new_ksyms_file" | uniq -u |
-sed -n 's/^#define __KSYM_\(.*\) 1/\1/p' | tr "A-Z_" "a-z/" |
-while read sympath; do
-	if [ -z "$sympath" ]; then continue; fi
-	depfile="include/ksym/${sympath}.h"
-	mkdir -p "$(dirname "$depfile")"
-	touch "$depfile"
-	# Filesystems with coarse time precision may create timestamps
-	# equal to the one from a file that was very recently built and that
-	# needs to be rebuild. Let's guard against that by making sure our
-	# dep files are always newer than the first file we created here.
-	while [ ! "$depfile" -nt "$new_ksyms_file" ]; do
-		touch "$depfile"
-	done
-	echo $((count += 1))
-done | tail -1 )
-changed=${changed:-0}
-
-if [ $changed -gt 0 ]; then
-	# Replace the old list with tne new one
-	old=$(grep -c "^#define __KSYM_" "$cur_ksyms_file" || true)
-	new=$(grep -c "^#define __KSYM_" "$new_ksyms_file" || true)
-	info "KSYMS" "symbols: before=$old, after=$new, changed=$changed"
-	info "UPD" "$cur_ksyms_file"
-	mv -f "$new_ksyms_file" "$cur_ksyms_file"
-	# Then trigger a rebuild of affected source files
-	exec $@
-else
-	rm -f "$new_ksyms_file"
-fi
diff --git a/scripts/gen_autoksyms.sh b/scripts/gen-keep-ksyms.sh
similarity index 66%
rename from scripts/gen_autoksyms.sh
rename to scripts/gen-keep-ksyms.sh
index da320151e7c3..306e9b88aae9 100755
--- a/scripts/gen_autoksyms.sh
+++ b/scripts/gen-keep-ksyms.sh
@@ -1,13 +1,23 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0-only
 
-# Create an autoksyms.h header file from the list of all module's needed symbols
-# as recorded on the second line of *.mod files and the user-provided symbol
-# whitelist.
-
 set -e
 
-output_file="$1"
+modlist=$1
+
+emit ()
+{
+	local macro="$1"
+	local prefix="$2"
+	local syms="$3"
+
+	echo "#define $macro \\"
+	for s in $syms
+	do
+		echo "	KEEP(*($prefix$s)) \\"
+	done
+	echo
+}
 
 # Use "make V=1" to debug this script.
 case "$KBUILD_VERBOSE" in
@@ -51,15 +61,14 @@ fi
 
 # Generate a new ksym list file with symbols needed by the current
 # set of modules.
-cat > "$output_file" << EOT
+cat << EOF
 /*
  * Automatically generated file; DO NOT EDIT.
  */
 
-EOT
-
-[ -f modules.order ] && modlist=modules.order || modlist=/dev/null
+EOF
 
+syms=$(
 {
 	sed 's/ko$/mod/' $modlist | xargs -n1 sed -n -e '2p'
 	echo "$needed_symbols"
@@ -68,5 +77,17 @@ EOT
 # Remove the dot prefix for ppc64; symbol names with a dot (.) hold entry
 # point addresses.
 sed -e 's/^\.//' |
-sort -u |
-sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$output_file"
+# Sorting is essential because the module subsystem uses binary search for
+# symbol resolution. For CONFIG_TRIM_UNUSED_KSYMS=n, this is done by the
+# linker's SORT command (an alias of SORT_BY_NAME). For CONFIG_TRIM_UNUSED=y,
+# symbols are linked in the same order as this script outputs.
+# Add LC_ALL=C to make it work irrespective of the build environment.
+LC_ALL=C sort -u |
+sed -e 's/\(.*\)/\1/'
+)
+
+emit "KSYMTAB"		"___ksymtab+"		"$syms"
+emit "KSYMTAB_GPL"	"___ksymtab_gpl+"	"$syms"
+emit "KCRCTAB"		"___kcrctab_gpl+"	"$syms"
+emit "KCRCTAB_GPL"	"___kcrctab_gpl+"	"$syms"
+emit "KSYMTAB_STRINGS"	"__ksymtab_strings+"	"$syms"
diff --git a/scripts/gen_ksymdeps.sh b/scripts/gen_ksymdeps.sh
deleted file mode 100755
index 1324986e1362..000000000000
--- a/scripts/gen_ksymdeps.sh
+++ /dev/null
@@ -1,25 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-
-set -e
-
-# List of exported symbols
-ksyms=$($NM $1 | sed -n 's/.*__ksym_marker_\(.*\)/\1/p' | tr A-Z a-z)
-
-if [ -z "$ksyms" ]; then
-	exit 0
-fi
-
-echo
-echo "ksymdeps_$1 := \\"
-
-for s in $ksyms
-do
-	echo $s | sed -e 's:^_*:    $(wildcard include/ksym/:' \
-			-e 's:__*:/:g' -e 's/$/.h) \\/'
-done
-
-echo
-echo "$1: \$(ksymdeps_$1)"
-echo
-echo "\$(ksymdeps_$1):"
diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index 168cd27e6122..ab96471141f0 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -3,16 +3,15 @@
  * Archs are free to supply their own linker scripts.  ld will
  * combine them automatically.
  */
-SECTIONS {
-	/DISCARD/ : {
-		*(.discard)
-		*(.discard.*)
-	}
 
-	__ksymtab		0 : { *(SORT(___ksymtab+*)) }
-	__ksymtab_gpl		0 : { *(SORT(___ksymtab_gpl+*)) }
-	__kcrctab		0 : { *(SORT(___kcrctab+*)) }
-	__kcrctab_gpl		0 : { *(SORT(___kcrctab_gpl+*)) }
+#include <linux/ksyms.lds.h>
+
+SECTIONS {
+	__ksymtab		0 : { KSYMTAB }
+	__ksymtab_gpl		0 : { KSYMTAB_GPL }
+	__kcrctab		0 : { KCRCTAB }
+	__kcrctab_gpl		0 : { KCRCTAB_GPL }
+	__ksymtab_strings	0 : { KSYMTAB_STRINGS }
 
 	.init_array		0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) }
 
@@ -41,6 +40,12 @@ SECTIONS {
 	}
 
 	.text : { *(.text .text.[0-9a-zA-Z_]*) }
+
+	/DISCARD/ : {
+		*(.discard)
+		*(.discard.*)
+		KSYM_DISCARDS
+	}
 }
 
 /* bring in arch-specific sections */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS
  2021-03-09 15:17 [PATCH v2 0/4] kbuild: build speed improvement of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
                   ` (2 preceding siblings ...)
  2021-03-09 15:17 ` [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
@ 2021-03-09 15:17 ` Masahiro Yamada
  2021-03-09 19:54   ` Linus Torvalds
  2021-03-10 12:55   ` kernel test robot
  3 siblings, 2 replies; 13+ messages in thread
From: Masahiro Yamada @ 2021-03-09 15:17 UTC (permalink / raw)
  To: linux-kbuild
  Cc: Christoph Hellwig, Linus Torvalds, Jessica Yu, Nicolas Pitre,
	linux-kernel, linux-arch, Masahiro Yamada

Now that the build time cost of this option is unnoticeable level,
revert the following two:

  a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some guarding")
  5cf0fd591f2e ("Kbuild: disable TRIM_UNUSED_KSYMS option")

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

Changes in v2:
  - New patch

 init/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 22946fe5ded9..0cbdc20b9322 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2265,8 +2265,7 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
 	  If unsure, say N.
 
 config TRIM_UNUSED_KSYMS
-	bool "Trim unused exported kernel symbols" if EXPERT
-	depends on !COMPILE_TEST
+	bool "Trim unused exported kernel symbols"
 	help
 	  The kernel and some modules make many symbols available for
 	  other modules to use via EXPORT_SYMBOL() and variants. Depending
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 15:17 ` [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
@ 2021-03-09 17:36   ` Nicolas Pitre
  2021-03-09 18:11     ` Masahiro Yamada
  2021-03-17 15:48   ` kernel test robot
  1 sibling, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2021-03-09 17:36 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kbuild, Christoph Hellwig, Linus Torvalds, Jessica Yu,
	linux-kernel, linux-arch

On Wed, 10 Mar 2021, Masahiro Yamada wrote:

> Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
> guarding") re-enabled this feature, but Linus is still unhappy about
> the build time.
> 
> The reason of the slowness is the recursion - this basically works in
> two loops.
> 
> In the first loop, Kbuild builds the entire tree based on the temporary
> autoksyms.h, which contains macro defines to control whether their
> corresponding EXPORT_SYMBOL() is enabled or not, and also gathers all
> symbols required by modules. After the tree traverse, Kbuild updates
> autoksyms.h and triggers the second loop to rebuild source files whose
> EXPORT_SYMBOL() needs flipping.
> 
> This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work in
> one pass. In the new design, unneeded EXPORT_SYMBOL() instances are
> trimmed by the linker instead of the preprocessor.
> 
> After the tree traverse, a linker script snippet <generated/keep-ksyms.h>
> is generated. It feeds the list of necessary sections to vmlinus.lds.S
> and modules.lds.S. The other sections fall into /DISCARD/.
> 
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

I'm not sure I do understand every detail here, especially since it is 
so far away from the version that I originally contributed. But the 
concept looks good.

I still think that there is no way around a recursive approach to get 
the maximum effect with LTO, but given that true LTO still isn't applied 
to mainline after all those years, the recursive approach brings 
nothing. Maybe that could be revisited if true LTO ever makes it into 
mainline, and the desire to reduce the binary size is still relevant 
enough to justify it.

Acked-by: Nicolas Pitre <nico@fluxnic.net>


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 17:36   ` Nicolas Pitre
@ 2021-03-09 18:11     ` Masahiro Yamada
  2021-03-09 19:54       ` Nicolas Pitre
  0 siblings, 1 reply; 13+ messages in thread
From: Masahiro Yamada @ 2021-03-09 18:11 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Linux Kernel Mailing List, linux-arch

On Wed, Mar 10, 2021 at 2:36 AM Nicolas Pitre <nico@fluxnic.net> wrote:
>
> On Wed, 10 Mar 2021, Masahiro Yamada wrote:
>
> > Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
> > guarding") re-enabled this feature, but Linus is still unhappy about
> > the build time.
> >
> > The reason of the slowness is the recursion - this basically works in
> > two loops.
> >
> > In the first loop, Kbuild builds the entire tree based on the temporary
> > autoksyms.h, which contains macro defines to control whether their
> > corresponding EXPORT_SYMBOL() is enabled or not, and also gathers all
> > symbols required by modules. After the tree traverse, Kbuild updates
> > autoksyms.h and triggers the second loop to rebuild source files whose
> > EXPORT_SYMBOL() needs flipping.
> >
> > This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work in
> > one pass. In the new design, unneeded EXPORT_SYMBOL() instances are
> > trimmed by the linker instead of the preprocessor.
> >
> > After the tree traverse, a linker script snippet <generated/keep-ksyms.h>
> > is generated. It feeds the list of necessary sections to vmlinus.lds.S
> > and modules.lds.S. The other sections fall into /DISCARD/.
> >
> > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
>
> I'm not sure I do understand every detail here, especially since it is
> so far away from the version that I originally contributed. But the
> concept looks good.
>
> I still think that there is no way around a recursive approach to get
> the maximum effect with LTO, but given that true LTO still isn't applied
> to mainline after all those years, the recursive approach brings
> nothing. Maybe that could be revisited if true LTO ever makes it into
> mainline, and the desire to reduce the binary size is still relevant
> enough to justify it.

Hmm, I am confused.

Does this patch change the behavior in the
combination with the "true LTO"?


Please let me borrow this sentence from your article:
"But what LTO does is more like getting rid of branches that simply
float in the air without being connected to anything or which have
become loose due to optimization."
(https://lwn.net/Articles/746780/)


This patch throws unneeded EXPORT_SYMBOL metadata
into the /DISCARD/ section of the linker script.

The approach is different (preprocessor vs linker), but
we will still get the same result; the unneeded
EXPORT_SYMBOLs are disconnected from the main trunk.

Then, the true LTO will remove branches floating in the air,
right?

So, what will be lost by this patch?



>
> Acked-by: Nicolas Pitre <nico@fluxnic.net>
>
>
> Nicolas



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 18:11     ` Masahiro Yamada
@ 2021-03-09 19:54       ` Nicolas Pitre
  2021-03-09 20:11         ` Rasmus Villemoes
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2021-03-09 19:54 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Linux Kernel Mailing List, linux-arch

On Wed, 10 Mar 2021, Masahiro Yamada wrote:

> On Wed, Mar 10, 2021 at 2:36 AM Nicolas Pitre <nico@fluxnic.net> wrote:
> >
> > On Wed, 10 Mar 2021, Masahiro Yamada wrote:
> >
> > > Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
> > > guarding") re-enabled this feature, but Linus is still unhappy about
> > > the build time.
> > >
> > > The reason of the slowness is the recursion - this basically works in
> > > two loops.
> > >
> > > In the first loop, Kbuild builds the entire tree based on the temporary
> > > autoksyms.h, which contains macro defines to control whether their
> > > corresponding EXPORT_SYMBOL() is enabled or not, and also gathers all
> > > symbols required by modules. After the tree traverse, Kbuild updates
> > > autoksyms.h and triggers the second loop to rebuild source files whose
> > > EXPORT_SYMBOL() needs flipping.
> > >
> > > This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work in
> > > one pass. In the new design, unneeded EXPORT_SYMBOL() instances are
> > > trimmed by the linker instead of the preprocessor.
> > >
> > > After the tree traverse, a linker script snippet <generated/keep-ksyms.h>
> > > is generated. It feeds the list of necessary sections to vmlinus.lds.S
> > > and modules.lds.S. The other sections fall into /DISCARD/.
> > >
> > > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> >
> > I'm not sure I do understand every detail here, especially since it is
> > so far away from the version that I originally contributed. But the
> > concept looks good.
> >
> > I still think that there is no way around a recursive approach to get
> > the maximum effect with LTO, but given that true LTO still isn't applied
> > to mainline after all those years, the recursive approach brings
> > nothing. Maybe that could be revisited if true LTO ever makes it into
> > mainline, and the desire to reduce the binary size is still relevant
> > enough to justify it.
> 
> Hmm, I am confused.
> 
> Does this patch change the behavior in the
> combination with the "true LTO"?
> 
> Please let me borrow this sentence from your article:
> "But what LTO does is more like getting rid of branches that simply
> float in the air without being connected to anything or which have
> become loose due to optimization."
> (https://lwn.net/Articles/746780/)
> 
> This patch throws unneeded EXPORT_SYMBOL metadata
> into the /DISCARD/ section of the linker script.
> 
> The approach is different (preprocessor vs linker), but
> we will still get the same result; the unneeded
> EXPORT_SYMBOLs are disconnected from the main trunk.
> 
> Then, the true LTO will remove branches floating in the air,
> right?
> 
> So, what will be lost by this patch?

Let's say you have this in module_foo:

int foo(int x)
{
	return 2 + bar(x);
}
EXPORT_SYMBOL(foo);

And module_bar:

int bar(int y)
{
	return 3 * baz(y);
}
EXPORT_SYMBOL(bar);

And this in the main kernel image:

int baz(int z)
{
	return plonk(z);
}
EXPORT_SYMBOLbaz);

Now we build the kernel and modules. Then we realize that nothing 
references symbol "foo". We can trim the "foo" export. But it would be 
necessary to recompile module_foo with LTO (especially if there is 
some other code in that module) to realize that nothing 
references foo() any longer and optimize away the reference to bar(). 
With another round, we now realize that the "bar" export is no longer 
necessary. But that will require another compile round to optimize away 
the reference to baz(). And then a final compilation round with 
LTO to possibly optimize plonk() out of the kernel.

I don't see how you can propagate all this chain reaction with only one 
pass.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS
  2021-03-09 15:17 ` [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS Masahiro Yamada
@ 2021-03-09 19:54   ` Linus Torvalds
  2021-03-10 12:55   ` kernel test robot
  1 sibling, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2021-03-09 19:54 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Jessica Yu,
	Nicolas Pitre, Linux Kernel Mailing List, linux-arch

On Tue, Mar 9, 2021 at 7:18 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> Now that the build time cost of this option is unnoticeable level,
> revert the following two:

It might still be a good idea to make it depend on EXPERT. Otherwise
you'll have problems with external modules..

Also, can you actually specify that "unnoticeable level" with numbers?

             Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 19:54       ` Nicolas Pitre
@ 2021-03-09 20:11         ` Rasmus Villemoes
  2021-03-09 20:45           ` Nicolas Pitre
  0 siblings, 1 reply; 13+ messages in thread
From: Rasmus Villemoes @ 2021-03-09 20:11 UTC (permalink / raw)
  To: Nicolas Pitre, Masahiro Yamada
  Cc: Linux Kbuild mailing list, Christoph Hellwig, Linus Torvalds,
	Jessica Yu, Linux Kernel Mailing List, linux-arch

On 09/03/2021 20.54, Nicolas Pitre wrote:
> On Wed, 10 Mar 2021, Masahiro Yamada wrote:
> 

>>> I'm not sure I do understand every detail here, especially since it is
>>> so far away from the version that I originally contributed. But the
>>> concept looks good.
>>>
>>> I still think that there is no way around a recursive approach to get
>>> the maximum effect with LTO, but given that true LTO still isn't applied
>>> to mainline after all those years, the recursive approach brings
>>> nothing. Maybe that could be revisited if true LTO ever makes it into
>>> mainline, and the desire to reduce the binary size is still relevant
>>> enough to justify it.
>>
>> Hmm, I am confused.
>>
>> Does this patch change the behavior in the
>> combination with the "true LTO"?
>>
>> Please let me borrow this sentence from your article:
>> "But what LTO does is more like getting rid of branches that simply
>> float in the air without being connected to anything or which have
>> become loose due to optimization."
>> (https://lwn.net/Articles/746780/)
>>
>> This patch throws unneeded EXPORT_SYMBOL metadata
>> into the /DISCARD/ section of the linker script.
>>
>> The approach is different (preprocessor vs linker), but
>> we will still get the same result; the unneeded
>> EXPORT_SYMBOLs are disconnected from the main trunk.
>>
>> Then, the true LTO will remove branches floating in the air,
>> right?
>>
>> So, what will be lost by this patch?
> 
> Let's say you have this in module_foo:
> 
> int foo(int x)
> {
> 	return 2 + bar(x);
> }
> EXPORT_SYMBOL(foo);
> 
> And module_bar:
> 
> int bar(int y)
> {
> 	return 3 * baz(y);
> }
> EXPORT_SYMBOL(bar);
> 
> And this in the main kernel image:
> 
> int baz(int z)
> {
> 	return plonk(z);
> }
> EXPORT_SYMBOLbaz);
> 
> Now we build the kernel and modules. Then we realize that nothing 
> references symbol "foo". We can trim the "foo" export. But it would be 
> necessary to recompile module_foo with LTO (especially if there is 
> some other code in that module) to realize that nothing 
> references foo() any longer and optimize away the reference to bar(). 

But, does LTO even do that to modules? Sure, the export metadata for foo
vanishes, so there's no function pointer reference to foo, but given
that modules are just -r links, the compiler/linker can't really assume
that the generated object won't later be linked with something that does
require foo? At least for the simpler case of --gc-sections, ld docs say:

'--gc-sections'
...

    This option can be set when doing a partial link (enabled with
     option '-r').  In this case the root of symbols kept must be
     explicitly specified either by one of the options '--entry',
     '--undefined', or '--gc-keep-exported' or by a 'ENTRY' command in
     the linker script.

and I would assume that for LTO, --gc-keep-exported would be the only
sane semantics (keep any external symbol with default visibility).

Can you point me at a tree/set of LTO patches and a toolchain where the
previous implementation would actually eventually eliminate bar() from
module_bar?

Rasmus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 20:11         ` Rasmus Villemoes
@ 2021-03-09 20:45           ` Nicolas Pitre
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2021-03-09 20:45 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Masahiro Yamada, Linux Kbuild mailing list, Christoph Hellwig,
	Linus Torvalds, Jessica Yu, Linux Kernel Mailing List,
	linux-arch

On Tue, 9 Mar 2021, Rasmus Villemoes wrote:

> On 09/03/2021 20.54, Nicolas Pitre wrote:
> > On Wed, 10 Mar 2021, Masahiro Yamada wrote:
> > 
> 
> >>> I'm not sure I do understand every detail here, especially since it is
> >>> so far away from the version that I originally contributed. But the
> >>> concept looks good.
> >>>
> >>> I still think that there is no way around a recursive approach to get
> >>> the maximum effect with LTO, but given that true LTO still isn't applied
> >>> to mainline after all those years, the recursive approach brings
> >>> nothing. Maybe that could be revisited if true LTO ever makes it into
> >>> mainline, and the desire to reduce the binary size is still relevant
> >>> enough to justify it.
> >>
> >> Hmm, I am confused.
> >>
> >> Does this patch change the behavior in the
> >> combination with the "true LTO"?
> >>
> >> Please let me borrow this sentence from your article:
> >> "But what LTO does is more like getting rid of branches that simply
> >> float in the air without being connected to anything or which have
> >> become loose due to optimization."
> >> (https://lwn.net/Articles/746780/)
> >>
> >> This patch throws unneeded EXPORT_SYMBOL metadata
> >> into the /DISCARD/ section of the linker script.
> >>
> >> The approach is different (preprocessor vs linker), but
> >> we will still get the same result; the unneeded
> >> EXPORT_SYMBOLs are disconnected from the main trunk.
> >>
> >> Then, the true LTO will remove branches floating in the air,
> >> right?
> >>
> >> So, what will be lost by this patch?
> > 
> > Let's say you have this in module_foo:
> > 
> > int foo(int x)
> > {
> > 	return 2 + bar(x);
> > }
> > EXPORT_SYMBOL(foo);
> > 
> > And module_bar:
> > 
> > int bar(int y)
> > {
> > 	return 3 * baz(y);
> > }
> > EXPORT_SYMBOL(bar);
> > 
> > And this in the main kernel image:
> > 
> > int baz(int z)
> > {
> > 	return plonk(z);
> > }
> > EXPORT_SYMBOLbaz);
> > 
> > Now we build the kernel and modules. Then we realize that nothing 
> > references symbol "foo". We can trim the "foo" export. But it would be 
> > necessary to recompile module_foo with LTO (especially if there is 
> > some other code in that module) to realize that nothing 
> > references foo() any longer and optimize away the reference to bar(). 
> 
> But, does LTO even do that to modules? Sure, the export metadata for foo
> vanishes, so there's no function pointer reference to foo, but given
> that modules are just -r links, the compiler/linker can't really assume
> that the generated object won't later be linked with something that does
> require foo? At least for the simpler case of --gc-sections, ld docs say:
> 
> '--gc-sections'
> ...
> 
>     This option can be set when doing a partial link (enabled with
>      option '-r').  In this case the root of symbols kept must be
>      explicitly specified either by one of the options '--entry',
>      '--undefined', or '--gc-keep-exported' or by a 'ENTRY' command in
>      the linker script.
> 
> and I would assume that for LTO, --gc-keep-exported would be the only
> sane semantics (keep any external symbol with default visibility).
> 
> Can you point me at a tree/set of LTO patches and a toolchain where the
> previous implementation would actually eventually eliminate bar() from
> module_bar?

All that I readily have is a link to the article I wrote with the 
results I obtained at the time: https://lwn.net/Articles/746780/.
The toolchain and kernel tree are rather old at this point and some 
effort would be required to modernize everything.

I don't remember if there was something special to do LTO on modules. 
Maybe Andi Kleen had something in his patchset for that: 
https://github.com/andikleen/linux-misc/blob/lto-415-2/Documentation/lto-build
He mentions that LTO isn't very effective with modules enabled, but what 
I demonstrated in myarticle is that LTO becomes very effective with or 
without modules as long as CONFIG_TRIM_UNUSED_KSYMS is enabled.

Having CONFIG_TRIM_UNUSED_KSYMS in one pass is likely to still be pretty 
effective even if possibly not not optimal. And maybe people don't 
really care for the missing 10% anyway (I'm just throwing a number in 
the air 
here).


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS
  2021-03-09 15:17 ` [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS Masahiro Yamada
  2021-03-09 19:54   ` Linus Torvalds
@ 2021-03-10 12:55   ` kernel test robot
  1 sibling, 0 replies; 13+ messages in thread
From: kernel test robot @ 2021-03-10 12:55 UTC (permalink / raw)
  To: Masahiro Yamada, linux-kbuild
  Cc: kbuild-all, clang-built-linux, Christoph Hellwig, Jessica Yu,
	Nicolas Pitre, linux-kernel, linux-arch, Masahiro Yamada

[-- Attachment #1: Type: text/plain, Size: 27347 bytes --]

Hi Masahiro,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on linus/master v5.12-rc2 next-20210309]
[cannot apply to kbuild/for-next asm-generic/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvement-of-CONFIG_TRIM_UNUSED_KSYMS/20210309-232117
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: s390-randconfig-r012-20210308 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 820f508b08d7c94b2dd7847e9710d2bc36d3dd45)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/0day-ci/linux/commit/16dfb9e33ee3b9d411560c44c016edc6a3e27e47
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvement-of-CONFIG_TRIM_UNUSED_KSYMS/20210309-232117
        git checkout 16dfb9e33ee3b9d411560c44c016edc6a3e27e47
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:105:32: note: expanded from macro '__swab16'
           (__builtin_constant_p((__u16)(x)) ?     \
                                         ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:106:21: note: expanded from macro '__swab16'
           ___constant_swab16(x) :                 \
                              ^
   include/uapi/linux/swab.h:15:12: note: expanded from macro '___constant_swab16'
           (((__u16)(x) & (__u16)0x00ffU) << 8) |                  \
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:106:21: note: expanded from macro '__swab16'
           ___constant_swab16(x) :                 \
                              ^
   include/uapi/linux/swab.h:16:12: note: expanded from macro '___constant_swab16'
           (((__u16)(x) & (__u16)0xff00U) >> 8)))
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:107:12: note: expanded from macro '__swab16'
           __fswab16(x))
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:118:32: note: expanded from macro '__swab32'
           (__builtin_constant_p((__u32)(x)) ?     \
                                         ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:19:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x000000ffUL) << 24) |            \
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:20:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x0000ff00UL) <<  8) |            \
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:21:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x00ff0000UL) >>  8) |            \
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:22:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0xff000000UL) >> 24)))
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:120:12: note: expanded from macro '__swab32'
           __fswab32(x))
                     ^
   In file included from drivers/spi/spi-mux.c:10:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
>> include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
>> include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
>> include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
>> include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
>> include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
>> include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   20 warnings generated.
--
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:105:32: note: expanded from macro '__swab16'
           (__builtin_constant_p((__u16)(x)) ?     \
                                         ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:106:21: note: expanded from macro '__swab16'
           ___constant_swab16(x) :                 \
                              ^
   include/uapi/linux/swab.h:15:12: note: expanded from macro '___constant_swab16'
           (((__u16)(x) & (__u16)0x00ffU) << 8) |                  \
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:106:21: note: expanded from macro '__swab16'
           ___constant_swab16(x) :                 \
                              ^
   include/uapi/linux/swab.h:16:12: note: expanded from macro '___constant_swab16'
           (((__u16)(x) & (__u16)0xff00U) >> 8)))
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:107:12: note: expanded from macro '__swab16'
           __fswab16(x))
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:118:32: note: expanded from macro '__swab32'
           (__builtin_constant_p((__u32)(x)) ?     \
                                         ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:19:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x000000ffUL) << 24) |            \
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:20:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x0000ff00UL) <<  8) |            \
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:21:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x00ff0000UL) >>  8) |            \
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:22:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0xff000000UL) >> 24)))
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:120:12: note: expanded from macro '__swab32'
           __fswab32(x))
                     ^
   In file included from drivers/spi/spi-sc18is602.c:11:
   In file included from include/linux/spi/spi.h:15:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:80:
>> include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
>> include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
>> include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
>> include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
>> include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
>> include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
>> include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   drivers/spi/spi-sc18is602.c:265:12: warning: cast to smaller integer type 'enum chips' from 'const void *' [-Wvoid-pointer-to-enum-cast]
                   hw->id = (enum chips)of_device_get_match_data(&client->dev);
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   21 warnings generated.
..

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29802 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass
  2021-03-09 15:17 ` [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
  2021-03-09 17:36   ` Nicolas Pitre
@ 2021-03-17 15:48   ` kernel test robot
  1 sibling, 0 replies; 13+ messages in thread
From: kernel test robot @ 2021-03-17 15:48 UTC (permalink / raw)
  To: Masahiro Yamada, linux-kbuild
  Cc: kbuild-all, clang-built-linux, Christoph Hellwig, Jessica Yu,
	Nicolas Pitre, linux-kernel, linux-arch, Masahiro Yamada

[-- Attachment #1: Type: text/plain, Size: 6106 bytes --]

Hi Masahiro,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on linus/master v5.12-rc3]
[cannot apply to kbuild/for-next asm-generic/master next-20210317]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Masahiro-Yamada/kbuild-build-speed-improvement-of-CONFIG_TRIM_UNUSED_KSYMS/20210309-232117
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: x86_64-randconfig-a015-20210317 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 8ef111222a3dd12a9175f69c3bff598c46e8bdf7)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/331032950fb793dce926a30d68897756d504c4a9
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Masahiro-Yamada/kbuild-build-speed-improvement-of-CONFIG_TRIM_UNUSED_KSYMS/20210309-232117
        git checkout 331032950fb793dce926a30d68897756d504c4a9
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/staging/comedi/drivers/cb_pcidas64.c:232:19: warning: unused function 'analog_trig_low_threshold_bits' [-Wunused-function]
   static inline u16 analog_trig_low_threshold_bits(u16 threshold)
                     ^
>> drivers/staging/comedi/drivers/cb_pcidas64.c:383:28: warning: unused function 'dma_chain_flag_bits' [-Wunused-function]
   static inline unsigned int dma_chain_flag_bits(u16 prepost_bits)
                              ^
   2 warnings generated.
--
>> drivers/staging/rts5208/xd.c:34:19: warning: unused function 'xd_check_err_code' [-Wunused-function]
   static inline int xd_check_err_code(struct rtsx_chip *chip, u8 err_code)
                     ^
   1 warning generated.
--
>> drivers/video/fbdev/tridentfb.c:1127:20: warning: unused function 'shadowmode_off' [-Wunused-function]
   static inline void shadowmode_off(struct tridentfb_par *par)
                      ^
   1 warning generated.
--
>> drivers/video/fbdev/via/via-core.c:62:19: warning: unused function 'viafb_mmio_read' [-Wunused-function]
   static inline int viafb_mmio_read(int reg)
                     ^
   1 warning generated.
--
   mm/compaction.c:56:27: warning: unused variable 'HPAGE_FRAG_CHECK_INTERVAL_MSEC' [-Wunused-const-variable]
   static const unsigned int HPAGE_FRAG_CHECK_INTERVAL_MSEC = 500;
                             ^
>> mm/compaction.c:462:20: warning: unused function 'isolation_suitable' [-Wunused-function]
   static inline bool isolation_suitable(struct compact_control *cc,
                      ^
>> mm/compaction.c:468:20: warning: unused function 'pageblock_skip_persistent' [-Wunused-function]
   static inline bool pageblock_skip_persistent(struct page *page)
                      ^
>> mm/compaction.c:473:20: warning: unused function 'update_pageblock_skip' [-Wunused-function]
   static inline void update_pageblock_skip(struct compact_control *cc,
                      ^
   4 warnings generated.
--
>> mm/z3fold.c:287:37: warning: unused function 'handle_to_z3fold_header' [-Wunused-function]
   static inline struct z3fold_header *handle_to_z3fold_header(unsigned long h)
                                       ^
   1 warning generated.
--
>> security/apparmor/file.c:150:20: warning: unused function 'is_deleted' [-Wunused-function]
   static inline bool is_deleted(struct dentry *dentry)
                      ^
   1 warning generated.
--
>> security/apparmor/label.c:1258:20: warning: unused function 'label_is_visible' [-Wunused-function]
   static inline bool label_is_visible(struct aa_profile *profile,
                      ^
   1 warning generated.
--
>> drivers/hwmon/sis5595.c:158:18: warning: unused function 'DIV_TO_REG' [-Wunused-function]
   static inline u8 DIV_TO_REG(int val)
                    ^
   1 warning generated.
--
>> drivers/mfd/max8925-core.c:472:40: warning: unused function 'irq_to_max8925' [-Wunused-function]
   static inline struct max8925_irq_data *irq_to_max8925(struct max8925_chip *chip,
                                          ^
   1 warning generated.
--
>> drivers/misc/hpilo.c:395:19: warning: unused function 'is_device_reset' [-Wunused-function]
   static inline int is_device_reset(struct ilo_hwinfo *hw)
                     ^
   1 warning generated.
..


vim +/is_deleted +150 security/apparmor/file.c

6380bd8ddf613b John Johansen 2010-07-29  143  
aebd873e8d3e34 John Johansen 2017-06-09  144  /**
aebd873e8d3e34 John Johansen 2017-06-09  145   * is_deleted - test if a file has been completely unlinked
aebd873e8d3e34 John Johansen 2017-06-09  146   * @dentry: dentry of file to test for deletion  (NOT NULL)
aebd873e8d3e34 John Johansen 2017-06-09  147   *
e37986097ba63c Zou Wei       2020-04-28  148   * Returns: true if deleted else false
aebd873e8d3e34 John Johansen 2017-06-09  149   */
aebd873e8d3e34 John Johansen 2017-06-09 @150  static inline bool is_deleted(struct dentry *dentry)
aebd873e8d3e34 John Johansen 2017-06-09  151  {
aebd873e8d3e34 John Johansen 2017-06-09  152  	if (d_unlinked(dentry) && d_backing_inode(dentry)->i_nlink == 0)
e37986097ba63c Zou Wei       2020-04-28  153  		return true;
e37986097ba63c Zou Wei       2020-04-28  154  	return false;
aebd873e8d3e34 John Johansen 2017-06-09  155  }
aebd873e8d3e34 John Johansen 2017-06-09  156  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38529 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-03-17 15:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-09 15:17 [PATCH v2 0/4] kbuild: build speed improvement of CONFIG_TRIM_UNUSED_KSYMS Masahiro Yamada
2021-03-09 15:17 ` [PATCH v2 1/4] export.h: make __ksymtab_strings per-symbol section Masahiro Yamada
2021-03-09 15:17 ` [PATCH v2 2/4] kbuild: separate out vmlinux.lds generation Masahiro Yamada
2021-03-09 15:17 ` [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass Masahiro Yamada
2021-03-09 17:36   ` Nicolas Pitre
2021-03-09 18:11     ` Masahiro Yamada
2021-03-09 19:54       ` Nicolas Pitre
2021-03-09 20:11         ` Rasmus Villemoes
2021-03-09 20:45           ` Nicolas Pitre
2021-03-17 15:48   ` kernel test robot
2021-03-09 15:17 ` [PATCH v2 4/4] kbuild: remove guarding from TRIM_UNUSED_KSYMS Masahiro Yamada
2021-03-09 19:54   ` Linus Torvalds
2021-03-10 12:55   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).