All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
@ 2016-08-24 12:29 Nicholas Piggin
  2016-08-24 12:29 ` [PATCH 1/3] kbuild: allow architectures to use thin archives instead of ld -r Nicholas Piggin
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Nicholas Piggin @ 2016-08-24 12:29 UTC (permalink / raw)
  To: Michal Marek, linux-kbuild
  Cc: Nicholas Piggin, linux-arch, Sam Ravnborg, Stephen Rothwell,
	Arnd Bergmann, Nicolas Pitre, Segher Boessenkool, Alan Modra

Hi Michal,

I ended up deciding to do a v3, because I had several changes
accumulated, as described in patches.

I've also left off the powerpc arch patches -- they can be found
in previous posts, for reference.

I've again tested ARM and it seems to be building okay and without
performance regression with my configurations. I think it's going
to be a matter of some toolchain options for them to go through.
arm64, x86, powerpc, and arm for me all built fine with thin archives
and --gc-sections enabled, so I can't see there being a fundamental
issue that can't be solved. Worst case, the incremental link option
can remain for a time.

Thanks,
Nick

Nicholas Piggin (2):
  kbuild: allow archs to select link dead code/data elimination
  kbuild: add arch specific post-link Makefile

Stephen Rothwell (1):
  kbuild: allow architectures to use thin archives instead of ld -r

 Documentation/kbuild/makefiles.txt | 16 +++++++++
 Makefile                           | 19 ++++++++--
 arch/Kconfig                       | 26 ++++++++++++++
 include/asm-generic/vmlinux.lds.h  | 52 ++++++++++++++++------------
 include/linux/compiler.h           | 23 ++++++++++++
 include/linux/export.h             | 30 ++++++++--------
 include/linux/init.h               | 38 +++++++-------------
 init/Makefile                      |  2 ++
 scripts/Makefile.build             | 23 +++++++++---
 scripts/Makefile.modpost           | 14 +++++---
 scripts/link-vmlinux.sh            | 71 ++++++++++++++++++++++++++++++++------
 11 files changed, 228 insertions(+), 86 deletions(-)

-- 
2.8.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] kbuild: allow architectures to use thin archives instead of ld -r
  2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
@ 2016-08-24 12:29 ` Nicholas Piggin
  2016-08-24 12:29 ` [PATCH 2/3] kbuild: allow archs to select link dead code/data elimination Nicholas Piggin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2016-08-24 12:29 UTC (permalink / raw)
  To: Michal Marek, linux-kbuild
  Cc: Nicholas Piggin, linux-arch, Sam Ravnborg, Stephen Rothwell,
	Arnd Bergmann, Nicolas Pitre, Segher Boessenkool, Alan Modra

From: Stephen Rothwell <sfr@canb.auug.org.au>

ld -r is an incremental link used to create built-in.o files in build
subdirectories. It produces relocatable object files containing all
its input files, and these are are then pulled together and relocated
in the final link. Aside from the bloat, this constrains the final
link relocations, which has bitten large powerpc builds with
unresolvable relocations in the final link.

Alan Modra has recommended the kernel use thin archives for linking.
This is an alternative and means that the linker has more information
available to it when it links the kernel.

This patch enables a config option architectures can select, which
causes all built-in.o files to be built as thin archives. built-in.o
files in subdirectories do not get symbol table or index attached,
which improves speed and size. The final link pass creates a
built-in.o archive in the root output directory which includes the
symbol table and index. The linker then uses takes this file to link.

The --whole-archive linker option is required, because the linker now
has visibility to every individual object file, and it will otherwise
just completely avoid including those without external references
(consider a file with EXPORT_SYMBOL or initcall or hardware exceptions
as its only entry points). The traditional built works "by luck" as
built-in.o files are large enough that they're going to get external
references. However this optimisation is unpredictable for the kernel
(due to above external references), ineffective at culling unused, and
costly because the .o files have to be searched for references.
Superior alternatives for link-time culling should be used instead.

Build characteristics for inclink vs thinarc, on a small powerpc64le
pseries VM with a modest .config:

                                  inclink       thinarc
sizes
vmlinux                        15 618 680    15 625 028
sum of all built-in.o          56 091 808     1 054 334
sum excluding root built-in.o                   151 430

find -name built-in.o | xargs rm ; time make vmlinux
real                              22.772s       21.143s
user                              13.280s       13.430s
sys                                4.310s        2.750s

- Final kernel pulled in only about 6K more, which shows how
  ineffective the object file culling is.
- Build performance looks improved due to less pagecache activity.
  On IO constrained systems it could be a bigger win.
- Build size saving is significant.

Side note, the toochain understands archives, so there's some tricks,
$ ar t built-in.o          # list all files you linked with
$ size built-in.o          # and their sizes
$ objdump -d built-in.o    # disassembly (unrelocated) with filenames

Implementation by sfr, minor tweaks by npiggin.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

---
Changes since v1
- Fixed um build breakage
- Fixed linking against lib.a archives (add symbol table)
- Tested x86 builds
- Tested arm64 defconfig thin archives cross compile
  inclinc - 385s 2.7GB; thinarc - 377s 1.8G

Changes since v2
- Add symbol index for lib.a archives, same as incremental link build

 arch/Kconfig            |  6 +++++
 scripts/Makefile.build  | 23 +++++++++++++---
 scripts/link-vmlinux.sh | 71 +++++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 85 insertions(+), 15 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index d794384..1330bf4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -424,6 +424,12 @@ config CC_STACKPROTECTOR_STRONG
 
 endchoice
 
+config THIN_ARCHIVES
+	bool
+	help
+	  Select this if the architecture wants to use thin archives
+	  instead of ld -r to create the built-in.o files.
+
 config HAVE_CONTEXT_TRACKING
 	bool
 	help
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 0d1ca5b..578ded1 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -358,12 +358,22 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ;
 # Rule to compile a set of .o files into one .o file
 #
 ifdef builtin-target
-quiet_cmd_link_o_target = LD      $@
+
+ifdef CONFIG_THIN_ARCHIVES
+  cmd_make_builtin = rm -f $@; $(AR) rcST$(KBUILD_ARFLAGS)
+  cmd_make_empty_builtin = rm -f $@; $(AR) rcST$(KBUILD_ARFLAGS)
+  quiet_cmd_link_o_target = AR      $@
+else
+  cmd_make_builtin = $(LD) $(ld_flags) -r -o
+  cmd_make_empty_builtin = rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS)
+  quiet_cmd_link_o_target = LD      $@
+endif
+
 # If the list of objects to link is empty, just create an empty built-in.o
 cmd_link_o_target = $(if $(strip $(obj-y)),\
-		      $(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \
+		      $(cmd_make_builtin) $@ $(filter $(obj-y), $^) \
 		      $(cmd_secanalysis),\
-		      rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@)
+		      $(cmd_make_empty_builtin) $@)
 
 $(builtin-target): $(obj-y) FORCE
 	$(call if_changed,link_o_target)
@@ -389,7 +399,12 @@ $(modorder-target): $(subdir-ym) FORCE
 #
 ifdef lib-target
 quiet_cmd_link_l_target = AR      $@
-cmd_link_l_target = rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@ $(lib-y)
+
+ifdef CONFIG_THIN_ARCHIVES
+  cmd_link_l_target = rm -f $@; $(AR) rcsT$(KBUILD_ARFLAGS) $@ $(lib-y)
+else
+  cmd_link_l_target = rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@ $(lib-y)
+endif
 
 $(lib-target): $(lib-y) FORCE
 	$(call if_changed,link_l_target)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index f0f6d9d..2f8a615 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -37,12 +37,40 @@ info()
 	fi
 }
 
+# Thin archive build here makes a final archive with
+# symbol table and indexes from vmlinux objects, which can be
+# used as input to linker.
+#
+# Traditional incremental style of link does not require this step
+#
+# built-in.o output file
+#
+archive_builtin()
+{
+	if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+		info AR built-in.o
+		rm -f built-in.o;
+		${AR} rcsT${KBUILD_ARFLAGS} built-in.o			\
+					${KBUILD_VMLINUX_INIT}		\
+					${KBUILD_VMLINUX_MAIN}
+	fi
+}
+
 # Link of vmlinux.o used for section mismatch analysis
 # ${1} output file
 modpost_link()
 {
-	${LD} ${LDFLAGS} -r -o ${1} ${KBUILD_VMLINUX_INIT}                   \
-		--start-group ${KBUILD_VMLINUX_MAIN} --end-group
+	local objects
+
+	if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+		objects="--whole-archive built-in.o"
+	else
+		objects="${KBUILD_VMLINUX_INIT}				\
+			--start-group					\
+			${KBUILD_VMLINUX_MAIN}				\
+			--end-group"
+	fi
+	${LD} ${LDFLAGS} -r -o ${1} ${objects}
 }
 
 # Link of vmlinux
@@ -51,18 +79,36 @@ modpost_link()
 vmlinux_link()
 {
 	local lds="${objtree}/${KBUILD_LDS}"
+	local objects
 
 	if [ "${SRCARCH}" != "um" ]; then
-		${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2}                  \
-			-T ${lds} ${KBUILD_VMLINUX_INIT}                     \
-			--start-group ${KBUILD_VMLINUX_MAIN} --end-group ${1}
+		if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+			objects="--whole-archive built-in.o ${1}"
+		else
+			objects="${KBUILD_VMLINUX_INIT}			\
+				--start-group				\
+				${KBUILD_VMLINUX_MAIN}			\
+				--end-group				\
+				${1}"
+		fi
+
+		${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2}		\
+			-T ${lds} ${objects}
 	else
-		${CC} ${CFLAGS_vmlinux} -o ${2}                              \
-			-Wl,-T,${lds} ${KBUILD_VMLINUX_INIT}                 \
-			-Wl,--start-group                                    \
-				 ${KBUILD_VMLINUX_MAIN}                      \
-			-Wl,--end-group                                      \
-			-lutil -lrt -lpthread ${1}
+		if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+			objects="-Wl,--whole-archive built-in.o ${1}"
+		else
+			objects="${KBUILD_VMLINUX_INIT}			\
+				-Wl,--start-group			\
+				${KBUILD_VMLINUX_MAIN}			\
+				-Wl,--end-group				\
+				${1}"
+		fi
+
+		${CC} ${CFLAGS_vmlinux} -o ${2}				\
+			-Wl,-T,${lds}					\
+			${objects}					\
+			-lutil -lrt -lpthread
 		rm -f linux
 	fi
 }
@@ -119,6 +165,7 @@ cleanup()
 	rm -f .tmp_kallsyms*
 	rm -f .tmp_version
 	rm -f .tmp_vmlinux*
+	rm -f built-in.o
 	rm -f System.map
 	rm -f vmlinux
 	rm -f vmlinux.o
@@ -162,6 +209,8 @@ case "${KCONFIG_CONFIG}" in
 	. "./${KCONFIG_CONFIG}"
 esac
 
+archive_builtin
+
 #link vmlinux.o
 info LD vmlinux.o
 modpost_link vmlinux.o
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] kbuild: allow archs to select link dead code/data elimination
  2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
  2016-08-24 12:29 ` [PATCH 1/3] kbuild: allow architectures to use thin archives instead of ld -r Nicholas Piggin
@ 2016-08-24 12:29 ` Nicholas Piggin
  2016-08-24 12:29 ` [PATCH 3/3] kbuild: add arch specific post-link Makefile Nicholas Piggin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2016-08-24 12:29 UTC (permalink / raw)
  To: Michal Marek, linux-kbuild
  Cc: Nicholas Piggin, linux-arch, Sam Ravnborg, Stephen Rothwell,
	Arnd Bergmann, Nicolas Pitre, Segher Boessenkool, Alan Modra

Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to
select to build with -ffunction-sections, -fdata-sections, and link
with --gc-sections. It requires some work (documented) to ensure all
unreferenced entrypoints are live, and requires toolchain and build
verification, so it is made a per-arch option for now.

On a random powerpc64le build, this yelds a significant size saving,
it boots and runs fine, but there is a lot I haven't tested as yet, so
these savings may be reduced if there are bugs in the link.

    text      data        bss        dec   filename
11169741   1180744    1923176	14273661   vmlinux
10445269   1004127    1919707	13369103   vmlinux.dce

~700K text, ~170K data, 6% removed from kernel image size.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

---
Since v1
- More descriptive config option name
- Improve some comments
- Tested x86 builds, boot in KVM

Since v2
- Kill unnecessary changes to remove CONFIG_LTO remnants in arch/x86

 Makefile                          |  9 +++++++
 arch/Kconfig                      | 13 ++++++++++
 include/asm-generic/vmlinux.lds.h | 52 ++++++++++++++++++++++-----------------
 include/linux/compiler.h          | 23 +++++++++++++++++
 include/linux/export.h            | 30 +++++++++++-----------
 include/linux/init.h              | 38 ++++++++++------------------
 init/Makefile                     |  2 ++
 7 files changed, 104 insertions(+), 63 deletions(-)

diff --git a/Makefile b/Makefile
index b409076..b29c6c0 100644
--- a/Makefile
+++ b/Makefile
@@ -618,6 +618,11 @@ include arch/$(SRCARCH)/Makefile
 
 KBUILD_CFLAGS	+= $(call cc-option,-fno-delete-null-pointer-checks,)
 
+ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
+KBUILD_CFLAGS	+= $(call cc-option,-ffunction-sections,)
+KBUILD_CFLAGS	+= $(call cc-option,-fdata-sections,)
+endif
+
 ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
 KBUILD_CFLAGS	+= -Os $(call cc-disable-warning,maybe-uninitialized,)
 else
@@ -819,6 +824,10 @@ LDFLAGS_BUILD_ID = $(patsubst -Wl$(comma)%,%,\
 KBUILD_LDFLAGS_MODULE += $(LDFLAGS_BUILD_ID)
 LDFLAGS_vmlinux += $(LDFLAGS_BUILD_ID)
 
+ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
+LDFLAGS_vmlinux	+= $(call ld-option, --gc-sections,)
+endif
+
 ifeq ($(CONFIG_STRIP_ASM_SYMS),y)
 LDFLAGS_vmlinux	+= $(call ld-option, -X,)
 endif
diff --git a/arch/Kconfig b/arch/Kconfig
index 1330bf4..94138e5 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -430,6 +430,19 @@ config THIN_ARCHIVES
 	  Select this if the architecture wants to use thin archives
 	  instead of ld -r to create the built-in.o files.
 
+config LD_DEAD_CODE_DATA_ELIMINATION
+	bool
+	help
+	  Select this if the architecture wants to do dead code and
+	  data elimination with the linker by compiling with
+	  -ffunction-sections -fdata-sections and linking with
+	  --gc-sections.
+
+	  This requires that the arch annotates or otherwise protects
+	  its external entry points from being discarded. Linker scripts
+	  must also merge .text.*, .data.*, and .bss.* correctly into
+	  output sections.
+
 config HAVE_CONTEXT_TRACKING
 	bool
 	help
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 6a67ab9..a66ffe9 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -196,9 +196,14 @@
 	*(.dtb.init.rodata)						\
 	VMLINUX_SYMBOL(__dtb_end) = .;
 
-/* .data section */
+/*
+ * .data section
+ * -fdata-sections generates .data.identifier which needs to be pulled in
+ * with .data, but don't want to pull in .data..stuff which has its own
+ * requirements. Same for bss.
+ */
 #define DATA_DATA							\
-	*(.data)							\
+	*(.data .data.[0-9a-zA-Z_]*)					\
 	*(.ref.data)							\
 	*(.data..shared_aligned) /* percpu related */			\
 	MEM_KEEP(init.data)						\
@@ -312,76 +317,76 @@
 	/* Kernel symbol table: Normal symbols */			\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
 		VMLINUX_SYMBOL(__start___ksymtab) = .;			\
-		*(SORT(___ksymtab+*))					\
+		KEEP(*(SORT(___ksymtab+*)))				\
 		VMLINUX_SYMBOL(__stop___ksymtab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
 		VMLINUX_SYMBOL(__start___ksymtab_gpl) = .;		\
-		*(SORT(___ksymtab_gpl+*))				\
+		KEEP(*(SORT(___ksymtab_gpl+*)))				\
 		VMLINUX_SYMBOL(__stop___ksymtab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__ksymtab_unused  : AT(ADDR(__ksymtab_unused) - LOAD_OFFSET) {	\
 		VMLINUX_SYMBOL(__start___ksymtab_unused) = .;		\
-		*(SORT(___ksymtab_unused+*))				\
+		KEEP(*(SORT(___ksymtab_unused+*)))			\
 		VMLINUX_SYMBOL(__stop___ksymtab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__ksymtab_unused_gpl : AT(ADDR(__ksymtab_unused_gpl) - LOAD_OFFSET) { \
 		VMLINUX_SYMBOL(__start___ksymtab_unused_gpl) = .;	\
-		*(SORT(___ksymtab_unused_gpl+*))			\
+		KEEP(*(SORT(___ksymtab_unused_gpl+*)))			\
 		VMLINUX_SYMBOL(__stop___ksymtab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__ksymtab_gpl_future : AT(ADDR(__ksymtab_gpl_future) - LOAD_OFFSET) { \
 		VMLINUX_SYMBOL(__start___ksymtab_gpl_future) = .;	\
-		*(SORT(___ksymtab_gpl_future+*))			\
+		KEEP(*(SORT(___ksymtab_gpl_future+*)))			\
 		VMLINUX_SYMBOL(__stop___ksymtab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
 		VMLINUX_SYMBOL(__start___kcrctab) = .;			\
-		*(SORT(___kcrctab+*))					\
+		KEEP(*(SORT(___kcrctab+*)))				\
 		VMLINUX_SYMBOL(__stop___kcrctab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
 		VMLINUX_SYMBOL(__start___kcrctab_gpl) = .;		\
-		*(SORT(___kcrctab_gpl+*))				\
+		KEEP(*(SORT(___kcrctab_gpl+*)))				\
 		VMLINUX_SYMBOL(__stop___kcrctab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__kcrctab_unused  : AT(ADDR(__kcrctab_unused) - LOAD_OFFSET) {	\
 		VMLINUX_SYMBOL(__start___kcrctab_unused) = .;		\
-		*(SORT(___kcrctab_unused+*))				\
+		KEEP(*(SORT(___kcrctab_unused+*)))			\
 		VMLINUX_SYMBOL(__stop___kcrctab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__kcrctab_unused_gpl : AT(ADDR(__kcrctab_unused_gpl) - LOAD_OFFSET) { \
 		VMLINUX_SYMBOL(__start___kcrctab_unused_gpl) = .;	\
-		*(SORT(___kcrctab_unused_gpl+*))			\
+		KEEP(*(SORT(___kcrctab_unused_gpl+*)))			\
 		VMLINUX_SYMBOL(__stop___kcrctab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__kcrctab_gpl_future : AT(ADDR(__kcrctab_gpl_future) - LOAD_OFFSET) { \
 		VMLINUX_SYMBOL(__start___kcrctab_gpl_future) = .;	\
-		*(SORT(___kcrctab_gpl_future+*))			\
+		KEEP(*(SORT(___kcrctab_gpl_future+*)))			\
 		VMLINUX_SYMBOL(__stop___kcrctab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: strings */				\
         __ksymtab_strings : AT(ADDR(__ksymtab_strings) - LOAD_OFFSET) {	\
-		*(__ksymtab_strings)					\
+		KEEP(*(__ksymtab_strings))				\
 	}								\
 									\
 	/* __*init sections */						\
@@ -416,7 +421,7 @@
 #define SECURITY_INIT							\
 	.security_initcall.init : AT(ADDR(.security_initcall.init) - LOAD_OFFSET) { \
 		VMLINUX_SYMBOL(__security_initcall_start) = .;		\
-		*(.security_initcall.init) 				\
+		KEEP(*(.security_initcall.init))			\
 		VMLINUX_SYMBOL(__security_initcall_end) = .;		\
 	}
 
@@ -424,7 +429,7 @@
  * during second ld run in second ld pass when generating System.map */
 #define TEXT_TEXT							\
 		ALIGN_FUNCTION();					\
-		*(.text.hot .text .text.fixup .text.unlikely)		\
+		*(.text.hot .text .text.fixup .text.unlikely .text.*)	\
 		*(.ref.text)						\
 	MEM_KEEP(init.text)						\
 	MEM_KEEP(exit.text)						\
@@ -519,6 +524,7 @@
 
 /* init and exit section handling */
 #define INIT_DATA							\
+	KEEP(*(SORT(___kentry+*)))					\
 	*(.init.data)							\
 	MEM_DISCARD(init.data)						\
 	KERNEL_CTORS()							\
@@ -581,7 +587,7 @@
 		BSS_FIRST_SECTIONS					\
 		*(.bss..page_aligned)					\
 		*(.dynbss)						\
-		*(.bss)							\
+		*(.bss .bss.[0-9a-zA-Z_]*)				\
 		*(COMMON)						\
 	}
 
@@ -664,12 +670,12 @@
 
 #define INIT_CALLS_LEVEL(level)						\
 		VMLINUX_SYMBOL(__initcall##level##_start) = .;		\
-		*(.initcall##level##.init)				\
-		*(.initcall##level##s.init)				\
+		KEEP(*(.initcall##level##.init))			\
+		KEEP(*(.initcall##level##s.init))			\
 
 #define INIT_CALLS							\
 		VMLINUX_SYMBOL(__initcall_start) = .;			\
-		*(.initcallearly.init)					\
+		KEEP(*(.initcallearly.init))				\
 		INIT_CALLS_LEVEL(0)					\
 		INIT_CALLS_LEVEL(1)					\
 		INIT_CALLS_LEVEL(2)					\
@@ -683,21 +689,21 @@
 
 #define CON_INITCALL							\
 		VMLINUX_SYMBOL(__con_initcall_start) = .;		\
-		*(.con_initcall.init)					\
+		KEEP(*(.con_initcall.init))				\
 		VMLINUX_SYMBOL(__con_initcall_end) = .;
 
 #define SECURITY_INITCALL						\
 		VMLINUX_SYMBOL(__security_initcall_start) = .;		\
-		*(.security_initcall.init)				\
+		KEEP(*(.security_initcall.init))			\
 		VMLINUX_SYMBOL(__security_initcall_end) = .;
 
 #ifdef CONFIG_BLK_DEV_INITRD
 #define INIT_RAM_FS							\
 	. = ALIGN(4);							\
 	VMLINUX_SYMBOL(__initramfs_start) = .;				\
-	*(.init.ramfs)							\
+	KEEP(*(.init.ramfs))						\
 	. = ALIGN(8);							\
-	*(.init.ramfs.info)
+	KEEP(*(.init.ramfs.info))
 #else
 #define INIT_RAM_FS
 #endif
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 793c082..b79a66f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -184,6 +184,29 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 # define unreachable() do { } while (1)
 #endif
 
+/*
+ * KENTRY - kernel entry point
+ * This can be used to annotate symbols (functions or data) that are used
+ * without their linker symbol being referenced explicitly. For example,
+ * interrupt vector handlers, or functions in the kernel image that are found
+ * programatically.
+ *
+ * Not required for symbols exported with EXPORT_SYMBOL, or initcalls. Those
+ * are handled in their own way (with KEEP() in linker scripts).
+ *
+ * KENTRY can be avoided if the symbols in question are marked as KEEP() in the
+ * linker script. For example an architecture could KEEP() its entire
+ * boot/exception vector code rather than annotate each function and data.
+ */
+#ifndef KENTRY
+# define KENTRY(sym)						\
+	extern typeof(sym) sym;					\
+	static const unsigned long __kentry_##sym		\
+	__used							\
+	__attribute__((section("___kentry" "+" #sym ), used))	\
+	= (unsigned long)&sym;
+#endif
+
 #ifndef RELOC_HIDE
 # define RELOC_HIDE(ptr, off)					\
   ({ unsigned long __ptr;					\
diff --git a/include/linux/export.h b/include/linux/export.h
index 2f9ccbe..0d1ccdd 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -1,5 +1,6 @@
 #ifndef _LINUX_EXPORT_H
 #define _LINUX_EXPORT_H
+
 /*
  * Export symbols from the kernel to modules.  Forked from module.h
  * to reduce the amount of pointless cruft we feed to gcc when only
@@ -42,27 +43,26 @@ extern struct module __this_module;
 #ifdef CONFIG_MODVERSIONS
 /* Mark the CRC weak since genksyms apparently decides not to
  * generate a checksums for some symbols */
-#define __CRC_SYMBOL(sym, sec)					\
-	extern __visible void *__crc_##sym __attribute__((weak));		\
-	static const unsigned long __kcrctab_##sym		\
-	__used							\
-	__attribute__((section("___kcrctab" sec "+" #sym), unused))	\
+#define __CRC_SYMBOL(sym, sec)						\
+	extern __visible void *__crc_##sym __attribute__((weak));	\
+	static const unsigned long __kcrctab_##sym			\
+	__used								\
+	__attribute__((section("___kcrctab" sec "+" #sym), used))	\
 	= (unsigned long) &__crc_##sym;
 #else
 #define __CRC_SYMBOL(sym, sec)
 #endif
 
 /* For every exported symbol, place a struct in the __ksymtab section */
-#define ___EXPORT_SYMBOL(sym, sec)				\
-	extern typeof(sym) sym;					\
-	__CRC_SYMBOL(sym, sec)					\
-	static const char __kstrtab_##sym[]			\
-	__attribute__((section("__ksymtab_strings"), aligned(1))) \
-	= VMLINUX_SYMBOL_STR(sym);				\
-	extern const struct kernel_symbol __ksymtab_##sym;	\
-	__visible const struct kernel_symbol __ksymtab_##sym	\
-	__used							\
-	__attribute__((section("___ksymtab" sec "+" #sym), unused))	\
+#define ___EXPORT_SYMBOL(sym, sec)					\
+	extern typeof(sym) sym;						\
+	__CRC_SYMBOL(sym, sec)						\
+	static const char __kstrtab_##sym[]				\
+	__attribute__((section("__ksymtab_strings"), aligned(1)))	\
+	= VMLINUX_SYMBOL_STR(sym);					\
+	static const struct kernel_symbol __ksymtab_##sym		\
+	__used								\
+	__attribute__((section("___ksymtab" sec "+" #sym), used))	\
 	= { (unsigned long)&sym, __kstrtab_##sym }
 
 #if defined(__KSYM_DEPS__)
diff --git a/include/linux/init.h b/include/linux/init.h
index aedb254..813f9d1 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -156,24 +156,8 @@ extern bool initcall_debug;
 
 #ifndef __ASSEMBLY__
 
-#ifdef CONFIG_LTO
-/* Work around a LTO gcc problem: when there is no reference to a variable
- * in a module it will be moved to the end of the program. This causes
- * reordering of initcalls which the kernel does not like.
- * Add a dummy reference function to avoid this. The function is
- * deleted by the linker.
- */
-#define LTO_REFERENCE_INITCALL(x) \
-	; /* yes this is needed */			\
-	static __used __exit void *reference_##x(void)	\
-	{						\
-		return &x;				\
-	}
-#else
-#define LTO_REFERENCE_INITCALL(x)
-#endif
-
-/* initcalls are now grouped by functionality into separate 
+/*
+ * initcalls are now grouped by functionality into separate
  * subsections. Ordering inside the subsections is determined
  * by link order. 
  * For backwards compatibility, initcall() puts the call in 
@@ -181,12 +165,16 @@ extern bool initcall_debug;
  *
  * The `id' arg to __define_initcall() is needed so that multiple initcalls
  * can point at the same handler without causing duplicate-symbol build errors.
+ *
+ * Initcalls are run by placing pointers in initcall sections that the
+ * kernel iterates at runtime. The linker can do dead code / data elimination
+ * and remove that completely, so the initcall sections have to be marked
+ * as KEEP() in the linker script.
  */
 
 #define __define_initcall(fn, id) \
 	static initcall_t __initcall_##fn##id __used \
-	__attribute__((__section__(".initcall" #id ".init"))) = fn; \
-	LTO_REFERENCE_INITCALL(__initcall_##fn##id)
+	__attribute__((__section__(".initcall" #id ".init"))) = fn;
 
 /*
  * Early initcalls run before initializing SMP.
@@ -222,15 +210,15 @@ extern bool initcall_debug;
 
 #define __initcall(fn) device_initcall(fn)
 
-#define __exitcall(fn) \
+#define __exitcall(fn)						\
 	static exitcall_t __exitcall_##fn __exit_call = fn
 
-#define console_initcall(fn) \
-	static initcall_t __initcall_##fn \
+#define console_initcall(fn)					\
+	static initcall_t __initcall_##fn			\
 	__used __section(.con_initcall.init) = fn
 
-#define security_initcall(fn) \
-	static initcall_t __initcall_##fn \
+#define security_initcall(fn)					\
+	static initcall_t __initcall_##fn			\
 	__used __section(.security_initcall.init) = fn
 
 struct obs_kernel_param {
diff --git a/init/Makefile b/init/Makefile
index 7bc47ee..c4fb455 100644
--- a/init/Makefile
+++ b/init/Makefile
@@ -2,6 +2,8 @@
 # Makefile for the linux kernel.
 #
 
+ccflags-y := -fno-function-sections -fno-data-sections
+
 obj-y                          := main.o version.o mounts.o
 ifneq ($(CONFIG_BLK_DEV_INITRD),y)
 obj-y                          += noinitramfs.o
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] kbuild: add arch specific post-link Makefile
  2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
  2016-08-24 12:29 ` [PATCH 1/3] kbuild: allow architectures to use thin archives instead of ld -r Nicholas Piggin
  2016-08-24 12:29 ` [PATCH 2/3] kbuild: allow archs to select link dead code/data elimination Nicholas Piggin
@ 2016-08-24 12:29 ` Nicholas Piggin
  2016-08-24 13:32   ` Nicholas Piggin
  2016-08-24 13:06 ` [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Arnd Bergmann
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2016-08-24 12:29 UTC (permalink / raw)
  To: Michal Marek, linux-kbuild
  Cc: Nicholas Piggin, linux-arch, Sam Ravnborg, Stephen Rothwell,
	Arnd Bergmann, Nicolas Pitre, Segher Boessenkool, Alan Modra

Allow architectures to create arch/xxx/Makefile.postlink with targets
for vmlinux, modules.ko, and clean, which will be invoked after final
linking of vmlinux and modules.

powerpc will use this to check vmlinux linker relocations for sanity,
and may use it to fix up alternate instruction patch branch addresses.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

---
Since v1,
- Switched to a more flexible arch makefile invocation.
- Provide a powerpc patch to use it to help existing build issue
  (rather than only justification being out-of-tree patch).

Since v2
- Depend on existence of Makefile.modpost, rather than config option.
- Add a clean target
- Move post-vmlinux invocation into Makefile rather than link-vmlinux.sh
- Arch postlink must always be done after final link, not on an if_changed
  basis, because the vmlinux itself is not a dependency.

 Documentation/kbuild/makefiles.txt | 16 ++++++++++++++++
 Makefile                           | 10 +++++++---
 arch/Kconfig                       |  7 +++++++
 scripts/Makefile.modpost           | 14 +++++++++-----
 4 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt
index 13f888a..16841a7 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -41,6 +41,7 @@ This document describes the Linux kernel Makefiles.
 	   --- 6.8 Custom kbuild commands
 	   --- 6.9 Preprocessing linker scripts
 	   --- 6.10 Generic header files
+	   --- 6.11 Post-link pass
 
 	=== 7 Kbuild syntax for exported headers
 		--- 7.1 header-y
@@ -1236,6 +1237,21 @@ When kbuild executes, the following steps are followed (roughly):
 	to list the file in the Kbuild file.
 	See "7.4 generic-y" for further info on syntax etc.
 
+--- 6.11 Post-link pass
+
+	If the file arch/xxx/Makefile.postlink exists, this makefile
+	will be invoked for post-link objects (vmlinux and modules.ko)
+	for architectures to run post-link passes on. Must also handle
+	the clean target.
+
+	This pass runs after kallsyms generation. If the architecture
+	needs to modify symbol locations, rather than manipulate the
+	kallsyms, it may be easier to add another postlink target for
+	.tmp_vmlinux? targets to be called from link-vmlinux.sh.
+
+	For example, powerpc uses this to check relocation sanity of
+	the linked vmlinux file.
+
 === 7 Kbuild syntax for exported headers
 
 The kernel includes a set of headers that is exported to userspace.
diff --git a/Makefile b/Makefile
index b29c6c0..9c6992f 100644
--- a/Makefile
+++ b/Makefile
@@ -967,9 +967,12 @@ endif
 include/generated/autoksyms.h: FORCE
 	$(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh true
 
-# Final link of vmlinux
-      cmd_link-vmlinux = $(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux)
-quiet_cmd_link-vmlinux = LINK    $@
+ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
+
+# Final link of vmlinux with optional arch pass after final link
+    cmd_link-vmlinux =                                                 \
+	$(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ;       \
+	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
 vmlinux: scripts/link-vmlinux.sh vmlinux_prereq $(vmlinux-deps) FORCE
 	+$(call if_changed,link-vmlinux)
@@ -1268,6 +1271,7 @@ $(clean-dirs):
 
 vmlinuxclean:
 	$(Q)$(CONFIG_SHELL) $(srctree)/scripts/link-vmlinux.sh clean
+	$(Q)$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) clean)
 
 clean: archclean vmlinuxclean
 
diff --git a/arch/Kconfig b/arch/Kconfig
index 94138e5..c8a1677 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -424,6 +424,13 @@ config CC_STACKPROTECTOR_STRONG
 
 endchoice
 
+config BUILD_ARCH_POSTLINK
+	bool
+	help
+	  Select this if the architecture wants to have a Makefile invoked
+	  on modules and vmlinux after they are linked. The architecture
+	  must provide arch/?/Makefile.postlink
+
 config THIN_ARCHIVES
 	bool
 	help
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 1366a94..16923ba 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -115,14 +115,18 @@ $(modules:.ko=.mod.o): %.mod.o: %.mod.c FORCE
 
 targets += $(modules:.ko=.mod.o)
 
-# Step 6), final link of the modules
+ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
+
+# Step 6), final link of the modules with optional arch pass after final link
 quiet_cmd_ld_ko_o = LD [M]  $@
-      cmd_ld_ko_o = $(LD) -r $(LDFLAGS)                                 \
-                             $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \
-                             -o $@ $(filter-out FORCE,$^)
+      cmd_ld_ko_o =                                                     \
+	$(LD) -r $(LDFLAGS)                                             \
+                 $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE)             \
+                 -o $@ $(filter-out FORCE,$^) ;                         \
+	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
 $(modules): %.ko :%.o %.mod.o FORCE
-	$(call if_changed,ld_ko_o)
+	+$(call if_changed,ld_ko_o)
 
 targets += $(modules)
 
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
                   ` (2 preceding siblings ...)
  2016-08-24 12:29 ` [PATCH 3/3] kbuild: add arch specific post-link Makefile Nicholas Piggin
@ 2016-08-24 13:06 ` Arnd Bergmann
  2016-08-24 15:13   ` Nicolas Pitre
  2016-08-24 14:21 ` David Howells
  2016-09-09  0:23 ` Nicolas Pitre
  5 siblings, 1 reply; 13+ messages in thread
From: Arnd Bergmann @ 2016-08-24 13:06 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michal Marek, linux-kbuild, linux-arch, Sam Ravnborg,
	Stephen Rothwell, Nicolas Pitre, Segher Boessenkool, Alan Modra

On Wednesday, August 24, 2016 10:29:18 PM CEST Nicholas Piggin wrote:
> Hi Michal,
> 
> I ended up deciding to do a v3, because I had several changes
> accumulated, as described in patches.
> 
> I've also left off the powerpc arch patches -- they can be found
> in previous posts, for reference.
> 
> I've again tested ARM and it seems to be building okay and without
> performance regression with my configurations. I think it's going
> to be a matter of some toolchain options for them to go through.
> arm64, x86, powerpc, and arm for me all built fine with thin archives
> and --gc-sections enabled, so I can't see there being a fundamental
> issue that can't be solved. Worst case, the incremental link option
> can remain for a time.
> 

I clearly want to use this on ARM, and I am still investigating
the performance regression that I see on my build box, but that
is no reason to delay your patches.

Everything looks good to me,

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] kbuild: add arch specific post-link Makefile
  2016-08-24 12:29 ` [PATCH 3/3] kbuild: add arch specific post-link Makefile Nicholas Piggin
@ 2016-08-24 13:32   ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2016-08-24 13:32 UTC (permalink / raw)
  To: Michal Marek, linux-kbuild
  Cc: linux-arch, Sam Ravnborg, Stephen Rothwell, Arnd Bergmann,
	Nicolas Pitre, Segher Boessenkool, Alan Modra

On Wed, 24 Aug 2016 22:29:21 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> Allow architectures to create arch/xxx/Makefile.postlink with targets
> for vmlinux, modules.ko, and clean, which will be invoked after final
> linking of vmlinux and modules.
> 
> powerpc will use this to check vmlinux linker relocations for sanity,
> and may use it to fix up alternate instruction patch branch addresses.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>


> diff --git a/arch/Kconfig b/arch/Kconfig
> index 94138e5..c8a1677 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -424,6 +424,13 @@ config CC_STACKPROTECTOR_STRONG
>  
>  endchoice
>  
> +config BUILD_ARCH_POSTLINK
> +	bool
> +	help
> +	  Select this if the architecture wants to have a Makefile invoked
> +	  on modules and vmlinux after they are linked. The architecture
> +	  must provide arch/?/Makefile.postlink
> +
>  config THIN_ARCHIVES
>  	bool
>  	help

Argh! Sorry, this thing crept back in again. No other reference to
BUILD_ARCH_POSTLINK in the patch, so it won't invalidate testing
to just remove the hunk.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
                   ` (3 preceding siblings ...)
  2016-08-24 13:06 ` [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Arnd Bergmann
@ 2016-08-24 14:21 ` David Howells
  2016-08-24 15:21   ` Nicolas Pitre
  2016-09-09  0:23 ` Nicolas Pitre
  5 siblings, 1 reply; 13+ messages in thread
From: David Howells @ 2016-08-24 14:21 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: dhowells, Michal Marek, linux-kbuild, linux-arch, Sam Ravnborg,
	Stephen Rothwell, Arnd Bergmann, Nicolas Pitre,
	Segher Boessenkool, Alan Modra

Out of interest, does using LTO also fix the problem?

David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-08-24 13:06 ` [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Arnd Bergmann
@ 2016-08-24 15:13   ` Nicolas Pitre
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2016-08-24 15:13 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Nicholas Piggin, Michal Marek, linux-kbuild, linux-arch,
	Sam Ravnborg, Stephen Rothwell, Segher Boessenkool, Alan Modra

On Wed, 24 Aug 2016, Arnd Bergmann wrote:

> On Wednesday, August 24, 2016 10:29:18 PM CEST Nicholas Piggin wrote:
> > Hi Michal,
> > 
> > I ended up deciding to do a v3, because I had several changes
> > accumulated, as described in patches.
> > 
> > I've also left off the powerpc arch patches -- they can be found
> > in previous posts, for reference.
> > 
> > I've again tested ARM and it seems to be building okay and without
> > performance regression with my configurations. I think it's going
> > to be a matter of some toolchain options for them to go through.
> > arm64, x86, powerpc, and arm for me all built fine with thin archives
> > and --gc-sections enabled, so I can't see there being a fundamental
> > issue that can't be solved. Worst case, the incremental link option
> > can remain for a time.
> > 
> 
> I clearly want to use this on ARM, and I am still investigating
> the performance regression that I see on my build box, but that
> is no reason to delay your patches.
> 
> Everything looks good to me,
> 
> Acked-by: Arnd Bergmann <arnd@arndb.de>

Ditto here.

Acked-by: Nicolas Pitre <nico@linaro.org>


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-08-24 14:21 ` David Howells
@ 2016-08-24 15:21   ` Nicolas Pitre
  2016-08-25  2:56     ` Nicholas Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2016-08-24 15:21 UTC (permalink / raw)
  To: David Howells
  Cc: Nicholas Piggin, Michal Marek, linux-kbuild, linux-arch,
	Sam Ravnborg, Stephen Rothwell, Arnd Bergmann,
	Segher Boessenkool, Alan Modra

On Wed, 24 Aug 2016, David Howells wrote:

> Out of interest, does using LTO also fix the problem?

With those patches in place, that would be the next thing to try.
Reducing our reliance on 'ld -r' will greatly help promoting LTO for the
kernel.

Personally I'd like to have the choice between LTO and -gc-sections at 
configure time.  They both have advantages of their own.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-08-24 15:21   ` Nicolas Pitre
@ 2016-08-25  2:56     ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2016-08-25  2:56 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: David Howells, Michal Marek, linux-kbuild, linux-arch,
	Sam Ravnborg, Stephen Rothwell, Arnd Bergmann,
	Segher Boessenkool, Alan Modra

On Wed, 24 Aug 2016 11:21:33 -0400 (EDT)
Nicolas Pitre <nicolas.pitre@linaro.org> wrote:

> On Wed, 24 Aug 2016, David Howells wrote:
> 
> > Out of interest, does using LTO also fix the problem?  
> 
> With those patches in place, that would be the next thing to try.
> Reducing our reliance on 'ld -r' will greatly help promoting LTO for the
> kernel.
> 
> Personally I'd like to have the choice between LTO and -gc-sections at 
> configure time.  They both have advantages of their own.


We discussed this in previous rounds of patches, but to just
expand on Nicolas' answer with some overview:

- Thin archives are needed for linking large kernels of some ISAs.
  I believe the linker becomes constrained in where it can use
  branch stubs when linking large inputs, but haven't looked into
  the internals exactly. There are a number of other ways proposed
  to solve it, but archives are well understood by toolchain and
  look like quite an elegant solution (with other benefits such as
  build output size and helping to enable LTO).

- gc-sections patch is mainly to address some small regressions
  in binary size with the first patch, but I think the results
  make it stand on its own. It's very fast, mature, does not
  transform code, and gives surprisingly good size saving without
  much enablement work. The work that is required (e.g., to
  annotate entry points) is mostly shared with LTO if we add that
  in future, so it's not dead-end cruft.

- LTO: Nicolas has posted far more significant size improvements
  with his more advanced work with reference annotatation and LTO.
  I'm surprised there has not been more interest, but I hope if
  we get this merged, it might give him motivation to look at it
  again. Discussion seemed to die down last time with people
  saying we should look at gc-sections first. 

I see this as enabler to LTO rather than a replacement. LTO is
bigger change to build and less mature, but long term it is the
right way to go IMO. When we iron out toolchains, perhaps have
LTO option for just static elimination rather than transformations
for high speed / low optimization builds, etc., then gc-sections
would be completely supplanted could be removed. Using sections
for dce is a nice linker hack, but real LTO information seems
cleaner in the end (and of course allows more optimization).

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
                   ` (4 preceding siblings ...)
  2016-08-24 14:21 ` David Howells
@ 2016-09-09  0:23 ` Nicolas Pitre
  2016-09-09 10:59   ` Michal Marek
  5 siblings, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2016-09-09  0:23 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Michal Marek, linux-kbuild, linux-arch, Sam Ravnborg,
	Stephen Rothwell, Arnd Bergmann, Segher Boessenkool, Alan Modra

On Wed, 24 Aug 2016, Nicholas Piggin wrote:

> Hi Michal,
> 
> I ended up deciding to do a v3, because I had several changes
> accumulated, as described in patches.

What is the prospect of those patches going upstream?

I ask because I didn't see them in the kbuild repo yet, nor in next for 
that matter.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-09-09  0:23 ` Nicolas Pitre
@ 2016-09-09 10:59   ` Michal Marek
  2016-09-10  4:05     ` Nicolas Pitre
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Marek @ 2016-09-09 10:59 UTC (permalink / raw)
  To: Nicolas Pitre, Nicholas Piggin
  Cc: linux-kbuild, linux-arch, Sam Ravnborg, Stephen Rothwell,
	Arnd Bergmann, Segher Boessenkool, Alan Modra

Dne 9.9.2016 v 02:23 Nicolas Pitre napsal(a):
> On Wed, 24 Aug 2016, Nicholas Piggin wrote:
> 
>> Hi Michal,
>>
>> I ended up deciding to do a v3, because I had several changes
>> accumulated, as described in patches.
> 
> What is the prospect of those patches going upstream?
> 
> I ask because I didn't see them in the kbuild repo yet, nor in next for 
> that matter.

I merged the patches into the kbuild.git repository now.

Michal


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections
  2016-09-09 10:59   ` Michal Marek
@ 2016-09-10  4:05     ` Nicolas Pitre
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2016-09-10  4:05 UTC (permalink / raw)
  To: Michal Marek
  Cc: Nicholas Piggin, linux-kbuild, linux-arch, Sam Ravnborg,
	Stephen Rothwell, Arnd Bergmann, Segher Boessenkool, Alan Modra

On Fri, 9 Sep 2016, Michal Marek wrote:

> Dne 9.9.2016 v 02:23 Nicolas Pitre napsal(a):
> > On Wed, 24 Aug 2016, Nicholas Piggin wrote:
> > 
> >> Hi Michal,
> >>
> >> I ended up deciding to do a v3, because I had several changes
> >> accumulated, as described in patches.
> > 
> > What is the prospect of those patches going upstream?
> > 
> > I ask because I didn't see them in the kbuild repo yet, nor in next for 
> > that matter.
> 
> I merged the patches into the kbuild.git repository now.

Excellent, thanks.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-09-10  4:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-24 12:29 [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Nicholas Piggin
2016-08-24 12:29 ` [PATCH 1/3] kbuild: allow architectures to use thin archives instead of ld -r Nicholas Piggin
2016-08-24 12:29 ` [PATCH 2/3] kbuild: allow archs to select link dead code/data elimination Nicholas Piggin
2016-08-24 12:29 ` [PATCH 3/3] kbuild: add arch specific post-link Makefile Nicholas Piggin
2016-08-24 13:32   ` Nicholas Piggin
2016-08-24 13:06 ` [PATCH 0/3 v3] kbuild changes, thin archives, --gc-sections Arnd Bergmann
2016-08-24 15:13   ` Nicolas Pitre
2016-08-24 14:21 ` David Howells
2016-08-24 15:21   ` Nicolas Pitre
2016-08-25  2:56     ` Nicholas Piggin
2016-09-09  0:23 ` Nicolas Pitre
2016-09-09 10:59   ` Michal Marek
2016-09-10  4:05     ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.